Date: 16 August 2018 / Version: 1.2 / Status: Experimental
This document specifies an a Slotted option for STANAG 5066 Annex K (CSMA Access) as an alternative to the Jitter approach specified. This will give performance and resilience improvements for networks with a small number of nodes. This specification also clarifies handling of single and multiple simultaneous CAS-1 links in conjunction with Annex K.
This document is part of the STANAG 5066 Extension Protocol (S5066-EP) series. The complete set of documents in the series are:
- STANAG 5066 Extension Protocol Index (S5066-EP1)
- STANAG 5066 Padding DPDU (S5066-EP2)
- Pipelining the CAS 1 Linking Protocol (S5066-EP3)
- Data Rate Selection in STANAG 5066 for Autobaud Waveforms (S5066-EP4)
- STANAG 5066 Large Windows Support (S5066-EP5)
- Slotted Option for STANAG 5066 Annex K (S5066-EP6)
- Advertising Extended Capabilities (S5066-EP7)
- Block Based EOTs (S5066-EP8)
- Compact Acknowledgement (S5066-EP9)
- Extension DPDU (S5066-EP10)
- Variable C_PDU Segment Size (S5066-EP11)
1. Clarification of Annex K Status
Annex K is an optional Annex (information only). However, Annex J notes "A node may ― indeed, a node should ― implement the CSMA mode1. If a CSMA mode is implemented, it shall be implemented in accordance with STANAG 5066 Annex K."
CSMA mode is often referred to as "listen before transmit" (LBT). It is clear industry practice to use LBT and it would be extremely foolish to implement Annex D (mandatory) without LBT/CSMA. It is hard to envisage a scenario where LBT is not used, as it is straightforward for a STANAG 5066 implementation to do this, and modern modems provide enhanced support for LBT.
A consequence of this is that Annex K is effectively mandatory, unless you choose to implement Annex L or an alternative MAC level approach such as TDMA.
2. Jitter vs. Slotted
In a CSMA network, nodes need to make independent decisions on when to transmit next. Two nodes transmitting together is highly undesirable, as there is no Collision Detect with HF.
When a node has finished transmitting there may be multiple nodes that wish to transmit. Annex K provides a Jitter algorithm to reduce risk of collision. The Jitter introduces random delay, which helps avoid two nodes transmitting at the same time.
An alternative approach set out here is to use a slotted approach, where each node on the network has a defined slot to start transmission in, relative to the end of the finishing transmission. Advantages of the slotted approach:
- The slots ensure that collisions are avoided.
- It can be faster than jitter for a network with a small number of nodes.
- It avoids collisions, even when there are long gaps.
Avoiding collisions is dependent on all nodes receiving at least some of the most recent transmission to determine when it ended (either by noting end of transmission or using an EOT based calculation).
The slotted approach is not "fair", as nodes with early slots will get priority over nodes with later slots. This characteristic will be good for some deployments and bad for others.
The slotted approach is not suitable for a network with large numbers of nodes. Both slotted and jitter approaches require all nodes to be configured with consistent parameters.
3. CAS-1 and Annex K
3.1 Single vs Multiple CAS-1 Physical Links
STANAG 5066 core and Annex K are written with a model of multiple simultaneous CAS-1 physical links. This all works cleanly as the Annex K procedures can be followed at end of each transmission to minimize risk of collision.
Many current STANAG 5066 implementations are designed to work with a single CAS-1 physical link. This has the benefit that when a link is active, that as soon as one node stops transmitting, the other will start (i.e., Annex K is not used and delays are minimized). This approach optimizes communication between two nodes. It will also work well if one node needs to transfer a large block of data to two nodes at the same time on the same channel. It can open a CAS-1 link to the first node and transfer the data and then repeat for the second node.
There are scenarios where use of multiple simultaneous CAS-1 links is beneficial. Consider where a node needs to transfer a small amount of data to two other nodes. If two CAS-1 links are open, the data for both nodes can be sent in a single transmission, which gives good latency and link utilization. Where S5066-EP3 is used, the CAS-1 links can be initiated in the same transmission. Annex K and this specification will minimize risk of the responses colliding.
This specification supports both approaches.
3.2 CAS-1 Link Lifecycle with Annex K
There are operational circumstances where (some) participating nodes on a multi-node channel do not implement Annex K or do not permit multiple simultaneous CAS-1 links. Alternatively the common operational scenario for the channel may be that one node typically has bulk ARQ data to send to multiple nodes. In such cases it is recommended that nodes be configured so that CAS-1 links are only established when there is data to be sent and that they are closed quickly when the link becomes idle. This could be extended to restrict the establishment of new CAS-1 links (by a node) when any other CAS-1 links (involving other nodes) are known to exist. This strategy will enable maximum use of immediate retransmission and will avoid Annex K delays during CAS-1 links.
This approach means that commonly there will be only one CAS-1 link. Multiple simultaneous CAS-1 links should only be used on networks where all nodes can support this and where traffic that will benefit from multiple CAS-1 links is anticipated.
3.3 When a CAS-1 Node does not use Annex K
When a node is participating in a CAS-1 link and its peer finishes transmitting, it is desirable to avoid the Annex K delays whenever possible. This section sets out the rules for a node to determine if this is possible.
Other nodes will simply be following Annex K rules. They will detect the immediate transmission and so collision will be avoided. The node making the immediate transmission needs to be confident that it is the only node that will do this.
The primary decision is based on looking at DPDUs in the transmission that has just completed. If the transmission contains DPDUs for the node making the decision and the node does not find any DPDUs directly addressed to other nodes the node shall start to transmit immediately at the end of the received transmission. Note that the transmission received may contain non-ARQ DPDUs for broadcast and/or multicast destinations and that this does not affect the decision.
In the event that a node received a transmission but is not able to parse any DPDUs in the transmission, it may transmit immediately if the local node is known to be a peer in all known CAS-1 transmissions. Under these conditions it is able to transmit safely, as no other node is expected to transmit at this point. In order to determine this, a node needs to monitor the status of all CAS-1 links on the channel (not just the CAS-1 links that the node is involved in).
When these conditions do not apply, Annex K and the related rules in this specification shall be used.
4. Good Conditions LBT Timer
For slotted to work effectively, there needs to be good agreement as to the end of transmission time.EOTs are set at half second intervals, so an EOT based calculation should be accurate to about one second.
This EP introduces a new timer that can be used in good conditions:
This timer must be used when an EOT based end of transmission can be calculated. The rationale for this is that the EOT based calculation of end of transmission is the most accurate approach, and this will allow a shorter value of LBT_WAIT_TIME_GC to be reliably used.
If end of transmission cannot be determined by EOT, then a longer timer is used. This gives time for another node which did hear the signal in full to accurately use LBT_WAIT_TIME_GC.
If EOTs from the last transmission are available, these shall be used to determine end of transmission time. Otherwise end of transmission is determined by the actual radio signal ending. Note that the radio signal may fade at the end, so that the actual signal end may later than measured.
The standard LBT_WAIT_TIME must be used when end of transmission is determined from the radio signal. It is expected that this will be set to an appropriately conservative value (current default is 30 seconds).
5. Addition of Slotted
The slotted approach introduces three new parameters:
- SLOT_TIME: the time length of each slot.
- MAX_SLOTS: the maximum number of slots, which must be greater than or equal to the number of nodes on the network.
- NODE_SLOT_POSITION: A per node configuration in the range 1 to MAX_SLOTS.
The times when a node can start to transmit after LBT_WAIT_TIME or LBT_WAIT_TIME_GC has expired are:
(NODE_SLOT_POSITION -1) * SLOT_TIME + n * MAX_SLOTS * SLOT_TIME
Where n is an integer value of zero or more. This allows slot co-ordination for an indefinite period after the last transmission.
The slotted mechanism applies also to the node which made the last transmission, which can calculate LBT_WAIT_TIME_GC. This node may have an option to require n to be at least 1, so that other nodes are given priority over the node that has just transmitted.
An implementation of this specification may use repeats of the slots to avoid subsequent collision. To do this LBT_WAIT_TIME is set to:
LBT_WAIT_TIME_GC + n * MAX_SLOTS * SLOT_TIME
Where n has a value of 1 or greater. This will give slots that are coordinated between the two timers.
If this process is not followed, there must be a delay before sending after previous transmission of at least:
LBT_WAIT_TIMER + (NUM_CONT_SLOTS x CONT_SLOT_WIDTH)
6. Repeat Transmission by Sender
When ARQ data is being transmitted, the rules of this specification mean that another node is expected to transmit next. When non-ARQ data is being transmitted other nodes may or may not transmit. In this case, the sender will want to transmit again. To support this the following rule is added, that a sender should wait for WAIT_BETWEEN_TX_TIME seconds after LBT_WAIT_TIME_GC has expired before the node can transmit again. This allows receivers who heard the transmission to get in first (as they will also be triggered by LBT_WAIT_TIME_GC), but sender can transmit again without needing to wait for LBT_WAIT_TIME. There are a number of considerations for setting WAIT_BETWEEN_TX_TIME:
- If it is know that only ARQ data is being transmitted, a conservative (long) time is appropriate.
- If the node has the last slot, it should be timed to reach this slot.
- If repeat slots are being observed, it should be timed to wait for the sender’s next slot.
- If repeat slots are not being observed, it should be timed to wait until after every other node has had a slot.
7. Notes on Choice of Timers
The choice of SLOT_TIME will depend on:
- Transmitting for long enough so that the node with the next slot can detect this transmission and so not transmit.
- Accuracy with which each node can determine end of transmission. The SLOT_TIME needs to be long enough to allow for such variation.
This version of the EP does not make any recommendation on a default value for SLOT_TIME. The default value of LBT_WAIT_TIME is 30 seconds.
It is anticipated that a much shorter value of LBT_WAIT_TIME_GC can be used. Keeping LBT_WAIT_TIME longer makes sense, as this can give confidence that any CAS-1 link is terminated and it also allows for another node that has performed the EOT calculation to transmit first, and this may be heard by the local node.
It is anticipated that a recommendation for these timers will be made in future versions of this EP, based on operational measurements.
8. Changes to STANAG 5066
This document defines an additional algorithm that can be added to Annex K.
9. Backwards Compatibility
When CSMA is used, the slotted or jitter algorithm must be agreed and consistently configured for all nodes. So this extension can only be used for a network where all nodes support it.