Published on 14th February 2008
Overview
HF Radios are important for military communications. IP is widely used
and is the basis for most network communication. This paper looks at
use of IP over HF Radio and the efficiency of different types of application
over IP. The key findings are that:
- That IP can be operated over HF Radio, and that doing so may be
useful, particularly to enable support of applications that are only
available to run over IP.
- That most IP based applications running over a HF link will make
very inefficient use of the HF link, and that direct application use
of the HF link using STANAG 5066 will give much better performance.
This paper concludes that applications intended for regular use over
HF Radio should not use IP and should instead be directly integrated
with STANAG 5066.
IP
IP (Internet Protocol) is the basis of the Internet, and the most widely
used protocol. It supports a number of transports, including TCP, UDP
and RTP, and a myriad of end applications. IP is the universal interface
between physical networks and application. Application writers develop
to IP, or to one of the standardized protocols or middleware systems
that run over IP. Networking technology providers seek to provide IP
operation as a primary goal.
"IP Everywhere" is such a strong culture, that it is difficult
to appreciate the few places where IP is not the right answer. IP is
universal and is the right answer almost all of the time. This paper
looks at why IP is not the right solution for HF Radio. Another example
of where IP is not right is deep space communications, where the long
round trip times require a different approach.
Key Factors for Good Performance over HF
HF Radio is very slow with typical rates around 1200 bits per second.
This slow speed is usually perceived as the primary difficulty with
using HF. For components about the modem level, speed is determined.
There are two general issues relating to performance that are now described.
Turnaround time
For data applications, the most significant problem with HF is turnaround
time. Turnaround time is the time taken to change direction of data
flow, and for HF Radio this is measured in seconds or tens of seconds.
Turnaround time is caused by factors at multiple levels:
- Modem turnaround time
- Interleaver completion time. This can be a significant time (e.g.,
2.16 seconds for STANAG 4539 short interleaver or 8.64 seconds for
the long interleaver).
- Delays due to Comsec layer.
- The HF Radio must switch from send to receive. (An HF Radio is
a simplex device, and cannot send and receive at the same time).
- HF Skywave latency times.
All of this leads to a turnaround time of at least a few seconds and
sometimes as much as 20-30 seconds. In order to make (reasonably) efficient
use of an HF link, it is critical to have transmit time longer than
turnaround time. This has a significant impact on applications.
For VHF and higher frequencies, turnaround time is less, so the impact
is less significant. For VHF, full duplex transmission can also be used,
which removes the impact completely.
Efficient Use of the Pipe
It is important that applications use the HF link as efficiently as
possible, as bandwidth is limited. There are a number of things that
need to be considered:
- Data Compression. This will be important for most applications.
Compression is not discussed in this paper, but should be addressed
by applications operating over HF.
- Protocol and Header Overhead. It is important that protocols do
not consume undue bandwidth with headers and other information. This
should be optimized. The protocols discussed in this paper, and in
particular STANAG 5066 are well optimized.
- Grouping data. Where possible data should be sent together in long
transmissions, to minimize the effect of turnaround time.
- Avoiding repeat transmissions, except when data is lost. It is
clearly important that data is not sent twice when not needed, and
applications should be designed to avoid this.
These last two points have potential for significant inefficiency,
and are discussed in more detail in the context of relevant protocols.
STANAG 5066
STANAG 5066 is a NATO standard for running applications over HF Radio.
This is described in the Isode white paper STANAG
5066: The Standard for Data Applications over HF Radio.
An HF Modem provides a quite basic send/receive capability. STANAG
5066 provides application oriented capabilities over this. Capabilities
of relevance to this paper:
- Decoupling. Data can be accepted from one or more applications,
and queued for sending. A STANAG 5066 server can then “fill
the pipe” to avoid wasting capacity.
- Fragmentation. STANAG 5066 will break application data into blocks
appropriate for the modem speed (DPDUs). This is important both for
precedence handling and acknowledgement.
- Precedence handling. STANAG 5066 will send higher priority data
(DPDUs) first..
- Support for multicast, which can also be used to support nodes
in Radio Silence (also known as EMCON (Emission Control)).
- Reliable transmission is supported for non-multicast traffic, by
acknowledgement at the DPDU level. This means that when there is data
loss (e.g., due to Radio noise) that retransmission is of the lost
DPDUs. This is more efficient than application level retransmission.
- Minimizing turnarounds. STANAG 5066 has long transmit times (up
to 127.5 seconds) and works to maximize transmit time. Acknowledgements
are delayed wherever possible, so reliable transmission does not increase
the number of turnarounds.
The central service in STANAG 5066 is "unit data". This service
allows transfer of a block of data, with a maximum size constrained
set at the STANAG 5066 level (typically 2 kBytes). Applications using
STANAG 5066 transfer all information using unit data.
Unit data may be unacknowledged (best effort) or acknowledged (reliable).
The choice is made by the application using STANAG 5066. Unacknowledged
must be used where data is being sent to more than one recipient (broadcast
or multicast) or where the recipient is in EMCON (Emission Control)
and cannot transmit an acknowledgement.
IP over HF Radio
IP could be mapped onto HF Radio in a number of ways. There are two
mechanisms standardized (STANAG 5066 and STANAG 4538) which are discussed
here. These provide similar characteristics to the IP user, and so the
choice of mapping does not significantly affect the analysis in this
paper. It is hard to conceive of alternate mappings that would lead
to improvements.
IP over HF Radio using STANAG 5066
STANAG 5066 defines use of IP over STANAG 5066. This support is mandatory
in the current standard, although not all STANAG 5066 products support
it. The mapping is very simple: essentially an IP packet is mapped directly
onto STANAG 5066 unit data. There are two options for unit data, both
of which are valid for IP:
- Unreliable. This is a natural mapping for IP, as IP is defined as
an unreliable protocol.
- Reliable. This may be used for unicast and non-EMCON transmission.
There are a number of factors in the choice:
- If data needs to be retransmitted, reliable will be more efficient,
as this can be done at the DPDU level, rather than forcing the application
to retransmit the complete unit data.
- Some applications respond to lost data by either reducing transmission
rate, or by packet exchange which leads to more turnarounds. These
are both undesirable, and can be addressed by the reliable option.
- A consequence of minimizing turnarounds is that in the event of
data loss, IP packets using the reliable option may be considerably
delayed. Some applications and operating respond badly to this delay,
and data may be retransmitted unnecessarily.
This choice gives a flavor of the issues that will be examined in the
rest of this paper. The support of IP over HF radio is quite straightforward.
The issues arise from the interaction of applications using IP and consequences
of the underlying characteristics of HF Radio.
Where priority data is carried in IP, this can be mapped onto STANAG
5066 priority. Handling of priority at the IP level is discussed in
the Isode white paper Sending FLASH Messages
Quickly.
IP over HF Radio using STANAG 4538
STANAG 4538 provides an alternate set of data link services and ARQ
(acknowledgement mechanisms to support reliable data) to STANAG 5066.
STANAG 4538 and associated specification is an example of 3G HF Radio.
STANAG 4538 provides a reliable point to point data service that will
work with very poor radio conditions. This makes it ideal for use with
small “man pack” radios with whip aerials that will often
need to deal with poor signals. STANAG 4538 defines a mapping of IP
over its data services, which is a straightforward use of the underlying
reliable data transfer. It has characteristics very similar to the reliable
mapping of STANAG 5066.
STANAG 5066 and associated standards is an example of 2G HF Radio.
It might be assumed that 3G is always preferable to 2G, but this is
not the case. STANAG 5066 data link and ARQ will perform better than
STANG 4538 in fair and good radio conditions, so is the best choice
in situations such as naval communications or strategic links where
more powerful radios and large aerials can be used. STANAG 5066 also
provides multiplexing, EMCON support and multicast which are not available
in STANAG 4538 data services.
STANAG 4538 data link may also be used in conjunction with STANAG 5066
application integration. This will enable an application to use STANAG
5066 as the mechanism for application separation and API, and then use
the STANAG 4538 data link services. This is the approach recommended
by this paper to support applications over STANAG 4538..
The Example Applications
This paper looks at two example applications to analyze the performance
of working with and without IP over STANAG 5066. These are:
- Internet messaging (message submission and transfer). IP and STANAG
5066 mappings are defined:
- SMTP (Simple Message Transfer Protocol) is defined to work over
IP.
- STANAG 5066 Annex F defines HMTP (HF Message Transfer Protocol).
This is a variant of SMTP, that provides some simple SMTP level
optimizations and defines a direct mapping onto STANAG 5066.
- STANAG 4406 Military Messaging. STANAG 4406 defines operation over
low bandwidth in Annex E, and this operates over the ACP 142 protocol.
IP and STANAG 5066 mappings are defined:
- ACP 142 defines operation over the Internet Standard UDP (User
Datagram Protocol) which operates over IP.
- ACP 142 is defined so that it can be operated over different
underlying protocols. STANAG 4406 Annex E defines operation directly
over STANAG 5066.
SMTP gives a good insight into a TCP based application, which is a
common choice for Internet applications (it is not used for real time
applications such as voice and video, but is used for most other applications).
ACP 142 was designed for use over HF Radio, and is a good example of
a rate based approach.
As well as being important applications in their own right, the analysis
shows a number of key issues in the way the various protocol combinations
work. This paper looks to highlight key protocol choices, rather than
give detailed descriptions. Many of the protocols examined in this paper
are presented in a highly simplistic manner, in order to make clear
the major features and in particular characteristics that will impact
performance.
Data Streams & Internet Messaging
This section looks at the underling mapping of Internet Messaging (SMTP
over TCP and HMTP direct over STANAG 5066).
TCP
In order to analyze performance, it is important to understand (in
a very simplistic way) how TCP works in conjunction with IP. In an IP
network there are no end to end circuits or virtual circuits. IP routers
simply switch packets. When congestion occurs, the routers drop packets.
TCP sends data packets in order, and gets back an acknowledgement (ACK)
for each data packet. If it does not get an ACK back within a reasonable
time, it resends the data packet. Both data packets and ACKs are carried
by IP, and data flows in both directions to support TCP. TCP is a windowing
protocol; The receiver specifies in ACK packets how many bytes ahead
of the packet referenced by the ACK that the sender may send to. The
window is the mechanism used by TCP to control how fast it goes. When
packets are lost, a TCP receiver will reduce (close down) the window,
which will in turn reduce the data rate. This “fair” behavior
of TCP is a key element of controlling congestion in the Internet.
The sequence below shows how TCP with data flowing from client to server
would map onto IP packets:
Client: TCP SYN ->
Server: <- TCP SYN,ACK
Client: TCP SYN & DATA ->
Server: <- TCP ACK
Client: TCP DATA ->
Server: <- TCP ACK
Client: TCP DATA ->
Server: <- TCP ACK
Client: TCP DATA ->
Server: <- TCP ACK
Client: TCP DATA ->
Server: <- TCP ACK
Client: FIN,ACK ->
Server: <- TCP RST,ACK
If mapped onto a high latency link, the order of packets may change
as follows:
Client: TCP SYN ->
Server: <- TCP SYN,ACK ** Turnaround 1
Client: TCP SYN & DATA -> ** Turnaround 2
Client: TCP DATA ->
Client: TCP DATA ->
Client: TCP DATA ->
Server: <- TCP ACK ** Turnaround 3
Server: <- TCP ACK
Server: <- TCP ACK
Server: <- TCP ACK
Client: TCP DATA -> ** Turnaround 4
Server: <- TCP ACK ** Turnaround 5
Client: FIN,ACK -> ** Turnaround 6
Server: <- TCP RST,ACK ** Turnaround 7
The key change to note is that multiple DATA packets are sent before
the associated ack comes back. The above diagram notes how the TCP data
flow causes turnarounds:
- Turnarounds 1 & 2 are associated with opening the TCP connection.
- Turnarounds 3 & 4 are associated with the TCP window, requiring
an ACK before more data can be sent.
- Turnaround 5 is used to ensure that the last data is received by
the application before closing the connection.
- Turnarounds 5 & 6 are associated with closing the connection.
It can be seen that if a suitable size of window is chosen, ongoing
data transfer can be mapped efficiently onto the underlying HF data
exchange. There are quite a few turnarounds associated with open and
close, so TCP would not be efficient for short interactions.
A key benefit of using TCP (over IP) is that a wide range of applications
come with TCP "out of the box".
A basic problem with TCP and HF, is that most "out of the box"
TCP implementations are tuned for much faster networks than HF. A good
TCP implementation will adapt its settings to network conditions, but
this will be likely to have a performance cost in extra turnarounds
and inefficient use of the pipe. Specific issues:
- Window is too small. This will cause the sender to wait for ACKs
too soon, and have too short a transmit time before the turnaround.
- Window is too large. This will cause the sender to transmit more
packets than can be queued (in the various queues between the sender
and the STANAG 5066 system) and IP packets will get dropped.
- Packets retransmitted without need. This is most likely to be a
problem early on, before the sender has an accurate measure or round
trip time.
In summary, efficiency will depend significantly on the TCP implementations,
and how they react to the network characteristics. TCP will be a poor
choice for short connections and for "chatty" applications.
It should give reasonable performance for long lived TCP connection
with steady data flows that can map onto the optimal "two minutes
each way" HF model.
If a reliable IP mapping of IP onto STANAG 5066 or IP over STANAG 4538
is chosen, it is important that TCP is tuned so that it does not cause
turnarounds or retransmissions as a consequence of the long delays.
If data is lost with an unreliable mapping, additional turnarouds are
a likely consequence.
SMTP & Turnarounds
The core SMTP protocol is "chatty". The sending implementation
provides data (e.g., a recipient email address) and then waits for the
receiver to accept or reject the address. This leads to many application
turnarounds, where one end waits for the other.
In an HF environment, this is highly undesirable application behavior,
as it requires lots of turnarounds in order to support it.
The turnarounds of SMTP and TCP are illustrated below, with each line
representing an IP packet. The protocol exchange was taken from a real
interaction from Microsoft Outlook to Isode’s M-Switch server,
using Microsoft’s secure authentication.
Client: TCP SYN ->
Server: <- TCP SYN,ACK
Client: TCP SYN ->
Server: <- SMTP 220 (Welcome)
Client: SMTP EHLO ->
Server: <- TCP ACK
Server: <- SMTP 350 (OK)
Client: SMTP AUTH NTLM (Authenticate Session) ->
Server: <- SMTP 334 (Intermediate Authentication Response)
Client: SMTP Authentication Data ->
Server: <- SMTP 334
Client: SMTP Authentication Data ->
Server: <- SMTP 235 (Authentication successful)
Client: SMTP Mail From (Message Sender) ->
Server: <- TCP ACK
Server: <- SMTP 250 (OK)
Client: SMTP RCPT TO (Recipient) ->
Server: <- TCP ACK
Server: <- SMTP 250 (OK)
Client: SMTP DATA (Ask if ready for the message) ->
Server: <- TCP ACK
Server: <- SMTP 354 (Go ahead)
Client: SMTP Message Body ->
Server: <- TCP ACK
Client: SMTP EOM ->
Server: <- TCP ACK
Server: <- SMTP 250
Client: TCP ACK ->
Client: SMTP QUIT ->
Client: TCP FIN,ACK ->
Server: <- SMTP 221
Server: <- TCP RST,ACK
This exchange is shown at the IP level. Note that this exchange was
to transfer a very short message, which fitted into a single IP packet.
There are 23 changes of direction, which would require 23 turnarounds.
This is very inefficient.
The TCP choice to map onto IP packets has an effect on performance.
The above scenario could be optimized by sharing IP packets. The exchange
above is optimized for a fast low latency network.
Server to Server communication with a modern SMTP implementation making
full use of pipelining would need fewer packets, but the number of turnarounds
remains significant. A good SMTP implementation such as M-Switch might
give the following, assuming no authentication and an efficient mapping
to IP:
Client: TCP SYN ->
Server: <- TCP SYN,ACK
Client: TCP SYN ->
Server: <- SMTP 220 (Welcome)
Client: SMTP EHLO ->
Server: <- TCP ACK + SMTP 350 (OK)
Client: SMTP Addressing Information ->
Server: <- SMTP Address Response
Client: SMTP Message ->
Server: <- SMTP Message Response
Client: TCP FIN,ACK + SMTP QUIT ->
Server: <- TCP RST,ACK SMTP 221
It can be seen that this is much better than then previous example,
with just eleven turnarounds, but still has a significant turnaround
overhead.
HMTP Direct Mapping to STANAG 5066
HMTP contains two key differences with SMTP. The first is that it defines
a mode of operation very similar to standard SMTP pipe-lining that minimizes
number of turnarounds (to two for a small message). It also fixes options
to maximize interoperability without the need for service negotiation.
The key feature of HMTP is that it defines a mapping onto unit data
with ARQ. The ARQ is essential, as HMTP cannot deal with data loss.
For a message that fits into three MTUs, this would lead to the following
sender/receiver interaction:
Sender: Data (HMTP Commands
& Message) ->
Sender: Data (HMTP Commands & Message) ->
Sender: Data (HMTP Commands & Message) ->
Receiver: <- ARQ (for 3 Data)
Receiver: <- Data (HMTP Response)
Sender: ARQ ->
It can be seen that this is very efficient, with just two turnarounds
needed (where the message can be transmitted within 127.5 seconds).
The mapping of a TCP based application directly onto STANAG 5066 offers
significantly better performance..
ACP 142
ACP 142 is a protocol for supporting Multicast and EMCON transmission
of data, over HF and other networks. STANAG 4406 Annex E uses ACP 142
by providing it with a single compressed file to transfer to one or
more destinations. This is described in the Isode white paper Military
Messaging over HF Radio and Satellite using STANAG 4406 Annex E.
How ACP 142 works
ACP 142 works by dividing the data to be transferred into multiple
packets. It sends out each packet in turn, to unicast or broadcast addresses.
Each recipient will inform the sender of any missing packets (so that
they can be re-transmitted) and at the end tells the sender that it
has all of the packets. Key features of ACP 142:
- It uses an unreliable datagram service, and does not require low
level acknowledgements.
- There is no windowing mechanism to control data transfer rate.
A key issue to address with underlying mappings is how to optimize
data transfer rate.
ACP 142 direct over STANAG 5066
ACP 142 has a straightforward mapping onto the unit data service of
STANAG 5066, using the unreliable (non-ARQ) option.
ACP 142 provides functionality to support multicast and EMCON transmissions.
It also provides an “optimal” transfer of data to a single
recipient. The following sequence is directly comparable to the HMTP
sequence shown earlier:
Sender: Data (ACP 142 Addressing
Information) ->
Sender: Data (ACP 142 Data) ->
Sender: Data (ACP 142 Data) ->
Sender: Data (ACP 142 Data) ->
Receiver: <- Data (ACP 142 Ack)
It can be seen that there is only a single turnaround, after the data
has been sent (as ARQ is not used). If data is lost, this will be handled
by the ACP 142 protocol. This uses the absolute minimum of turnarounds
for reliable data transfer.
ACP 142 handles loss of intermediate packets, so if a packet is lost
at the modem level, only the lost packet needs to be retransmitted.
Rate control with STANAG 5066 is straightforward, and is handled by
the SIS protocol. The application can send data as fast as it wishes
using the unit data service. If the STANAG 5066 SIS server has too much
data from the sending application (or from another application) it can
request the application to stop sending. It will inform the application
later when it can send data again. Because the SIS server interacts
with the modem and handles all of the data being sent, it has all the
information necessary to optimize use of HF link. The flow control enables
the ACP 142 implementation to send data at the optimal rate.
ACP 142 over IP
ACP 142 maps cleanly onto IP using the User Datagram Protocol (UDP),
which is a simple application service layered directly on IP. When IP
is mapped without ARQ, the result is very similar to the direct mapping
to STANAG, with a small overhead of the UDP and IP headers.
A transmitting ACP 142 application needs to control the rate at which
it sends out packets, and when using IP there is not protocol mechanism
to achieve this. The approach adopted by implementations we are aware
of is to configure a rate at which packets are sent out. This value
will be set to match the underlying (HF) network. Setting the rate needs
care:
- If the rate is set too low, bandwidth is wasted.
- If the rate is set too high, IP packets will get dropped. This
will cause ACP 142 to retransmit in response to missing packet information
from receivers. This will be inefficient.
The difficulty is that the rate for a real system will be variable,
and the application has no mechanism to determine this rate. The HF
bandwidth available may change according to conditions. The link may
be shared with other STANAG 5066 applications (which may have higher
precedence), and their use will not be visible.
Some implementations use ICMP Source Quench to provide flow control.
This works to some extent, but has a number of problems:
- Source Quench is not allowed for multicast addresses. This is a
problem as ACP 142 is often used for multicast destinations.
- The Source Quench may be generated after packet discard, and the
ACP 142 implementation cannot determine whether or not to retransmit.
- The Source Quench is a crude “slow down” signal. It
does not indicate how much to slow down or when or if it is safe to
speed up again.
In summary direct mapping of ACP 142 to STANAG 5066 gives better performance
than use of IP, due to the ability to provide optimal control of data
rate.
ACP 142 vs Data Stream
Although this paper is intended to primarily consider the use of IP
with HF Radio, it also provides a useful comparison between ACP 142
and Data Stream mechanism such as the one used by HMTP for carrying
"bulk data". Where it is available as an option, ACP 142 has
the following advantages over a simple data stream approach.
- It supports multicast transmission.
- It supports EMCON transmission.
- For transfer to a single recipient that is larger than the MTU
size, it requires fewer turnarounds, and is likely to give better
performance.
Analysis of IP over HF Radio
The paper has looked at two specific applications, and compared operation
with and without use of IP. This section looks at how this specific
comparison applies more generally. Reliable applications fall into two
major classes according to how data transfer rate is controlled: rate
based and window based. These are now considered.
ACP 142 is a good example of a rate based protocol. The key performance
issue with a rate based protocol and HF is getting the rate correct.
This is straightforward when using STANAG 5066 directly, and not possible
when using IP. For rate based protocols, using IP is undesirable.
TCP is a windowing protocol, and in practice the only one that matters.
They key issue for TCP is to minimize turnarounds. For a chatty application
such as SMTP, the cost of turnarounds is prohibitive. A TCP based application
could give reasonable performance if the following factors are all present:
- A long lived application (so the TCP start/stop overhead can be
amortized).
- Stable HF data rate (i.e., no change in modem speed, or other applications
sharing the link), to avoid overhead and turnarounds of adjusting
the window.
- Low transmission errors (as these will lead to extra turnarounds).
These conditions are quite restrictive, so in most situations, a direct
mapping onto STANAG 5066 is going to give much better performance
than TCP over HF Radio.
Another class of application is based on unreliable communication,
typically using the User Datagram Protocol (UDP). In situations of low
packet loss, a UDP based approach is in practice reasonably reliable.
It would be sensible to use a reliable mapping of IP onto STANAG 5066
or STANAG 4538 in support of such protocols, to minimize packet loss
over the HF link. This sort of application is only going to be useful
for low volumes of data. A key problem is that there is no feedback
to the application if more data is sent than can be handled by the link.
Unreliable communication is only sensible for a specialized class of
applications, as reliability is generally desirable.
HF Links are also used for voice communication. This is generally given
priority over data, and data transfer cannot generally co-exist with
voice. This means that voice usage can interrupt data, and needs to
be taken into account by data applications.
Further Reading
Some useful papers relating to this are available online:
There are also several papers in the IEE 9th International Conference
on HF Radio Systems and Techniques:
- IP over HF as a bearer service for NATO formal messages
- Bowman HF IP network solution
- IP traffic over STANAG 5066
With some of these papers, it is important to read the details and
look at the numbers, which are all in line with the findings in this
paper. Some papers are written to look at use of IP, and summarize how
it works, and sometimes performance is reasonable. This is not inconsistent
with the technical conclusion here that it works, and performance varies
from atrocious to sub-optimal.
Conclusions
This paper has shown quite clearly that direct use of STANAG 5066 will
be substantially more efficient that that the use of IP. The consequences
of this are:
- That IP can be operated over HF Radio, and that doing so may be
useful, particularly to enable support of applications that are only
available to run over IP.
- That most IP based applications running over a HF link will make
very inefficient use of the HF link, and that direct application use
of the HF link using STANAG 5066 will give much better performance.
This paper concludes that applications intended for regular use over
HF Radio should not use IP and should instead be integrated with HF
Radio using STANAG 5066.