HF as an IP Subnet vs Applications Optimized for HFApproaches to integrating HF Radio into a modern IP based military network are described in the Isode white paper HF Radio & Network Centric Warfare. This paper also gives a general description as to the importance and role of HF Radio.
The above diagram shows how HF Radio could be deployed as an IP Subnet bearer. This is a standard approach to integrating different network technologies, and use of IP as the single common layer that separates application from network. This architecture has a number of major benefits. In particular:
An alternative architecture shown above is described in the white paper HF Radio & Network Centric Warfare. The key point is that an application proxy or relay is used, and specialized protocols are used directly over the HF radio. This approach has a number of benefits and disadvantages relative to using HF Radio as an IP subnet bearer, which are discussed in the white paper. This paper looks at one key aspect of the trade-off: relative performance. Given the inherently poor performance of HF Radio (low bandwidth and high error rate), it is important that the architecture chosen makes efficient use of the underlying network. Running IP over HF using STANAG 5066
STANAG 5066 provides a standardized separation between application and HF Radio data link layer, described in the Isode White Paper STANAG 5066: The Standard for Data Applications over HF Radio. STANAG 5066 defines (and mandates) in Annex F support of IP running over STANAG 5066. The architecture for this is shown above. Note that IP is not an end to end protocol, and is used in conjunction with end to end layer protocols. The most common are:
The diagram above shows how IP over STANAG 5066 would fit into the overall architecture. STANAG 5066 is used for the link communication between a pair of IP system. This will usually be between a pair of routers (intermediate systems) but may also be with a host (end system). STANAG 5066 Annex F defines the details of how this works.
STANAG 5066 is deployed using the SIS (Subnet Interface Service) protocol, which runs over TCP/IP. This will decouple the STANAG 5066 application (in this case the IP Service) from the STANAG 5066 server and associated modem/radio subsystem. This architecture is described further in the Isode White Paper on STANAG 5066. The diagram above shows how the SIS protocol and STANAG 5066 Server fit with an IP Router. IP can also be run over STANAG 4538. We would expect this to lead to results very similar to those obtained here for STANAG 5066 ARQ mode. It would be useful to confirm this by measurement. Previous Supporting MeasurementsThe measurements of this paper need to be interpreted in the context of two earlier sets of measurements made by Isode, which are referenced in the following text as: “The STANAG 5066 Performance Paper” – STANAG 5066 Performance Measurements over HF Radio gives a set of base measurements as to how STANAG 5066 performs. It only makes sense to consider the performance of protocols running over STANAG 5066 in the context of the characteristics of the service provided by STANAG 5066 to the application. “The HF Messaging Performance Paper” – Performance Measurements of Messaging Protocols over HF Radio gives measurements of messaging protocols designed to work over STANAG 5066. These measurements give a base line for comparing the performance of messaging applications running over IP. Test Setup
The test setup used here is essentially the same as the one used in the other Performance Papers. We used RM6 HF Modems from RapidM and the associated STANAG 5066 Servers, connected back to back with an audio link. This setup should give “perfect transfer”. A software bug (now fixed) led to a small level of packet loss. This small data loss seems realistic, and showed a number of interesting characteristics during tests.
The diagram above shows the IP and STANAG 5066 setup used for all of the tests. The IP Routers used for the tests are Red Hat Linux servers – the data volumes are low, and so there is no need to use special purpose hardware or operating system. The STANAG 5066 IP Client for Linux was developed by NATO NC3A, who provided the software for these tests. The applications used in the various tests then simply connect to the routers using IP, and this provides the applications with an IP path that is switched over HF Radio. UDP TestsThe first set of tests uses UDP (User Datagram Protocol). UDP is a simple transport layer protocol that effectively provides the application with direct access to the IP service. It provides an unreliable service, as IP is unreliable (i.e., routers and network components may discard IP packets). UDP measurements give a clear insight into the (raw) performance of IP over STANAG 5066. Testing Framework and Analysis
The test framework consists of a UDP generator that sends out UDP datagrams and a peer test analyser. The test generator sends out packets at size and rate specified for the test. Each packet is numbered, and includes a time stamp. The analyser records which packets are received (and by implication which packets are lost) and the delay. UDP MeasurementsIP over STANAG 5066 uses a direct mapping of IP packets onto the STANAG 5066 Unit Data service. Mapping may by ARQ (reliable) or non-ARQ (unreliable). The STANAG 5066 Performance paper made direct measurements of ARQ and non-ARQ transmission of Unit Data. The first set of UDP performance tests was seeking to measure throughput performance, and measurements equivalent to the STANAG 5066 tests were made. The UDP tests will add an overhead of 28 bytes for each datagram sent (8 bytes from UDP and 20 bytes from IPv4). This overhead can be calculated and deducted, so that a direct comparison with the STANAG 5066 results can be made. Packets were sent at a rate that of 100% of what the link can cope with, and so some packet loss occurred in the tests. This happens when the STANAG 5066 Server flow controls the IP Client, which will drop packets that cannot be sent. All tests were done in ARQ mode. It was seen as more important to carefully investigate characteristics of ARQ mode, as TCP could not be made to work when non-ARQ mode was used. The subsequent ACP 142 tests use IP over STANAG 5066 in non-ARQ mode. The following table shows for various combinations of line speed and PDU Size (the user data provided to UDP) the throughput, and the percentage overhead (comparing throughput to line speed). The IP/UDP overhead shown is a calculated value, with the remaining overhead assigned to STANAG 5066. Where equivalent STANAG 5066 measurements were made in the STANAG 5066 performance paper, the measured overhead is shown.
Observations on this data:
A second set of tests was made to look at latency for different network loading. These tests were done at 1200 bits per second with a PDU size of 512 bytes.
The applied load (70%) in this first graph is at a level which can get through without loss. Traffic settles to give a latency ranging broadly between 20-30 seconds. The sawtooth interval is about 30 secs, reflecting STANAG 5066 transmissions and turnaround of about this time. The larger spikes correspond to data loss. When this happens, the STANAG 5066 will retransmit. This gives a delay, because the retransmission will come after two more turnarounds. Because packets are delivered in order, this delay will also impact subsequent UDP datagrams. The double spike to the right is probably caused by the interaction of two data losses and associated retransmissions.
The second graph is for 20% loading (which should be easy to handle). The pattern is similar to 70% loading, but with a reduced typical latency time in the region of 10-20 seconds.
The final test had an applied load of 700%, so most UDP datagrams sent would be dropped off the back of the IP queue. After an initial build up, latency varies in the range 150-450 seconds, with a sawtooth pattern caused by the interaction of the IP queue, STANAG 5066 behaviour, and occasional data loss. The sawtooth spikes occur at approximately two minute intervals, which is in line with the 127.5 second STANAG 5066 maximum transmit time. AnalysisApart from the anomalous behavior with large PDUs at 75 bits per second, IP (and UDP) performance over STANAG 5066 is broadly in line with what would be expected from STANAG 5066 measurements. The overhead is somewhat higher than a theoretical calculation would suggest, but the difference is not large. At the IP level, performance is reasonable. The delay characteristics are interesting, particularly as load increases. Understanding these characteristics will be helpful for analyzing the performance of applications and protocol layers running over IP over HF. Application over UDP (ACP 142)ACP 142 is a protocol designed to operate over HF Radio, and can use either STANAG 5066 UDOP (Unreliable Datagram Oriented Protocol) or UDP. For more details see the whitepaper Military Messaging over HF Radio and Satellite using STANAG 4406 Annex E. ACP 142 Architecture
Isode's M-Switch product supports a number of options for ACP 142 based messaging shown above. The left hand diagram shows the options for operation over STANAG 5066. The HF Messaging Performance Paper gives performance measurements for this stack and also describes how tests were done. The right hand diagram shows operation over IP. Note that IP is not end to end, so the application should not be considered as running over IP over STANAG 5066. Rather, IP will run over HF for one of the subnets in the end to end communication. ACP 142 Measurements
The diagram above shows how ACP 142 works with STANAG 5066 when operating over UDP/IP. There is no direct connection from the application to STANAG 5066 – it connects via an IP Router. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
These measurements give a comparison of performance over IP and over STANAG 5066. The messages have 0, 1, 10 and 100 kByte payloads. The sizes given here are the ACP 142 file sizes; analysis relative to these file sizes gives a clear measurement of ACP 142 performance over the underlying networks. The STANAG 5066 mappings used an MTU size of 2048. The IP mappings used an MTU size of 1500 (typical LAN IP MTU size choice) and a non-ARQ mapping of IP onto STANAG 5066 (so that both sets of measurements used non-ARQ). A key additional setting is the rate control, which controls how fast the sending system sends out IP packets. The settings of this were chosen empirically. If the setting was too high, we observed that there was packet loss due to IP packets being dropped by the STANAG 5066 IP Client. This is highly undesirable, as it will require application level requests for retransmission. We tuned the values to a level where we did not see IP packet loss. This was primarily for the longer tests. We were surprised as to how low we had to go in order to avoid IP packet loss. For small messages the overheads using IP are small (around 10% additional overhead) and this ties with the UDP measurements. For larger messages, the performance degradation was more significant. It would almost certainly be possible to get some performance increase by very careful tuning of the rate parameter. However, if increased too far, performance will drop significantly due to retransmissions. AnalysisOperating ACP 142 over IP over HF can give reasonable performance (10% overhead relative to direct use of STANAG 5066) in some situations, particularly where messages are small and transmissions are infrequent. Performance will not be so good in other situations, including:
Changing the rate will help to tune performance, but setting the value for complex situations will be hard, and will almost certainly impact performance in some situations. Where possible, it is clear the direct operation over STANAG 5066 is preferable to use of IP. Other applications over UDPThere are relatively few applications built to operate over UDP. Often applications using UDP have low data volumes with infrequent traffic. This sort of application would be expected to work reasonably over HF, provided that the application is tolerant of long delays. An example application that works in this way is the Bowman situational awareness application. This uses an ARQ mapping of IP, and the application treats IP as a reliable datagram service for information update. It would be desirable to have applications that transfer bulk data be able to adapt to transfer rate. Designing such a protocol would be very hard (perhaps impossible) because of the very long delays and turnaround times. It is much easier for applications that can make use of STANAG 5066 flow control. TCP TestsWhy TCP is ImportantTCP (Transmission Control Protocol) is used for most Internet applications. It provides reliable data stream, and is a sensible choice for any application needing reliable data transfer. Because of this, understanding the TCP interaction with HF is key to understanding application performance over HF. How TCP WorksIn order to understand the results and analysis here, it is important to have some understanding as to how TCP works. The Wikipedia article is a reasonable starting point. Key concepts (simplified):
Testing Framework
The TCP Test framework is similar to the one used for UDP testing. The test generator opens up a TCP stream, and writes data as fast as it can, waiting for operating system flow control. It includes time stamps within the data sent. The test analyser reads data from the TCP stream. It records how long it took to establish a connection, and then arrival time of subsequent data blocks. This gives a simple measurement of throughput and latency. TCP MeasurementsIn order to make the system work, the TCP was configured on the test systems to have a larger than normal number of connection attempts. This was necessary to establish the first connection to a peer. Once done, the RTT was cached, and so suitable time values were used on subsequent attempts. Measurements were made at 9600, 1200, and 300 bits per second. We were unable to get TCP to work at 75 bits per second, which was where we made STANAG 5066 measurements. At each speed we used MTU of 500 and 1500. The following graph is typical.
It can be seen that the average throughput rises steadily and reached a stable value. This pattern was seen for all bar one of the tests. The latency does not reach a steady state, and there is an oscillating pattern. This reflects bursty arrival of data over the test, so although throughput is optimized, the system never really reaches a stable state. For each of the tests run, the final throughput level is measured, and the time taken to reach 50% and 80% of the maximum throughput. This is shown below.
It can be seen that for many tests it took a long time to reach stable state. This is because TCP uses “slow start”, where the window is initially small and the increased. Because of the long RTT, it takes a long time to open up the window to the optimum amount. This has most impact at higher speeds, because here the optimum window size is larger. At 300 bits/sec, the starting window size is reasonable. When MTU size is increased, the optimum window size is smaller, and so the system reaches this value more quickly. Larger MTU also appears to noticeably increase efficiency at higher data rates. At 300 bits/second the system was barely able to work with the larger MTU, but was fine with 500. The time for first data to arrive is recorded. The differences between 500 and 1500 bytes are significant. We do not understand why this is the case. Some more graphs are shown now, to give a picture of some of the behaviors observed.
This graph shows that throughput quickly reaches a stable value. Latency varies dramatically, with a rather different pattern to the previous graph, and appears to stabilize after an hour.
This graphs appears to reach a more stable latency quickly, but after a long period of time shifts to a higher delay, so clearly there was instability.
This shows the one that did not really work. Throughput started poorly and degraded (to 2%), with latency going to a very high value. It was not able to achieve a sensible use of the link, although it did work after a fashion. AnalysisIn some of the tests TCP was able to achieve reasonable throughput. However, there are a number of concerns:
We believe that the impact of the long underlying IP delays, and the variability of underlying latency dependent on load, has a complex interaction with the TCP protocol, causing some of the effects seen. Impact of Errors and Varying ConditionsThe tests were undertaken in near perfect conditions. In a real deployment, there would be variability, due to errors; varying speed of underlying links; competition with other applications; periods of no traffic. From our observations of the rather unstable nature of the interaction between TCP and HF, it seems likely that operational conditions would lead to significantly poor performance than observed here. SMTP TestsThe next section looks at application level measurements. We use the Internet Standard SMTP (Simple Mail Transfer Protocol) which runs over TCP, and can be directly compared with protocols optimized for HF, described in the HF Messaging Performance Paper. Testing Framework and Comparison
The messaging test framework described in the HF Messaging Performance Paper was used to measure SMTP running over TCP/IP with IP running over HF. CO-ACP 142 is used as the reference protocol here, as in the paper. The diagram above shows the testing setup for SMTP. SMTP Measurements |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
This table shows performance for messages of differing sizes (0, 1, 10 and 100 kByte payloads) with size of submitted message measured. Because CO-ACP 142 does compression, utilization can be over 100%. MTU size of 1500 bytes was used for 1200 and 9600 bits/sec and 500 bytes for 300 bits/sec (the best performing value was chosen for each speed). Observations:
Other Applications over TCPSMTP is just one application, and other applications will have different characteristics operating over TCP. It is important to consider how applicable the results obtained here will be to other applications running over TCP. We believe that SMTP is quite a typical Internet application, so these will be a reasonable general guideline. Things noted about SMTP:
There is no reason to believe that (relative to other Internet applications) that SMTP will have particularly poor performance over HF, or that other applications will perform significantly better or worse. Applications not using TCP or UDPThis paper has focused on TCP and UDP. The third class of applications uses RTP. Most RTP applications such as streaming video are not appropriate for HF Radio because bandwidth demands are too high. Voice can be used. VOIP (which uses RTP) might be usable if a non-ARQ IP mapping is chosen, in order to eliminate the delay spikes which arise from an ARQ mapping. Rate control would be a problem, and it would be affected by the same issues discussed in the context of ACP 142. As HF Radio supports optimized digital voice directly, it seems to make more sense to gateway it, and use the optimized protocols over HF. AcknowledgementsIsode would like to thank:
ConclusionsThis paper has measured the performance of IP over HF Radio, looking at UDP, ACP 142, TCP and SMTP. Use of UDP and UDP based protocols can work reasonably well in certain situations. ACP 142 could work reasonably in a lightly loaded network transferring small messages. However, this is not a general purpose protocol, and would not really fit with the HF as an IP Subnet approach. Given this, if you are using special protocols, it will generally be advantageous to use protocols that operate directly over STANAG 5066. Measurements of SMTP and TCP are more relevant to considering the choice of general HF as an IP Subnet vs use of special protocols for HF. Link utilization with TCP can be reasonable in some situations, primarily for longer transfers operating at mid-range HF speeds. However, a number of concerns arise:
Isode’s conclusion from this is that there are too many issues and problems with operation over IP for HF, to make it a generally viable operational approach. As a consequence, Isode strongly recommends that IP is not used over HF Radio for mission critical applications, except in certain very specialized circumstances.
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Copyright © 2009 Isode | privacy feedback
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||