Transporting VoIP Traffic with UDP and RTP

In the H.323 article you learned that VoIP communications uses a combination of both TCP and UDP at the transport layer. TCP is the transport protocol used for primary call control functions and signaling including call establishment, flow control, codec negotiation, and so forth. Call control functions rely on reliable communications facilities in order to ensure that calls are completed and maintained correctly. In contrast, actual voice data is very time-sensitive, and as such benefits from the lower latency associated with UDP. You should recall that UDP headers do not include sequence numbers, nor include any reliability mechanisms such as acknowledgements. UDP is built for speed, and that’s really the name of the game when it comes to transporting voice traffic over a packet-switched network.
While UDP helps to reduce the delay associated with transporting VoIP traffic across a packet-switched network, the fact that UDP does not include sequence numbers or any type of timing information presents an issue. As such, UDP relies upon an upper layer protocol to provide these features, namely the Real-Time Protocol (RTP). RTP uses UDP at the transport layer, providing both sequencing information so that packets are delivered in the correct order, and timing information so that issues such as network delay can be accounted and compensated for. Some of the techniques used to compensate for delay and other issues on VoIP networks will be explored in more detail in the next section.

One key consideration when looking at the transport of time-sensitive traffic over a packet-switched network is the size of the actual packets transferred. The standard payload of a voice packet on a Cisco network is a 20 ms sample of voice, which is usually in the vicinity of 20 bytes (for example with the G.729 codec), although it can vary depending upon the codec used (G.711 uses a 160-bye payload, while G.726 uses anywhere from 40-60 bytes). In contrast, the combined headers of RTP, UDP, and IP add up to a total of 40 bytes, meaning that “overhead” accounts for approximately 66% of the size of a packet with a 20-byte voice payload, which is not very efficient at all.

While high-speed networks like switched Ethernet can easily facilitate the overhead associated with RTP packets, WAN links are typically much slower, and bandwidth is at a premium. Earlier in this chapter you were briefly introduced to a compression method supported on Cisco equipment, namely RTP header compression, or cRTP. In cases where VoIP traffic needs to traverse slower serial links, enabling cRTP is a great idea, since it compresses the RTP/UDP/IP header size from 40 bytes to anywhere between 2 and 5 bytes. Obviously this is a significant savings in terms of header overhead, reducing it from approximately 66% to anywhere from 9-20% assuming a 20-byte payload.

RTP header compression is enabled on a link-by-link basis on Cisco routers. It is only recommended on links with speeds up to 2 Mbps. In fact, Cisco only supports cRTP on serial interfaces using Frame Relay (Cisco encapsulation), HDLC, and PPP encapsulation, along with ISDN interfaces. RTP header compression is not used on higher-speed interfaces (like Ethernet) because of the tradeoff involved in terms of higher CPU utilization.

Author: Dan DiNicolo

Dan DiNicolo is a freelance author, consultant, trainer, and the managing editor of He is the author of the CCNA Study Guide found on this site, as well as many books including the PC Magazine titles Windows XP Security Solutions and Windows Vista Security Solutions. Click here to contact Dan.