The Challenge of H.323 Video Conferencing

with Network Address Translation

Originally published in Teleconference Magazine

Video conferencing has been around since the mid-1960s, but only recently has the arrival of desktop video conferencing made it a practical tool for everyday personal communication. The prominence of the Internet as a mainstream phenomenon has enabled one technology-H.323 conferencing-to emerge as the scion of universal adoption. Unfortunately, several challenges threaten the usefulness of H.323 conferencing in certain network situations. The challenges are oddly intertwined. The H.323 standard works on internet protocol (IP) networks, which use IP addressing to route information from point A to point B. Unfortunately, the current IP standard predates the escalating boom of “connected” devices and, in the not-too-distant future, the world will run out of valid IP addresses. To alleviate this problem-and to save money and increase security-many organizations are turning to network address translation (NAT) as a method of IP address conservation.

From the standpoint of video conferencing, the critical issue is that network address translation prevents H.323 conferencing products from working properly. Without specific intervention, a NAT device effectively blocks all incoming calls. The more popular H.323 conferencing becomes, the more of a problem this is likely to be.

This document examines these issues and discusses a recently announced solution to the problem: Sorenson Glasses(tm).

H.323 Video Conferencing

The original video conferencing standard, H.320, was developed in the early nineties as a common language to allow different manufacturers’ products to communicate with one another. H.320 was created as an “umbrella standard” to specify methods for transmitting various types of data. The relatively old-school H.320 conferencing scheme was generally used in expensive, dedicated installations such as board rooms, with multiple high-speed data lines and prohibitively expensive equipment.

In 1996, the International Telecommunications Union (ITU) established the H.323 standard, which specifies methods for sending and receiving video, audio, and data communications over IP networks. The standard also allows for services such as converting data between networks, handling latency, and dealing with lost packets. H.323 conferencing works over any network that uses the internet protocol (IP) for connectivity, including local area networks (LANs), wide area networks (WANs) and the Internet. Consequently, the H.323 revolution has taken video conferencing out of the board room and put it on the desktop.

In a typical H.323 call, a person at one endpoint (a computer or other device enabled with an H.323 application) calls a person at another endpoint. For a successful call, each of the endpoints needs to have a unique, “public” IP address and must have authority to open several network data ports to send and receive multimedia data. Once an H.323 call is negotiated, it can open channels to handle various kinds of data. Different applications provide different functions. Most video conferencing applications offer audio, video, and T.120 data, which is used to control collaboration processes such as application sharing and whiteboarding. The H.323 standard also provides a way to implement proprietary data channels for specialized applications such as network gaming, interactive technical support, and distance learning.

The importance of conferencing standards cannot be overstated. Since video conferencing is not a government-sanctioned service, and since no monolithic company has risen immediately to dominate the market, customers rely on various vendors to provide them with the tools they need to set up their systems. In the absence of a standard, companies would be at the mercy of scattered, proprietary technologies, and users purchasing systems would be locked into a specific brand of equipment. Since-at least theoretically-any communications product created according to the H.323 standard will interoperate correctly with any other standards-compliant product, users are free to enter the world of desktop video conferencing without pinning their hopes on the success of a single company.

Internet Protocol (IP) Addressing

The internet protocol provides two basic functions: data delivery (or routing) and fragmentation. In addition to these, other services include timeouts, prioritization, source routing, and route tracing.

Data delivery is probably the most important feature of IP. For the IP routing system to work, each machine on a network must be assigned a unique address that distinguishes it from every other machine on the network. With the current protocol, an address consists of four 8-bit values (octets), for a total of 32 bits in each address. When represented in decimal form, an IP address appears as four numbers separated by dots. Since the range of an 8-bit value is from 0 to 255, the lowest possible address is 0.0.0.0 and the highest is 255.255.255.255.

When data is sent over a network, it is broken up into chunks of information called packets. Each IP packet contains a header which specifies the address of the machine sending the data and the address of the machine intended to receive the data.

The biggest IP network in existence, the Internet, is a vast web of connected machines. Rarely does information on the Internet go directly from one computer to another. Instead, it usually makes a series of hops from one node to another, eventually getting to its intended destination. Routers at these nodes examine the addresses on the packets and choose a path for the packet to travel. Essentially, IP communication is an elaborate relay race, with routers passing packets like batons and IP addresses determining who eventually ends up with the data.

Running Low on Addresses

Back in mid-70s, when Vinton Cerf, Jon Postel and Danny Cohen dreamed up the internet protocol, 32-bit addresses seemed like a pretty good idea. Four 8-bit values provides a range of almost 4.3 billion unique addresses-4,294,967,296 to be exact-and this seemed tantamount to overkill to the architects of Arpanet, the precursor to today’s Internet. As Vint Cerf muses:

The somewhat embarrassing thing is that the network address space is under pressure now. The original design of 1973 and 1974 contemplated a total of 256 networks. There was only one LAN at PARC, and all the other networks were regional or nationwide networks. We didn’t think there would be more than 256 research networks involved.
“How the Internet Came to Be,” by Vinton Cerf, as told to Bernard Aboba

Today, with about half of the available addresses already allocated, experts anticipate that the pool of free addresses will run dry before the end of the decade. This curious predicament has been referred to as “the great IP crunch of 2010,” or even more ominously, “the next Y2K.”

The Internet Engineering Task Force has a proposed solution for the problem: a new protocol called IP version 6. Determined not to make the same mistake again, the engineers of IPv6 took overkill to a new level. IPv6 uses 128-bit addressing, which provides roughly 3.4 x 1038 addresses (that’s 340,000,000,000,000,000,000,000,000,000,000,000,000)—or 340 trillion addresses for every cubic centimeter of the earth’s volume (1.087 x 1027 cm3). Chances are, once IPv6 is finally implemented, we won’t be running out of addresses any time soon.

The problem is, IPv6 is still quite new. The Internet Assigned Numbers Authority green-lighted its regional agents to begin assigning IPv6 addresses in July of 2000, but the resulting system of machines is essentially cut off from the rest of the networked world. Full implementation of IPv6 entails a complete revamp of all existing networking hardware and software, which is no small task. Steve Deering, one of the principle designers of IPv6, had the following to say about IPv6: “It’s quite possible it won’t happen. It’s conceivable that we will just continue to do short-term hacks and band-aid whatever is required to keep living with IPv4.”

Whether IPv6 ever gets fully implemented, or whether it is eventually abandoned, the Internet will continue to route information. In the meantime, it is necessary to live with the current IP system, and the current hardware and software infrastructure.

Public and Private Networks

It’s important to underscore the point that H.323 conferencing requires each endpoint to have a routable IP address. An IP address is routable if it is visible to other machines on the network and if it is unique to that specific machine-within the bounds of the network. The IP system itself doesn’t care whether a network is public or private. So the IP address problem only applies when computers are dealing with each other within the greater world of the Internet, and not in the much smaller world of an institutional intranet.

If an organization has ten computers connected to each other over an IP network, these ten computers must each be identified by a unique identifier. If these machines aren’t connected to the rest of the world-the Internet-it would not matter which specific ten of the 4.3 billion possible IP addresses were chosen for these computers. If, one day, the system administrator decided to give all ten machines access to the Internet (in addition to each other), it would be necessary to assign each computer an IP address that is unique to that machine and that machine alone within the entire Internet.

The following analogy makes the situation even simpler. Consider a hypothetical office with ten employees. Inexplicably, these employees don’t have a need to telephone anyone but each other. Their telephone system is a closed network, and they have a great degree of flexibility when assigning phone numbers. They still have to follow the convention of (###) ###-####, but within these boundaries they can get creative. One employee might prefer the easy-to-remember (012) 345-6789; another might want to be (666) 666-6666. It wouldn’t matter if someone else in Honolulu or Duluth had an identical telephone number. It’s inconsequential to discuss which prefixes are valid in their particular town, or which area code they’re in. Since their system can’t see or be seen from the outside world, these are all non-issues.

Consider, now, that the employees one day decide that they want to be able to call people in their own town, as well as Honolulu, Duluth, and the rest of the world. To do this, they have to work with their telephone company to modify their phone system so that each of them has a public, unique phone number. They are bound by area code and local prefix restrictions. What’s more, they have to pay for the right to use the ten telephone numbers they end up with.

Actually, the telephone system managers in this hypothetical scenario have another option. Suppose they want to save money by purchasing only a couple of “outside lines” for the office. Or maybe they want incoming calls to all be handled by an operator, so those pesky folks in Duluth can’t call their workers directly. Whatever the reason, they could set up the system so that the office has only a few (say, three) public telephone numbers that are shared by all of the employees. Incoming calls would be routed using an employee’s private extension number. Outgoing calls would go through one of the three outside lines. Employees would be protected from outside annoyances, and the office would save money on telephone lines. It’s a great deal all around.

Computer network administrators have a similar option, which produces similar results. It’s called “network address translation.”

Network Address Translation (NAT)

One of the “short-term hacks” referred to by Steve Deering is a tricky scheme called network address translation. IP addresses aren’t just in short supply. Like telephone lines, they also cost money. Also, some network administrators consider public IP addresses a security risk. If a computer is visible to everyone on the Internet, it may be vulnerable to attack.

A network address translation device allows an organization to use private IP addresses (analogous to private telephone extensions) for communication within an internal network, and to share a small pool of public IP addresses (analogous to an outside line) when communicating on an external network such as the Internet. The NAT performs the conversion transparently and on the fly, so all internal users get access to external services. The users behind the NAT can see the outside world, but at the same time, the users are protected from prying eyes because all communication with the Internet seems to come directly from the machine doing the translation.

Once again, it’s a great deal all around. That is, unless someone behind a NAT wants to be able to receive H.323 calls.

Consider what happens when Dick tries to place an H.323 call to Jane. Dick’s computer is behind a typical NAT, so his computer’s IP address is not routable. Jane, on the other hand, has a routable IP address. When Dick calls Jane, an interesting sequence of events occurs. His private IP address is encoded twice in the outgoing call: once as part of the H.323 stream and once when this stream is broken into individual data packets and sent over the network. When the data gets to the NAT, the private address in the packet headers is converted into a public address. Dick’s message now has two IP addresses as it goes out over the Internet, one that is routable and one that isn’t. When Jane’s computer receives the call setup data, her computer responds to the call by sending a message to the IP address encoded in the H.323 stream. The acknowledgement then goes out to Dick’s private IP address, which is not routable. When the data hits the Internet, the routers can’t find Dick’s machine and the packets are discarded. The call is now dead.

Some vendors have begun to create H.323-aware NAT products which allow outgoing calls to function correctly. They do this by translating the IP address in the H.323 stream as well as the IP address in the data packets themselves. If Dick’s system administrator were to upgrade their NAT with such a product, Dick’s call would go through because Jane’s acknowledgement would know where to go.

Neither kind of NAT, though-the standard kind nor the H.323-aware flavor-can allow a call from Jane to get to Dick. If Jane tries to call Dick using his computer’s private IP address, her outgoing call contains an unroutable address in two places, and is discarded by the first router it reaches. Sure, she could use the public IP address of the NAT device, but when her call gets to NAT, it has no way of knowing that Dick is the intended recipient. The call is still dead.

The only way for Jane to initiate an H.323 call with Dick is to contact him some other way-by phone, fax, email, or pigeon-and ask him to call her. This solution is less than satisfying.

Note that Dick can still call Henry-and vice-versa-since they are in the same local network. Though both Dick and Henry have IP addresses that are not externally routable, they are internally routable because their machines can see each other just fine. Internal calls aren’t affected by network address translation issues. But since LAN-to-LAN calling is the most common type of H.323 communication, the NAT problems described above are critical to fully effective use of desktop video conferencing.

Sorenson Glasses

Sorenson Vision Inc., the company that developed the EnVision H.323 desktop video conferencing system, has introduced a powerful solution to the “NAT problem.” The product, Sorenson Glasses, is a multifunction H.323 server component for systems using network address translation. Glasses is an H.323 gatekeeper, an H.323 endpoint proxy, and an H.323-to-H.323 gateway-performing all of these functions when needed without user intervention.

Once installed, either behind or alongside a NAT, Glasses provides several services. As a gatekeeper, Glasses performs host resolution and bandwidth management. As an endpoint proxy, Glasses maintains network security by helping to maintain the barrier between a private network and the Internet. Most importantly, Glasses’ gateway function solves the problem that has been the subject of this document: it enables inbound calling to H.323 endpoints behind network address translation devices.

Sorenson Glasses’ gateway functionality allows a smooth yet secure transition between a private network and the Internet. Glasses uses port forwarding and embedded aliasing technologies to enable inbound calling to multiple H.323 endpoints behind a NAT. This ensures that incoming conferencing data survive the address translation process and get to the intended endpoint.

With Glasses installed on Dick’s network, Jane can now make H.323 calls to Dick without any problems. When Jane calls Dick, the call contains both the IP address of the NAT device and an alias to Dick’s routability-impaired computer. The call sails through the Internet without a hitch, since the NAT’s IP address is public and routable. When the call reaches Dick’s network, the NAT knows from the embedded alias that Dick is the intended recipient. The call gets negotiated and Jane and Dick are free to conference up a storm.

Though gateway functionality is Glasses’ main raison d’etre, the product’s gatekeeper and proxy services are both useful and useable. The first gatekeeper service, host resolution, is similar to ILS (Internet locator server) technology with the exception that it provides “dialing assistance” within the architecture of the H.323 standard. This feature allows users to eschew clunky IP addresses and place calls within a local network using email addresses or telephone numbers. The second service, bandwidth management, is more a system control than a user feature. It allows an IT manager to put a cap on the amount of network bandwidth available for H.323 conferencing. This is useful both on small networks, where bandwidth is a more-than-precious commodity, and on large networks, where there may be dozens of concurrent H.323 calls. By providing a method for managing H.323 data, conserving space for other network traffic, this feature helps system administrators manage their networks more efficiently.

Glasses also functions as an endpoint proxy. Previously, firewall managers wanting to enable H.323 conferencing for internal machines would be required to open multiple ports to multiple hosts, possibly jeopardizing the security of the system. Implementing a NAT was not an option (or, at least, not a very good one) because of the problems inherent in address translation. With Glasses, a single machine-safe behind a firewall and/or NAT device-can serve as a proxy for all of the H.323 users on a network.

Essentially, Sorenson Glasses lets everyone have their NAT and conference too. Because of Glasses, users (and their system administrators) no longer have to choose between security and connectivity. Although Glasses was originally created for EnVision users, it is fully H.323-compliant and will function with most NAT systems to enable calling any H.323 endpoint. It is an industry-wide solution to an industry-wide problem.