VoIP

Voice over IP (VoIP)

Sending telephone calls over the Internet (Voice Over IP (VoIP)) has been growing rapidly, and it has been predicted that soon most Americans will use VoIP in place of the traditional telephone system. The main incentive for people to switch to VoIP from a traditional phone is cost. VoIP is less expensive, particularly for people who make lots of long distance calls.

This can only be implemented if there is a high speed Internet (broadband) connection, but most homes have such a connection now.

VoIP typically provides many special features like call forwarding, conference calls, caller id etc because they are easy to implement in software.

Issues:

VoIP generally uses wall power, so if the power goes off, you don't have phone service (traditional phone service does not rely on wall power)
Quality of service can vary. Real time signals sent over the internet are subject to jitter, packet delay, packet loss etc.
Emergency 911 calls are hard. With traditional phone service, a call to 911 locates you. With IP, your location is unknown, and may even be in a different state.
It is only as reliable as your broadband connection, and at the moment, this is far less stable than the phone system
If your computer is multiprocessing, this can result in quality problems and even dropped connections.

VoIP requires the following components:

A device to convert a traditional phone signal to a VoIP signal and vice versa
A mechanism to connect IP to the traditional phone system and vice versa (a gateway)
A mechanism for translating IP addresses to phone numbers and vice versa
A protocol for transmitting voice over the internet with a minimum of jitter

There are a number of different and incompatible protocols for these.

Connecting a phone to the Internet

There are actually three ways to do this. The most common way is with an analog telephone adapter (ATA). This is a device between an ordinary analog telephone and a computer. It converts the analog signal to a digital signal. The device that does this is a CODEC (coder/decoder), which samples the analog signal 8000 times per second, and converts the value to an 8 bit value (64kb/sec).

An alternative is an IP phone, a special digital phone that can connect directly to ethernet.

The third option is software within your computer, which uses a microphone and the computer speakers.

VoIP gateways

Your computer sends signals to your provider. Your provider is running a gateway, which, among other things, has a Soft Switch. A soft switch is a database mapping protocol that converts IP addresses to phone numbers and vice versa. So it has to know the locations of IP addresses. If it does not know the address itself, it hands the request off to another switch.

The overall process of setting up a connection in the traditional phone system is call signalling.

There are several different protocols for performing the signalling function, i.e. the interface between the Internet and the phone system.

H.323 is the oldest. This was originally developed for video conferencing and was modified for VoIP
SIP (session initiation protocol) a new protocol developed specifically for VoIP signalling
It is text based, Callers and callees are identified by SIP addresses. When making a SIP call, a caller first locates the appropriate server and then sends a SIP request. The most common SIP operation is the invitation. Instead of directly reaching the intended callee, a SIP request may be redirected or may trigger a chain of new SIP requests by proxies. Users can register their location(s) with SIP servers.
Unlike the other signaling protocols, SIP has its roots in the IP community. It is peer-to-peer and highly decentralized.
Skinny - a Cisco Proprietary protocol
MGCP (Media Gateway Control Protocol)

The Real Time Transport Protocol (RTP)

Real time media such as voice or video, etc. needs to be delivered much more reliably than ordinary Internet traffic. If there are even small delays in displaying data, this is annoying to the listener/viewer. The term for this is jitter. To prevent this, the program which receives the packets buffers them so that it can run the frames or voice packets at exactly the correct speed. Naturally this introduces some delay, which is acceptable for video, but not for VoIP.

Note that if packets are delayed enough so that they arrive after their scheduled time, they are simply discarded; it is pointless to retransmit missing packets.

There is a special transport layer protocol for real time applications; Realtime Transport Protocol (RTP). RTP is not really a transport layer, it runs over UDP. Also, RTP does not actually ensure on-time delivery of packets or even that packets will not be lost. That is up to the receiving end point software. However, it does supply two pieces of information which can be helpful to the receiver, all packets have a sequence number, and they all have a time stamp.

RTP is used for many different media, and the granularity of the time stamp differs depending on the payload type.

Here is the RTP header.

	0	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	28	29	30	31
	Octet 1,5,9								Octet 2,6,10								Octet 3,7,11								Octet 4,8,12
1 - 4	V=2		>P	X	CC				M	PT							Sequence number
5 - 8	Timestamp
9 - 12	Synchronisation source (SSRC) number

Version: Identifies the version of RTP (currently 2).
Padding: A flag which indicates whether the packet has been appended with padding octets after the payload data.
X (Header extension): Indicates whether an optional fixed length extension has been added to the RTP header.
CC (CSRC count): Although not shown on this header diagram, the 12 octet header can optionally be expanded to include a list of up to contributing sources. Contributing sources are added by mixers, and are only relevant for conferencing application where elements of the data payload have originated from different computers. For point to point communications, CSRCs are not required.
M (Marker): Alllows significant events such as frame boundaries to be marked in the packet stream.
PT (Payload type): This field identifies the format of the RTP payload and determines its interpretation by the application
Sequence number: A unique reference number which increments by one for each RTP packet sent. It allows the receiver to reconstruct the sender's packet sequence.
Timestamp: The time that this packet was transmitted. This field allows the received to buffer and playout the data in a continuous stream.
Synchronisation source (SSRC) number: A randomly chosen number which identifies the source of the data stream.

The RTP header is inserted after the UDP header and before the actual payload.

Applications which run RTP also have to run the Real Time Control Protocol (RTCP) . This allows the two end points to provide out-of-band data to each other. This protocol supports various types of messages. For example, the sender periodically sends a sender report which provides an absolute timestamp periodically to allow the receiver or receivers to resynchronize. The receiver periodically sends a receiver report to report on how well it is receiving the signal. Other messages are involved in initiating or terminating streams.

It would be nice if the Internet provided some sort of Quality of Service (QoS), but at the moment it does not. QoS is only possible if routers are able to reserve bandwidth for a particular stream. IPv6 is much better designed to allow this than IPv4.

Protocols have been developed to reserve bandwidth and to guarantee quality of service, and they can be useful on Intranets, but in order for them to be useful on the overall Internet, it will be necessary to reengineer the whole thing.