VoIP protocols and technologies are divided into two major categories: Distributed and Centralized. The MGCP protocol is an example of Centralized VoIP protocol since it supports a centralized client / server architecture. On the other hand, H.323 and SIP protocols are categorized as Distributed since the voice distribution is based on inter-node ad hoc network.
All VoIP technologies and protocols use a common protocol for speech distribution which is RTP (Real Time Protocol) which packetizes voice traffic over IP, as well as supports multiple codecs to compress the data.
The differences lie in the way of signal transmission and in the area of service logic and mode of the call management (at the end points or on a central server). Both architectures (Distributed and Centralized) have their advantages and disadvantages. Distributed models scale well and are more flexible (robust) because they have no central node, which can lead to failure. Centralized model are more easily managed and support the traditional supplementary services (such as conferences), but may have limits on scalability which determines the capacity of the central telephony server.
The technology of speech encoders / decoders in the past few years has made considerable progress thanks to advances in the field of telephony hardware architecture which is built on specialized digital signal processing chips (digital signal processors, DSP), as well as research on human speech. The new codecs are not just doing analog to digital conversion – they use sophisticated predictive models to analyze the input speech signal and subsequent transmission of speech using minimal bandwidth.
A simple pulse-code modulation of speech PCM (standard ITU-T G.711) enables the transformation of speech at 64 kbit / sec of the mu-law and A-law. In both of these methods we can achieve 12-13 bits PCM quality in 8 bits using a logarithmic compression.
Another commonly used method of compression is adaptive differential pulse-code modulation (ADPCM). A typical case is the use of ADPCM coding ITU-T G.726 with a 4-bit quantization, providing a transfer speech rate of 32 kbit / sec. Unlike PCM, the 4-bit code is not the actual amplitude of the speech, rather it is the difference in the amplitude and the rate of change of amplitude, using some rather primitive linear prediction.
The new methods of compression, such as LPC, CELP, and MP-MLQ, use additional features in the waveform in both PCM and ADPCM, using knowledge of the original features of the formation of speech. Such methods are applied methods of signal processing, which compress speech by sending only simplified parametric information about the original form of sound and the vocal tract. To send this information requires less bandwidth. These methods can be combined into a common set of codecs in the source.
Currently, we have the following methods of encoding in voice over IP:
*G.711 – PCM-method at a rate of 64 kbit / sec
*G.726 – ADPCM – method of transmission at speeds 40, 32, 24 and 16 kbit / sec.
*G.728 – CELP – a method with a transfer rate of 16 kbit / sec
*G.729 – CELP – a method with a transfer rate of 8 Kbps / sec
*G.723.1 – MP.MLQ – a method with a transfer rate of 6.3 kbit / sec and CELP – a method with a transfer rate 5.3 kbit / sec.
When implementing VoIP networks we need to focus on the requirements for bandwidth and latency (delay).
Bandwidth requirements are critical and are determined not only by the transmission rate of the codec used (from 3-4 to 64 Kbit / sec), but the extra load on the network, called IP headers, and other factors. Due to the presence of pauses during conversations, a technology was developed for detecting voice activity (Voice Activity Detection, VAD). With VAD, bandwidth requirements are reduced roughly in half. Thus, for example, for the G.711 codec with bandwidth of 64 kbit / sec, with the use of VAD technology the total bandwidth for a voice channel will be about 40 kbit / sec.
Requirements for quality in voice networks determine the maximum latency of 150-200 ms. A greater delay value usually reduces the quality of conversation significantly. The greatest amount of delay (30 milliseconds) is introduced by the G.723 codec, and the lowest delay (0.75 ms) is found in G.711 codec. It should be noted that the smallest propagation delay time is introduced by channel switching networks, and the greatest propagation delay is found in packet-switched networks (IP networks) due to buffering. In connection with this fact, the VoIP technology is less attractive for voice transmission over the Internet, than VoATM and VoFR. Nevertheless, VoIP quality over the Internet is quite acceptable for a corporate network that needs maximum of 4 -6 concurrent voice channels.