By Igor Barkov
Transmission of voice/fax data with usage of transport protocols TCP/IP becomes widespread nowadays. This article is intended for those who are familiar with ABC of VoIP, and here we'll consider some issues which haven't been much on discussion yet.
Before exchange of the commercial traffic a network of a new IP-operator will be tested. The result of it will influence the price for traffic termination in this network. Whether this procedure is successful depends on 2 factors: a method of connection to a dial-up Telephony Network of Common Use (TNCU) and quality of a connecting IP-channel between gateways. The requirements to the delay and the bandwidth of the network of a connecting operator are quite high. For example (ITXC):
Beside the mentioned, ITXC lays down some requirements concerning types and configuration of the equipment, accessibility of the network for remote monitoring.
You might know that it's quite difficult to maintain a channel with such characteristics. If some company is making its way to the IP-telephony market, it' better to organize a clear channel n? 64 Kbit/s to join a Partner's IP-network.
The construction of n? 64 Kbit/s always takes much time and is expensive. The expenditure depends on a bandwidth and a physical length. The efficiency of an IP-channel depends on the traffic volume. In IP-telephony we can talk of a maximum of simultaneously connected channels. Today there are a lot of ways to calculate a bandwidth of a channel, i.e. http://www.iptelephony.org/frame/technology.html or www.iLocus.com .
An algorithm of information coding/decoding influences much an effective usage of IP-channel bandwidth.
All types of voice codecs can be divided in three groups:
Fig.1 demonstrates an average subjective result of voice coding quality for the mentioned codec types.
In voice gateways a concept of "codec" supposes not only coding/decoding algorithms but their hardware realization. The most codecs in IP-telephony have recommendations of "G" family H.323 standard. For detailed theoretical aspects of voice codecs, please, refer to http://www-mobile.ecs.soton.ac.uk/papers/papers.html. Now we'll turn to basic codecs which are used in IP-telephony gateways of operator level.
The recommendation describes a codec, that uses PCM converting of an analog signal accurate to 8 bit, 8 KHz clock frequency and simplest compression of a signal amplitude. The data rate on a converter output constitutes 64 Kbit/s (8 Bit ? 8 KHz). To reduce quantization noise and improve a conversion of signals with small amplitudes they use nonlinear quantization according to m - Law (see fig.2)
First PCM codecs appeared in 60s. G.711 codec is widespread in systems of traditional telephony with channel commutation. However, the codec is utilized rare because of high requirements to the bandwidth and the delay in the channel. It may be used when you need to provide a maximum of coding voice information with a few simultaneous small talks.
G.723.1 recommendation introduces combined codecs which use MP-MLQ (Multy-Pulse Multy Level Quantization). This codec is a combination of ADC/DAC and a vocoder. They appeared thanks to vehicular communication systems. A vocoder allows decreasing a data rate in a channel, what is important for an efficient use of a radio channel and an IP-channel. G.723 codecs convert an analog signal to a data stream at 64 Kbit/s (PCM), and then define frequency phonemes, analyze them and transfer the information on the current state of phonemes in a voice signal. This algorithm allows decreasing coded information speed to 5,3 6,3 Kbit/s without noticeable voice quality degradation. The codec's scheme is shown in Figure 3. The codec has 2 speeds and 2 coding variants: 6,3 Kbit/s with MP-MLQ algorithm and 5,3 Kbit/s with CELP. The first variant is intended for packet voice transmission.
The conversion process requires from DSP 16,4 16,7 MIPS (Million Instructions Per Second) and makes 37 ms delay. G.723.1 codec is widely used in voice gateways and other IP-telephony devices.
G.729 combined codecs
They include G.729, G.729 Annex A, G.729 Annex B. G.729 codecs are called CS-ACELP (Conjugate Structure - Algebraic Code Excited Linear Prediction). The conversion process uses 21,5 MIPS and brings in 15 ms delay. The coded voice signal speed constitutes 8 Kbit/s.
G.726 recommendation offers a coding technology with usage of ADPCM with the following speeds: 32, 24, 16 Kbit/s. The conversion process doesn't bring in any delay and requires DSP 5,5 - 6,4 MIPS. Figure 4 demonstrates the structure chart.
The codec may be utilized simultaneously with G.711 to decrease the coding speed of the latter. The codec is intended for videoconference systems.
The combined codec relates to LD-CELP technology (Low Delay - Code Excited Linear Prediction). The codec ensures 16 Kbit/s conversion speed, brings in 3-5 ms delay, and is intended for videoconference systems. For more information refer to http://www.ecs.soton.ac.uk/ The table below shows H.323 codec characteristics.
AudioCodes company offers a new blessing - NetCoder codec. It has quality much better than that of G.723.1 and G.729, and doesn't require significant calculating resources. However, the manufacturers of voice gateways don't hurry to integrate it in their products. Besides, it's not included in H.323 standard. NetCoder works at 4,8 9,6 Kbit/s, brings in 20 ms delay, and it has an integrated mechanism of optimal transmission of voice pauses and automated data rate.
What is VAD?
VAD technology is used together with a lot of voice codecs. Fig.5 illustrates the simplest VAD mechanism. An input analog signal comes to compare facility input, where its amplitude is measured and compared with the threshold value. In case the amplitude is more than the threshold (the red line), the signal goes to the codec input and is coded according to a definite algorithm (T2 T3 interval). If it's less (i.e. in T1 T2 interval), then at T1 moment it is service information on the pause beginning which is transferred, and at T2 - it's information on the pause end.
What codec is better?!
Specificity of voice codec usage allows operating such characteristic as MOS (Mean Opinion Score). CISCO company gives test results comparing speech intelligibility. The better quality, the higher score.
* - the results of testing of Net Coder and G.711, G.723.1, G.729a codecs for different voice signal level are shown in the figure 6.
Data rate in the gateway-gateway channel consists of several components. Fig. 7 demonstrated a structure of interaction of devices according to H.323 standard.
Here you can see that beside coded voice/fax data, which are controlled by Real Transport Protocol (RTP), the network with usage of the interconnection protocols (H.225) transfers the information on telephony alarm status (Q.931) and the information on RAS (Registration Admission Status).
The structure below shows interconnection of high level protocols TCP and UDP and H.323 components (red color) with IP.
The figure 9 shows basic stages of gateway-gateway interconnection under the control of H.323 gatekeeper for a telephony call, which comes to "A" gateway input from a telephony network, with a call, which is directed at the abonent connected to "B" gateway.
Because of a complexity of a realization of H.323 multiprotocol structure, some manufacturers started to support and develop alternative protocols of IP-gateways interconnection, simulteneously with H.323. For example, Nuera, Komode, Mediatrix and Ericsson with SIP (Session Initial Protocol), CISCO Systems with MGCPs (Media Gateway Control Protocol) and SGCP (Simple Gateway Control Protocol), and some others.
Write a comment below. No registration needed!