PrefaceToday everybody knows quite well what sound possibilities a modern computer has. A record library on a computer, together with audio cassettes and CDs is no wonder for many. We all know that a CD is simple and easy to use. Why? Because it has a definite format which is everywhere the same, and the sound quality depends on a recording studio and usually is high. It is rather convenient. And what about the music on a computer? The PCM (pulse-code modulation) established for CD-DA discs is not very compact and doesn't suit for delivering music via the net. That is why developers now are working on a great deal of complex compression algorithms. All of them differ very much in the sound quality, that is why a user has always to make a choice of an algorithm for its favorite music to be recorded. Despite a great variety of different algorithms and formats, the MPEG 1.0 Audio Layer III, simply called MP3 is an absolute leader. There are a lot of programs-encoders for recording music in this format. The Net contains a lot of different benchmark tests comparing programs that encodes into MP3. The best one for today is the LAME, a free open project without any license restrictions. The MP3' position, however, can shake soon. On the scene appears a new algorithm called Ogg Vorbis. After the Beta 3 version of this encoder appeared at the end of the summer of 2000 users have been facing the challenge. At the beginning of 2001 two new versions: LAME 3.88 and Ogg Vorbis 1.0 Beta 4 appeared at once on the market. They are both different very much from their predecessors, and today we will compare them. Before the StartThe testing was carried out in 5 quality zones:
The OrlSoft MPeg eXtension 2.0 program was used as an encoder and a decoder. The following programs were used to analyze the results: CoolEdit Pro 1.2, Steinberg WaveLab 3.02, SpectraLab 4.32.13. We took the following compositions for our testing: Roxette "Crush On You", Richard Clayderman "Mano a Mano", DJ BOBO "What A Feeling", Bluemchen "Ist Deine Liebe Echt" and "Sehnsucht@herz.de", Chicane "Autumn Tactics" etc. The most peculiar results for each zone will be used for conclusions and illustrations. Terms of comparisonUnfortunately, it is impossible to compare LAME and OGG equivalently on the same bitrate. The format and encoder Ogg Vorbis isn't intended for a coding mode with a constant bitrate (like in the MP3), any of 6 pre-established modes (112, 128, 160, 192, 256, 350) is a coding with a variable bitrate (VBR), though it will be more correct to call it ABR - an average bitrate. Moreover, apart from the speed that can be set as you will, the OGG allows for no manipulations with its parameters, while the LAME gives a possibility to control near everything. That is why we will compare both coders in the coding modes recommended by the developers. We will only deviate from this rule for a lowpass filter (the signal has the frequencies only below some definite level) in order to specify (for the LAME) what frequencies should be left to avoid severe cutting of them. Not so long ago an average bitrate mode appeared in the LAME, though it exists for a long time already in the OGG. According to the developers, at the same bitrate the coding in the ABR mode mustn't be worse than in the usual one. That is why for estimating the LAME quality we will use two sets of coded samples: with a constant bitrate and a variable one. Format limitationsIt is impossible to code non-stop albums into the MP3 format without pauses, since files always have to have pauses in the beginning and in the end of tracks. The LAME can correct the beginning of files. In the Ogg Vorbis format files coincide with the originals completely. But this coder can code files only at the sampling frequency of 44100 Hz, i.e. in the CD-DA format. The OGG can work at 48000 Hz when coding files. TerminologySample - an audio file, some fragment of a musical composition. Original, an original sample - a fragment of the WAV format taken from an audio CD. Coded sample - a sample coded (compressed) into one of the formats in question. In this case they are MP3 (LAME) or OGG. Decoded sample - a coded sample converted from a compressed format into a usual WAV one for our testing. Coder, encoder - a program for compression (coding) of a sample from one format into another, here - from the WAV into MP3 or OGG. AFC - an amplitude-frequency characteristic of the sound, representation of the sound with the graph of the frequency vs. amplitude. Sonogram - the sound representation with the graph of time vs. frequency. Delta-signal - a differed (delta) signal obtained by subtraction of one sample from another, which characterizes difference between them. In this case it is used to calculate the difference between original and coded samples. "Super"The aim of coding is to reach the maximum possible quality of the sound. That is why we set the maximum parameters for both coders. For the LAME we are taking a mode of 320 Kbit with a full sound range up to 22 KHz and the highest quality (-q0), the other parameters are set by the coder. For OGG we also set the mode of the highest quality - 350 Kbit. Unfortunately, other parameters can't be regulated. So, what is the result? I confirm that the psycho acoustic models of the coders have undergone severe changes, which are easy to define when analyzing decoded samples. High frequency processing has changed much. Earlier in the 320 Kbit mode the LAME left the complete range up to 22 KHz, and now these freequencies also pass the psycho acoustic model. This fact can be perfectly illustrated on the sonogram. Compare the original and the decoded sample (click the graphs to enlarge): And now look at the OGG at 350 Kbit - it cuts frequencies even at the level of 16 KHz. The encoded signal contains even such parts of the AFC (the vertical line marks the 16 KHz frequency). Such processing of high frequencies is quite strange for a mode of the highest quality of coding. Let's listen to them and compare with the original records. It is interesting that the sound is so close to the original that it is difficult to tell three compositions from each other. With the maximum parameters both coders give practically identical sounding of the original CD. The only thing I noticed is that the OGG gives more transparent sounding and better reproduction of upper middle frequencies. But this difference is so tiny that it can be noticed only with the high quality equipment. So, both coders get the highest score with the only difference in that the OGG has an average bitrate higher than 320 Kbit (usually it is within the range from 340 to 380). The averaged AFCs of the original and coded samples differ not much despite so impudent handling of high frequencies. Now let's estimate delta-signals, i.e. let's calculate them and compare differences between the original and coded samples. The delta-signal of the LAME samples sounds like a quiet wide-band noise through which one can hear the weak main sound with hoarse pattering and severely distorted high frequencies. For the OGG samples the picture is much more complicated: it reminds not just noise but a highly distorted original with phase distortion effects (flanger or phaser effects). I think that in the OGG different ranges are processed better, they seem to be well thought-out as compared with the LAME, which has very close parameters of a psycho acoustic model for the majority of subranges. It is well noticeable when analyzing the AFC of the delta-signals (the red graphs is the LAME's, the white one is for the OGG). The lower is the signal, the better the quality of the sound on the corresponding frequencies. So, the OGG developers decided to simplify coding of low and middle frequencies lower than 2 KHz and to improve reproduction of the upper middle and high frequencies up to 16 KHz. On the graph you can see that in the range up to 2 KHz the LAME reproduces the sound better, while in the range from 2 to 16 KHz the OGG gives better results. The LAME, unlike the OGG, provides wide possibilities in controlling the coding process, psycho acoustic parameters and filters. And if we cut frequencies higher than 20 KHz (which are anyway inaudible) when coding at 320 Kbit, we can achieve better sounding. Let's look at the graph of averaged AFCs of the delta-signals of the full and cut at 20 KHz samples. The difference in the level of the delta-signal is up to 2 dB what makes around 15%. It means that the whole range gets improved by 1-2 dB. So, in the zone of the highest quality the OGG and LAME are very close to each other, that is why music lovers can take a coder they like more. "Good"The super high bitrate give the excellent quality of the sound, but it isn't preferable among high bitrates since a file size at the stream of 25 MBytes/s is very big. As a rule, many prefer 256 Kbit bitrate as a rational compromise between good quality and a file size. That is why let's compare the quality of the both coders in this case and estimate losses relatively to the highest bitrate. Here I will again cut frequencies beyond 20 KHz for the LAME mode in order to improve the quality of the basic audible range. The LAME was tested in two modes: with a constant bitrate and a variable one. First let's look at the frequency dynamics of the samples obtained (changing of the AFC with time with averaging in small intervals - from 20 to 100 milliseconds). The trend of losses of high frequencies keeps. The OGG cuts the high frequencies (higher than 18 KHz) to a greater degree than the LAME does: But the OGG reproduces frequencies in the range from 15 to 18 KHz much better, while the LAME cuts them in pauses between high splashes of the amplitude and at lower frequencies. But even this diagram shows that the frequency reproduction of the LAME in the ABR mode is much better than a standard mode of a constant bitrate. Now let's listen to what we have received. The LAME developers has much changed the coder: there is no more the difference in sounding of the high frequencies. Samples coded into the ABR are better than those in the standard mode, that is why I recommend to refuse a constant bitrate in favor of the ABR. Cutting by the OGG of frequencies higher than 18 KHz doesn't affect much the general sounding of the samples, the difference with the LAME is also insignificant. If you prefer 256 Kbit, keep on using it, but you should take the newer version of the LAME. As for delta-signals, the sound is like in the 320 Kbit mode. The OGG samples have some hoarse sounds in the range of high frequencies, and in the LAME samples we can here the general increase of the noise level. At the same bitrate the OGG cuts tremendously frequencies higher than 18 KHz in order to improve sounding of others. Practically in the whole range the OGG samples are much closer to the original than the LAME ones. But such considerable differences are typical not for all samples. For example, in case of the Richard Clayderman all flaws in the reproduction of high frequencies are well noticeable. In this case the LAME performs much better, but the OGG shows higher quality in case of frequencies higher than 2 KHz. At the same time, the LAME shows better quality when coding in the ABR mode, the difference is from 1 to 2 dB. I will try to explain the discrepancy between two comparisons. The first sample is very loud and dense, the sound is set nearly to 0 dB, while the second sample is an orchestral composition recorded on average at -3 dB, i.e. without compacting a dynamic range. The denser the record, the higher the quality in case of the OGG as compared with the LAME. But you should remember that the LAME has some kind of a metallic effect at high frequencies, and the OGG cuts frequencies higher than 18 KHz. Below you can see AFCs of delta-signals for the sample of an average density. You can see that the picture hasn't changed much. So, when choosing a coder for working in the 256 Kbit mode you should define what is more important for you: middle frequencies or those higher than 17-18 KHz? Also note that in the ABR mode the coding quality of the LAME is much better than in case of a constant bitrate. In my opinion, the leader is the OGG. 256 vs 320/350: who wins?Now I think we should compare the coding quality of these two zones. The listening tests of the LAME samples prove my conclusion about the high quality of the ABR 256 Kbit mode. The difference between 256 ABR and 320 samples is noticeable: the sound becomes more acute, excessively sharp, but it is not critical. That is why if it is not important for you to have the maximum similarity with the original, but you need just high quality, then the 256 ABR mode will suit you. As for OGG samples, the situation is different: the sound diffuses, though it isn't critical. Differences, however, can be heard only with the special equipment. And now look at these differences in a graphical form. These are averaged AFCs of the delta-signals for 256 and 320/350 Kbit samples of both coders. The maximum difference between the modes takes place in the OGG coder, and they are best noticeable on the frequencies where the OGG sounds the best. It can be explained by the difference in the bitrate since we compare not 320, but 350 Kbits against 256. In conclusion I should notice that coding in the 256 Kbit mode has so high quality that you can easily use this mode, if it is not vital for you to have the sound as close to the original as possible. "Not bad"The 192 Kbit mode is a middle-of-the-road solution since it doesn't offer decent quality and provides not a small size (1.5 MBytes - 1 minute). Let's check what we have changed in the new version. The situation with high frequencies has changed cardinally: the OGG reproduces them better, the LAME cuts them. The ABR coding is still better than the usual one, but it is yet worse than the OGG. On the sonogram you can clearly see that in the 192 Kbit mode the quality of reproduction of high frequencies is better in case of the OGG. Both coders cut considerably frequencies higher than 16 KHz, but the OGG leaves them more than the LAME does. Let's listen to our samples. The general sounding is of course worse than that of the 256 Kbit, there is some metallic effect, smearing of high frequencies, and a small loss of a depth of both coders. But this time their sound is much different. The LAME reproduces high frequencies not bad, while the OGG deserves a good mark in the majority of cases. The biggest gap in their sound is noticeable on the Richard Clayderman's sample. The OGG only smears the samples, while the LAME brings about a metallic sound. It is a disadvantage of the LAME. At the same time in the OGG middle frequencies fall a bit, though it was absent at higher bitrates. Nevertheless, on the whole the sounding in the 192 Kbit mode is not that bad. Let's look at the delta-signals of the coded samples where the difference in the sound was maximum. In the LAME samples apart from the general increase of noise noticeable distortions appeared in the range of high frequencies. In the OGG samples strongly distorted original in the mids is heard quite well through the noise and distorted highs. On the LAME samples a level of middle frequencies is not so high. It means that a level of delta-signals, noises and distortions grows with decreasing of the bitrate. The lower the bitrate, the bigger the difference between coded samples or originals. It is difficult to define with the graphs what coder is better. Here you can see diagrams for Roxette samples, very dense and loud. The conclusion is the following: The coders sound different. The LAME reproduces the sound not bad in general, but it makes highs metallic. The OGG smears them and falls behind the LAME in reproduction of middle frequencies. That is why I think that the LAME in the ABR mode codes the music at 192 Kbit better than the OGG. And in the constant bitrate mode it is beaten by the latter in all tests. I hope you understand that the ABR mode always gives better quality. So, the LAME in the ABR mode leads at 192 Kbit. "Middling"Now we will concern the most popular bitrates - 160 and 128 Kbit. 128 Kbit mode is not enough for reproduction of high quality sound, since for coding of highs the width of the stream is not enough. And 160 Kbit are quite enough for an acceptable reproduction of 16-17 KHz frequencies. The OGG is again a leader at high frequencies. The sonograms of the LAME are practically similar this time, with the ABR a little bit better. Let's listen to them. The OGG sounds excellently on all samples! I couldn't imagine that such quality might have been achieved at 160 Kbit. The LAME can't handle highs, despite an artificial suppression beyond 18 KHz. They not only have a metallic sounding, there is also a "chewing" effect. But at middle frequencies the LAME sounds better than the OGG. Look at the averaged AFCs of delta-signals. At the expense of poorer middle frequencies the OGG reproduces better highs and lows, i.e. those frequencies which are traditionally cut at low rates of coding. This is what the LAME does as a traditional MP3 coder. An absolute leader at 160 Kbit is the OGG. But the LAME in the ABR mode falls just a bit behind it. "Satisfactory"It is used mainly for music on the Net. It is the most popular music format. Let's look at our contestants. The LAME coding is carried out with suppression of frequencies higher than 16500 Hz in order to improve the sound in the main frequency range. When suppressing higher frequencies, reproduction of highs wouldn't be better, but the general sounding will become worse. Without taking into consideration the artificial suppression of highs of the LAME, the sonograms of the samples of both coders are similar, with the ABR sonogram a little better. When listening to them I can hear the same sound picture as in case of the 160 Kbit: the OGG samples sound much more beautiful and quality than the LAME ones. The general sounding is of course worse than the 160, but the trend keeps: the OGG reproduces better high and low frequencies, the LAME shines in middle ones. For 128 Kbit mode the quality of coding is very good. Let's look at the AFCs. You can see that everything depends on the original. The first diagram is of the very dense Roxette sample, the second if of the Clayderman sample. The OGG coder, trying to give excellent highs (beyond 16 KHz) saves on 2-15 KHz frequencies. But the OGG samples sound much more pleasant, there is only some fall in the middle frequencies. But you shouldn't pay too much attention to it, since for 128 Kbit it is not an important criterion to estimate applicability of the coder. So, the leader is the new version of the OGG coder. 128 vs 160: who wins?The rivalry between 128 and 160 bitrates is not weaker than between 256 and 320. Let's compare the both modes separately for each coder. We will start with the OGG. The difference is considerable. The lows sound quite different: they become diffused, sharpness of percussion instruments and basses is absent. Highs seem to be more metallic and diffused at the same time. Middle frequencies are not much different, because, in my opinion, the developers are again saving on highs and lows in favor of middle range. On the whole, the sound in the 160 Kbit mode is more juicy, that is why you'd better take it - for the Net this size is still acceptable. Look at the spectra of the delta-signals. For the OGG the difference is well noticeable - around 4 dB. On the whole the spectrum differs only in a signal level, apart from a small shift near 2 KHz which proves that middle frequencies are reproduced identically. Now comes the LAME. Note that in the 128 Kbit mode the LAME started to code quite good not so long ago, that is why the comparison must be very interesting. The quality of the samples with a lot of high frequencies differs much. If at 160 Kbit the reproduction was normal, at 128 Kbit the sound becomes jerky, rough, metallic with a "chewing" effect and strong phase distortions. On samples with less highs the difference is not so severe, but the sound is rare satisfacory. But there is also an advantage. Low and middle frequencies don't differ in this two modes! Even the difference in middle frequencies which can be seen on the sonogram doesn't affect the sound. A level of the delta-signals is much lower in case of the LAME than in case of the OGG, the largest difference doesn't exceed 2 dB, while in the OGG the difference can reach 4-5 dB. That is why the LAME samples when listened to seem to have less differences. The LAME has the most problems in reproduction of high frequencies, the OGG has also troubles in lows. In general, I would recommend 160 Kbit everywhere where it is possible, since the difference between 160 and 128 is not that big in size, but matters a lot as far as quality is concerned. Conclusion
Write a comment below. No registration needed!
|
Platform · Video · Multimedia · Mobile · Other || About us & Privacy policy · Twitter · Facebook Copyright © Byrds Research & Publishing, Ltd., 1997–2011. All rights reserved. |