Taking into consideration a growing interest of modem owners to data compression protocols specifically and to efficiency of data transfer on a channel level generally it was decided to refer to this subject within the frames of a small testing.
It is well known that data compression plays a significant role when using a modem connection for browsing web-pages on the Internet, reception of text files or large e-mail messages. Modem data compression is used most often for it, and the most popular compression protocol V.42bis serves exactly this purpose. A sending modem traces a data stream, and if data are compressible, it compresses them and then transfers them via a telephone network in a packed form. A receiving modem unpacks data and transfers them to a computer. Different data are compressed differently: modem compression allows for a gain from several percents to 5-10 times as compared with the initial volume. That is why it is important to know how good this function works in a certain modem. So, this review is aimed at comparison of effectiveness of different realization ways of compression protocols.
The effectiveness of compression of the V.42bis protocol depends on two key parameters: a size of a dictionary of compression and on a line's length. With the help of a line a sending agent describes a definite sequence of symbols in a data stream from a computer and replaces such sequence with a shorter codeword for transfer. Having received a codeword a receiving modem converts it into the corresponding line. The larger the number of symbols in a line, the larger the repeating fragments which can be replaced with codewords, the more effective the compression. If there are not many long repeating parts, usage of shorter codewords can assist in achieving good results. A set of used lines and codewords that correspond to them makes a dictionary - its size is measured in elements (lines). The structure of a dictionary is dynamic and changes in a data transfer process depending on what sequences occur most often. The larger the dictionary, the larger the number of lines which can be used in a compression process. A physical parameter that limits a size of the parameters of the V.42bis protocol in modems of different manufacturers is a fast RAM.
The compression has both advantages and disadvantages: if you are sending practically incompressible data, an attempt to compress them will only increase a size of sending data. The V.42bis protocol checks whether it is beneficial, and if not, it switches a transparent mode. But at the same time, frequent switching from one mode into the other will reduce a channel bandwidth. The criteria of reswitching are set by manufacturers and may differ. How good is a switching algorithm defines efficiency in a data transfer process.
It should be noted that it is wrong to say that we testing a single compression function. For the compression protocol to work we need one more protocol to control data reception/transfer. Unlike a data flow transfer inside a computer, the data transferred via a telephone network can pick up noise. When using a flow method in case of a modem data transfer any error can distort a data flow severely. To avoid it modems have a channel level protocol - V.42, also known as an error correction protocol. It controls the integrity of data transferred.
I can describe the V.42 functions shortly with the following scheme: data received from a computer (including those compressed with the V.42bis) are divided into blocks of a fixed length, or frames, and several such frames make a window. A receiving modem takes a frame by frame, and in response to the last frame from a window sends a confirmation message. Having received it, a sending modem sends a next portion of data. If a frame is damaged due to an error during the reception, a receiving modem sends a request for a repeated transfer of this frame. Creation of such logical structure - frames, window - requires some overhead expenses. A size of a service data is a fixed value, therefore, the larger a frame size, the less the percentage of the overhead expenses. The same concerns the number of frames in a window: for a window with 5 frames, if there are no errors, a modem has to confirm each 5th successfully received frame.
Since the structure 'frame,window' is realized differently in different modems, the total number of overhead expenses will differ. Further we will look at how strong these differences are.
As basic test compressed files we use a standard set TSB-38. Here we test peculiarities of operation of modems as far as file reception is concerned.
Files are received from a ftp server of the MTU-Inform provider. As a server we use the Cisco 5300 with MICA modems that support both V.42bis and V.44 (V92/V44 preview software). A linear speed of the receiving modems is limited to 9600 bit/s both due to a possible noise effect and for elimination errors when receiving data, and to reduce the influence of a COM-port bandwidth on the results.
The tested modem as a calling modem connects to a host modem, and test files start transferring in a terminal mode. We have conducted 10 measurements with each file type for each modem. To exclude the influence of the error correction protocol we took into consideration only those attempts when there were no damaged frames both when receiving and when sending data. If the result in one of the attempts deviates from the average by more than 1%, the number of measurements increases proportionally.
Besides, we checked the effectiveness when sending a compound file, for example, a special test file received by replication of built-up from a html-text, pictures and java-scripts of 661,440 bytes. The HTS Express was used as a receiving modem with a presetting of a size of an error correction frame equal to 128 octets, and a dictionary/line size equal to 1024/16. This restriction grades physical properties of differently realized channel level protocols in the tested modems, first of all, to reveal the differences in operation of compression algorithms.
In the results of the optional part of the test we give the CPS value, the average number of switchings between a compression and a transparent modes and an average coefficient of the modem compression according to the results of the file transmission. The latter two parameters are based on the statistical data of the HTS Express modem.
Files included in the TSB-38.
Files used for measuring CPS are obtained by replication of standard files of five types:
Files are located at ftp://ftp.mtu.ru/pub/horgi/test/
How good one or another type can be compressed is shown in table 3 - "Comparison of efficiency of data compression with V.42bis, V.44 and usual pkzip".
File reception results.
A file transfer protocol (FTP) also brings in some additional overhead expenses and influences the CPS. On the one hand we should use a protocol with the minimal overheads for the best comparative estimation of the compression schemes (and with the debugging purpose, whether the recommended value of the TSB-38 is achieved). On the other hand, taking into account the majority of typical cases of application of compression protocols, I consider it rational to conduct the test in conditions which are the closest to the traditional ones. At the same time, such approach allows obtaining the most objective and realistic picture.
Relative compression efficiency. The result of reception of files without compression is taken as unit:
The same results in a graphical form:
Results of the absolute effective data receive rate, CPS:
* - the result without compression for the Omni 56K Pro and LT-Win modems.
The same results in a graphical form:
Comment to the results.
The V.44 results are moderate. Of course, it outscores others by a great margin but it is not 20-200%, which are much written about in many sources. Comparing the V.42bis and V.44 of the same modem - LT Win (GM56PCI-L) - we can see that the performance gain is from 38% (on uncompressed graphics files which are rare for the web-area) to 12% (on compound files, the most closest to the structure of typical web-pages). When comparing the "maximum" of the given schemes of the V.42bis (Courier 4096/32) and the V.44 (in LT Win implementation) the performance gap becomes narrower: 31% for exotic graphics formats, around 13% for a pure text and less than 4% for compound data.
Comparison of different schemes of the V.42bis looks more interesting: depending on the structure the spread in values between the "minimal" (1024/16) and the "maximum" (4096/32) scheme reaches from 42% for text files and to 13% for compound data. But it should be noted that the spread is not so big for uncompressed graphics - only around 5%, and in case of exe-files the minimal structure of the V.42bis wins more than 9% in the compression efficiency.
Another peculiarity consists in two different models of the ZyXEL - U-336E and Omni 56K Pro. Despite a twice larger dictionary and a twice longer line, the elder model wins only in text files.
Now comes a classical question "is it possible to compress to a greater degree?" Of course, it is possible when using not a modem compression (limited by the efficiency and memory resources), but classical archivers.
Comparison of the efficiency of data compression with the V.42bis, V.44 and usual pkzip.
You can see that the pkzip is much more effective than any modem compression, and the difference in the results between the pkzip and V.44 is much more than that between the V.44 and V.42bis.
Today you can often come across links to www.v92.com. Those who have already looked at those results can be surprised. On their graphs the results of the compression of the V.44 lay in the middle between the V.42bis and Winzip. But it seems strange only at first sight. In the course of preparation of the test results and comparison with other sources we have calculated the compression degree when using the archives in this test and in the materials at www.v92.com. For a text file the difference between the pkzip and Winzip of the version of "www.v92.com" made more than 18% in favor of the pkzip, and for the graphics file it is 49%. Why the archiver at www.v92.com compresses files so badly is for our readers to decide.
So, the conception of site structure and the general rules of preparation of files for sending are not changing even with the appearance of new protocols of modem compression: everything compressible must be preliminary compressed. Traditional archivers provide a great benefit for exe-files. For graphics files it is also preferable to use formats without excessiveness - gif, jpg and other.
Data transfer results of different modems.
This test was conducted for the V.42bis protocol, which is common for all modems in question. Besides, we have examined one of USR junior models - the OEM-modem 2977.
All participants supported a compression dictionary of 1024 symbols and a line 16 symbols long. All of them worked with an error correction frame equal to 128 octets and a 15-frame window, except the Omni 56K Pro (a 16-frame window) and the LT Win modem (an 8-frame window).
You can see that despite a considerable difference in the schemes of switching of modes, the maximum gap in the results of the effective speed is only 2%. The attention is attracted by the fact that this difference occurs between the models of the same company(!), and the number of switchings between the compression and transparent modes of both modems is almost the same. One of possible reasons consists in the criteria for mode switching. Nevertheless, I can't mark out a leader in this test.
What else affects the effective speed.
The experience shows that a data compression scheme used in the modem makes the greatest effect on the results - in case of the V.42bis they are different sizes of a dictionary/line, in case of the V.44 it is a different compression algorithm. In the most of cases (text and compound data) - the larger a dictionary, the more effective the compression. The most of the graphics formats on the Net are preliminary compressed images, and programs are either self-unpacking archives or short setup.*** files. That is why you should pay greater attention to the first and the last element of data in the results.
But the results are also affected by other subjective and objective factors.
Influence of the major objective parameter (a frame size of an error correction protocol) is easy to estimate according to the formula:
This formula doesn't allow for overhead expenses on a higher level than a modem one. Although it is impossible to obtain an absolute result for a frame of a definite size, it is still possible to compare the results of frames of different sizes.
Below you can see what frame sizes of the V.42 are used by different modems (the second figure is the maximum size of the window):
Using the data from the table in the formula we can explain the 2% difference between the definite modems in the results of table 2 (when receiving the archived files).
A size of the window of the V.42 protocol also influences an effective receiving speed. But it is more difficult to calculate the difference between the different structures due to a lack of a ready formula, and delays of confirmation frames depend both on the conditions of the channel (time of signal traveling) and on subjective factors: delays in the distant modem (of reception of confirmation files and the further response), the number of errors at the level of the V.42 protocol, distribution of these errors etc. The larger the window, the less the number of delays.
And now let's return to the promised explanation of the disrepancy. I mean the structure of the V.42 (namely, a window size) in the ZyXEL Omni 56K Pro modem. The table shows that a frame size is 256 octets and a window size is 8 frames. This structure is used at default. On the other hand, nobody restricts to use a fast RAM differently. We managed to make use of it in the optional test on compression when sending data: the Omni 56K Pro worked there with a frame of 128 octets and a 16-frame window. The Omni 56K Pro distributes all resources the most optimal way. If a distant modem supports a 256-octet frame, the Omni uses this opportunity saving on a window size. It is rational since a frame size plays a more important role than a window size. If a distant modem has a structure of 128/15 (frame/window), the Omni will agree with these values and rearrange the memory in accordance with the possibilities of a distant modem. Non-standard values are also possible: for example, together with a 192-octet frame a window may measure 10 frames.
Such idea is also realized in some provider modems. As for modems for end-users, only the Omni 56K Pro can boast about it.
Apart from the causes whose influence can be estimated, there are some subjective factors affecting an effective speed of data reception, but whose influence wasn't estimated here:
Resources of modem's controller efficiency at high speeds. In this test the modems worked at low speed to obtain as "pure" result as possible. But a modem's controller can stop handling preparation of data for sending and reception. This problem will hardly occur at low linear speeds, but at high speeds (or in case of simultaneous work in the modes of reception and transfer) some modems may not cope with an increased data flow and the estimated effective speed won't be reached.
Trellis coding scheme used in the V.34. Today two schemes are widely used: 16-position and 64-position ones. The 64-position scheme gives some advantages in noise-immunity for the V.34. At the same time, an increase of the number of coder states up to 64 makes the delays of data processing in a modem longer. These delays are not large, but it seems that because of exactly this problem many modems on the Conexant chipset contain an algorithm that limits a window size to 4 frames for data transfer if a distant modem uses a 64-position trellis coding scheme, and the SREJ mode (selective re-request for damaged frames) is not enabled in the local modem. It looks like reassurance, but today there are no any deep investigations on this subject.
1. A speed increase achieved with the V.44 given by some manufacturers do not coincide with the reality, at least within the frames of our tests. It is especially well noticeable on text or compound data which web-pages mostly consist of. Maybe we should allow for the preview software the provider has.
2. At the same time we can't deny that the V.44 protocol is more effective for any type of data than the V.42bis.
3. The statement that "the V.44 algorithm provide a better data compression up to 6:1 against the maximum possible 4:1 for the V.42Bis" is wrong, since it doesn't coincide with the experience either according to the figures (50% difference in protocols' efficiency), or according to the estimation of the maximum limit for the protocols in question - it is possible to say about some comparative figures only for some definite type of data.
4. In the V.42bis usage of compression structures of the maximum size (with their support, as for example, in the new model USR Courier) can be beneficial in some cases.
5. Although different modems transfer data differently, it is not worth taking into consideration this fact - the spread in the results is too strong.
6. The estimation of the differnce in different compression schemes in data reception doesn't allow us to name clear leaders or outsiders. On the one hand, the spread in the values on the most interesting Internet compound data is quite moderate. On the other hand, it is irrational to estimate a modem or make your own choice with regard of the results of only this test without taking into account the tests of physical-level protocols.
7. From an end-user standstill, the attention is primarily attracted by the level of realization of traditional protocols of physical level (V.34, V.90); the quality and reliability of modems' operation at this level; a depth of realization of error correction protocols; an optimal connection of the latter ones with the physical level protocols.
8. It is possible that more flexible and effective realization schemes of the V.44 will appear in future (remember the Omni 56K Pro for the V.42).
9. Modem compression loses to traditional compression methods in effectiveness. But you shouldn't forget other, alternative methods: program data compression, both on the level of a ppp-protocol and on higher level of the OSI model.
Write a comment below. No registration needed!