Sound: FAQ

By Grigory (* and **) and Maksim Lyadov (***)

<Q>: How to read this FAQ?

For your convinience a response for each question is divided into three categories according to the complexity of the material.

<A*>: A user knows little and doesn't want to gain an understanding of complicated terms.

<A**>: A user has basic knowledge on handling technics, including a computer, and wants to know everything.

<A***>: A user thinks he knows much and likes to get at the roots of things.

<Q>: What is sound?

<A*>: Sound is everything we hear with our ears.

<A**>: Sound is invisible waves which are spreading out in the air mostly due to the fact that there are oscillations. We hear them with nerve-endings in our ears.

<A***>: Acoustic wave is a physical phenomenon which occurs in different aggregative states of a substance. When spreading out they have a finite speed that characterizes medium compressibility. Speed of small disturbances' propagation in general case equals: . For adiabatic and isentropic processes , where k is adiabatic exponent. In each elementary volume there is oscillations of overpressure. An energy of an acoustic wave is characterized by an acoustic pressure and an acoustic intensity. Acoustic waves have all wave properties. It is proved in occurrence of interference and diffraction during their spreading.

<Q>: What is volume level?

<A*>: Turning up or down your tape or TV set you change the volume level with the switch with a writing "volume".

<A**>: Volume is a seeming sound intensity. In order to measure it there is a unit called decibel [dB]. It is a relative value showing how much the volume get increased or decreased. Taking hardly heard sounds as zero, we can make the following table:

Volume	Volume level, dB
Hearing threshold	0
Whisper	20
Usual speech	50
Street noise	80
Plane take-off	120

<A***>: A seeming volume is estimated via its level: [dB]. According to the psychophysical law of Weber-Fechner, this value for a human is in direct proportion to a subjective perception of volume change. Where - acoustic intensity, - density, a - acoustic speed. But in most cases a volume level is measured via a sound pressure: . L < 0 means sound weakening, L > 0 - sound intensification.

<Q>: What is a pitch of tone?

<A*>: Singing of birds is an example of a high-pitched sound. Middle-pitched sound is when people are conversing, low-pitched sound is a growling of a bear.

<A**>: The higher a number of oscillations, the higher is sound. A number of oscillations per second is called a frequency and is measured in Hertz [Hz].

<A***>: Let's look at a curve of oscillations U(t). The maximum average voltage value is an amplitude of the signal, A. A time span between two consecutive oscillations is called a period (Ò). A value which is in inverse proportion to a period is a frequency :.

range of audible frequencies
infrasound
ultrasound

<Q>: What is timbre?

<A*>: This is what differs a voice of one man from another.

<A**>: Let's take a sound of the same pitch of tone played on two different musical instruments - on a trumpet and on a piano. The sounds will have some different characteristics. Their collection is called a timbre.

When we are turning the volume switch we feel that timbre changes as well. On some devices you can find a button "tone corrector" which corrects perception of volume of different frequencies according to psychophysical peculiarities of perception.

We often come across a such concept as a timbre regulator, an equalizer in particular. This term implies a bit different sense. A timbre regulator and an equalizer separately regulates volume of different frequency components.

<A***>: Let's consider curves of two musical instruments - a trumpet and a piano:

They were received by re-recording of a note A of the first octave via the codec in WAV editor. It was reproduced by SoundBlaster Live! sound card with an integrated 8 MBytes memory bank (GM-instrument #56 Trumpet and GM-instrument #0 Acoustic Grand Piano). A period of the main oscillation characterizes a pitch of a tone, and a shape defines timbre coloration.

<Q>: What ways sound passes though?

<A*>: First a singer sings something into a microphone in a recording studio. After that the sound is processed and recorded on a CD. Having bought a CD you can listen to what is recorded there.

<A**>: In a microphone acoustic waves are converted into an electric signal. The sound is modulated by either voltage or current on electric musical instruments. And in computers, where it converts right into a digital form (sampler technologies). This signal passes through several devices (compressor, limiter, equalizer, reverberator), both hardware and virtual ones. After that all digital sounds in a modern studio are summed up in one sound file which is prepared and recorded on a CD-DA. When played on a home Hi-Fi CD-player a digital signal is converted into analog one by a DA converter, and when amplified it is applied to acoustic systems. The latter ones convert an electric signal back into acoustic oscillations. All this way is called a sound track. You should note that the sound quality may change after passing by the signal all these components. To what extend it changes depends on all links of a chain. For example, when buying speakers we give a preference to those which sound clearer, defining it orally. There are some standard factors for measuring a degree of sound worsening (FS, SNR, THD etc.).

<A***>: Computer includes a processing and a reproducing parts of a sound track. PCM (pulse code modulation) is the highest quality acoustic data coding format for today. Most often this format is kept on a PC in .wav files. But the wav extension itself does not always mean PCM, it may be a file with data in MPEG Layer 3 (MP3) format.

<Q>: What is Frequency Response (FS)?

<A*>: This implies some figures, e.g. 20-20000, that you can find on the last page of a user's manual.

<A**>: When looking at the FS curve pay attention not to min and max reproduced frequencies, but to how uneven the curve is. The most part of unevenness leads to strong distortion of timbre. The graph should be as even as possible, without sharp rises and falls. At high frequencies in falls the sound will be dim, unclear, in rises you will get some unpleasant hissing noise. At low frequencies in falls sound loses "richness", and in rises you will feel some kind of buzzing.

In high quality acoustic systems unevenness of FSB in a working range makes not more than +1..-1 dB. For computer speakers it is +10..-10 dB - quite acceptable figures.

<A***>: Let's consider a typical FS of a cheap plastic speaker (a frequency is laid off on OX in a logarithmic scale, a relative amplitude is on OY):

You can see that the acoustic system has the minimal distortions in the frequency range from 100 to 10 000 Hz. A human speech is in the range from 80 to 10 000 Hz, and a range of a philharmonic orchestra if from 30 to 20 000 Hz. You can see that this acoustic system is good only for listening a human speech. Of course you can listen to a philharmonic orchestra, but the sounding won't be natural.

Since an amplitude of a signal measured in logarithms is a relative value, 0 on OY axis can be put in any place. For example, at -80 dB (relative to 0 on this graph). After that you can easily write in the certificate that the acoustic system has a 20-20000 Hz range of reproducing frequencies - and this is really so. +90 dB unevenness, though, will be difficult to explain, that's why it is not specified!

<Q>: What is THD?

<A*>: A terrible abbreviation which shows only some figures. Do not be afraid, just enjoy the music.

<A**>: It is an estimation of harmonic distortions. THD is an average factor which doesn't define unambiguously the sounding quality, it means that equipment with the same THD can sound differently. Hi-Fi abbreviation implies the following: the less the number of distortions, the better the sounding. In Hi-Fi systems THD must not exceed 1.5% (at 1000 Hz).

<A***>: It is some integral factor which characterizes harmonic distortions for this system. In acoustic systems there is a filter for a measured signal, when applying a test signal (as a rule, 1 KHz sinusoid), in order to measure all additional harmonics appearing due to nonlinearity of the system. Usually the power of the second and the third harmonics is measured, since they make the most effect. In order to convert from percentage to decibels you must use the following formula:

X [dB] = 20 log (X [%] / 100)

<Q>: What is noise (SNR)?

<A*>: Noise is when pshhhhhh, and it's no good. The less pshhhhh, the better.

<A**>: Noise is some random signal of low volume which is added up to the base (initial) signal.

Singal/Noise ratio (SNR) shows exceeding of a signal level over a noise level. Noise can be resolved into frequencies. In the middle frequency spectrum noise is the most audible. The most unpleasant noise is that which is distributed evenly on all frequencies (white noise).

However, noise can be filtered out, that's why noise is not so unpleasant as distortions (see THD). SNR is measured in dB.

<A***>: We can make the following guiding table for SNR:

10-20 dB	Telephone
20-50 dB	Speakers for a player
50-60 dB	Portable radio sets, 8-bit sound cards
60-80 dB	Hi-Fi equipment
80-100 dB	Studio and Hi-End equipment

There is some misunderstanding in signal/noise concept. Manufacturers like to specify a bit different factor instead of the SNR, it is Zero Signal Noise. Why is it bad? For manufacturers it is quite simple to realize inside the equipment a so called "gate". For example, at the level of -80 dB of an input signal a switch snaps, and noise level falls down too much, on the edge of reality. That's why you can see such specifications as 96-97 dB SNR in cheap equipment. When applying a signal of a low level, these characteristics falls down sharply becoming worse by 20 (sometimes 30!) dB.

<Q>: THD+N

<A*>: The higher is the THD+N, the worse is the quality in a general case.

<A**>: This factor unifies the two previous ones and is intended for a simultaneous estimation of noise level and harmonic distortions coefficient.

<A***>: THD+N is the most successful factor for digital equipment since it allows to choose the best signal level for SNR and THD separately.

<Q>: Power

<A*>: Power is not volume.

<A**>: The specified power by a manufacturer doesn't makes special sense when choosing equipment. What concerns an acoustic system you can say that its power is equal to 10 W. Or its power is equal to 1000 W. The both values are not wrong. In the first case power can be specified in "RMS", and in the second case - "in PMPO". If you want to compare two devices in their power characteristics, then pay attention to distortion level (THD) at which power is measured. For example, a set of speakers 300 W RMS at 10% THD will be less preferable and may sound worse than speakers of 50 W RMS at 0.1% THD.

<A***>: For more details see "Standards describing power in phonics".

<Q>: Dynamic range (DR)

<A*>: A difference between the quietest and the loudest sounds.

<A**>: For audio equipment it is sound dynamics reserve between noise threshold and a starting point of overloading of acoustic systems and an amplifier. In order to reduce a DR and to simplify music and speech reproduction on cheap equipment they use a so called sound compression (do not mix up with size compression of an audio file). That's why pop and rock sound quite decent even on cheap home equipment and on computer speakers (DR of such recordings is rather narrow - not more than 10-15 dB). For classical music DR is much wider and makes 50 dB. That's why the requirements for the whole audio track are much higher.

<A***>: For digital equipment it is the maximum SNR, where noise is considered to consist of quantization noise in a theory and a threshold of digital noise of dithering and subharmonic distortions (noise floor + harmonic distortion) in practice. For an acoustic system it is sensitivity, [dB/W*m]. For amplifiers it is roughly a linear part of an amplification curve.

Write a comment below. No registration needed!