Rus Articles Journal

Acoustic sagas. How the MP3 format was created?

Successfully last operation on record of digital sound on CDA ( of Compact Disc Audio ) announced to the world the beginning of a new era in a sound recording. In 1982 in Langenkhagena near Hanover mass production a compact - disks was open. The rapid development of the computer equipment and digital technologies which began a bit later resulted in need of a compression of a digital sound. The reasons for this purpose there was a weight. Economy of disk space, increase of speed of a digital data transmission, need of creation of a format of a sound convenient for use in the software.

In 1987 the German institute of Fraunkhofer ( of Fraunhofer Institut f ü r Integrierte Schaltungen ) began comprehensive investigations of a problem of coding of a digital sound. This institute possesses the patent for the MP3 technology. Father this format of a sound call Carle - Heinz Brandenburg ( of Karl - Heinz Brandenburg ), the mathematician and the specialist in electronics studying compression methods since 1977. In 1989 when the patent for a format was taken out, any MP3 file in the nature did not exist yet. In 1993 a MP3 - files were recognized conforming to the international MPEG standard - 1 .

What principles formed the basis of coding of a digital sound in this format? The initial sound file is divided into frames (English frame - a shot) lasting 0,05 sec. everyone. Then the analysis of each frame is carried out. At the same time all frequencies of a sound lying out of perception range an ear of the person are rejected. Besides, fluctuations with too high or low value of amplitude are rejected. It is known that its upper bound the person makes perceptions 96 dB. The lower bound of perception strongly depends on sound frequency. High and low frequencies have higher amplitude threshold of perception.

At the first stage the signal of each frame is presented by mathematical transformation of Fourier in the form of the sum of sinusoids of various amplitude and frequency (graphically the sound of any frequency represents a sinusoid). In memory values of amplitudes and frequencies entering the resulting formula register.

The second stage of processing is based on use of psychoacoustic model of perception of a sound by an ear of the person. For example, minor consecutive changes in sound frequency are rejected (the signal with a frequency of 5000 Hz and the signal of 5100 Hz following it register as one with a frequency of 5000 Hz and the summarized duration). Other feature of an ear leads to effect of frequency masking. The sound with a certain frequency masks other sounds with a close frequency, but smaller amplitude which are rejected. The lag effect of perception of a sound an ear leads to the fact that, say, some time after loud cotton the sound of high frequency and low amplitude simply is not heard. Such sounds are rejected by the filter too.

The third stage represents compression of the processed signal by the known mathematical methods. Compression of data in a MP3 is carried out by a little modified option of algorithm of Huffman ( of Huffman ), applied during creation of archives of the PKZIP, LHA, ZOO, ARJ format.

As a result of all three transformations information which is contained in the initial sound file is compressed several times. Extent of compression in modern coders is measured in kilobits per second ( of kbps ), and can be set by the user. At the same time he has to remember that the infinite aspiration to decrease of the size of the file with increase of extent of compression leads to the fact that the second stage of processing of a sound (it is regulated without special restrictions) becomes more aggressive. Under a knife sounds, distinguishable an ear of the person begin to go.

There is no unambiguous opinion on the minimum extent of compression of a sound, admissible at its processing. One say that 128 kbps (extent of compression - about 10:1) - are quite enough, others prefer the size, twice big. Today the upper bound - 320 kbps which is capable to satisfy any expert is installed in the majority of players and coders.

Finally I will note that coding of a sound in a MP3 belongs to the lossy type (there is a loss of information on a sound). The ways relating to the lossless type in which the second stage " is excluded; psychoacoustic knife and compression is based only on mathematical methods, completely save initial information in the final squeezed file.]