Logo

 

 

MP3 Downloads!?!

 

MP3 music files have been available for several years, and file-sharing programs are more popular than ever. Still, many people are confused about what is legal and what isn't. Can downloading music put you at risk for legal action?

It's no wonder so people are confused. The Net is full of sites with ads for "Napster replacements" which claim to be 100% legal. You may have read claims that "MP3 is legal" and "file sharing is legal." These statements are true, but very misleading.

Is MP3 legal? Yes, because MP3 is just a file format. Indeed, the vast majority of MP3 files found on the Web are perfectly legal, put up there by unknown bands who want to get noticed or by established artists promoting their current material.

Is file sharing legal? It can be, but the vast majority of files shared on P2P (peer-to-peer) networks like KaZaA and Shareza violate copyright law.

What is illegal is unauthorized copying of commercial music.

This usually means MP3's that are made from CD's and then put on the Net by individuals who haven't sought permission from the artist or music company.
What do copyright laws allow? To put it simply, you may make a copy of your own CD for your personal use. That means you may record it to a cassette tape or rip it to MP3 files. You may not, however, give this copy to another person.
Many people believe that if no money is involved, then no law has been broken. This is false. Whether you give the copy away or sell it, this is still a violation of copyright law.

The bottom line: if it sounds too good to be true, it probably is. If you want the latest hits, you need to pay for them. Fortunately, the legitimate online music services are very good, and the competition is keeping the prices down.

 

Why you should buy a CD when you can get the music for free?

Let me phrase it in another way: Why do you want to get paid after a week of hard labour? Why don't you do it for free?
Simply to get food on the shelf, to support your family, to be able to travel, to finance your hobbies...
Whatever you want to do, it costs money.

Well, an artist creates music to make a living like you are going to work. His creations should be rewarded.
Buying a CD supports the artist. He gets to do useful things like eat, buy guitar strings and recording gear so that he can make more music for you to enjoy.
Creating a new song and put it on the market costs a lot of hard work and a lot of money. Of the price you pay for a CD the artist gets about half of it. Then there is also the income he gets from the merchandising, airplay revenue, concerts...
An artist can create as much as he wants, if he doesn't sell, he has no income.
And then, on one fine day, after one or more temptations, he makes that fine song that everyone wants and loves.
He has a lucky strike and then ....
He sells only a few copies of his masterpiece and everyone is downloading and uploading like madness.
His income of selling CD's is second to none in spite of the fact that he tops all charts...

How would you feel???

What is the MPEG ?


MPEG is the name of a working group established under the joint direction of the International Standards Organisation/International Electrotechnical Commission (ISO/IEC), that has for goal to create standards for the digital video and the audiophonic compression. More precisely, MPEG defines the syntax of audio and video format needing low data rates, as well as operations to be undertaken by decoders.

Algorithms used by encoders are not defined by MPEG. This authorizes continual encoder improvements as well as their adaptation to specific applications, without necessitating any redefinition of the arrangement of data. Jointly to the audio and video coding, MPEG defines also methods aiming to test the conformity to standards of formats and decoder, and publishes technical reports.

The stages


Two points need to be distinguished. Firstly, MPEG works in stages. These stages are normally denoted in Arabic figures (MPEG-1, MPEG-2, MPEG-4).
The first stage, the MPEG-1, establishes the coding of monophonic and stereophonic sounds, to frequencies commonly used for the high audio quality (48 ,44.1 and 32 KHZ).
The second stage contains two different ways of work. The first is the extension to weaker recording frequencies, providing a best resonant quality to very weak debits (less of 64 Kbits/s for a monophonic signal). The second way is the extension to sounds including several voices.
MPEG-1 and MPEG-2 have both a three layers structure. Each layer represents a family of coding algorithms. These layers are denoted in Roman figures (Layer I, Layer II, Layer III).

The layers


Different layers have been defined and they all have their own advantages. Moreover, the complexity increases going from the Layer I to the Layer III.

Layer I possesses the lowest complexity and is specifically targeted to applications where the complexity of the encoder plays an important role.
Layer II requires a more complex encoder as well as a slightly more complex decoder. Compared to Layer I, Layer II is able to suppress more redundacy in the signal and applies the psychoacoustic model in more efficient way.
Layer III is once again of an increased complexity and is targetted to applications needing the lowest data rates, by its suppression of the redundant signal and its improved extraction of feebly audible frequencys using its filter.


How MPEG audio works?


MPEG audio compressor are based on a perceptual coding scheme. During a perceptual coding, the codec does not try absolutely to maintain a signal identical to original after encoding and decoding phases, but its goal is rather to insure that the output signal seems identical for a human ear.

The first psychoacoustic effect that the perceptual coding uses is the masking effect, based on the fact that some parts of the signal, due to the functioning of the human auditive system, are not audible. To be able to suppress this signal, the encoder integrates a psychoacoustic model trying to mimic the human ear's behaviour. This psychoacoustic model analyzes the input signal on several consecutive blocks and determines for each block the spectrum of the signal. It then modelizes the masking properties of the human auditive system, and estimates the minimal audible level.

During its quantification and coding phase, the encoder tries to allocate the number of bits in order to respect masking properties as well as the size of the authorized data rate. The decoder is a lot less complex because it does not require any psychoacoustic model neither procedure of bit allocation. Its only work is reconstructing an audio signal from information coded inside of the file.

 

The minimal audition threshold


The minimal audition threshold of the ear is not linear. It is represented, according to the law of Fletcher and Munson, by a curve dug between 2Khz and 5Khz. It is not therefore necessary to code sounds situated under this threshold, because they will not be perceived.

The masking effect


This system is based on masking properties of the human ear:
When you look at the sun and if a bird passes ahead, you do not see it because of the too predominant light of the sun. In audio, it is similar. During strong sounds, you do not hear the weakest sounds. Take as an example a piece of organ: when the organist does not play, you hear the breath in the piping, and when he plays, you no longer hear it because it is masked.

It is therefore not necessary to code all the sounds. This is the first property used by the MP3 format to earn some space. For this the MP3 encoder uses a psychoacoustic model modeling the behavior of the human ear.

The bytes reservoir


Often, some passages of a musical piece can not be coded to a given rate without altering the musical quality. The MP3 then uses then a short reservoir of bytes that acts as a buffer by using capacity from passages that can be coded to an inferior rate in the given flow.

The Joint Stereo coding


In the case of a stereophonic signal, the MP3 format can then use a few more tools, reffered as Joint Stereo (JS) coding, to further shrink the compressed file size.

In many mid-range Hi-fi sets , there is a unique subwoofer. However you usually do not have the feeling that the sound comes from this boomer, but rather from satellite speakers. Indeed for very low and very high frequencies, the human ear is no longer able to locate the spacial origin of sounds with full accuracy. The mp3 format can therefore (optionally) revert to such a trick by using what is called Intensity Stereo (IS). Some frequencies are then recorded as a monophonic signal followed by a few additional information in order to restore a minimum of spatialisation.

The second joint stereo tool is called Mid/Side (M/S) stereo. When the left and the right channels are quite similar, then a middle (L+R) and a side (L-R) channels are encoded instead of left and right. This allows to reduce the final file size by using less bits for the side channel. During playback, the MP3 decoder will reconstruct the left and right channels.

The Huffman coding


The MP3 also uses the classic technique of the Huffman algorithm. It acts at the end of the compression to code information, and this is not therefore itself a compression algorithm but rather a coding method.

This coding creates variable length codes on a whole number of bits. Higher probability symbols have shorter codes. Huffman codes have the property to have a unique prefix, they can therefore be decoded correctly in spite of their variable length. The decoding step is very fast (via a correspondence table). This kind of coding allows to save on the average a bit less than 20% of space.

It is an ideal complement of the perceptual coding: During big polyphonies, the perceptual coding is very efficient because many sounds are masked or lessened, but little information is identical, so the Huffmann algorithm is very seldom efficient. During "pure" sounds there are few masking effects, but Huffman is then very efficient because digitalized sound contains many repetitive bytes, that will then be replaced by shorter codes.

Mp3 limitations


Joint stereo limitations


Mp3 can not switch joint stereo mode for specifics scalefactor bands. If joint stereo is used, it has to be used for all the bands. This is rather inoptimal, and is limiting the use of joint stereo. As an example, imagine the following situation:
The lower frequencies are featuring an instrument playing on the far left, and frequencies around 1500Hz are featuring a singer in the middle of the stage.
In such a situation, it is not possible to use joint stereo with Mp3 because of the lower frequencies part which is too different between both channels. A further bitrate reduction could have been achieved if it was possible to toggle joint stereo mode on a scalefactor band basis. (in this case regular stereo would have been used for the lower frequencies, and Middle/Side stereo for the remaining part of the frequency spectrum)

Too limited maximum frame size


Even if a buffer is available (the bit reservoir), the total size of information belonging to a frame (data inside the frame + data from the bit reservoir) is limited. The ISO standard defines the maximum size to be the size of the buffer for 320kbps frame. Unfortunately, in some (limited) cases this limit seems to be too low, leading to unavoidable degradations of the sound quality.

Inoptimal window sizes


The time/frequency resolution of Mp3 is inoptimal. It is either 576 samples for a long block, or 192 samples for a short block.
On long blocks, the number of samples is limiting the frequency resolution, and so the coding efficiency.
On short blocks, the number of samples (being too high) is limiting the time resolution. 192 samples are translated into a time resolution of 4.3ms for a sampling frequency of 44.1kHz. This is too high in case of some percussive sounds, and can lead to a lack of sharpness, or to pre-echo.
This was corrected by the ISO comitee in the design of AAC, which is using window sizes of 1024 samples (in case of long blocks) or 128 samples (in case of short blocks)

Scalefactor band 21 problem


The last scalefactor band (sfb21 for long blocks or sfb12 for short blocks) has no own scalefactor. This scalefactor band covers the range from 16kHz up to the higher frequency limit, when using 44.1 or 48kHz sampling frequency.
If the resolution of this part of the spectrum must be increased (determined by the psychoacoustic model), the local scalefactor, which is missing, can not be used to adjust resolution. In this case, the only solution is to adjust the global gain value, but this global gain is impacting every scalefactor band.
To increase sfb21 resolution, the global gain value has to be reduced. To balance this, scalefactors of other scalefactor bands can be reduced. But once they reach a value of 0, they can not be reduced anymore, meaning that an higher than needed resolution will locally be used in those bands, leading to an inflate of the bitrate. When encoding sfb21 content, it is common to encounter some scalefactor bands that are encoded with a too high resolution just to accomodate the coding needs of sfb21

Hybrid transform scheme


Layer III is using MDCT transforms, bu in order to maintain backward compatibility with Layer II, it does the MDCT transform on top of the 32 subbands produced by the PQMF filter of Layer II.
While the MDCT stage itself is lossless, it is not the case for the PQMF filter bank. In the transform process, this first stage introduces some noise that can not be totally removed. Using a plain MDCT from the beginning would produce a better result (but would loose compatibility with Layer II).

Mixed blocks limitation


The Mp3 standard allows mixed blocks, but only in a limited way.
Mixed blocks are blocks where the 2 first subbands are using long block structure, while the upper bands are using short block structure. This is usefull to reduce pre-echo in case of transcients, while keeping a good frequency resolution in the lower part of the spectrum. Unfortunately, as defined by the ISO standard, it is not possible for a mixed block to follow or to be followed by a short block. This is a severe restriction regarding when to use mixed blocks, and is imposing additionnal complexity to the encoder in order to be able to use them.

Source: http://www.mp3-tech.org/


These pages were created by www.LyricsVault.net