|Quotes • Headscratchers • Playing With • Useful Notes • Analysis • Image Links • Haiku • Laconic|
Unless you've been living away from civilization (or unless you're an old man who steadfastly believes Computers Are Bad), chances are you've at least heard about MP3. We see it everywhere in the Internet, our portable music players now use MP3 audio instead of those good ol' cassettes, most mid-high range phones can play MP3, the street dealers from around the block probably sell DVDs packed with songs in MP3, and your car's sound system can probably play MP3s from your iPod, a CD or an USB drive.
Okay, enough buildup. We're here to discuss how it works.
First, MP3, more than a simple audio format, is a lossy, frequency-domain, quantized audio compression algorithm. The name is short for MPEG-1 Audio Layer 3, where MPEG is a group of companies developing these sorts of audio and video compression methods. In English: MP3 is a list of step by step instructions that can be followed by a computer program (algorithm) that takes an uncompressed audio file, typically of the same format used in an audio CD, cuts the waveform into about 40 split-second snippets for each second, calculates the frequency components of each snippet (the frequency domain), uses these components to remove part of the information contained in the sequence (some information is lost in the process -- it's lossy), rounds up the frequency components to some pre-defined values (quantizes them), and encodes the resulting information in a way that takes up less space than the original audio file (compresses it). So far, nothing unusual: there's many other lossy audio compression techniques out there, and many of them already existed in 1994, when the MP3 format was released.
However, there's one difference between MP3 and the other formats of the day: the part that removes information from the sound sequence was specially designed to remove only the sounds your ear and mind can't hear. This is due to something known as psychoacoustics. Basically, this means we don't hear the same wave that comes out of an audio source: our ear is much more sensitive to medium sounds than it is to bass and treble sounds, and our mind subconsciously filters out the background noise and amplifies the ones we want to hear. This is why you can't seem to hear anything when you record a video in a classroom: your mind filters out the classroom's chatter and lets you focus on your friend's voice, but the microphone can't do this, and just picks up all the environmental sound. MPEG-1 audio layer 2 (MP2) already had a psychoacoustic model, but MP 3's psychoacoustic model is much more detailed, allowing more precise representation of the frequencies that are not dropped.
So, to continue the analogy started on their pages: if a MIDI file is just sheet music, a WAV file is an entire orchestra, an MP 3 is half an orchestra. You know all the pianos and clarinets and timpanis and stuff that aren't being used by this song? MP 3 leaves 'em behind.
Based on this fact, this is one of the main reasons why the MP3 format became so popular: thanks to its psychoacoustic model, you can strip a lot of information from the audio, and yet it still sounds pretty much the same as the original source. As a result, a medium-quality song encoded in MP3 format takes up very little space -- usually 4 MB, much less than the 30-60 MB they usually take in a CD; and of course, back in 1995, when hard disk capacity and Internet bandwidth were at a very great premium (modems rarely went beyond the 28 kbps, and a CD could hold two medium hard drives!), an audio format that could turn a massive file into a very small file was obviously the best choice to save your songs. Another very important reason is that MP3 decoders are very efficient: they use very little CPU time and very little memory, and can run on very small hardware, i.e. a computer from 1994, a portable MP3 player, a mobile phone, or, of course, a gaming console. Of course, the Intertubes and the earliest file sharing networks (Napster, Audiogalaxy, Kazaa), and the marketing of the iPod as a fashionable device, played a very big part on the MP3's popularity.
There are many other compression formats out there, some of them born out of legal conflicts between patents and free/open-source software (like Vorbis), some of them actually better (like AAC and Vorbis), some of them lossless (like FLAC; no information is lost, quality is nigh-perfect, but the files are large), but for now, MP3 is the most popular audio format in the world. However, FLAC is highly trending right now because finally PMPs are able to give a decent FLAC playback.
Interesting note - the original testing of the MP 3 algorithms was done with "Tom's Diner" by Suzanne Vega. (The original acoustic version, not the hit remix by DNA)
- ↑ a 4-minute song on 128 kbps MP 3, very close to CD-quality audio, weighs in at 3.75 MB; 192 kbps, which takes a trained ear to distinguish from CD audio, is only half again as large