The picture below shows the grooves of a vinyl record magnified 1000 times with an electronic microscope (black points are dust particles). The beauty of this image consists of the concept that the “landscape” we see is actually the sound wave recorded on a physical medium.
To understand how the sound wave can be recorded in a physical medium, we have simply to imagine a hard stylus acting like a blade when the sound goes through it vibrating and drawing the sound wave over the vinyl surface instead of going to a speaker like normally the sound systems do.
Amplitude and frequency of the sound are represented in the form and shape of the grooves, thus the louder a track is recorded the less effective recording time is kept. In the same way, the more bass and low frequency sounds a track has, the bigger need to be the distances between the grooves decreasing the available recording time of the vinyl. This is the reason why a techno LP has less effective recording time than a classical music LP.
In the next image we can see how a stylus goes through a groove playing the sound transcribed on it. The stylus vibrates as it “touches the sound wave” and transmits this vibrations to the head who will transform them in a electrical signals which will finally be amplified and sent to the speakers.
Many people wonder how the stereo sound (2 channels) can be obtained from a vinyl as there is only one reading point (1 stylus in a groove). The answer is that the stereo sound is coded by horizontal movements of the stylus in the groove, so as well as transmitting the sound wave the differences between both channels are coded in the way the stylus moves.
The continuous friction between the stylus and the surface of the vinyl makes the contact point heats up to 260ºC (500ºF), however, as the area heated is so small the total amount of heat accumulated is instantaneously dissipated.
Let’s see now how the digital format works. In the next picture we can see how the sound is coded in a CD/DVD where the white spaces called “pits” are actually holes made in the record process by the laser on the thin metallic plate. Depending on the size there are two types of holes which are used to code the two symbols needed in the binary system -ones and zeros-.
Another way of digital storage is the one performed in the hard disks where the bits of information are coded by the polarization of magnetic sectors as we can see in the image below.
After talking about the three main ways of sound storage (vinyl, CD-Audio, hard disk) the next important question is if all them store it with the same quality or not. It is commonly spread the idea that “vinyls simply sounds better” than any digital format, but is that so?. Let’s see if there is a scientific foundation on this or not.
Let’s start analysing the best quality audio format: the WAV format also know as CD-Audio or PCM is known as a “lossless” format, is this correct? First of all we have to take into account that any digitalization process introduces losses by the same meaning of “digitalization” that is encode a finite number of points of the sound wave. However, the sound wave has an infinite number of points, thus any attempt of coding a finite number of them will make that the final result not be 100% equal to the initial sound wave.
Then, why is it said that the WAV is a lossless format? The answer is for human ear it is. This is due to the fact that we can only ear from 20Hz to 20KHz, so following the Nyquist theorem the maximum sampling frequency needed is the double of the higher frequency: 20KHz * 2 =40KHz. For this reason the WAV file is sampled at 44KHz (44.000 times per second).
In the above image we can see the sampling and coding process. The digital information recorded represents a number of points that will be used to reconstruct as faithfully as possible the original sound wave. In the diagram below we can see how the reconstruction fidelity depends on the number of bits used.
The right would be to say that the WAV format (CD-Audio or PCM) is lossless for the human ear. Curiously for animals such as dogs -which have a higher range of frequency response than the 20KHz of human ear- the WAV format is lossy (although dogs are not the major music lovers I know). Currently new audio formats such as the DVD-Audio offers 192KHz of sampling, a value much higher than any live being sound perception.
The sampling issue is technically know as Frequency Response, however, this is just one of the two main concepts to have into account when talking about sound quality. The second one is the Dynamic Range.
Dynamic Range (DR) allows to define the maximum and minimum sound amplitude which will be recorded in a specific source (analog or digital). The low value defines which is the minimum sound volume which can be differentiated of the background noise (SPL). The higher value specifies the maximum amplitude which can be stored without causing distortion. Human Dynamic Range goes from 0dB SPL to 140dB SPL.
When we sample a sound wave for a WAV format at 44KHz the DR is given by the number of bits used to encode the sound. Normally 16b what allows up to 96dB SPL, however the WAV files ready for studio mastering are encoded with 24b, as with this value we have 144dB SPL, covering the whole Dynamic Range of the human ear.
It is important to know the theoretical limits but these values will hardly ever be obtained as the tolerances and imperfections of the electronic components used in the sound systems add background noise what decreases the real DR levels at a maximum of 100dB SPL.
Note! It is important do not confuse the Dynamic Range dB’s with the final gain of the sound played. The DR dB’s of an analog or digital medium specifies which sounds will be recorded in that medium, however the final volume sets the gain applied to the sounds contained in the DR. So if it is well recorded both vinyl and CD sounds can be amplified in the same way.
Analog recording on vinyl also introduces limits on the DR as there is a maximum size for the groove and the space between the grooves, setting the maximum DR in 90dB SPL. This means a WAV file coded with 24b (144dB SPL) has a higher DR than the one which the vinyl can support (90dB SPL) although this consideration does not really matter as the electronic components of sound systems add background noise what makes the theoretical limits can not be reached.
Curiously this imperfection of the materials (stylus and vinyl surface) is what is called “vinyl warmth” and consists in a background noise -almost imperceptible- that is added to the original making it more imperfect, that is more human. (Note! do not confuse this sound with the crispy sounds which are generated by bad formed grooves).
We could summarize telling that the music recorded in vinyl is equivalent to the WAV format sampled at 44KHz and coded with 24b (144dB SPL) as these values are beyond the human ear limits. On the one hand we have a higher sample rate (44KHz vs 40KHz) and on the other hand we have a higher Dynamic Range (144dB SPL vs 140dB SPL).
So with the scientific facts on the table we can not say that the vinyl sounds better than the digital WAV format as both contain the sound recorded with the same fidelity. However, the vinyl presents a warmer sound as the elements which take part in the sound reproduction (stylus, head, vinyl surface and amplification stage) add extra harmonics to the original sound giving it richness and body. That is, the “vinyl feeling” is not based on a higher fidelity but on a different sound due to the imperfections of the elements which reproduce the sound (more imperfections means more human). Note also that the physical contact with the vinyl make other feelings come into play, increasing the listening sensations.
Things change when we talked about the mp3 format which is really a lossy format as compression algorithms are applied. In order to understand the mp3 existence we have to know that the WAV bitrate is 1.4Mbps. Let’s see how the calculation is made: (frequency) * (encoding bits) * (channels) = 44KHz * 16b * 2 = 1.4Mbps. This value makes WAV files to be really heavy (~70MB) making a hard task the transmission over the Internet. For this reason a new light format was designed, so that people could enjoy a similar sound quality with smaller file sizes, allowing an easy and quick transfer rate.
The highest quality mp3 format has 320kbps, 5 times less that the WAV format and it is being adopted as ‘the facto’ standard due to the quality losses are practically imperceptible for most of the population. In order to understand how a format which has 5 times less size sounds similar to the WAV format, we have to understand how mp3 encoders work.
Mp3 encoders apply a technique called psycoacustic analysis. This process consists in generating a kind of “digital human ear” and pass through it the original WAV sound, deleting all the sounds which wouldn’t be ever listened by the human ear. Some examples of these deleted sounds are:
- Sounds out of the Frequency Response range of the human ear (20Hz – 20KHz)
- In case there is a loud sound in a specific frequency band delete other sounds as the human ear won’t be able of distinguishing the details of the low volume sounds
- If there is constant sound compress it using standard compression techniques (similar to zip and rar)
In the image above we see an spectrometry performed as part of the psycoacoustic analysis of a track. Different frequency bands are shown (Y axis) along with the energy contained in each of them (cold colours represent less energy) during the track duration (X axis).
As last concept I would like to think that although we ear the sound as an analog wave (the same which is received by our ears) its is coded in electrical pulses in the brain. That is, our own body makes a “digitalization” process of the sound transforming it in information quantos to be transmitted over the brain cortex and other parts of the internal brain. Thus, in the process of listening of any song exists both analog and digital parts which alternates continually even inside our own bodies.
Microscope Images: Chris Supranowitz