Digital audio explained with stacks of paper

Different aspects of Digital Audio explained and demo’ed.

Digital audio explained with stacks of paper
Digital audio explained with stacks of paper

In this article, I explain storing and streaming Digital Audio and how to manage the data. I will make Digital Audio tangible and demonstrate and explain the file sizes with the visual metaphor of stacks of paper.

Samples and bit depth

A good start is sample. A sample is an atomic slice of the audio file. Like a single sheet of paper from a stack.

Audio sample compared to a sheet of paper

With the single sample, bit depth comes in. This is how much information is contained on the sheet. So the white paper on the left will contain more information than the small yellow one. The white one represents a high bit depth and the yellow one a lower bit depth.

For music, the bit depths you will come across are 16 bit and 24 bit. For voice, like on phone systems, you will see 8 bits. While the sample rate is concerned with capturing frequency accurately, bit depth is related to dynamic range.

Dynamic range is the distance between the quietest and loudest sounds in a piece of music, and the quality of the resolution within this range.

For many years, 16 bit was the standard, the depth used on CD's. While 16 bit is still very common, 24 bit is now becoming more widely used for Hi-Res (HD) audio. Consumers can now purchase music in lossless formats that support higher sample rates and bit depths.

The below illustration shows this concept in detail:

Sample rate

In order to produce sound, we now need to add time to the equation. The faster you move through the stack the more fidelity you get.

How many samples of data are taken per second? This is normally measured in hertz, eg an audio file usually uses samples of 44.1 kHz (44,100 audio samples per second).

A single voice signal on the phone occupies 8 kHz of bandwidth. Each sample is quantized into 8 bits, yielding a rate of 64 kbps, which is used universally on phone systems.

Bitrate

When audio streaming became the norm, bitrate was an important measurement. How much data is going to be needed to provide the right quality of sound. Are we moving slowly or quickly  through the stack, and a how big are the sheets?

We use bitrate to describe the fidelity of audio files. An MP3 file that was compressed at 320kbps, will have a much better dynamic range and sound quality as one compressed at 128kbps. Or, more information can be contained on a bigger sheet of paper than on the smaller ones.

Bitrate is also the measure of the rate at which data is transferred from one point to another in time. Think of it as the volume of the pile of paper that needs to be downloaded.

High resolution has a high bit depth and high sample rate. This means that the required bit rate to play the stream will be high as well.

With higher bitrate, audio files with higher bit depth and sample rate can be streamed, thus increasing the quality of the audio. However, this means an increase in bandwidth used for transmission.

Bitrate formula

You can see that sample rate and bitrate are related, but not the same. Here is the formula for it:

Bitrate formula = Sample rate x Bit-depth x Number of Channels

A typical, uncompressed high-quality audio file has a sample rate of 44,100 samples per second, a bit depth of 16 bits per sample and 2 channels of stereo audio. The bit rate for this file would be:

44,100 samples per second × 16 bits per sample × 2 channels = 1,411,200 bits per second (or 1,411.2 kbps)

Variable bitrate

Bit depth is fixed for unencoded streams, but with lossy compression codecs (like MP3 and AAC) it is calculated during encoding and can vary from sample to sample. More on encoding in this article.

Variable sample sizes.

Audio formats in applications

Application Audio Codec Sampling rate Bit rate
Netflix Multiple
Whatsapp OPUS
CD N/A 44.1 kHz
FaceTime AAC-LD 16kHz
AM Radio 5 kHz
Phone call 8 kHz
Apple Music AAC
Spotify Free AAC 44.1 kHz 128 kbits/s
Spotify Premium AAC 44.1 kHz 256 kbits/s
Skype SILK Variable

Audio ecosystem examples
I created some diagrams to show examples of an audio ecosystem. The interconnected parts and the type of hardware and compression in the flow from digital stream to ear. Audiophile set up, wiredThis is a high end, audiophile set up. Uncompressed stream with high resolution music from Qobuz. The stre…
Audio compression and codecs
This article is a primer on Digital Audio for streaming music to a device such as mobile phone or wireless headphone. It is intented for the digital producer and the interested consumer. Uncompressed digital audio files are large. Compare it to a full balloon. To squeeze the size of the