Sound
Simply put, sound is a vibration that travels through the air or another medium, like our eardrums, enabling us to hear it, or the diaphragm in a microphone, enabling us to record it. All “natural” sounds comes from an analog source or signal, such as your voice or an instrument.
DAW stands for “Digital Audio Workstation," which is just your audio software of choice. I’m an Adobe Audition fanboy, but there are plenty of other options out there like Audacity and Pro Tools that you can try out.
We capture and digitize these analog signals using microphones and recorders and store them as an audio waveform, which is visually shown as a sine wave in your DAW.
Amplitude
Amplitude refers to the strength of the waveform. In audio, amplitude is measured in decibels (dB) and represents how loud an audio file is, or it’s volume. The higher the amplitude of a waveform, the louder the sound will be.
Frequency
Next up, we have Frequency, which refers to the number of up and down cycles per second in a waveform and is measured in Hertz. You’ll probably understand frequency better as the pitch of the sound. The higher the frequency, or more cycles per second, the higher the pitch of the sound; on the flip side, the lower the frequency, or fewer cycles per second, the lower the pitch."
Fun Fact: Our ears can hear sounds within a frequency range of roughly 20hz (deep and bassy) to 20kHz (or very high-pitched and potentially annoying and painful). Dogs have an even wider hearing range, from as low as 40 Hz to as high as 60kHz!
Sample Rate
Next we have Sample Rate, which refers to how many times the sound is recorded (or sampled) per second. The higher the sample rate, the more accurate the waveform. Kind of sounds like frame rate in video, doesn’t it? Sample Rate is also measured in Hz just like frequency, and the reason for this is because the sample rate also represents the highest frequency you’re capable of capturing.
Confused? Let’s break it down:
48kHz (or 48,000hz), one of the most common sample rates you’ll come across in your recording equipment and software.
The 48,000 represents the number of samples per second as well as HALF of the max frequency range it’s capable of capturing.
Keeping this in mind, we now know that 48 kHz audio captures 48,000 samples per second and has a max frequency of 24kHz, a little over the max of human hearing!
This is all part of what is known as the The Nyquist Sampling Theorem, but I’m not going to get into any of that here because the math scares me. Just be aware of it.
Bit Depth
Working side by side with the Sample Rate is Bit Depth, which to me can best be described as the amount of amplitude information and detail captured in each sample.
Measured in bits, you can think of Bit Depth, along with Sample Rate, as the audio’s “resolution.”
When you record audio, the more samples taken and the more information in those samples, the “truer” the waveform is going to be the original analog source.
Each amplitude value captured in this process affects the Dynamic Range of the audio, or how well the low and high, (soft and loud) points of the waveform are recorded, thus affecting the amplitude we talked about earlier. Simply put, a higher bit depth also lets you record louder audio that won’t clip or distort!
What’s clipping? Clipping is when an audio’s amplitude exceeds the set dB threshold, causing the peaks of the waveform to be cut off and distorting the sound! Just look for the little red light on your level meter. Once you’ve hit solid red, you have clipped audio!
By the way, this whole recording process of turning real-life sounds into digital information is called Quantization, which in SUPER simple terms can be described as the recorder assigning a digital value to the analog signal as it's being recorded.
Remember, real-life doesn't have pixels or bits, so everything you record needs to be assigned a digital spot. The more spots (or pixels or bits) available, the more accurate the digital copy will be.
Let’s move on to the common Bit Depths you’re likely to come across in your recording equipment or DAWS:
8-bit
16-bit
24-bit
32-bit float
Without getting into the crazy math involved (again, math scares me), here’s a quick breakdown of how much “space” and dynamic range each bit depth allows:
8-bit - 256 possible amplitude values (48 dB dynamic range)
16-bit – around 65,536 amplitude values (96 dB dynamic range)
24-bit – around 16,777,216 available values (144 dB dynamic range)
32-bit float – Over 4 billion amplitude values (1528 dB dynamic range)
And for those of you wondering, 32-bit float is just 24-bit with 8 extra bits of wiggle room, useful for when things suddenly get loud.
Let’s now look at how Sample Rate and Bit Depth show up in everyday life within commercially available music and media.
CDs are encoded at 44.1kHz at 16-bits, meaning that the song is sampled 44,100 times per second, has a max frequency of 22.05kHz, and a 96db dynamic range. While 44.1kHz at 16-bits is still considered high fidelity audio and is still used in a lot of audio only medias like music and podcasts, it’s also considered a dated sample rate and bit depth combo that was chosen more for technical reasons than quality ones.
Meanwhile, digital music you can download or stream online has more variety! A song I bought from Amazon, for example, was encoded at 44.1kHz at 32-bits, meaning that the song was sampled 44,100 times per second, has a max frequency of 22.05kHz and a 1528 dB dynamic range. Much more dynamic range than on a CD!
On the flip side, the audio standard for video is 48kHz at 24 bits, or higher! Audio captured at these sample rate and bit depth combinations is now considered “HD” and will get you through many, if not all, recordings with no issues.
Other “HD” Sample Rates include:
48kHz
88.2kHz
96kHz
192kHz
Disclaimer: Before you start recording at extremely high sample rates and bit depths like 192kHz at 32-bits, it's important to consider your computer's processing power and storage capacity. Also consider what your actual needs are, such as whether you’re recording an interview for a YouTube video that’s going to be compressed a few times after you export and upload, or sound engineering a live concert.
How important is all this to know? It depends. If you just want to get solid sounding audio for your projects, it's good to be aware of all this, but not crucial. Most recorders and cameras default to standard sample rates, so you'll be fine in most situations. I myself didn't fully grasp the concepts of Sample Rate and Bit Depth until years after I had been working with audio, both as a student and “professionally.”
BUT, once I did learn more about these audio basics, it definitely influenced the way I approach audio projects and explained a few issues I would encounter from time to time.
Bitrate
Like in video, Bitrate (not to be confused with bit depth) refers to the amount of data that is processed per second. If Sample Rate and Bit Depth is how much information we capture per second, then Bit Rate is the overall quality of all that information. Higher bitrates result in higher-quality files but also larger file sizes. You’ll almost only have to deal with Bitrate when you’re encoding and exporting your projects once they’re ready for the world to hear!
Let’s think about it in video terms by comparing it to a 1080p high bitrate video and a low bitrate 4K one. Despite the higher resolution of the 4K video, the 1080p higher bitrate video will look much cleaner and retain more detail than the low bitrate 4K one. At the end of the day, it’s just a matter of finding the right balance between bitrate and file size for both video and audio.
To simplify things, just try to remember the following terms: Lossless and Lossy.
Lossless means the files are uncompressed and don't lose any information, but they occupy a lot of space on your hard drive. They’re big boys! WAV, FLAC, and AIFF are some examples of lossless audio formats.
Lossy, on the other hand, are compressed files that lose information as they’re made to save space. MP3, WMA, and AAC are common examples of lossy audio formats.
Lossless and Lossy apply to video and photography as well in their respective formats, such as Camera RAW being lossless for photography or .mp4 being lossy for video.
Conclusion
So that was a lot, right? But don't worry, you don't need to memorize everything to get started. Again, this was all just to help you be aware of the terms and concepts you’ll come across while recording and editing.
Let me know in the comments if you found this helpful, hated everything about it, if I got everything wrong, or if you have any feedback. I'm always happy to chat!
-A
Links and Resources:
https://www.adobe.com/uk/creativecloud/video/discover/audio-sampling.html
https://www.headphonesty.com/2019/07/sample-rate-bit-depth-bit-rate/
https://zoomcorp.com/en/us/news/32-bit-float-everything-you-need-to-know/
https://www.izotope.com/en/learn/digital-audio-basics-sample-rate-and-bit-depth.html
https://www.blackghostaudio.com/blog/sample-rate-bit-depth-explained