In a last article, we explained the mathematical effect of quantization and what the resulting quantization noise is. In this article, we will hear, how the quantization noise actually sounds. As a teaser, listen to the following:
# For running this code, the code snippets below need to be run beforehand
display(HTML("Original signal:" + Audio(data=data_music, rate=rate)._repr_html_()))
showQuantization(data_music, U=1,bits=4, showSignals=False);
Clearly, we hear a significant noise floor below the original music signal. Let's look deeper into what actually happens. First, we define a function to load some audio file from the internet:
def loadAudio(url, start, length):
R = requests.get(url)
with open("sound.mp3", "wb") as f:
f.write(R.content)
!ffmpeg -y -i sound.mp3 sound.wav > /dev/null 2>&1
rate, data = wavfile.read("sound.wav")
if len(data.shape) > 1: # stereo to mono conversion
data = data.sum(axis=1)
data = (1.0 * data / abs(data).max()).astype(np.float32)
dataPart = data[rate*start+np.arange(min(rate*length, len(data)))]
targetRate = 10000 # Resample signal to 10kHz sampling rate
targetSamples = int(len(dataPart) * targetRate / rate)
resampled = signal.resample(dataPart, targetSamples)
return targetRate, resampled / abs(resampled).max()
# Utility function two display two audios side by side in the notebook
def audioSideBySide(name1, audio1, name2, audio2):
text = '%s %s %s %s
' % (name1, name2, audio1._repr_html_(), audio2._repr_html_())
display(HTML(text))
Then, we load some music from the internet and extract a portion of 10 seconds length out of it:
url_music = "http://www.scientificinvesting.eu/a/Mozart%20-%20Symphony%20n.10%20K.74%20in%20G%20-%201%20Allegro.mp3"
rate_music, data_music = loadAudio(url_music, 40, 10)
rate = rate_music
Next, we define the functions for calculation of the quantization thresholds and for performing the actual quantization. Sure, we have shamelessly copied them from our previous article:
# Calculate the quantization levels with a uniform quantizer
def calcLevels(U, b, quantization_type):
N_levels = 2**b
delta = 2*U / N_levels
if quantization_type == 'mid-rise':
levels = -U + delta/2 + np.arange(N_levels) * delta
elif quantization_type == 'mid-tread':
levels = -U + np.arange(N_levels) * delta
else:
raise RuntimeError("Unknown quantization type!")
return levels
# Map the input array x to the nearest values in S
def quantize(x, S):
X = x.reshape((-1,1))
S = S.reshape((1,-1))
dists = abs(X-S)
nearestIndex = dists.argmin(axis=1)
quantized = S.flat[nearestIndex]
return quantized.reshape(x.shape)
Let us now define a convenience function that performs the quantization of a signal, shows the resulting signals and creates the audio objects:
def showQuantization(audio, U, bits, quantization_type='mid-rise', showNoise=True, showSignals=True):
S = calcLevels(U=U, b=bits, quantization_type=quantization_type)
quantized = quantize(audio, S) # Perform quantization
q_noise = audio - quantized # Calculate quantization noise
P_signal = sum(abs(audio**2)) # Calculate SNR in dB
P_noise = sum(abs(q_noise**2))
SNR = 10*np.log10(P_signal/P_noise)
audioSideBySide("Quantized to q=%d bits" % bits, Audio(data=quantized, rate=rate),
"Quantization Noise", Audio(data=q_noise, rate=rate))
t = np.arange(len(audio)) / rate
if showSignals:
plt.plot(t, audio, label='Original')
plt.plot(t, quantized, label='Quantized to q=%d bits' % bits)
if showNoise:
plt.plot(t, q_noise, label='Quantization Noise')
return t, quantized, q_noise
Now, we are ready to listen to the quantized music and also look at the resulting quantized signals. First, listen to the original signal once again:
Audio(data=data_music, rate=rate)
Let us first hear, how different number of quantization bits $q$ influence the sound quality. On the left side, the quantized sound is presented. On the right side, the quantization noise, i.e. the difference between the original and the quantized version is played.
showQuantization(data_music, U=1,bits=2)
showQuantization(data_music, U=1,bits=3)
showQuantization(data_music, U=1,bits=4)
showQuantization(data_music, U=1,bits=8);
Clearly, we can here some distortion of the quantized signals, depending on the number of bits $q$ used for quantization. Particularly, we can recognize two effects of an increased bit count for quantization:
Obviously, the quantized signal gets clearer and the noise floor/hearable distortions decrease. This is also in line with the calculated SNR values. As was derived previously about uniform quantization, the SNR for a sine wave increases by $6$dB for each additional bit. The measured SNRs roughly confirm this measurement practically.
With more quantization bits $q$, the quantization noise becomes more and more monotonous: For $q=\{2, 3\}$, the original signal is hearable within the quantization noise. This indicates that 2 or 3 bits can by far not convey the whole information of the music signal, and the remaining error exhibits a significant portion of the signal. For $q=4$, we can barely here any structure within the noise, and for $q=8$, we experience a very smooth and quiet noise. We say, the noise becomes white with more bits.
In other words, for $q=\{2,3\}$, we clearly hear strong distortions of the quantized signal and not just only a noise floor. In comparison for $q=4$ we hear a monotonous noise floor and for $q=8$ we can barely hear any distortion.
We have seen that with 8-bit quantization, we get already a very clear signal, with hardly any noise hearable. Let us now reduce the dynamic range of the quantizer, i.e. the maximum amplitude it is able to quantize uniformly. If the signal grows beyond this amplitude, the quantizer just outputs the maximum:
showQuantization(data_music, U=0.3,bits=8, showNoise=False);
In the signal plots, the quantized signal is limited to $x(t)=0.3$, and it does not correctly represent the overall signal. Hence especially in more powerful parts of the signal, we hear a strong distortion. This distortion is due to clipping, i.e. the input signal is higher than the maximum quantization level and the quantizer clips the signal to the highest possible amplitude. If the signal is higher than this amplitude, the hearable distortion occurs. We can hear the clipped signal parts in the quantization noise, since the quantization noise contains the information that is not available in the quantized signal.
Let us again have a look at the quantized signal for $q=2$:
showQuantization(data_music, U=1,bits=2, showNoise=False);
We know, for $q=2$, there are 4 quantization levels: $\{-0.75, -0.25, 0.25, 0.75\}$. However, looking at the waveform of the quantized signal we see, that the levels $\{-0.75, 0.75\}$ are rarely used and mostly the quantized signal jumps between $\pm 0.25$. Looking at the original waveform, this behaviour is clear: Also, the original waveform mostly shows smaller amplitudes. Accordingly, the quantizer switches between the two levels $\pm0.25$ most of the time, wasting the potential for signal improvement of the remaining 2 levels. We can also hear this effect in the quantization noise: For louder signal portions (e.g. around $T=8s$), the noise becomes more uniform than in the quiet parts (e.g. the beginning of the audio).
What can we do to improve our quantized signal? Clearly, we can increase $q$. But, as a tradeoff, we can also reduce the dymanic range of the quantizer and hence let clipping occur. At the positive side, we get the $4$ levels of the quantizer used more uniformly, and hence reach a better SNR in overall. See, what happens, if we set the maximum quantization amplitude to $U=0.5$:
showQuantization(data_music, U=0.5,bits=2, showNoise=False);
The measured SNR increases from $1.7$dB to $7.8$dB! This is a huge gain in SNR! While listening to the noise of both signals, despite bad audio quality for both signals, we hear a more smooth quantization noise floor for the clipping quantizer.
The fact that most signals rarely have high amplitudes and mostly concentrate in the smaller-amplitude regions (i.e. they usually exhibit a relatively high Crest factor/PAPR) is exploited in non-uniform quantizers, for example in the Lloyd-Max Quantizers. These quantizers optimize the thresholds according to the distribution of the amplitudes and try to minimize the quantization errors.
Let us now go to the extreme, and listen to a signal that is quantized by $q=1$ bit. Essentially, the 1-bit quantization is just a detection of zero-crossings, since with one bit one can only store the sign of a signal.
showQuantization(data_music, U=1,bits=1, showNoise=False);
As we can hear, despite strongly distorted, still the music can be recognized. This underlines, that a lot of signal information is contained in the zero crossings.
To prove this, let us generate some articifial signal:
t = np.arange(0, 5, 1/rate)
y = np.cos(2*np.pi*100*t*(t+1))
quantized = (y>0) - 0.5 # quantize to +- 0.5
audioSideBySide("Original", Audio(data=y, rate=rate), "1-bit quantized", Audio(data=quantized, rate=rate))
We can hear a frequency sweep of increasing frequency as the original signal. Clearly, the 1-bit quantized signal conveys a similar signal: The hearable frequency is increasing over time. So, even when we only consider the sign of a signal, we can get information out of it. However, the sound of the signal is strongly distorted and sounds more like a 80s computer game.
Sidenote: Can you hear the aliasing frequencies in the quantized signal? These are the sounds that wobble up and down at the higher frequencies. This aliasing is occuring since the quantization noise is not bandlimited (there are abrupt jumps in the quantized signal), which cannot be represented by the sampling of our signal. Hence, we get Aliasing which can be heard by the wobbling higher frequencies.
- Quantization results in a quantization noise, which can be heard in the quantized signal.
- The more bits are used for the quantization, the more smooth/uniform/white becomes the quantization noise.
- For signals with high Crest factor, the quantization SNR can be improved by reducing the dynamic range of the quantizer, but at the same time experience clipping.
- A significant part of the signal information is contained in the sign of the signal, which can be resolved with a 1-bit quantization already.
Do you have questions or comments? Let's dicuss below!