The discrete Fourier Transform is of extreme importance for all kinds of digital signal processing tasks. Given an input sequence $x[n]$ of length $N$ samples, the DFT is given by $$X[k] = \sum_{n=0}^{N-1}x[n]\exp(-j2\pi\frac{nk}{N}).$$ It is used for filtering, spectral analysis, complexity reduction, equalization and so many other kinds of analysis. In this article, we are going to illustrate the properties of the discrete Fourier Transform which are different from the conventional, time-continuous Fourier transform. We will describe the effect of zero-padding versus using a larger FFT window for spectral analysis. Finally, we will apply this knowledge to the detection of dual-tone multi-frequency signaling, how it is used in the standard phone dialing.

Let us first pose the central property of the DFT:

The Discrete Fourier Transform (DFT) assumes that its input signal is one period of a periodic signal. Its output are the discrete frequencies of this periodic signal.

This central property is very important to understand. Let us first examine the (continuous-time) Fourier transform of some periodic signal. To do this, we define a function that calculates the continuous-time Fourier Transform (see also this post)

```
def cft(g, f):
"""Numerically evaluate the Fourier Transform of g for the given frequencies"""
result = np.zeros(len(f), dtype=complex)
# Loop over all frequencies and calculate integral value
for i, ff in enumerate(f):
# Evaluate the Fourier Integral for a single frequency ff,
# assuming the function is time-limited to abs(t)<5
result[i] = complex_quad(lambda t: g(t)*np.exp(-2j*np.pi*ff*t), -10, 10)
return result
def complex_quad(g, a, b):
"""Return definite integral of complex-valued g from a to b,
using Simpson's rule"""
# 2501: Amount of used samples for the trapezoidal rule
t = np.linspace(a, b, 2501)
x = g(t)
return integrate.simps(y=x, x=t)
```

```
def triang(t):
return (1-abs(t)) * (abs(t)<1).astype(float)
Fs = 10000
t = np.arange(-10, 10, 1/Fs)
f = np.arange(-3, 3, 1/20.)
plt.subplot(121)
plt.plot(t, triang(t))
plt.subplot(122)
cft_triang = cft(triang, f)
plt.plot(f, abs(cft_triang));
```

```
def periodic_triang(t):
return triang(np.mod(t+1, 2)-1)
plt.subplot(121)
plt.plot(t, periodic_triang(t));
plt.subplot(122)
# Energy correction /10, since the periodic signal has 10 periods (i.e. 10 times the power)
cft_periodic_triang = cft(periodic_triang, f) / 10
plt.plot(f, abs(cft_periodic_triang), '-', label='$X(f)$ (periodic)')
plt.plot(f, abs(cft_triang), label='$X(f)$ non-periodic', lw=2);
```

As is shown, suddenly the spectrum of the periodic signal becomes discrete, but follows the shape of the continuous spectrum. I.e. **he spectrum of the periodic signal is a sampled version of the continuous spectrum**. The input signal has a period $T=2$, i.e. the signal is repeated every $T=2$ seconds. Accordingly, its spectrum is non-zero every $1/T=0.5Hz$.

Now, let us consider the spectrum of an originally periodic function, but which was windowed by a rectangular function:

```
def rect(t):
return (abs(t)<0.5).astype(float)
window = lambda t: rect((t-1)/4)
def windowed_periodic_triang(t):
return periodic_triang(t) * window(t)
plt.subplot(121)
plt.plot(t, windowed_periodic_triang(t), 'b-', label='windowed', lw=2);
plt.plot(t, window(t), 'r-', label='window')
plt.plot(t, periodic_triang(t), 'g-x', label='periodic', markevery=2000)
cft_windowed_periodic_triang = cft(windowed_periodic_triang, f) / 2
cft_window = cft(window, f) / 4
plt.subplot(122)
plt.plot(f, abs(cft_windowed_periodic_triang), 'b-', label='windowed', lw=2);
plt.plot(f, abs(cft_periodic_triang), 'g-x', label='periodic')
plt.plot(f, abs(cft_window), 'r-', label='window')
```

Suddenly, the spectrum becomes continuous again (that's clear, as the signal is not periodic anymore), but it crosses the value at the discrete points from the periodic signal (green line). Mathematically, this is explained by the fact that multiplication in time-domain (i.e. windowing by the rect function of width $T=4$) relates to convolution in frequency domain (i.e. convolution with a sinc-function that has zeros every $F=1/T=0.25$Hz, see red curve).

Now, let us make an extreme, and window our periodic function to only one period:

```
window = lambda t: rect(t/2)
def windowed_periodic_triang(t):
return periodic_triang(t) * window(t)
plt.subplot(121)
plt.plot(t, windowed_periodic_triang(t), 'b-', label='windowed triang', lw=2);
plt.plot(t, window(t), 'r-', label='window')
plt.plot(t, periodic_triang(t), 'g-x', label='periodic', markevery=2000)
cft_windowed_periodic_triang = cft(windowed_periodic_triang, f)
cft_window = cft(window, f) / 2
plt.subplot(122)
plt.plot(f, abs(cft_windowed_periodic_triang), 'b-', label='windowed', lw=2);
plt.plot(f, abs(cft_periodic_triang), 'g-x', label='periodic')
plt.plot(f, abs(cft_window), 'r-', label='window')
plt.plot(f, abs(cft_triang), 'k-x', label='non-periodic')
```

Let us summarize our findings on the continuous-time Fourier Transform:

- The Fourier Transform of a non-periodic function is a continuous function.
- The Fourier Transform of a periodic function with period $T$ is a discrete spectrum, where the spectral lines are $1/T$ apart.
- The Fourier Transform of a periodic function with period $T$, that was windowed with a (rectangular) window of length $T_W$ is again a continuous function. Its value equals the convolution of the discrete spectrum of the continuous function with the spectrum of the window.

Now, that we have found properties of the continuous-time Fourier Transform regarding periodicity and windowing, let us recall the fundamental thing for the DFT:

The Discrete Fourier Transform (DFT) assumes that its input signal is one period of a periodic signal. Its outputs are the discrete frequencies of this periodic signal.

What does that mean?

1) If we calculate the DFT of some sequence, the DFT assumes the signal is actually a periodic repetition of this sequence:

```
n = np.arange(32)
xn = np.linspace(0, 1, len(n))
plt.subplot(121)
plt.stem(n, xn);
xn_periodic = np.hstack([xn]*5)
plt.subplot(122)
plt.stem(xn_periodic)
```

2) As we know from before, the spectrum of a periodic function with period $T$ is discrete, where the spectral lines occur in distance $1/T$. Let us now consider, our DFT input sequence consists of $N$ samples (e.g. $N=$128), that are sampled with frequency $F_s$ (e.g. $F_s=$1kHz), i.e. the samples are $T_s=1/F_s$ apart in time (e.g. $T_s=$1ms). Then, the overall length of the input sequence is $T=N/F_s$ (e.g. $T=$128ms). Hence, the DFT calculates the spectrum at the spectral lines, which are $\Delta_f=1/T$ apart (e.g. $\Delta_f=7.8125$Hz). This leads to the fist important measure for the DFT:

The distance between frequency bins $\Delta_f$ of the DFT output only depends on the length of the input sequence $T$ and is given by $$\Delta_f=1/T_s.$$ The distance between frequency bins does not depend on the sampling frequency.

The output of the DFT consists of $N$ frequency bins, which are $\Delta_f$ apart. Hence, the maximum frequency that is representable by the DFT is given by $F_{\max}=N\Delta_f=N/T=N/(N/F_s)=F_s$, which leads us to the second important property of the frequency bins.

The frequency range that is represented by the output of the DFT is given by $$F_{\max}=F_s.$$

Note, that above holds for the sampling of a complex-valued signal. If we have a real-valued signal, then the output of the DFT is symmetric to the DC frequency (i.e. first bin). This means, that actual information about the signal is only contained in the first $N/2$ bins, which corresponds to the common statement of $F_{\max}=F_s/2$ due to the Nyquist sampling theorem.

So, with the above information, if the DFT input consists of $N$ samples that are sampled with frequency $F_s$, output of the DFT corresponds to the frequencies $F=[0, \frac{F_s}{N}, 2\frac{F_s}{N},\dots,(N-1)\frac{F_s}{N}]$.

Let us verify this finding with a cosine wave of known frequency $f_0$ that is sampled with sampling frequency $F_s$:

```
Fs = 1000
f0 = 20
t = np.arange(0, 1, 1/Fs) # Calculate the time points where the sampling occurs
N = len(t) # number of samples in the sampled signal
cos_t = np.cos(2*np.pi*f0*t)
plt.figure(figsize=(8,3))
plt.subplot(131)
plt.plot(t, cos_t)
X_f = np.fft.fft(cos_t)
f = np.arange(0, Fs, Fs/N)
plt.subplot(132)
plt.plot(f, abs(X_f))
plt.subplot(133)
plt.plot(f, abs(X_f))
```

As we know, the cosine wave consists only of a single frequency. However, when looking at the full spectrum (in the center figure) there are 2 peaks: One in the very beginning (actually, it's at $f=20$Hz) and the other at the very and of the x-axis (it's at $f=(1000-20)$Hz). Why's that? Remember the Nyquist sampling theorem, that we can actually only use the frequencies up to $f=F_s/2$, as the higher frequencies are just a mirror of the lower frequencies.

From math we know $$ \cos(2\pi f_0t) = \frac{1}{2}(\exp(j2\pi f_0t)+\exp(-j2\pi f_0t),$$ i.e. we can represent a cosine as a sum of two complex exponentials, one with positive frequency $f_0$ and one with negative frequency $-f_0$. How does this fit into our calculations, where are the negative frequencies?

The $k$th bin of the DFT is calculated with the exponential term $\exp(-j2\pi\frac{nk}{N})$. Now, knowing about the periodicity of the exponential, the $k$th last DFT bin uses the exponential $\exp(-j2\pi n\frac{(N-k)}{N}=\exp(-j2\pi \frac{-kn}{N})$. This means, we can identify the $k$th last DFT bin with a negative frequency. As such, we identify the second half of the frequency axes as the negative frequencies, such that the overall frequencies range from $F=[-F_s/2, -F_s/2+F_s/N, -F_s/2+2F_s/N, \dots, 0, F_s/N, \dots, F_s/2-F_s/N]$. This more common understanding of the DFT frequency bins requires to rotate the DFT output as well. This is, where the function `fftshift`

comes into play: It swaps the first and second half of its input sequence:

```
N2 = 20 # N/2
H1 = 5*np.sin(np.arange(N2)) # first half of the signal
H2 = np.arange(N2) # second half of the signal
H = np.hstack([H1, H2])
plt.subplot(121)
plt.stem(H)
plt.subplot(122)
plt.stem(np.fft.fftshift(H))
```

`fftshift`

and negative frequencies:

```
plt.figure(figsize=(8,3))
plt.subplot(131)
plt.plot(t, cos_t)
f2 = np.arange(-Fs/2, Fs/2, Fs/N) # the frequency axis including negative frequencies
plt.subplot(132)
plt.plot(f2, np.fft.fftshift(abs(X_f)))
plt.subplot(133)
plt.plot(f2, np.fft.fftshift(abs(X_f)))
```

Now, we see the common representation of the cosine as the sum of two diracs, that occur at positive and negative frequencies.

To summarize:

- The DFT output frequency bins correspond to the frequencies $F_k=k\frac{F_s}{N}$.
- With the more common frequency axis of half positive, half negative frequencies, the frequency bins for a DFT are given by
`f=np.arange(-Fs/2, Fs/2, Fs/N)`

. In this case, an`fftshift`

of the DFT output is required before plotting the result. (This expression is only exact when N is even. If N is odd, slightly more complex operation would need to be calculated)

Finally, let us define a convenience function that calculates the `fftshift`

of the `fft`

of a sequence.

```
def dft(xn, fftLen=None):
if fftLen is None:
fftLen = len(xn)
return np.fft.fftshift(np.fft.fft(xn, fftLen))
```

```
Fs = 50 # sampling frequency
f0 = 2# signal frequency
T = 0.8 # time duration
t = np.arange(0, T, 1/Fs) # the time samples
t_fine = np.arange(0, T, 1/(100*Fs))
t_long = np.arange(-1, 2, 1/Fs)
N = len(t)
f = np.arange(-Fs/2, Fs/2, Fs/N) # the frequency bin axis
cos_t = np.cos(2*np.pi*f0*t) # generate the sampled signal
cos_tfine = np.cos(2*np.pi*f0*t_fine)
plt.subplot(131)
plt.plot(t_long, np.cos(2*np.pi*f0*t_long))
plt.gca().add_patch(patches.Rectangle((0, -1), T, 2, hatch='/', alpha=0.3))
plt.text(T/2, 0, "FFT window", ha='center', va='center', rotation=45)
plt.subplot(132)
plt.stem(t, cos_t)
plt.plot(t_fine, cos_tfine)
plt.subplot(133)
plt.stem(f, abs(dft(cos_t)))
```

```
print (np.round(f[N//2:N//2+10], 3))
```

*spreads* the measured energy for $f_0=2Hz$ onto the neighboring frequency bins. The nearest frequency bin ($f=2.5$) gets the highest value, $f=1.25$ the second highest, $f=3.75$ the theird and so on. This explains qualitatively what we see.

But, let's try to analyze this in more detail. Let us recap the main property of the DFT:

The Discrete Fourier Transform (DFT) assumes that its input signal is one period of a periodic signal. Its outputs are the discrete frequencies of this periodic signal.

This means, what the DFT actually assumes is that the signal looks like this:

```
dft_assumed = np.hstack([cos_t]*5)
plt.plot(dft_assumed);
for n in range(5):
plt.axvline(n*N, lw=1, color='red')
```

**spectral leakage** because even though the signal $x(t)$ is a periodic signal of frequency $f_0$, if we take a part of the signal and calculate the DFT spectrum from it, we see multiple frequencies occuring, due to the strange behaviour at the period's boundary. In case we measure exactly an integer multiple of the signal period, the spectral leakage will disappear, because the DFT sees a purely periodic signal in this case:

```
T = 1; N = Fs*T
t = np.arange(0,T,1/Fs)
cos_t = np.cos(2*np.pi*f0*t)
plt.subplot(121)
plt.stem(t, cos_t);
f = np.arange(-Fs/2, Fs/2, Fs/N)
plt.subplot(122)
plt.stem(f, abs(dft(cos_t)))
```

The previous section has shown that, if we window a signal with a length that is not an integer multiple of its period, we will suffer from spectral leakage and the true frequency of the underlying tone will be invisible. Can we still find the true frequency of the tone?

Let us remember the section about the continuous Fourier Transform. The CTFT of a period signal is a discrete spectrum. But, if we multiply the periodic signal with a rectangular window, the spectrum becomes continuous, as it is the convolution of the discrete spectrum with a sinc-function (originating from the multiplication with the rectangular window in the time domain). Let's do this again with our previous example, but using the CTFT this time:

```
T = 0.8
Fs = 50
f0 = 2
cos_fn = lambda t: np.cos(2*np.pi*f0*t)
periodic_cos_fn = lambda t: cos_fn(np.mod(t, T))
windowed_cos_fn = lambda t: cos_fn(t) * ((abs(t-T/2) <= T/2).astype(float))
t = np.arange(-2, 4, 1/Fs)
f = np.arange(-5, 5, 1/20)
X_periodic = cft(periodic_cos_fn, f)
X_windowed = cft(windowed_cos_fn, f)*20/0.8
plt.figure(figsize=(8,3))
plt.subplot(121)
plt.plot(t, periodic_cos_fn(t), label='periodic')
plt.plot(t, windowed_cos_fn(t), label='windowed')
plt.subplot(122)
plt.plot(f, abs(X_periodic), label='periodic')
plt.plot(f, abs(X_windowed), label='windowed')
```

**zero padding** the signal. The idea is to approximate the CTFT of the windowed signal by letting the input signal to the DFT also look like an aperiodic windowed version. However, we know that the DFT always assumes the signal is periodic, but we can still get a similar effect: Let us take our measured window and append zeros to it:

```
T=0.8; Fs = 50; f0 = 2;
t = np.arange(0, T, 1/Fs)
N = len(t)
cos_t = np.cos(2*np.pi*f0*t) # the measured signal
N = len(t) # number of samples
N_zp = 7*N # The number of zeros to add
zero_padded_cos_t = np.hstack([cos_t, np.zeros(N_zp)])
f = np.arange(-Fs/2, Fs/2, Fs/N)
N_fft = len(zero_padded_cos_t)
f_zp = np.arange(-Fs/2, Fs/2, Fs/N_fft)
plt.figure(figsize=(8,3))
plt.subplot(121)
plt.stem(zero_padded_cos_t);
plt.subplot(122)
plt.plot(f_zp, abs(dft(zero_padded_cos_t)), 'g-', label='With ZP')
plt.stem(f, abs(dft(cos_t)), label='Without ZP')
```

Here, we have performed zero-padding by adding $7N$ zeros to the windowed signal. Accordingly, the distance between two adjacent frequency bins becomes $F_s/(8N)$ since the DFT input signal now has length of $8N$ samples.

Looking at the figure, also the DFT can tell us, what was the frequency of the original signal: $f_0=2Hz$, since the maximum of the blue curve occurs at 2Hz. For comparison, look at the frequency points that were calculated by the non-zeropadded DFT: we have a much coarser spacing between the DFT frequency bins, making it hard to tell the exact frequency. However, the ZP curve in green is only an interpolated version of the blue dots. As such, zero-padding a signal does not increase the amount of information that is contained in the signal. But, it can help in identifying dominant frequencies more accurately.

Let us now consider a signal that conists of two tones of different frequencies $f_0, f_1$ which are close to each other:$$x(t)=\cos(2\pi f_0t) + \cos(2\pi f_1 t).$$

The task is now to provide information on the frequency of both tones. Let us first generate such a signal and see hot it looks like:

```
Fs = 50
f0 = 2
f1 = 2.3
T = 10
t = np.arange(0, T, 1/Fs)
x = lambda t: np.cos(2*np.pi*f0*t) + np.cos(2*np.pi*f1*t)
plt.plot(t, x(t));
```

```
T = 1; N = Fs*T
t = np.arange(0, T, 1/Fs)
plt.subplot(121)
plt.plot(t, x(t));
f = np.arange(-Fs/2, Fs/2, Fs/N)
plt.subplot(122)
plt.plot(f, abs(dft(x(t))))
```

```
f_zp = np.arange(-Fs/2, Fs/2, Fs/(32*N))
plt.plot(f_zp, abs(dft(x(t), 32*N)))
```

Apparently, also with Zero-padding, we cannot distinguish the two frequencies. The estimate would be that the signal consists of a tone with $f=2.2Hz$. This is in line with the statement, that ZP does not reveal extra information from the spectrum. Instead it merely interpolates the coarse spectrum to become more smooth. It does not help in distinguishing between two close frequencies.

So, how can we improve the resolution of our DFT? We know, the DFT bins are determined by the length of the recorded signal: The longer the signal, the finer the useful DFT bins (in contrast to the interpolated bins from zero-padding). So, the solution is to record a longer portion of the signal, such that both frequencies will occur on separate bins. Let's measure the signal with a duration of 10s:

```
T = 10
t = np.arange(0, T, 1/Fs)
N = len(t)
f = np.arange(-Fs/2, Fs/2, Fs/N)
plt.subplot(121)
plt.plot(t, x(t))
plt.subplot(122)
plt.plot(f, abs(dft(x(t))))
```

*Note that the spectrum is zoomed strongly zoomed too see the two different peaks*). With this amount of real measurement, we can easiliy distinguish between the two tones in the signal: There is one tone at $f=2Hz$ and one at $f=2.3Hz$.

We can now conclude the following

- The DFT is vulnerable to
spectral leakage. Spectral leakage occurs when a non-integer number of periods of a signal is sent to the DFT. Spectral leakage lets a single-tone signal be spread among several frequencies after the DFT operation. This makes it hard to find the actual frequency of the signal.- For a single-tone signal, we can find its actual frequency even when spectral leakage occurs. We can
zero-padthe signal and perform a larger DFT to get a more frequency bins.- Zero-padding a signal does not reveal more information about the spectrum, but it only interpolates between the frequency bins that would occur when no zero-padding is applied. In particular,
zero-padding does not increase the spectral resolution.- To
increase the spectral resolution, longer durations of real measurements are necessary. A longer measurement obtains more information from the measured signal, since the resulting frequency bins are distributed finer.

Let us now apply the gained knowledge to an intuitive problem: phone dialing. When your phone dials a number, you can hear a sequence of sounds. There, each sound encodes a different digit to be dialed. The sounds are two-tone signals according to $$x(t)=\cos(2\pi f_1 t) + \cos(2\pi f_2 t),$$ where $f_1$ and $f_2$ depend on the digit to be dialed. The frequencies are encoded according to the following table:

1209Hz | 1336Hz | 1477Hz | 1633Hz | |
---|---|---|---|---|

697Hz | 1 | 2 | 3 | A |

770Hz | 4 | 5 | 6 | B |

852Hz | 7 | 8 | 9 | C |

941Hz | * | 0 | # | D |

So, for example, when dialing the digit 5, the corresponding signal has the form $$x_5(t)=\cos(2\pi f_1 t) + \cos(2\pi f_2 t), \quad\text{with } f_1=770Hz \quad f_2=1336Hz.$$

Let us replicate this table in code:

```
F1 = np.array([697, 770, 852, 941])
F2 = np.array([1209, 1336, 1477, 1633])
tones = {'1': (0,0),
'2': (0,1),
'3': (0,2),
'4': (1,0),
'5': (1,1),
'6': (1,2),
'7': (2,0),
'8': (2,1),
'9': (2,2),
'0': (3,1),
'*': (3,0),
'#': (3,2),
'A': (0,3),
'B': (1,3),
'C': (2,3),
'D': (3,3)}
for number, inds in sorted(tones.items()):
print ("%s: %dHz %dHz (delta-F: %dHz)" % (number, F1[inds[0]], F2[inds[1]], (F2[inds[1]]-F1[inds[0]])))
```

Now, let's write function that generates the signal for a given digit with a certain duration:

```
Fs = 8000 # sampling frequency of our discrete system
def tone(number, duration):
"""generate the sound for number with given duration"""
inds = tones[number]
f1 = F1[inds[0]]
f2 = F2[inds[1]] # get both frequencies for the digit
t = np.arange(0, duration, 1/Fs)
return np.cos(2*np.pi*f1*t) + np.cos(2*np.pi*f2*t) # calculate signal
# Generate tone for digit 2 with duration of 1s
Audio(data=tone(number='2', duration=1), rate=Fs)
```

```
def dialNumber(numbers, toneDuration):
data = [tone(n, toneDuration) for n in numbers]
return np.hstack(data)
Audio(data=dialNumber("0049176854864321", toneDuration=0.1), rate=Fs)
```

The task now is to write a program that can extract the dialed number from this sound sequence.

How should we proceed? Lets follow this rough idea:

- Assuming we know the duration of each tone, split the sound into the tone for each digit.
- Perform DFT of the separate tones.
- Check for two distinct peaks in the spectrum to find, which frequencies were contained in the signal.
- Map the frequencies to the dialed numbers

The crucial step is step 3., where we need to distinguish between two frequency components that might be close to each other. First, let's run a naive DFT on the sound for some digit. We choose the digit `*`

, because it has the least distance between the two contained frequencies, which makes it the most challenging digit.

```
def markPossibleTones():
for f in F1:
plt.axvline(f, color='r')
for f in F2:
plt.axvline(f, color='k')
data = tone(number='*', duration=1e-2)
f = np.arange(-Fs/2, Fs/2, Fs/len(data))
plt.plot(f, abs(dft(data)))
markPossibleTones()
```

```
f_ZP = np.arange(-Fs/2, Fs/2, Fs/(8*len(data)))
plt.plot(f, abs(dft(data)), label='DFT')
plt.plot(f_ZP, abs(dft(data, 8*len(data))), label='Zero-padded DFT')
markPossibleTones()
```

Now, we clearly see, which tones are contained in the signal: It's the highest of the lower and lowest of the higher frequencies, corresponding to digit `*`

. Hence, we should resort to zero-padding our signal before the DFT to get more accurate information about the actually contained tones.

Now, let's write a function to perform the estimation of a tone given its spectrum:

```
def estimateDigit(sound):
N = len(sound)
f_ZP = np.arange(-Fs/2, Fs/2, Fs/(8*N)) # 8 times zeropadding
X = abs(dft(sound, 8*N))
# Technique: We check each of the possible frequencies
# and choose the frequency which has the highest amplitude in the spectrum
```