Introduction to Fourier Transform

Initialize NeuroAnalyzer

using Plots
using FFTW
using DSP
using NeuroAnalyzer

The Fourier Transform is a mathematical tool used to decompose a signal into its constituent frequencies. It transforms a time-domain signal into its frequency-domain representation, allowing us to analyze the frequency components of the signal.

Fourier Theorem

Fourier Theorem

According to the Fourier theorem, any signal can be represented as a sum of sine waves, each with its own frequency, amplitude, and phase. This means that complex signals can be broken down into simpler sinusoidal components.

Mathematical Representation

The Fourier Transform of a continuous-time signal \(s(t)\) is given by:

\[ X(f) = \int_{-\infty}^{\infty} s(t) e^{-i 2\pi ft} \, \mathrm{d}t \]

For a discrete-time signal \(s[n]\), the Discrete Fourier Transform (DFT) is:

\[ X[k] = \sum_{n=0}^{N - 1} s[n] e^{-i 2\pi kn/N} \]

The Fast Fourier Transform (FFT) is an efficient algorithm to compute the Discrete Fourier Transform (DFT) and its inverse. The FFT speeds up the computation of the DFT by exploiting the symmetry and periodicity properties of the complex exponential functions used in the DFT.

How It Works

The Fourier Transform works by comparing the signal to sinusoids oscillating at different frequencies \(f_j\). When the signal and a sinusoid at a particular frequency align (i.e., their dot product is non-zero), the summation in the Fourier Transform yields a large value for the corresponding frequency component \(X_j\).

Stationarity Assumption

The Fourier Transform assumes that the signal is stationary, meaning its statistical properties (such as mean and variance) do not change over time.
For non-stationary signals, such as a chirp (a signal where the frequency changes over time), the Fourier Transform may not accurately capture the time-varying frequency components.

Issues with FFT of Non-Stationary Signals

When applying the Fast Fourier Transform (FFT) to non-stationary signals (signals whose frequency content changes over time), several issues can arise due to the inherent limitations of FFT. Here’s a breakdown of the problems you mentioned:

Smeared Out Peaks:

FFT assumes that the signal is stationary and infinite in duration. For non-stationary signals, the frequency components are not constant over time.
When FFT is applied, the frequency components that change over time get “smeared” across the frequency spectrum. This smearing occurs because FFT cannot localize when these frequency changes happen.

Side-Band Ripples Around Peaks:

Non-stationary signals often have abrupt changes in frequency or amplitude. FFT interprets these changes as additional frequency components, leading to side-band ripples or spectral leakage.
These ripples appear as smaller peaks or oscillations around the main frequency peaks, making it difficult to distinguish true signal components from artifacts.

Amplitudes Smaller Than in Signal:

FFT distributes the energy of time-varying frequency components across the entire frequency spectrum. This distribution can result in lower amplitude values for specific frequencies than what might be observed in localized time segments of the signal.
Essentially, the energy of a frequency component that is present only for a short duration gets spread out, reducing its peak amplitude in the FFT result.

Lossless Property

The Fourier Transform is a lossless transformation, meaning that no information is lost when converting a signal from the time domain to the frequency domain and back. Here’s a breakdown of why this is the case:

Lossless Nature of Fourier Transform

Equal Number of Points and Components:

If a signal has \(n\) time points, its Fourier Transform will consist of \(n\) sine and cosine components (or complex exponentials).
This one-to-one correspondence ensures that all information in the time domain is preserved in the frequency domain.

Degrees of Freedom:

The degrees of freedom in a Fourier Transform are zero because the transformation is bijective (one-to-one and onto). Each point in the time domain corresponds to a unique set of Fourier coefficients in the frequency domain.

100% Variance Explained:

The Fourier Transform decomposes the signal into a sum of sine and cosine waves that completely represent the original signal. This means that 100% of the variance in the signal is explained by its Fourier components.

Unique Frequency Components:

Each Fourier coefficient corresponds to a unique frequency component of the signal. The set of these coefficients fully describes the signal in the frequency domain.

Complex Fourier Coefficients

The Fourier Transform converts a time-domain signal into a set of complex numbers (Fourier coefficients) for each frequency component of the signal. These coefficients represent the signal in the frequency domain. Here’s a concise explanation of what these coefficients signify:

Each complex number can be expressed in the form \(a + bi\), where \(a\) is the real part and \(b\) is the imaginary part.

Example:

x = rand(5)
X = fft(x)

5-element Vector{ComplexF64}:
  3.8842318614963403 + 0.0im
 0.09588104749214789 - 0.31114639474112676im
 0.22478060563875177 - 0.13180601029512268im
 0.22478060563875177 + 0.13180601029512268im
 0.09588104749214789 + 0.31114639474112676im

Magnitude and Phase:

Magnitude: Represents the amplitude (or strength) of each frequency component in the signal. Calculated as: \(\text{Magnitude} = \sqrt{a^2 + b^2}\)
Phase: Represents the phase shift (or timing) of each frequency component. Calculated as: \(\text{Phase} = \arctan\left(\frac{b}{a}\right)\)

Magnitude:

mag = abs.(X)

5-element Vector{Float64}:
 3.8842318614963403
 0.3255844809394216
 0.26057464385707607
 0.26057464385707607
 0.3255844809394216

since

\[ \vert 3+2i \vert = \sqrt{3^2 + 2^2} \]

Phase (in radians):

ph = DSP.angle.(X)

5-element Vector{Float64}:
  0.0
 -1.2718755663771808
 -0.5303418099811322
  0.5303418099811322
  1.2718755663771808

Phase is independent of amplitude, BUT: when amplitude is 0, phase is undefined. Therefore, for small amplitude, phase estimation may be less accurate.

Due to the division, it is practically useful to remove very small values of \(\vert X \vert\) since they will affect calculations. We also use the angle() function (theta = DSP.angle.(X)) instead of theta = atan.(imag.(X), real.(X)).

Interpretation:

The magnitude tells you how much of each frequency is present in the signal.
The phase tells you where each frequency component is located in time relative to the start of the signal.

But this does not relate to physical properties of the signal, just shows the composition of frequencies in the signal.

Phase angles have several uses in neuroscience:

In time-frequency analysis: phases are used to quantify the consistency of the timing of rhythmic activity over repeated trials.
In synchronization analysis: phases are compared across different electrodes to determine whether two brain regions are coupled (functionally interacting).
In spike-field coherence: phases are used to determine whether the timing of action potentials is locked to the population-level field potentials.

Phase angles MUST NOT be averaged. Instead, calculate an average vector, its length represents how close the angles are (e.g. /_ or _._).

Zero Padding

Key Concepts

Purpose: Zero padding adds zeros to the end of a time-domain signal before computing the Fast Fourier Transform (FFT). This does not add new information but increases the frequency resolution of the resulting spectrum. Effect: The FFT output appears “smoother” and more interpolated, making it easier to visualize and analyze frequency components.

Mathematical Implementation

Padding:

Original signal: s (length = L).
Padded signal: s_padded (length = L + n, where n is the number of zeros added).

Normalization:

To maintain correct amplitude scaling, normalize by the original signal length (L): FFT(s, L + n) / L

In NeuroAnalyzer there is fft0(s, n) function available, which will pad the signal s with n zeros.

Inputs:

s: The original signal (time series).
n: Number of zeros to append.

Output: The FFT of the zero-padded signal, normalized by the original length of s.

Example:

If s is an EEG segment of 1000 samples and you call fft0(s, 1000), the function:

Pads s with 1000 zeros (total length = 2000).
Computes the FFT of the padded signal.

Why Use Zero Padding?

Improved Visualization: Smoother frequency spectra for better interpretation.
Interpolation: Helps identify peaks or features that might be missed with coarse frequency bins.
No Information Gain: Zero padding does not improve spectral resolution beyond the original signal’s limits (determined by the sampling rate and original length).
Matching FFT Length for Convolution: Convolution requires both signals to have the same length (e.g., for linear convolution). How: Pad the shorter signal with zeros to match the length of the longer signal.
Increasing Frequency Resolution: Zero padding interpolates the FFT, making the frequency spectrum appear smoother and increasing frequency resolution (number of frequency bins). Key Point: It does not improve frequency precision (the ability to distinguish close frequencies). Rule of Thumb: Doubling the signal length (via zero padding) doubles the number of frequency bins.
Smoother/Interpolated Plots: Zero padding creates a denser frequency axis, useful for visualization or identifying peaks.
Optimizing FFT Performance: FFT algorithms (e.g., Cooley-Tukey) are fastest for lengths that are powers of two. Use nextpow2() to find the next power of two ≥ your signal length, then pad with zeros.

s = rand(10)
l = nextpow2(length(s))
fft0(s, l - length(s))

Practical Note

Zero padding is especially useful in EEG analysis for visualizing high-frequency components or subtle spectral features.
Always remember to normalize by the original signal length to avoid misinterpreting amplitudes.
Always report padding: State the original and padded lengths (e.g., “Padded from 1000 to 2000 samples”).
No Free Lunch: Zero padding doesn’t add information - it only interpolates existing data.

Power Spectrum

The power spectrum is a fundamental concept in signal processing that describes how the total power of a signal is distributed across its frequency components.

It is closely related to the power spectral density (PSD), but while the PSD represents power per unit frequency, the power spectrum typically refers to the total power in each frequency bin (e.g., for discrete or sampled signals).

The power spectrum \(S(k)\) of a signal \(x(t)\) is the squared magnitude of its Fourier transform, often normalized by the signal length:

\[ S(k) = \frac{1}{N} \vert X(k) \vert^2 \]

where:
\(X(k)\) is the discrete Fourier transform (DFT) of \(x(n)\),
\(N\) is the number of samples,
\(k\) is the frequency bin index.

x = rand(10)
n = length(x)
X = fft(x)
S = abs2.(X) ./ n

For continuous-time signals, the power spectrum is the integral of the PSD over a frequency band.

Normalization

To get the Fourier coefficients in the same scale as the original data, we must divide the coefficients by the number of samples.

Interpretation

The power spectrum shows which frequencies are present in the signal and their relative strengths.

Peaks in the power spectrum indicate dominant frequencies (e.g., the fundamental frequency of a sine wave).

The sum of the power spectrum over all frequencies equals the total power of the signal (Parseval’s theorem).

Units

For discrete signals, the power spectrum is often dimensionless (since it’s normalized by \(N\)).

These methods are valid if we are using a rectangular window, the signal is stationary and we calculate discrete power spectrum (which means that individual frequencies are distanced; otherwise spectra of these frequencies will merge and nothing can be said about individual frequencies).

Bin width is the distance between to samples in the spectrum.

\[ bw = \frac{f_s}{N} \]

where:
\(f_s\) is the sampling rate,
\(N\) is the number of samples.

fs = 10
k = 0:(n / 2) # 0.0:1.0:5.0
bw = fs / n
freq = collect(k .* bw)

6-element Vector{Float64}:
 0.0
 1.0
 2.0
 3.0
 4.0
 5.0

Therefore, to have a more narrow distance between separate frequencies in the spectrum, we need longer data (more samples) = more frequency resolution.

Same may be obtain by padding the signal with 0 s. However, zero-padding only increases visualization, but not frequency resolution.

Positive and Negative Frequencies

For real-valued signals, the power spectrum is symmetric about zero frequency. Only the first half (single-sided) is usually plotted.

Fourier transform returns coefficients above Nyquist frequency; these are negative frequencies - below frequencies going backward.

Negative frequencies are a mathematical artifact of using complex exponentials to represent real signals. For real-valued signals, the negative and positive frequency components are complex conjugates of each other and contain the same information.

To reconstruct amplitudes from negative frequencies we multiple amplitudes of positive frequencies by 2. However: amplitude at 0 Hz and Nyquist frequency also gets doubled this way; correctly all frequencies above 0 and below Nyquist should be doubled.

Here’s how to compute the single-sided, normalized power spectrum of a signal:

x = rand(10)
n = length(x)
X = fft(x)
S = abs2.(X) ./ n
S[2:(end - 1)] .*= 2

For real-valued signal (e.g. EEG) the negative frequencies mirror the positive frequencies and the amplitude gets split between the positive and negative frequencies. The negative frequencies are important only for signal of complex numbers (they are different than positive signal then), in neuroscience we usually ignore negative frequencies and double positive frequencies instead (we use rfft() instead of fft(), which requires FFTW.jl).

However, negative frequencies are necessary for inverse Fourier transform.

Amplitude Spectrum

The amplitude spectrum is a fundamental tool in signal processing that represents the magnitude of the frequency components of a signal. It shows how much of each frequency is present in the signal, regardless of phase. Unlike the power spectrum (which represents the squared magnitude), the amplitude spectrum directly reflects the strength of each frequency component.

Units

The units of the amplitude spectrum depend on the units of the original signal.

Single-Sided vs. Double-Sided Spectrum

For real-valued signals, the amplitude spectrum is symmetric about zero frequency. Only the single-sided spectrum (from 0 to fs/2) is typically plotted, as the negative frequencies are redundant.

Here’s how to compute the single-sided, normalized amplitude spectrum of a signal:

x = rand(10)
n = length(x)
X = fft(x)
A = abs.(X) ./ n
A[2:(end - 1)] .*= 2

A discrete spectrum occurs when a signal is composed of a finite or countable number of distinct frequency components. These signals are typically periodic (or a sum of periodic signals) and their Fourier transform consists of impulses (Dirac delta functions) at specific frequencies.

Amplitude and power spectra are useful for signal of discrete spectrum. For a signal with continuous spectrum, Power Spectral Density should be used.

Power Spectral Density

Power Spectral Density (PSD) is the power spectrum normalized for the sampling frequency.

\[ PSD_{xx}(f_k) = \frac{1}{f_sN} \vert X(f_k) \vert^2 \]

where:
\(X(f_k)\) is the Discrete Fourier Transform (DFT) of \(x[n]\),
\(f_k\) are the frequency bins,
\(f_s\) is the sampling frequency,
\(N\) is the number of samples.

For real-valued signals, the PSD is symmetric around 0 Hz, so we often use the one-sided PSD (for f ≥ 0):

\[ PSD_{xx}(f_k) = \frac{2}{f_sN} \vert X(f_k) \vert^2 \]

for \(f_k > 0\).

The factor of 2 accounts for the energy in both positive and negative frequencies.

Normalization

The \(\frac{1}{f_sN}\) term normalizes the PSD so that:

The total power (integral of PSD over frequency) equals the variance of the signal.
The units of PSD are power per Hz (e.g., μV²/Hz for EEG signals).

Parseval’s Theorem: the total power in the time domain equals the integral of the PSD over frequency.

# sampling frequency (Hz)
fs = 250
# mumber of samples
N = 1000
# ime vector (1 second)
t = range(0, 1, length=N)
# 10 Hz + 20 Hz signal
x = generate_sine(10, t, 1) .+ generate_sine(20, t, 0.5)

# compute DFT
# two-sided DFT
X = fft(x)
# take one-sided spectrum (for real signals)
X_one_sided = X[1:N÷2+1]

# frequency vector
# frequency bins (0 to Nyquist)
f = collect(range(0, fs/2, length=N÷2+1))

# one-sided PSD
PSD = (2 .* abs.(X_one_sided).^2) ./ (fs * N)
# DC component (0 Hz) should not be doubled
PSD[1] /= 2
# Nyquist frequency should not be doubled
PSD[end] /= 2

# plot
plot_psd(f,
         PSD,
         xlabel="Frequency (Hz)",
         ylabel="PSD (V²/Hz)",
         title="Power Spectral Density")

Common PSD Estimation Methods

Periodogram (as above): Simple but noisy for short signals.
Welch’s Method: Averages periodograms of overlapping segments for smoother estimates.
Multitaper Method: Uses multiple tapers to reduce variance.

# compute PSD using Welch's method
psd_result = welch_pgram(x,
                         fs=fs)
# frequency vector
f = Float64.(psd_result.freq)
# PSD vector
PSD = psd_result.power 

# plot
plot_psd(f,
         PSD,
         xlabel="Frequency (Hz)",
         ylabel="PSD (V²/Hz)",
         title="Power Spectral Density",
         flim=(0, 20))

Euler’s Formula

Core Concepts

Euler’s formula connects exponential, trigonometric, and complex number representations:

\[ e^{i \times \theta} = \cos(\theta) + i \times \sin(\theta) \]

where \(\theta\) is the phase.

\(e^{1i \times \theta}\) (exp(1im * theta)) is an efficient way to represent oscillatory information using polar notation.

Unit Circle Representation:

e^{1i } describes a unit vector (magnitude = 1) rotating at angle \(\theta\) in the complex plane.
Real part: \(\Re = \cos(\theta)\) (x-coordinate).
Imaginary part: \(\Im = \sin(\theta)\) (y-coordinate).

Raising a complex number to the natural log base gives a vector with absolute value of 1 and an angle of the real component of the number.

Euler’s formula for vectors larger than \([1, 1]\) (with magnitude \(m\)):

\[ m \times e^{1i \theta} = m \cos(\theta) + i m \sin(\theta) \]

Applications in Signal Processing

Representing Oscillations

Complex exponentials (\(e^{i\theta}\)) are the eigenfunctions of linear time-invariant (LTI) systems.
Any real-world oscillatory signal (e.g., EEG) can be expressed as a sum of complex exponentials using the Fourier transform.

Extracting Amplitude and Phase

For a complex number \(z = x + iy\)

Amplitude (magnitude): \(\vert z \vert = \sqrt{x^2 + y^2}\).
Phase (angle): \(\theta = \tan^{-1}(y/x)\).

In polar form:

\[ z = \vert z \vert e^{i\theta} \]

This is how the Hilbert transform extracts instantaneous amplitude/phase from real signals.

Power and Phase Information

Power: \(\vert z \vert ^2 = x^2 + y^2\) (energy of the oscillation).
Phase: \(\theta\) (timing information, e.g., “when” a frequency component peaks).

Euler’s formula for \(\theta = \pi\):

\[ e^{\pi i} + 1 = 0 \]

Used for extracting power and phase information; phase is a temporal information (when a frequency occurred).

theta = pi / 4 # 45°
z = 1.2
plot_polar([theta z], ticks = true)

theta = pi / 4 # 45°
euler = z * exp(1im * theta)
# or
euler = z * (cos(theta) + 1im * sin(theta))
plot_polar([DSP.angle(euler) abs(euler)]; ticks = true)

Example

Generating 2-second cosine wave of 10 Hz frequency, amplitude of 2 (arbitrary units) and phase shift of +30° (≈ 0.52 radians) and performing the Fourier transform to calculate amplitude and phase:

fs = 1000
t = collect(0:(1 / fs):2)
s = generate_cosine(10, t, 2, 30)
# X will contain:
# - Fourier coefficients (X.c)
# - amplitudes (X.a)
# - powers (X.p)
# - phases (X.ph)
X = ftransform(s, db=false)
# f will contain vector of frequencies
f = NeuroAnalyzer.freqs(s, fs)[1]

Plotting the signal:

NeuroAnalyzer.plot(t, s, ylabel = "Amplitude [AU]")

Plot power spectrum, ignore 0 Hz:

# we need to discard 0 Hz
NeuroAnalyzer.plot_psd(f,
                       X.p,
                       flim = (1, 40),
                       ylabel = "Power [AU^2]")

Similar results can be obtained using a periodogram (note that units are different):

psd_data = psd(s, fs = fs, db = false)
plot_psd(psd_data.f,
         psd_data.p,
         flim = (1, 40),
         ylabel = "Power [AU^2/Hz]")

The frequencies where there is no signal (which is all except for the 10 Hz frequency), the phase will be given by noise. You can set the phase here to zero:

X.ph[X.a .< maximum(X.a)] .= 0;

Plot phases:

plot_phase(rad2deg.(X.ph[1:100]), f[1:100],
           unit = :deg,
           type = :stem)

There is a 30° phase shift at 10 Hz, as expected.

Remember that sine has initial phase of -π/2 (-90°). Also, if the input signal does not have an integer of periods, it will cause some phase shift.

To avoid edge artifacts, windowing (e.g. Hann) could be applied:

w = generate_window(:hann, length(s))
NeuroAnalyzer.plot(t, w)

Multiply the signal by the window:

sw = s .* w
NeuroAnalyzer.plot(t, sw)

Re-run the Fourier transform:

Xw = ftransform(sw)

Plot amplitude spectrum:

plot_psd(f,
         Xw.a,
         flim = (0, 50),
         ylabel = "Amplitude [AU]")

Plot power spectrum:

plot_psd(f,
         Xw.p,
         flim = (0, 50),
         ylabel = "Amplitude [AU^2]")

Plot phases:

Xw.ph[Xw.a .< maximum(Xw.a)] .= 0
plot_phase(rad2deg.(Xw.ph[1:100]), f[1:100],
           unit = :deg,
           type = :stem)

Inverse Fourier Transform

Because the Fourier Transform is lossless, the Inverse Fourier Transform (IFT) can perfectly reconstruct the original time-domain signal from its frequency-domain representation.

For discrete signals, the Inverse Discrete Fourier Transform (IDFT) is:

\[ s[n] = \frac{1}{N} \sum_{k = 0}^{N - 1} X[k] e^{i 2\pi kn/N} \]

fs = 1000
t = collect(0:(1 / fs):2)
s = generate_cosine(10, t, 2, 30)
# f will contain vector of frequencies
f = freqs(s, fs)[1]
# Fourier transform
X = ftransform(s, nf = true)
# Inverse Fourier transform
s_ifft = real(ifft0(X.c))

Tip: By default NeuroAnalyzer transform() and ftransform() functions use Fourier transform and returns coefficients for positive frequencies only (internally it uses FFTW.rfft()). For inverse Fourier transform we need positive and negative frequencies in the Fourier components. To do so, set nf=true for transform() or ftransform().

Tip: ifft0() applies re-normalization (multiplies by the length of the Fourier coefficients).

NeuroAnalyzer.plot(t,
                   s)

NeuroAnalyzer.plot(t,
                   s_ifft)

Frequency Resolution

Frequency resolution determines minimal difference between two frequencies that can be separated (e.g., separating 10 Hz from 10.5 Hz).

If you need higher resolution, increase \(N_{FFT}\) (e.g., use longer segments or zero-padding).

The frequencies in the Fourier transform are in normalized units, where 1 is the data sampling rate.

The frequencies in the Fourier transform will always start at 0 Hz, regardless of the data sampling rate or the number of time points.

Frequency resolution:

\[ \Delta f = \frac{1}{T} = \frac{f_s}{N} \]

where:
\(\Delta f\) is the frequency resolution (Hz),
\(T\) is the length of the signal (in seconds),
\(f_s\) is the sampling rate (Hz),
\(N\) is the number of points in the FFT (or the length of your time-domain signal segment).

For example, if we want 0.2 Hz frequency resolution, the signal must be

\[ T = 1/0.2 = 5.0 \]

seconds long.

How to Improve Frequency Resolution

Increase \(N_{FFT}\):

Longer Segments: Use a longer time-domain segment of your signal. This increases N directly.
Zero-Padding: Artificially increases N by appending zeros to your signal. This doesn’t add information but interpolates the frequency spectrum, making it appear smoother.

Note: Zero-padding improves the visual resolution (more points in the frequency domain) but does not improve the actual ability to resolve close frequencies. Only longer time segments do that.