Sampling

Initialize NeuroAnalyzer

using NeuroAnalyzer
using Plots
eeg = load("files/eeg.hdf")

Sampling is the process of converting a continuous-time signal (e.g., EEG, audio, or sensor data) into a discrete-time signal by taking measurements at regular intervals. It involves two key steps:

Discretization: Dividing the continuous signal into equal time intervals (sampling points).
Quantization: Approximating the amplitude of the signal at each sampling point to a finite set of values (e.g., 16-bit, 32-bit).

Quantization Error: The difference between the original continuous signal and the quantized discrete signal.

Cause: Limited precision in representing amplitudes (e.g., rounding to the nearest integer or floating-point value).
Impact: Introduces noise into the signal, reducing accuracy.
Example: A 16-bit ADC (Analog-to-Digital Converter) has a quantization error of ±0.5 LSB (Least Significant Bit).

Aliasing: A distortion that occurs when a signal is sampled at a rate lower than twice its highest frequency component. This causes high-frequency signals to appear as low-frequency signals, leading to phantom slow activity in the data.

Cause: Undersampling violates the Nyquist-Shannon sampling theorem.
Impact:
- Loss of high-frequency information.
- Introduction of false low-frequency components.
- Irreversible—once aliased, the original high-frequency signal cannot be recovered.
Example: A 100 Hz sine wave sampled at 80 Hz will appear as a 20 Hz sine wave (100 Hz - 80 Hz = 20 Hz).

Anti-Aliasing Filter: A low-pass filter applied before sampling to remove frequencies above the Nyquist frequency.

Purpose: Ensures that the signal contains no frequencies higher than half the sampling rate. Prevents aliasing by cutting off high-frequency components that could distort the sampled signal.
Impact on Temporal Precision: The cut-off frequency of the anti-aliasing filter determines the highest measurable frequency and thus the temporal precision of the data (on the order of milliseconds).

Nyquist Frequency: The highest frequency that can be accurately represented in a sampled signal.

Formula:

\[ Nf = \frac{sr}{2} \]

where:
\(Nf\) is the Nyquist Frequency,
\(sr\) is the sampling rate.

Interpretation:

Frequencies above the Nyquist frequency will be aliased (appear as lower frequencies).
The Nyquist frequency is the theoretical limit for frequency extraction from the data.

Alternative Formula:

\[ Nf = \frac{N}{2} + 1 \]

where \(N\) is the number of samples in the signal.

The number of distinct frequencies that can be extracted is:

\[ \frac{N}{2} \text{ (for frequencies 1 to Nyquist)} + 1 \text{ (for 0 Hz)} \]

Raleigh Frequency: The lowest frequency that can be extracted from a signal.

Formula:

\[ \text{Raleigh Frequency} (rf) = \frac{1}{\text{Total Time}} \]

Interpretation:

Determines the longest period that can be resolved in the data.
For example, a 1-second signal has a Raleigh frequency of 1 Hz.

Nyquist–Shannon Sampling Theorem: To perfectly reconstruct a continuous signal from its samples, the sampling rate must be at least twice the highest frequency component of the signal.

Formula:

\[ f_s \geq 2 \times f_{max} \]

where:
\(f_s\) is the sampling rate,
\(f_{max}\) is the highest frequency in the signal.

Practical Implications:

Minimum Sampling Rate: \(2 \times f_{max}\) (Nyquist rate).
Optimal Sampling Rate: 2–5× the highest frequency to ensure accuracy and reduce artifacts.

Example: For a signal with a maximum frequency of 200 Hz, the optimal sampling rate is 400–1000 Hz.

Oversampling: Taking more samples than required (e.g., 32× oversampling for a 10 Hz cosine wave would use a sampling rate of 320 Hz).

General Rules for Sampling

Minimum Sampling:

2 points per cycle are required to avoid aliasing.
Minimum sampling frequency = 1/2 of the Nyquist frequency.

Optimal Sampling Rate:

2–5× the highest frequency of interest.
Example: For a signal with a maximum frequency of 100 Hz, the optimal sampling rate is 200–500 Hz.

Avoid Subsampling: Sampling rate should be 5–10× the speed of changes in the signal to prevent subsampling artifacts.

Narrowband vs. Broadband Signals:

For narrowband signals (e.g., a single frequency), a lower sampling rate may suffice.
For broadband signals (e.g., EEG with multiple frequency components), the Nyquist criterion must be strictly followed.

Sampling interval (\(dt\)): The time between two consecutive samples.

\[ dt = 1 / fs \]

\[ dt = t[2] - t[1] \]

where \(t\) are the time points.

Sampling Rate (\(f_s\)): The number of samples taken per second.

\[ fs = 1 / dt \]

Frequency Step Size (\(df\)): The smallest frequency difference that can be resolved in the frequency domain.

\[ df = 1 / (n \times dt) \]

where \(n\) is the number of time points.

Generating a vector of frequencies:

fs = 100
df = 1/fs
t = 0:df:10
n = length(t)
hz = 0:df:n

fs = 100
df = 1/fs
t = 0:df:10
freqs(t)

# 0 to 100 Hz, 50 points in-between
linspace(0, 100, 50)

Getting the index of a particular frequency:

frq_value = 57
_, frq_index = findmin(abs.(hz .- frq_value))
hz[frq_index]

57.0

frq_index = vsearch(frq_value, hz)[1]
hz[frq_index]

57.0

Resampling

Resampling (Downsampling and Upsampling) is the process of changing the sampling rate of a discrete-time signal. It is commonly used in EEG, audio, and signal processing to:

Reduce data size (downsampling).
Align signals with different sampling rates (upsampling).
Match the sampling rate of different devices or analysis tools.

Resampling must be done carefully to avoid aliasing and distortion.

Downsampling

Definition: Reducing the sampling rate of a signal by removing samples.

Key Steps:

Apply an Anti-Aliasing Filter:

Purpose: Remove frequencies above the new Nyquist frequency (half of the new sampling rate) to prevent aliasing.
Cutoff Frequency: Set to the Nyquist frequency of the new sampling rate, not the original.
Example: Downsampling from 1000 Hz to 250 Hz → New Nyquist frequency = 125 Hz. The anti-aliasing filter should cut off at ≤125 Hz.
Why? Frequencies above the new Nyquist frequency would alias into the lower frequency range during downsampling.

Remove Samples:

After filtering, remove every \(k\)-th sample to achieve the new sampling rate.
Example: Downsampling from 1000 Hz to 250 Hz → Keep 1 out of every 4 samples (1000/250=41000 / 250 = 41000/250=4).

Why Use Integer Fractions?

Efficiency: Downsampling by integer fractions (e.g., 1000 Hz → 250 Hz) ensures that the new sampling rate is a clean multiple of the original, avoiding fractional delays or phase distortions.
Compatibility: Many analysis tools and devices expect sampling rates that are integer multiples of each other.

Upsampling

Definition: Increasing the sampling rate of a signal by adding new samples.

Key Steps:

Insert Zeros:

Add zeros between existing samples to increase the sampling rate.
Example: Upsampling from 250 Hz to 1000 Hz → Insert 3 zeros between each sample.

Apply a Low-Pass Filter:

Purpose: Remove the imaging artifacts (high-frequency components introduced by zero insertion).
Cutoff Frequency: Set to the original Nyquist frequency (half of the original sampling rate).
Why? The inserted zeros create high-frequency artifacts that must be filtered out.

Why Upsample to the Highest Rate?

Consistency: When combining signals from different devices (e.g., one EEG at 256 Hz and another at 500 Hz), upsampling to the highest rate ensures all signals are aligned in time.
Avoid Loss of Information: Upsampling preserves the original signal’s frequency content while increasing the sampling rate.

Resampling in practice

When working with multi-rate signals (e.g., one EEG recording at 256 Hz and another at 500 Hz), the goal is to align the sampling rates so that the signals can be compared, combined, or analyzed together: upsample to the Highest Sampling Rate

Upsampling preserves the original signal’s frequency content while increasing the sampling rate. Downsampling the 500 Hz signal to 256 Hz would lose high-frequency information.

Showing the sampling rate:

sr(eeg)

Resampling to 512 Hz:

eeg_512 = NeuroAnalyzer.resample(eeg,
                                 new_sr = 512)
sr(eeg_512)

Resampling to 128 Hz:

eeg_128 = NeuroAnalyzer.resample(eeg,
                                 new_sr = 128)
sr(eeg_128)

Tip: resample() will choose upsample() or downsample() automatically, depending on the signal sampling frequency.

Getting frequencies

Frequencies and Nyquist frequency of the signals:

f_data = freqs(eeg)
println("Number of frequencies: $(length(f_data.hz))")
println("Nyquist frequency: $(f_data.nqf) Hz")

Number of frequencies: 128001
Nyquist frequency: 128.0 Hz

f_data = freqs(eeg_512)
println("Number of frequencies: $(length(f_data.hz))")
println("Nyquist frequency: $(f_data.nqf) Hz")

Number of frequencies: 256002
Nyquist frequency: 256.0 Hz

f_data = freqs(eeg_128)
println("Number of frequencies: $(length(f_data.hz))")
println("Nyquist frequency: $(f_data.nqf) Hz")

Number of frequencies: 64001
Nyquist frequency: 64.0 Hz

Converting frequencies from normalized units to Hz: create N/2 + 1 linearly spaced numbers between 0 and the Nyquist:

N = 1000
nf = 20
linspace(0, nf, N ÷ 2 + 1)

or better:

linspace(0, nf, floor(Int64, N / 2) + 1)

or:

fs = 40
linspace(0, fs / 2, N)