Introduction to EEG Analysis

Initialize NeuroAnalyzer
using LinearAlgebra
using StatsKit

EEG analysis often involves working with correlated time series. In this tutorial, you’ll learn basic aspects of such analysis pipelines.

Best Practices for Robust Data Analysis

  1. Simplicity and Clarity
  • Keep analyses simple: Focus on straightforward, interpretable methods. Complexity should serve clarity, not obscure it.
  1. Data Integrity
  • Stay close to the data: Avoid excessive transformations or assumptions that distance you from the raw observations.
  • Record clean signals: Always check impedance before each trial to ensure high-quality data collection.
  1. Understanding and Transparency
  • Understand your analyses: Be able to explain every step of your methodology and its rationale.
  • Write a clear Methods section: Document your process thoroughly so others can replicate and build on your work.
  1. Avoiding Pitfalls
  • Prevent over-fitting: Ensure your model generalizes beyond the training data.
  • Avoid mistakes: Use simulated data to validate results and perform sanity checks (e.g., data visualization, consistency tests).
  1. Reproducibility and Validation
  • Do replicable research: Design your study so that others can reproduce your findings.
  • Split-half replication: Validate results by analyzing one half of the data and replicating with the other half.
  • Collect extra data: Aim for at least 10% more data than needed to account for artifacts or exclusions.
  1. Exploring Parameters
  • Test different parameters: Compare results under varying conditions (e.g., filter cutoffs at 6 Hz vs. 8 Hz) to ensure robustness.
  1. Pilot Testing
  • Run pilot studies: Test stimuli, equipment, and participant understanding of instructions to identify potential issues early.
  1. Informed Decision-Making
  • Make informed choices: Base your methods and analyses on evidence, best practices, and the specific goals of your research.

Domains for Signal Analysis

Signal analysis can be performed in different domains, each offering unique insights into the data:

  1. Time Domain
  • Represented as \(x[t]\) (continuous-time) or \(x[n]\) (discrete-time).
  • Focuses on how the signal evolves over time.
  1. Frequency Domain
  • Reveals the signal’s frequency components, showing which frequencies are present and their amplitudes.
  1. Fourier Transform
  • A mathematical tool that converts a time-domain signal into its frequency-domain representation.
  • Essential for analyzing periodic signals and identifying frequency content.
  • Learn more about Fourier transform.
  1. Laplace Transform
  • Used for continuous-time signals.
  • Generalizes the Fourier transform and is particularly useful for analyzing system stability and transient responses.
  1. Z-Transform
  • Used for discrete-time signals.
  • The discrete-time counterpart to the Laplace transform, ideal for digital signal processing.

Key Relationship

Time Domain → Fourier Transform → Frequency Domain

The Fourier transform bridges the time and frequency domains, allowing you to analyze signals from both perspectives.

Signal Representation in EEG Analysis

EEG signals can be represented in various ways to highlight different aspects of brain activity:

  1. Amplitude-Based Representations
  • Voltage Parameters: Direct measurement of signal amplitude (e.g., microvolts, μV).
  • EEG Wave Amplitudes: Amplitudes of specific frequency bands:
    • Delta (0.5–4 Hz)
    • Theta (4–8 Hz)
    • Alpha (8–12 Hz)
    • Beta (13–30 Hz)
  1. EEG Pattern Amplitudes: Amplitudes of characteristic EEG patterns, such as:
  • Spike/wave complexes
  • Sharp waves
  • K-complexes
  • Sigma spindles
  1. Frequency-Based Representations: After applying the Fast Fourier Transform (FFT), the signal’s frequency content can be analyzed as:
  • Spectra of Background Activity: Frequency spectra for alpha, beta, delta, and theta bands.
  1. Key Frequency Metrics
  • Peak Frequency: The dominant frequency within a band of interest.
  • Bandpower: Power spectral analysis (μV²), emphasizing the most dominant frequency bands.
  • Amplitude Spectral Analysis: Square root of power (μV), representing all frequencies equally.
  1. Relative Activity
  • Relative Amplitude: The amplitude at a specific frequency (e.g., 35 μV at 11 Hz) divided by the total activity (e.g., area under the curve for 0–31 Hz: 190 μV), expressed as a percentage (e.g., 18%).
  1. Coherence Function
  • Coherence: Measures synchrony between two signals, providing a normalized estimate of their cross-correlation as a function of frequency.
  • Range: 0 to 1, where 1 indicates perfect similarity between the two signals.

Domains of EEG Analysis

EEG analysis spans multiple domains, each offering unique insights into brain activity. Below is an overview of the key domains and their characteristics:

  1. Time-Domain Analysis: Purpose: Maps the temporal sequence of cognitive operations. Limitations: Averaging can lose information due to noise and variability.
  • Event-Related Potentials (ERPs) / Fields (ERFs):
    • Derived by averaging many single trials.
    • Challenges:
      • Noisy input signals reduce amplitude.
      • Jittered or non-phase-locked events are lost.
      • Connectivity cannot be assessed.
      • Biological mechanisms and physiological links are unclear.
      • Low signal-to-noise ratio and statistical power.
      • Averaging discards significant information.
  • Microstates: Short-lived, stable topographic maps of brain activity.
  • Fractal-Based Dynamics: Analyzes self-similar patterns in EEG signals.
  1. Frequency-Domain Analysis: Purpose: Reveals frequency components of the signal. Limitation: Only interpretable for stationary data; FFT obscures temporal information.
  2. Time-Frequency Domain Analysis: Purpose: Combines temporal and spectral information, closely aligning with neurophysiology. Advantages: Suitable for both exploratory and hypothesis-driven analyses. Limitation: Reduced temporal precision compared to pure time-domain analysis.
  • Spectral Analysis: Examines frequency content over time.
  • Wavelet Convolution: Uses wavelets to analyze signals at different scales.
  • Connectivity Measures:
    • Phase-based connectivity
    • Power-based connectivity
    • Cross-frequency coupling
    • Mutual information
  • Spatial Transforms:
    • Independent Component Analysis (ICA)
    • Principal Component Analysis (PCA)
    • Beamforming / Minimum Norm Estimation (MNE)
    • Dipole fitting
  1. Spatial-Domain Analysis: Purpose: Focuses on the spatial distribution of EEG signals across the scalp.
  2. Advanced Processing Techniques: Multiway Processing: Analyzes data across multiple dimensions (e.g., time, frequency, space).
  • Topography: Maps the spatial distribution of EEG activity.
  • Source Separation: Isolates distinct sources of brain activity from mixed signals.
  • Source Localization: Estimates the origin of EEG signals within the brain.

Major Types of EEG Analysis

  1. Core Analysis Techniques
  • Spectral Analysis (Frequency-Domain): Focuses on frequency bands such as alpha, beta, gamma, delta, and theta.
  • Event-Related Potentials (ERPs): Analyzes brain responses to specific stimuli, such as P300 and Error-Related Negativity (ERN).
  • Source Imaging (ESI): Estimates the location, direction, and distribution of EEG sources by solving inverse problems.
  1. Additional Analysis Types
  • Steady-State Topography: Includes Steady-State Visual Evoked Potentials (SSVEP) and Auditory Steady-State Responses (ASSR).
  • Hemispheric Asymmetry: Examines differences in activity between the left and right hemispheres.
  • Coherence: Measures synchrony between signals across different brain regions.
  • Power Spectral Density (PSD): Quantifies the power of signal components at different frequencies.
  • Entropy-Based Measures:
    • Entropy: General measure of signal complexity.
    • Multiscale Entropy: Entropy calculated across different frequency bands.
    • Differential Entropy (DE): Measures entropy differences.
    • Differential Asymmetry (DASM): Asymmetry in differential entropy.
    • Rational Asymmetry (RASM): Ratio-based asymmetry measure.
    • Asymmetry (ASM): General hemispheric asymmetry.
  • Causality and Synchronization:
    • Differential Causality (DCAU): Assesses causal relationships.
    • Synchronization Index (SI): Quantifies signal synchronization.
  • Temporal Dynamics:
    • Detrended Fluctuation Analysis (DFA): Studies long-range correlations.
    • Recurrence Quantitative Analysis (RQA): Analyzes recurring patterns in data.
  1. Non-Linear Techniques
  • Correlation Dimension (CD): Measures the dimensionality of the signal’s attractor.
  • Largest Lyapunov Exponent (LLE): Quantifies chaos and predictability.
  • Hurst Exponent (H): Assesses long-term memory and self-similarity.
  • Fractal Dimension (FD): Describes the complexity of signal patterns.
  • Higher-Order Spectra (HOS): Analyzes non-linear interactions in frequency domains.
  • Phase Space Plots: Visualizes system dynamics in phase space.
  • Recurrence Plots: Identifies recurring states in time-series data.

Levels of Data Analysis

Data analysis in research can be conducted at two primary levels, each serving distinct purposes:

  1. Subject-Level (First-Level) Analysis: Focus: Analysis of data within individual subjects, typically across epochs or trials.
  • Descriptive Statistics: Measures such as mean, variance, or standard deviation to summarize data.
  • Test Statistics: Statistical tests like t-tests to evaluate differences or effects within the subject’s data.
  • Model Parameters: Parameters derived from fitting statistical models (e.g., General Linear Model) to the data.
  • Contrasts: Comparisons between:
    • Different time periods within a condition (e.g., event-related response vs. baseline).
    • Different conditions (e.g., task vs. control).
  1. Group-Level (Second-Level) Analysis: Focus: Aggregating subject-level measures to derive group-level insights.
  • Between-Subject Effects: Examines differences between groups (e.g., comparing patients vs. healthy controls).
  • Within-Subject Effects: Investigates differences between conditions within the same group (e.g., pre-treatment vs. post-treatment).
  • Combined Effects: Assesses interactions between between-subject and within-subject factors.

Permutation Testing

Overview

Permutation testing is a non-parametric statistical method commonly used to assess the significance of observed effects without relying on distributional assumptions. It works by systematically reshuffling the data to create a null distribution of the test statistic.

How It Works

  1. Permutation Process:
  • For subject-level analysis, condition labels are randomly permuted across trials.
  • For group-level analysis, condition labels are randomly permuted across subjects.
  1. Recalculating the Test Statistic:
  • The test statistic (e.g., mean difference, t-value) is recalculated for each permutation of the data.
  1. Building the Null Distribution:
  • The collection of test statistics from all permutations forms the null distribution, representing the expected range of values under the null hypothesis (no true difference between conditions).

Evaluating Significance

  • The test statistic from the original, non-permuted data is compared to the null distribution.
  • The p-value is calculated as the proportion of permuted test statistics that are as extreme or more extreme than the observed statistic.
  • If this p-value falls below the chosen statistical threshold (e.g., 0.05), the null hypothesis can be rejected, indicating a significant difference between conditions.

Mass Univariate Analysis

Definition

Mass univariate analysis treats each sample - across space (sensors or sources), time, and/or frequency - as a separate univariate variable. Although each sample technically represents a single variable, EEG data is inherently correlated, so treating samples as independent is often an oversimplification.

Process

  • A statistical test is applied simultaneously to all samples (or a preselected subset) within the dataset.
  • The goal is to identify experimental effects at any individual sample, rather than focusing on a single, isolated sample.

Key Characteristics

  • Massive Scale: The analysis involves thousands to millions of tests, depending on the number of samples across time, frequency, sensors, or sources.
  • Multiple Comparisons Problem: The sheer number of tests increases the risk of false positives (Type I errors).

Addressing Multiple Comparisons

  • False Discovery Rate (FDR): A statistical method to control the expected proportion of false positives among significant results.
    • FDR provides a criterion for significance that accounts for the multiple comparisons problem.
    • It defines the expected rate of Type I errors across all tests, ensuring robust interpretation of results.

The Multiple Comparisons Problem

Definition

The multiple comparisons problem arises when conducting numerous statistical tests: as the number of tests increases, so does the probability of a Type I error (false positive) occurring in at least one test. This overall probability is known as the family-wise error rate (FWER).

Family-Wise Error Rate (FWER): The probability of making at least one Type I error across a set of statistical tests. Controlling FWER is a common method for correcting multiple comparisons.

Mitigating the Problem

  1. Reducing the Number of Comparisons
  • Region of Interest (ROI) Approach:
    • Limit statistical tests to predefined regions (e.g., specific sensors, sources, time windows, or frequency bands).
    • Example: In fMRI, tests are restricted to voxels within predefined spatial ROIs. Similarly, in EEG/MEG, focus on pre-selected sensors, sources, time windows, or frequency bands.
    • This approach dramatically reduces the number of comparisons requiring correction.
  • Collapsing Data:
    • Instead of treating each sample as a univariate measure, collapse data across samples to create summary measures (e.g., mean, maximum).
    • Example: In event-related response analysis, summarize components using measures like amplitude or latency rather than testing each sample individually.
  1. Avoiding Bias in Sample Selection
  • Independence Criterion:
    • Criteria for selecting samples or summary measures must be independent of the experimental effect being tested.
    • Example: It is invalid to define an ROI based on samples showing the largest condition differences, as this biases tests toward rejecting the null hypothesis.
  • Circular Inference:
    • Problem: Selecting time-frequency windows where maximal condition differences are observed introduces bias.
    • Solution:
      • Average data across subjects, conditions, trials, or epochs.
      • Use the averaged data to define windows of interest.
      • Apply these windows to compare subjects, conditions, trials, or epochs.
    • Example workflow: Trials/Epochs → Average per subject → Group-level analysis on averaged data.

Solutions to the Multiple Comparisons Problem

The Challenge

When conducting multiple statistical tests, the probability of finding a false positive (Type I error) increases beyond the nominal false positive rate set by the statistical threshold. This inflates the risk of incorrectly identifying a significant effect where none exists.

The chance of finding a significant effect present somewhere in the data when no genuine effect exists anywhere in the data will be much larger than expected from the nominal false positive rate set by the statistical threshold.

To maintain accurate control of the type I error rate across the family of tests, it is therefore necessary to reduce the statistical threshold used for the individual tests to compensate for the presence of multiple comparisons.

Methods for Correction

  1. Bonferroni Correction
  • Approach: Adjusts the significance level based on the number of comparisons.
  • Limitation: Can be overly conservative, reducing statistical power.
  1. False Discovery Rate (FDR)
  • Approach: Controls the expected proportion of false positives among significant results, rather than the probability of any false positive.
  • Advantages:
    • Less stringent than Bonferroni correction.
    • Reduces Type II errors (false negatives).
  • Implementation: Often applied using the Benjamini-Hochberg procedure.
  1. Null Distribution of the Maximum Test Statistic
  • Approach: Compares individual test statistics against the null distribution of the maximum test statistic across all tests.
  • Advantages:
    • Provides accurate control of the family-wise error rate (FWER).
    • Generally results in fewer Type II errors compared to Bonferroni correction.
  • Implementation:
    • For parametric tests, the null distribution can be calculated using random field theory.
    • For non-parametric tests (e.g., permutation testing), the null distribution is derived by measuring the distribution of the maximum test statistic across permutations.
  1. Cluster Correction
  • Approach: Only considers activations as significant if they form part of a spatially or temporally contiguous cluster.
  • Implementation:
    • Determine cluster significance by comparing cluster sizes to those derived from the null distribution of maximum cluster sizes.
    • Helps control FWER while preserving sensitivity to true effects.

Bonferroni Correction

Limitations in Time-Frequency Analysis

The Bonferroni correction is often too stringent for comparing time-frequency data points. This method adjusts the significance threshold based solely on the number of tests (e.g., the number of time points), without considering the overall pattern in the spectrogram. Consequences

  • Reduced Statistical Power: The correction significantly lowers the ability to detect genuine effects.
  • Increased Type II Errors: This leads to a higher likelihood of failing to detect true effects (false negatives).

Cluster-based analysis

Overview

Cluster-based analysis addresses the multiple comparisons problem by focusing on clusters of contiguous samples that exhibit an experimental effect (e.g., a segment of spectrogram), rather than analyzing each sample individually. This approach reduces the need for extensive corrections while maintaining statistical rigor.

Key Steps

  1. Initial Statistical Comparisons:
  • Perform statistical comparisons for individual samples, similar to mass univariate analysis.
  1. Cluster Identification:
  • Use the results of these comparisons to identify clusters of samples that exceed a predefined statistical threshold.
  1. Cluster Quantification:
  • Quantify each cluster using a cluster-level statistic, such as the sum of test statistics within the cluster (often referred to as cluster mass).
  1. Statistical Testing of Clusters:
  • Calculate the probability of observing a cluster statistic of equal or greater magnitude anywhere in the data under the null hypothesis.
  1. Null Distribution Calculation:
  • The null distribution of cluster statistics can be derived using random field theory, but for MEG/EEG data, a non-parametric permutation test is more commonly used.
  • Permutation Process:
    • Randomly permute condition labels across subjects (or conditions in within-subject designs).
    • Repeat the clustering procedure for the permuted data.
    • Record the maximum cluster statistic for each permutation.
    • The distribution of these maximum values forms the null distribution.
  1. Hypothesis Testing:
  • Compare the observed cluster statistics to the null distribution to calculate p-values and determine statistical significance.

Advantages and Limitations

  • Advantages:
    • Avoids the need for multiple comparison corrections by focusing on clusters.
    • Provides a robust way to identify spatially or temporally extended effects.
  • Limitations:
    • Lacks precision in localizing effects to individual samples.
    • Statistical effects are interpreted at the cluster level, not at the level of individual data points.

Stationarity

Definition

Stationarity refers to a signal whose statistical properties - such as mean, variance, frequency, and autocovariance - remain constant over time. For example, in a group of sampled signals (an ensemble), the histograms of amplitudes should be similar across different time points.

Testing for Stationarity

To assess stationarity, compare the signal’s statistical properties at different time points.

Characteristics of a Stationary Signal

  • Average power remains constant over time.
  • A signal may exhibit mean stationarity (constant mean) but not variance stationarity (constant variance).

Ergodic Process

An ergodic process occurs when a single sampled signal is representative of the entire ensemble, meaning its statistical properties match those of the ensemble over time.

Non-Stationarity in EEG

  • EEG signals are inherently non-stationary due to dynamic changes in the states of neuronal assemblies during brain functions (e.g., transitions between different cognitive or physiological states).
  • Non-stationarities in EEG reflect underlying neural events and processes.

Challenges with Non-Stationary Signals

  • Many signal processing methods, such as the Fast Fourier Transform (FFT), assume stationarity. Applying FFT to non-stationary signals can introduce high-frequency noise and distort results.

Addressing Non-Stationarity

To achieve local stationarity, EEG signals can be segmented using:

  1. Fixed Segments: Dividing the signal into equal-length intervals.
  2. Adaptive Methods: Dynamically adjusting segment boundaries based on signal characteristics.

Handling Violations of Mean Stationarity

If a signal violates mean stationarity, consider the following approaches:

  • Detrending: Remove trends from the signal.
  • Pre-whitening: Compute the derivative of the signal to attenuate low-frequency dynamics.
  • Time-Varying Mean Subtraction: Calculate a time series of the time-varying mean and subtract it from the original signal.
  • Filtering: Apply high-pass (HP), low-pass (LP), or band-pass (BP) filters to isolate relevant frequency components.

Univariate and Multivariate Analysis

Univariate Analysis

Definition: Analyzes one channel at a time.
Process: Each channel is evaluated independently, and results are summarized afterward.
Example: “80% of channels showed a spectral peak at 23.4 Hz.”

Multivariate Analysis

Definition: Analyzes multiple channels simultaneously.
Key Features:

  • Considers the conjunctive relationships between multiple measurements (e.g., all EEG channels).
  • Extracts features from the entire dataset, capturing interactions among channels.
  • Requires multichannel data to perform effectively.

Generating multivariate time series data:

v = [1 0.5 0; 0.5 1 0; 0 0 1]

# Cholesky decomposition of positive semi-definite covariance
c = Matrix(cholesky(v*v'))

# 10_000 time points
n = 10_000

# d contains 10_000 × 3 random numbers
# column 1 and 2 are correlated at 0.8
# column 3 is uncorrelated
d = randn(n, size(v, 1)) * c
display(d[1:4, :])
4×3 Matrix{Float64}:
  0.242899   -0.667166  -1.35143
  0.713569    0.250843  -2.24936
  0.0405446  -0.199368  -0.793128
 -1.56027    -1.74489    0.996115

Signal Decomposition

Definition

Signal decomposition involves separating a signal into distinct components to analyze its inherent features. This process helps reveal underlying patterns, frequencies, or modes that may not be apparent in the raw signal.

Common Methods

  1. Fourier Transform (FT): Decomposes a signal into its constituent frequencies, revealing its spectral content.
  2. Wavelet Transform (WT): Provides a time-frequency representation, capturing both when and at what frequency a feature occurs.
  3. Empirical Mode Decomposition (EMD): Adaptively decomposes a signal into intrinsic mode functions (IMFs), which represent oscillatory modes embedded in the data.
  4. Singular Value Decomposition (SVD): Factorizes a signal matrix into orthogonal components, useful for dimensionality reduction and noise filtering.
  5. Principal Component Analysis (PCA): Transforms correlated variables into a set of uncorrelated components, highlighting the most significant patterns.
  6. Proper Orthogonal Decomposition (POD): Extracts spatial or temporal modes that capture the most energetic structures in the signal.
  7. Dynamic Mode Decomposition (DMD): Identifies dynamic patterns and their evolution over time, useful for analyzing complex systems.

Key Concepts

Mode: A distinct portion or component of the signal, often representing a specific pattern or oscillation.

Multivariate Empirical Mode Decomposition (MEMD):

  • Extends EMD to multivariate signals by using real-valued signal projections to define multi-dimensional envelopes of local extrema.
  • Simultaneous Decomposition: All signals are decomposed together, ensuring that scales common to multiple signals align in equally indexed IMFs (“mode alignment property”).