Introduction to EEG Experiments

Initialize Packages

using CairoMakie
using StatsKit
using Random

Conducting EEG experiments is a powerful way to study brain activity by recording electrical signals from the scalp. EEG is widely used in neuroscience, psychology, clinical research, and even brain-computer interfaces (BCIs). Here’s a structured introduction to help you understand the basics of EEG experiments:

What is EEG?

Definition: EEG measures the electrical activity of the brain using electrodes placed on the scalp.
What it captures: Voltage fluctuations resulting from ionic currents in neurons, primarily from the cerebral cortex.
Temporal resolution: Excellent (millisecond precision), but spatial resolution is limited compared to techniques like fMRI.

Hardware

Electrodes: Typically 10–256 electrodes (e.g., 10-20 system for standard placement).
Amplifiers: Boost the tiny brain signals (microvolts) for recording.
Digitizer: Converts analog signals to digital data.
Recording device: Stores data for analysis (e.g., EEG caps, portable systems).

Steps in an EEG Experiment

Define your research question: What brain activity or cognitive process are you studying? (e.g., attention, memory, epilepsy, sleep).
Design the paradigm: Choose a task or stimulus (e.g., oddball paradigm, resting-state, event-related potentials (ERPs)).
Select participants: Consider age, health, and sample size.

Data Collection

Electrode placement: Follow the 10-20 system or high-density arrays.
Impedance check: Ensure good contact (<5–10 kΩ) between electrodes and scalp.
Calibration: Record baseline activity (e.g., eyes open/closed).
Run the experiment: Present stimuli or tasks while recording EEG.

Data Processing

Preprocessing:
- Filtering (e.g., bandpass 0.5–40 Hz).
- Artifact removal (e.g., eye blinks, muscle noise).
- Re-referencing (e.g., average or mastoid reference).
Analysis:
- Time-domain: Event-related potentials (ERPs) like P300 or N170.
- Frequency-domain: Power spectral density (PSD), alpha/beta/gamma bands.
- Time-frequency analysis: Wavelets or short-time Fourier transform (STFT).

Interpretation

Link EEG features to cognitive/neural processes.
Statistical analysis (e.g., t-tests, ANOVA, machine learning).

Common EEG Paradigms

Event-Related Potentials (ERPs): Measure brain responses to stimuli (e.g., P300 for attention).
Resting-State EEG: Record brain activity while the participant is relaxed (e.g., alpha waves).
Steady-State Evoked Potentials (SSEPs): Use flickering stimuli to study visual/auditory processing.
Sleep Studies: Monitor brain activity during sleep stages.

Challenges in EEG Experiments

Artifacts: Eye movements, muscle activity, or electrical noise can contaminate signals.
Volume conduction: Signals from deep brain structures are attenuated.
Interpretation: EEG signals are complex and require careful analysis.

Applications of EEG

Clinical: Diagnosing epilepsy, sleep disorders, or brain injuries.
Cognitive neuroscience: Studying perception, memory, and attention.
Brain-Computer Interfaces (BCIs): Controlling devices with brain signals (e.g., prosthetics, spellers).

Example Workflow

Hypothesis: “Alpha waves (8–12 Hz) increase during eyes-closed relaxation.”

Experiment: Record EEG from participants with eyes open and closed.

Analysis: Compare alpha power between conditions.

Result: Confirm or refute the hypothesis.

Literature Review

Ensure your study is novel and methodologically sound.
Search databases (PubMed, Google Scholar) for similar studies.
Has this been done before? Can the methodology be improved? Adapt your design based on past findings and limitations.

Plan Thoroughly

Avoid methodological flaws and logistical issues.
Outline every step: hypothesis, paradigm, equipment, participant recruitment, data analysis, and contingencies.
Create a timeline and budget.
Rule of thumb: Over-plan to anticipate challenges.

Address EEG-Specific Challenges

Minimize artifacts and ensure data quality.
Movement artifacts: Use a chin rest, bite bar, or comfortable seating to reduce head/body movement.
Muscle artifacts: Avoid tasks requiring excessive facial or neck muscle use.
Environmental noise: Shield from electrical interference (e.g., Faraday cage if possible).

Ensure Adequate Repetitions

Improve signal-to-noise ratio and statistical power.
Aim for >30 repetitions per condition, ideally >100 for robust averaging (e.g., ERP studies).
Balance repetition with participant fatigue.

Keep Experiments Simple

Isolate variables and simplify interpretation.
Test one hypothesis at a time.
Avoid complex multi-task paradigms unless necessary.
Simpler designs yield clearer, more interpretable results.

Perform Power Calculations

Determine the required sample size for statistical significance.
Use tools like GPower or simulations based on pilot data.
Calculate power for your expected effect size, alpha, and desired power (typically 0.8).

Pre-Register Your Study

Enhance transparency and credibility.
Register your hypothesis, methods, and analysis plan on platforms like OSF, AsPredicted, or ClinicalTrials.gov.
Prevents “p-hacking” and confirms your study’s rigor.

Verify Stimulus Presentation

Ensure precise timing and synchronization with EEG recordings.
Check monitor refresh rate (e.g., 60 Hz = 16.7 ms per frame).
Use high-precision software and hardware (e.g., photodiodes) to confirm timing accuracy.
Log stimulus onsets relative to EEG triggers.

Conduct Pilot Testing

Identify and resolve issues before full data collection.
Run at least 1 pilot participant (ideally 2–3).
Test the entire pipeline: setup, task, data quality, and participant comfort. Adjust protocols based on pilot feedback.

Control Environmental Conditions

Optimize signal quality and participant comfort.
Temperature: Keep the room cool (~20–22°C). Heat increases sweat and electrode impedance.
Lighting: Dim but sufficient to avoid eye strain.
Noise: Minimize auditory distractions.

Document Everything

Ensure reproducibility and troubleshoot issues.
Keep detailed notes during data collection:
- Participant ID, date, time, and any unusual events (e.g., “Participant sneezed at 10:23”).
- Technical issues (e.g., “Electrode Fz lost contact; re-applied gel”).

Organize Data in BIDS Format

Standardize data for easy sharing and analysis.
Use the Brain Imaging Data Structure (BIDS) to organize files:
- Consistent naming (e.g., sub-01_task-nback_eeg.set).
- Include metadata (e.g., channel locations, task events).

Avoid Mid-Experiment Changes

Maintain consistency and validity.
Do not modify the paradigm, equipment, or analysis plan after data collection begins.
If changes are unavoidable, document them and analyze their impact separately.

Final Tip: Iterate and Improve

After completing your study, review what worked and what didn’t. Use these insights to refine future experiments.

Approaches

Exploratory Approach

Goal: Discover patterns, generate hypotheses, or identify unexpected effects without strong a priori predictions.

Characteristics:

Flexible and open-ended: No rigid hypotheses; data drives the discovery.
Statistical methods:
- Permutation testing (non-parametric, robust to violations of assumptions).
- Multiple comparisons correction (e.g., FDR, Bonferroni) to control false positives.
Assumptions: Minimal; avoids reliance on distributional assumptions (e.g., normality).
Sensitivity: Lower for small effects (higher risk of Type II errors).
Typical use cases:
- Initial exploration of novel datasets.
- Pilot studies or secondary analyses.
- 2-group comparisons (e.g., patients vs. controls) or correlational analyses.

Pros:

Uncovers unexpected patterns or effects.
Useful when little is known about the phenomenon.

Cons:

Risk of false positives if corrections are inadequate.
Harder to interpret without follow-up confirmation.

Hypothesis-Driven Approach

Goal: Test specific, pre-defined hypotheses with controlled experiments.

Characteristics:

Structured and focused: Hypotheses are formulated a priori based on theory or prior evidence.
Statistical methods:
- Traditional parametric tests (e.g., ANOVA, t-tests, regression) if assumptions are met.
- No need for multiple comparisons correction if hypotheses are limited and orthogonal.
Assumptions: Requires careful attention to statistical assumptions (e.g., normality, sphericity).
Sensitivity: Higher for detecting predicted effects (lower Type II error risk).
Typical use cases:
- Factorial designs (e.g., 2×2 interactions).
- Confirmatory studies to validate exploratory findings.
- Mechanistic or causal questions (e.g., “Does stimulus X evoke ERP Y?”).

Pros:

Clear interpretation of results.
Higher reproducibility and rigor.
Efficient for testing specific theories.

Cons:

May miss unexpected or novel effects outside the hypothesis.
Requires strong prior knowledge or theoretical grounding.

Key Differences Summary

Feature	Exploratory	Hypothesis-Driven
Primary Goal	Discover patterns/hypotheses	Test pre-defined hypotheses
Flexibility	High (data-driven)	Low (theory-driven)
Statistical Methods	Permutation tests, corrections	Parametric tests (ANOVA, t-tests)
Assumptions	Minimal	Often strict (e.g., normality)
Sensitivity	Lower for small effects	Higher for predicted effects
Multiple Corrections	Required	Often not required
Risk of False +	Higher (if uncorrected)	Lower (controlled)
Risk of False –	Higher (misses small effects)	Lower (focused on predicted effects)
Use Cases	Pilot studies, novel datasets	Confirmatory, factorial designs

When to Use Each Approach?

Exploratory:

Early-stage research.
Complex datasets (e.g., high-dimensional EEG, fMRI).
When you suspect “unknown unknowns.”

Hypothesis-Driven:

Testing specific theories or mechanisms.
Replicating/validating prior findings.
Experiments with clear predictions (e.g., “Stimulus A will increase alpha power”).

Hybrid Approach

Many studies combine both:

Start with exploratory analyses to identify patterns.
Follow up with hypothesis-driven experiments to confirm and refine findings.

Group- vs Subject-Level

Subject-Level Analysis

Focus: Examining effects within individual participants across trials or conditions.

Key Characteristics:

Variation over trials: Analyzes consistency or variability of effects (e.g., ERPs, oscillatory power) within a single subject.
No population generalization: Results apply only to the individual; cannot infer broader trends.
Effect size requirements: Needs large, robust effects to be meaningful (small effects may be noise or idiosyncratic).
Use cases:
- Single-case studies (e.g., clinical cases, BCIs).
- Exploring individual differences or outliers.
- Pilot data to identify consistent patterns before group analysis.

Pros:

Highlights individual variability (e.g., some subjects may show strong effects, others none).
Useful for personalized medicine or adaptive paradigms.

Cons:

Limited generalizability.
Risk of overinterpreting noise as signal.

Group-Level Analysis

Focus: Examining effects across a group of participants to generalize to a population.

Key Characteristics:

Consistency across groups: Averages data across subjects to identify effects that are reliable across the population.
Population generalization: Results can be inferred to the broader population (if sampling is representative).
Sensitivity to small effects: Detects subtle but consistent effects (e.g., small ERP amplitude changes) due to increased statistical power.
Use cases:
- Hypothesis testing in experimental designs.
- Clinical trials or comparative studies (e.g., patients vs. controls).
- Replicating or validating prior findings.

Pros:

Higher statistical power and generalizability.
Reduces impact of individual variability or outliers.

Cons:

May obscure meaningful individual differences.
Requires careful sampling to avoid bias.

Comparison Table

Feature	Subject-Level Analysis	Group-Level Analysis
Scope	Within-individual (trials)	Across-individuals (population)
Generalizability	None (single subject)	Yes (to population)
Effect Size	Large effects needed	Can detect small, consistent effects
Variability	Highlights individual differences	Averages across subjects
Statistical Power	Low (limited by N=1)	High (increases with sample size)
Use Cases	Single-case studies, BCIs, pilots	Experimental designs, clinical trials

When to Use Each Approach?

Subject-level:

You’re interested in individual responses (e.g., a patient’s unique brain activity).
Testing a proof-of-concept (e.g., a BCI prototype).
Exploring outliers or extreme cases.

Group-level:

You want to generalize findings to a population.
Testing theoretical hypotheses (e.g., “Does medication X reduce alpha power in group Y?”).
Comparing groups (e.g., healthy vs. clinical populations).

Combining Both Approaches

Start with group-level analysis to identify consistent effects.
Follow up with subject-level analysis to explore individual variability or outliers.
Use mixed-effects models to account for both group and subject-level variance.

Event-Related Experiments

Design and Considerations

Event-related experiments are a cornerstone of EEG research, allowing researchers to link brain activity to specific stimuli or tasks. Here’s a breakdown of the typical structure, considerations, and limitations of such designs:

Structure of an Event-Related Experiment

Fixation Period

Purpose: Serves as a baseline to compare brain activity before stimulus presentation.
Duration: Typically 500–1000 ms to stabilize brain activity.
Use: Helps isolate stimulus-related activity by subtracting baseline activity.

Stimulus Presentation

Purpose: Present the experimental stimulus (e.g., facial emotions, words, sounds).
Example: Two conditions, such as happy vs. angry facial expressions.
Timing: Duration depends on the stimulus type (e.g., 200–2000 ms).

Response Period

Purpose: Subject makes a behavioral response (e.g., button press, verbal response).
Consideration: Responses generate their own brain activity, which can overlap with stimulus-related activity.

Key Considerations

Timing Variability

Issue: Brain signals that vary in timing relative to experimental events (e.g., jittered responses) are difficult to measure using traditional event-related designs.
Solution: Use time-frequency analyses or single-trial methods to account for variability.

Motor Responses

Issue: Button presses or other motor responses generate their own event-related potentials (ERPs), such as the readiness potential (pre-movement activity) and motor potentials (post-movement activity).
Solution: Separate motor-related activity from stimulus-related activity using temporal or spatial filtering. Use control conditions (e.g., no-response trials) to isolate motor-related ERPs.

Baseline Correction

Issue: Without a proper baseline, it’s difficult to determine whether observed activity is stimulus-related or ongoing brain activity.
Solution: Use the fixation period as a baseline to normalize stimulus-related activity.

Limitations of Event-Related Designs

Fixed Timing: Assumes brain responses are time-locked to stimuli, which may not always be the case (e.g., cognitive processes like decision-making).
Overlap of Responses: Stimulus-related and motor-related brain activity can overlap, complicating interpretation. Artifacts: Eye movements, blinks, or muscle activity can contaminate EEG signals, especially during response periods.

Alternatives and Extensions

Time-Frequency Analysis: Captures changes in oscillatory activity (e.g., alpha, beta, gamma) that may not be time-locked to stimuli.
Single-Trial Analysis: Examines variability across individual trials rather than averaging.
Model-Based Approaches: Use computational models to predict brain activity based on experimental variables.

Block-Based (Boxcar) Experiments

Overview and Applications

Block-based designs, also known as boxcar designs, are ideal for studying sustained brain responses during continuous tasks or stimuli. Unlike event-related designs, which focus on transient responses to discrete events, block designs capture ongoing brain activity over extended periods.

Key Features of Block-Based Experiments

Structure:

Blocks of Conditions: Different experimental conditions (e.g., tasks or stimuli) are presented in separate blocks (e.g., 10–30 seconds each).
Event Triggers: Mark the onset of each block for data segmentation and averaging.
Baseline Periods: Include rest or inter-block intervals to serve as a baseline for comparison.

Data Analysis:

Averaging Across Blocks: Brain signals are averaged across blocks of the same condition, rather than individual events.
Loss of Temporal Resolution: Collapsing data within a block removes fine-grained timing information about brain responses.

Sustained Responses:

Block designs measure continuous brain activity (e.g., sustained attention, ongoing sensory processing).
Example: Presenting a visual stimulus at 4 Hz generates a steady-state evoked response (SSER) at 4 Hz (or harmonics).

Advantages of Block Designs

Sensitivity to Sustained Activity: Ideal for detecting prolonged or tonic brain responses (e.g., sustained attention, continuous motor tasks).
Signal-to-Noise Ratio: Averaging across blocks improves the signal-to-noise ratio for sustained effects.
Frequency Tagging: Different stimuli can be modulated at distinct frequencies (e.g., 4 Hz for Stimulus A, 6 Hz for Stimulus B). Brain responses to each stimulus can then be isolated by analyzing activity at the corresponding frequencies.

Applications

Steady-State Evoked Responses (SSERs):
- Example: Flickering a visual stimulus at 10 Hz elicits a 10 Hz brain response, allowing precise measurement of sensory processing.
Cognitive Tasks:
- Example: Comparing brain activity during blocks of a working memory task vs. rest.
Frequency Tagging:
- Example: Tagging two overlapping stimuli at 12 Hz and 15 Hz to separate their neural responses in the frequency domain.

Comparison with Event-Related Designs

Feature	Block-Based Design	Event-Related Design
Focus	Sustained brain activity	Transient brain responses
Timing	Continuous (seconds to minutes)	Discrete (milliseconds to seconds)
Analysis	Averaging across blocks	Averaging across trials
Temporal Resolution	Low (sustained activity)	High (time-locked to events)
Baseline	Rest blocks or inter-block intervals	Pre-stimulus fixation period
Use Cases	Sustained attention, SSERs, frequency tagging	Transient ERPs, rapid stimuli

Considerations

Baseline Selection: Use rest blocks or pre-block intervals to establish a baseline for comparison.
Artifacts: Minimize movement or muscle artifacts during long blocks.
Design Flexibility: Blocks can be counterbalanced or randomized to control for order effects.

Example: Frequency Tagging

Stimulus A: Flickers at 12 Hz → Brain response at 12 Hz.
Stimulus B: Flickers at 15 Hz → Brain response at 15 Hz.
Analysis: Measure EEG power at 12 Hz and 15 Hz to separate responses to each stimulus.

Triggers

Triggers are critical for synchronizing brain activity with experimental events (e.g., stimuli, responses). They mark the precise timing of events in your EEG data, enabling accurate analysis of event-related brain activity.

How Triggers Work

Transmission:

Triggers are typically sent using Transistor-Transistor Logic (TTL), which transmits integer values (e.g., 0 = no trigger, 1/2/3 = specific event codes).
Triggers can also be sent via analog signals (e.g., voltage changes from peripheral devices) or based on physiological measures (e.g., button presses, eye movements).

Timing:

Triggers are usually short-duration pulses (e.g., 5–50 ms) to precisely mark the onset of an event (e.g., stimulus presentation, response execution).

Synchronization:

Triggers align EEG data with experimental events, allowing you to:
- Segment data into epochs (e.g., -200 to 800 ms around stimulus onset).
- Average across trials to extract event-related potentials (ERPs) or time-frequency responses.

Common Sources of Triggers

Source	Example	Trigger Type
Stimulus Presentation	Onset of a visual/auditory stimulus	TTL pulse (e.g., value = 1)
Behavioral Responses	Button press, vocal response	TTL or analog signal
Physiological Events	Eye blink, heart rate (ECG)	Analog or TTL
External Devices	fMRI scanner, eye tracker	TTL or analog

Challenges with Triggers

Delays:

Triggers may be delayed due to:
- Transmission latency (e.g., hardware/software processing).
- Registering delays (e.g., time for the EEG system to log the trigger).
Solution: Measure and correct for delays during analysis (e.g., time-shift triggers in your data).

Jitter:

Inconsistent timing between trigger onset and the actual event (e.g., due to monitor refresh rates).
Solution: Use high-precision hardware/software (e.g., PsychoPy, LabStreamingLayer) and log exact timing.

Missed Triggers:

Triggers may fail to register due to hardware/software issues.
Solution: Verify trigger logging during pilot testing and include redundancy (e.g., parallel port + USB triggers).

Best Practices for Using Triggers

Test Your Setup:

Run a pilot session to confirm triggers are accurately logged and aligned with events.
Use tools like EEGLAB or MNE-Python to visualize trigger timing relative to EEG data.

Document Trigger Codes:

Maintain a trigger code legend (e.g., 1 = happy face, 2 = sad face, 3 = button press).
Example:

Trigger Value	Event Description
1	Happy Face Stimulus
2	Sad Face Stimulus
3	Correct Button Press
4	Incorrect Button Press

Synchronize Clocks:

Ensure your stimulus presentation software and EEG system are synchronized (e.g., using LabStreamingLayer (LSL) or parallel port triggers).

Correct for Delays:

Measure the delay between trigger onset and event occurrence (e.g., using a photodiode for visual stimuli).
Apply time shifts during preprocessing to align triggers with actual events.

Example: Visual Stimulus Presentation

Event: A happy face appears on the screen.
Trigger: A TTL pulse (value = 1) is sent from the stimulus PC to the EEG system at the exact onset of the face.
EEG Analysis: Epochs are created from -200 ms to 800 ms around the trigger, and ERPs are averaged across trials.

Visualization of Triggers

The image you shared illustrates how triggers (short pulses) mark the timing of events in the EEG data stream. Each pulse corresponds to a specific event (e.g., stimulus onset, response).

Estimating Number of Trials

To estimate the number of trials required for reliable EEG data analysis, especially in time-frequency domains, you can use a correlation-based approach. This method helps determine when adding more trials stops significantly improving the signal-to-noise ratio (SNR). Here’s a step-by-step guide to implementing this approach:

Step-by-Step Guide to Estimating Number of Trials

1. Extract Time-Frequency Data

Start with your EEG data epoched around events of interest (e.g., stimulus onset).
Compute time-frequency representations (e.g., using wavelets or short-time Fourier transform) for each trial.
Focus on power values in specific frequency bands (e.g., theta, alpha, beta, gamma).

2. Correlate Trial Averages with Full Mean

For each frequency band, calculate the mean power across all trials (this is your “ground truth” reference).
For each trial count nnn (from 1 to total number of trials):
- Compute the mean power across the first nnn trials.
- Calculate the Spearman correlation between this mean and the full mean (since power data is non-normal, skewed, and non-negative).
- Repeat for all frequencies of interest.

3. Plot Correlation vs. Number of Trials

Create a plot with:
- X-axis: Number of trials nnn.
- Y-axis: Spearman correlation between the mean of nnn trials and the full mean.
- Separate lines or panels for each frequency band.

Code

# Set random seed for reproducibility
Random.seed!(123)

# Simulate random power data: trials × time × frequency
n_trials = 50
n_times = 100
n_freqs = 20

# Random power values (0-100)
power_data = rand(n_trials, n_times, n_freqs) .* 100

# Compute full mean (ground truth)
# Average across trials
full_mean = dropdims(mean(power_data, dims=1), dims=1)

# Initialize array to store correlations
max_trials = n_trials
correlations = zeros(max_trials, n_freqs)

# Compute Spearman correlation for each frequency and trial count
for freq in 1:n_freqs
    for n in 1:max_trials
        # Mean of first n trials
        partial_mean = dropdims(mean(power_data[1:n, :, freq], dims=1), dims=1)
        c = corspearman(vec(full_mean[:, freq]),
                        vec(partial_mean))
        correlations[n, freq] = c
    end
end

# Plot correlations vs. number of trials for each frequency
fig = Figure()
ax = Axis(fig[1, 1],
          xlabel = "Number of Trials",
          ylabel = "Spearman Correlation",
          title="Convergence of Trial Averages by Frequency")
for freq in 1:n_freqs
    lines!(1:max_trials, correlations[:, freq])
end

# Add a horizontal line at correlation = 0.9 for reference
hlines!(0.9,
        color=:black,
        linestyle=:dash)
text!(5,
      0.9,
      text = "R = 0.9")
fig

Interpretation:
- The correlation increases as more trials are included.
- The curve plateaus when adding more trials no longer improves reliability (i.e., the correlation stabilizes).
- The plateau point estimates the optimal number of trials for each frequency.

4. Key Observations

Low frequencies (e.g., delta, theta):
- Typically have higher SNR, so fewer trials are needed to reach a plateau.
High frequencies (e.g., gamma):
- Typically have lower SNR, so more trials are needed to stabilize the correlation.

5. Practical Example

Suppose you analyze alpha (8–12 Hz) and gamma (30–50 Hz) bands:

For alpha, the correlation plateaus at ~20 trials.
For gamma, the correlation plateaus at ~50 trials.
Conclusion: Use at least 20 trials for alpha and 50 for gamma to ensure reliable estimates.

6. Additional Tips

Baseline Correction: Apply baseline normalization to power values before correlation to reduce non-stimulus-related variability.
Artifact Rejection: Exclude trials with artifacts (e.g., muscle activity, eye blinks) to avoid bias.
Frequency-Specific Analysis: Repeat the process for each frequency band of interest.

Why This Works

Spearman correlation is robust to non-normal distributions and outliers, making it ideal for power data.
The plateau in the correlation curve indicates when the law of diminishing returns sets in—adding more trials doesn’t significantly improve reliability.