In EEG analysis, covariance and correlation are statistical measures used to understand relationships between signals from different electrodes or brain regions.
Application in EEG Analysis
Connectivity Analysis: Covariance and correlation are used to assess functional connectivity between different brain regions. High correlation between EEG signals from different electrodes may indicate strong functional connectivity.
Feature Extraction: These measures can be used to extract features for further analysis, such as identifying patterns associated with specific cognitive or pathological states.
Artifact Detection: Correlation can help in identifying artifacts or noise that may be common across multiple channels.
Covariance
Variance measures the spread or variability within a single set of data.
Covariance measures the degree to which two ordered variables move together - whether they tend to increase and decrease in tandem.
A few important properties:
Units: covariance retains the units of both variables, which can make it harder to interpret. For example, cov(height, weight) has units of cm × kg.
Mean-centering: variables must be mean-centered before computing covariance.
Symmetry: covariance is commutative - cov(x, y) = cov(y, x) - meaning the result is the same regardless of which variable is treated as x or y. As a consequence, covariance alone cannot be used to infer causality or directionality.
Covariance measures the degree of linear relationship between two variables - how strongly and in what direction they are linearly associated.
A convenient way to summarize these relationships across multiple variables is the variance-covariance matrix (also called the covariance matrix). It arranges variances along the main diagonal and pairwise covariances in the off-diagonal elements, giving a compact view of both individual spread and inter-variable relationships in a single square matrix.
x
y
z
x
var(x)
cov(x, y)
cov(x, z)
y
cov(x, y)
var(y)
cov(y, z)
z
cov(x, z)
cov(y, z)
var(z)
Interpretation:
Diagonal terms (variances): large values indicate high variability in that variable - and by extension, potentially interesting structure in the data worth retaining or examining further.
Off-diagonal terms (covariances): large magnitudes indicate strong linear relationships between pairs of variables, and therefore high redundancy - the two variables are carrying overlapping information.
┌ Warning: Found `resolution` in the theme when creating a `Scene`. The `resolution` keyword for `Scene`s and `Figure`s has been deprecated. Use `Figure(; size = ...` or `Scene(; size = ...)` instead, which better reflects that this is a unitless size and not a pixel resolution. The key could also come from `set_theme!` calls or related theming functions.
└ @ Makie ~/.julia/packages/Makie/kJl0u/src/scenes.jl:264
Covariance captures how two variables vary together - summarising the relationship between their respective variances across paired observations.
The sign of the covariance indicates the direction of the relationship:
Positive - the variables tend to increase and decrease together (positive slope)
Negative - one variable tends to increase as the other decreases (negative slope)
Zero - no linear relationship between the variables
One important limitation of covariance is that its magnitude is sensitive to the scale of the data - it does not, on its own, indicate the strength of a relationship, only its direction. Normalizing by the standard deviations of both variables yields the more interpretable Pearson correlation coefficient.
Despite this, covariance is a foundational quantity in many multichannel analyses, including Principal Component Analysis (PCA), Independent Component Analysis (ICA), source-space imaging of M/EEG data, and least-squares fitting.
For a dataset with N channels, covariance produces an N × N covariance matrix.
Important: covariance is only valid when computed between zero-mean variables - subtract the mean of each variable before computing.
The diagonal elements (variances) can themselves be informative. Interpolating them over electrode locations produces a topographic map of signal variability across the scalp - revealing, for example, that certain electrodes consistently show higher variance than others. These maps are typically expressed in units of μV²/cm².
Seeded covariance topographic maps illustrate the covariance between a selected seed electrode and all other electrodes, providing a spatial view of how strongly each channel co-varies with the chosen reference point.
Correlation
Correlation is a standardized form of covariance - scaled to the range \([-1, 1]\) so that the strength of the linear relationship between two variables can be compared regardless of their original units or scale.
For two EEG signals (s1 and s2), the correlation matrix is computed as cor(s1 * s2'). For a multichannel signal s (channels × time points), it is computed as cor(s').
A correlation matrix is a covariance matrix normalized by the standard deviations of the individual variables, rescaling all values to the range \([-1, +1]\).
Two correlation methods are commonly used:
Pearson correlation - measures the strength of a linear relationship between two normally distributed variables. It is the standard choice when linearity and normality can be assumed.
Spearman correlation - tests for a monotonic relationship by first converting values to rank order (e.g. \([0.1, 1, 10, 100]\) becomes \([1, 2, 3, 4]\)). It makes no assumption about linearity or distribution, making it more appropriate for non-Gaussian data such as power spectra, neuron firing rates, and image pixel values.
Geometrically, correlation is equivalent to the cosine similarity between two vectors - it equals the cosine of the angle between two mean-centered unit vectors in a high-dimensional space.
Auto- - a signal compared against a shifted version of itself
Cross- - one signal compared against a shifted version of another
In both cases, the comparison is performed across a range of lags (denoted \(L\)), producing a function that describes how the relationship evolves as one signal is progressively shifted in time relative to the other.
The cross-covariance between two zero-mean signals \(x\) and \(y\) at lag \(L\) is:
which constrains the result to the range \([−1, +1]\), with \(R_{xx}(0) = 1\) always.
Auto-covariance
Auto-covariance measures the dependence structure of a signal with respect to itself - detecting rhythmic or periodic activity by quantifying how well the signal at time \(n\) n predicts the signal at a later (or earlier) time. For example, 60 Hz line noise will produce a strong auto-covariance peak at lags of 1/60 s and its multiples.
Algorithm:
Optionally subtract the mean from the signal (required for zero-mean assumption)
For each sample at index \(n\), multiply it by the sample at index \(n \pm L\)
Sum all products across all indices \(n\):
\[
r_{xx}(L) = \sum_{n} x(n) \times x(n \pm L)
\]
where \(L\) is the lag.
Biased auto-covariance divides the sum by the total data length \(N\), regardless of the lag:
This estimator is called “biased” because at large lags, fewer sample pairs contribute to the sum - yet the denominator remains \(N\), systematically underestimating the true covariance at those lags. It is however guaranteed to produce a positive semi-definite matrix, which is a desirable property in many applications.
Unbiased auto-covariance divides the sum by the number of sample pairs actually contributing at each lag, \(N - L\):
This corrects the underestimation of the biased estimator at large lags. However, because the denominator shrinks as \(L\) increases, estimates at large lags are based on progressively fewer sample pairs - making them less reliable and increasing variance. For this reason, auto-covariance estimates at large lags should be interpreted with caution regardless of which estimator is used.
One practical consequence of this trade-off is that the biased estimator is often more reliable at large lags - despite its systematic underestimation, the fixed denominator \(N\) stabilizes the estimate by implicitly down-weighting contributions from lag ranges where fewer sample pairs are available.
This matters in practice: for signals with known periodic structure - such as 50 Hz line noise, which should produce consistent auto-covariance peaks at every multiple of 1/50 s - the estimator should behave uniformly across all lags. The biased estimator satisfies this requirement better than the unbiased one, whose shrinking denominator introduces unequal weighting across lags and can distort the periodic structure you are trying to detect.
At lag \(L = 0\) auto-covariance reaches its maximum - the signal is perfectly aligned with itself, so all products \(x(n) \times x(n)\) are positive and their sum is maximized.
As the lag increases, the auto-covariance reveals the periodic structure of the signal:
Positive peaks occur at lags corresponding to full cycles of the dominant frequency - e.g. every 1/50 = 0.02 s for 50 Hz line noise, where the shifted signal realigns with itself.
Negative peaks occur at lags corresponding to half-cycles - e.g. every 0.01 s for 50 Hz noise, where the shifted signal is in anti-phase with the original.
Computing unbiased auto-covariance:
ac, l =acov(eeg_noisy, ch ="eeg", l =20, biased =false)
Plotting auto-covariance:
plot_xac(ac[1, :, 1], l, title ="Auto-covariance")
The auto-covariance plot shows repetitive positive peaks every 0.02 s and negative peaks every 0.01 s, both arising from dominant 50 Hz line noise in the source signal - the positive peaks occur at full-cycle intervals (1/50 s) and the negative peaks at half-cycle intervals (1/100 s).
The function axc2frq() can be used to automatically detect these peaks and convert their lag positions into corresponding frequencies:
ac, l =acor(eeg_noisy, ch ="all", l =10)# channel 2 of the first epochaxc2frq(ac[2, :, 1], l)
1-element Vector{Float64}:
50.0
This can be verified by examining a Power Spectral Density (PSD) plot.
plot_psd(eeg_noisy, ch="Fp2")
By default, biased auto-covariance is calculated. To calculate unbiased auto-covariance:
ac, l =acov(eeg_noisy, ch ="all", biased =false)
Auto-correlation
Auto-correlation measures the Pearson correlation between a signal and a delayed copy of itself - quantifying how similar the signal is to itself as a function of the time lag between the two copies. Formally, it is the correlation between values \(x_i\) and \(x_i + nx_{i + n}\) for varying \(n\).
Key properties:
At lag \(L = 0\), auto-correlation is always 1 - the signal is perfectly correlated with itself.
For periodic signals, auto-correlation is an oscillating function whose frequency matches the dominant rhythmic component of the original signal - revealing both the presence and the period length of any repeating structure.
For random signals (e.g. white noise), auto-correlation should be near zero at all non-zero lags.
Auto-correlation is therefore a practical tool for answering two questions:
Is the signal periodic? - look for oscillating structure in the auto-correlation function.
What is the period? - the lag at which the first positive peak occurs corresponds to the length of one cycle.
Common applications:
Detecting non-randomness - evaluate auto-correlation at lag \(L = 1\); a value significantly different from zero suggests temporal structure in the data.
Identifying a time series model - evaluate auto-correlation across multiple lags to characterise the dependence structure and select an appropriate model.
Wiener-Khinchin theorem: the auto-correlation function and the power spectral density are a Fourier transform pair - the Fourier transform of the auto-correlation function equals the power spectral density of the signal, and vice versa.
Computing auto-correlation:
ac, l =acor(e10, ch ="all", l =100)
Plotting auto-correlation (which is a standardized covariance):
plot_xac(ac[1, :, 1], l, title ="Auto-correlation")
Cross-covariance
Cross-covariance extends the concept of auto-covariance to two different signals - measuring their similarity as a function of the time lag applied to one signal relative to the other.
Where auto-covariance asks “how similar is this signal to a delayed version of itself?”, cross-covariance asks “how similar is signal \(x\) to a delayed version of signal \(y\)?”
This makes cross-covariance particularly useful for detecting shared periodic structure between two channels, estimating time delays between signals, and identifying directional relationships in multichannel data.
Biased cross-covariance divides the sum by the total data length \(N\), regardless of the lag:
As with biased auto-covariance, the denominator remains fixed at \(N\) for all lags - underestimating the true covariance at large lags where fewer sample pairs contribute, but producing a stable and positive semi-definite result.
Unbiased cross-covariance divides the sum by the number of sample pairs actually contributing at each lag, \(N - L\):
xc, l =xcov(eeg1, eeg2, ch1 ="Fp1", ch2 ="Fp1", l =100)
Plotting cross-covariance:
plot_xac(xc[1, :, 1], l, title ="Cross-covariance")
Cross-correlation
Cross-correlation is the normalized form of cross-covariance - scaled by the standard deviations of both signals so that results fall in the range \([−1, +1]\). It measures the similarity between two signals as a function of the lag applied to one relative to the other, and reveals at which offset the two signals are most strongly related.
Computationally, cross-correlation is equivalent to a sliding dot product - at each lag, the dot product between the two signals is computed, producing a function that peaks where the signals are most aligned.
where: \(r_{xy}(L)\) is the cross-covariance at lag \(L\), \(\sigma_x\), \(\sigma_y\) are the standard deviations of \(x\) and \(y\) respectively.
Computing cross-correlation (which is a standardized cross-covariance):
xc, l =xcor(eeg1, eeg2, ch1 ="Fp1", ch2 ="Fp1", l =100)
Plotting cross-correlation
plot_xac(xc[1, :, 1], l, title ="Cross-correlation")
Peak offset for cross-correlation between two channels indicate the delay between activation of channel 1 and 2; if peak offset is within a specific range then two channels are connected.