Added augmentation methods
This commit is contained in:
37
Organizing dataset.md
Normal file
37
Organizing dataset.md
Normal file
@ -0,0 +1,37 @@
|
||||
|
||||
This document is about how to work with datasets. Basic idea is to research how others do it and implement it in our pipeline.
|
||||
|
||||
Our approach will be as follows. Do training from scratch on as large mix of datasets as possible. Then do fine tuning on some benchmark dataset and evaluate on it.
|
||||
|
||||
This document talks about used datasets: train, eval and how are they used.
|
||||
|
||||
## Unify datasets
|
||||
|
||||
Many decisions to be made.
|
||||
Convert arbitrary dataset format to `mne.io.Raw`. Resample to common frequency, select only relevant probes.
|
||||
Common frequency: TODO.
|
||||
Selected common channels: TODO. Missing channels will be filled with 0.
|
||||
Normalize values to one interval.
|
||||
Bandpass filtering: what should be the parameters
|
||||
|
||||
|
||||
## How others do it
|
||||
### [[Augmentation methods#EEG Data Augmentation Method for Identity Recognition Based on Spatial–Temporal Generating Adversarial Network]]
|
||||
This paper uses GAN to augment data and train something like brain identification on it.
|
||||
More info on that in [[Augmentation methods]].
|
||||
They used dataset BCI competition IV dataset 2A.
|
||||
This dataset records EEG data during motor imagery tasks involving left hand, right hand, both feet, and tongue movements performed by 9 subjects. Each subject performed 72 trials of each of the 4 tasks during a single experiment, and each motor imagery trial lasted for 3 s. The EEG data were recorded using 22 Ag/AgCl electrodes at a sampling frequency of 250 Hz and were bandpass filtered between 0.5 and 100 Hz.
|
||||

|
||||
Furthemore authors used 50Hz notch to supress line noise and excluded three channels recording eye movement.
|
||||
For each individual’s EEG data, a third-order Butterworth IIR filter was applied in the 4–40 Hz frequency band to reduce the influence of eye movements.
|
||||
Subsequentially data were min-max normalized to range <0,1>.
|
||||
**The dataset was divided into training and testing sets in a 4:1 ratio, with each individual’s training set consisting of 864 samples.**
|
||||
|
||||
### [[Augmentation methods#Generative Adversarial Networks-Based Data Augmentation for Brain–Computer Interface(2020)]]
|
||||
**Evaluation using their own dataset**:
|
||||
Leave one subject out - train on all subjects except one, then test on that one.
|
||||
Adaptive training - train on all subjects and half data of one subject. Test on 2nd half of that subjects data.
|
||||
**Evaluation on BCI Competition III dataset IV a**:
|
||||
Down-sampled to 100 Hz. Only testing generalizability, using the adaptive training with and without augmented data.
|
||||
|
||||
|
Reference in New Issue
Block a user