Skip to main content

Spectral information of EEG signals with respect to epilepsy classification

Abstract

Background

The spectral information of the EEG signal with respect to epilepsy is examined in this study.

Method

In order to assess the impact of the alternative definitions of the frequency sub-bands that are analysed, a number of spectral thresholds are defined and the respective frequency sub-band combinations are generated. For each of these frequency sub-band combination, the EEG signal is analysed and a vector of spectral characteristics is defined. Based on this feature vector, a classification schema is used to measure the appropriateness of the specific frequency sub-band combination, in terms of epileptic EEG classification accuracy.

Results

The obtained results indicate that additional frequency band analysis is beneficial towards epilepsy detection.

Conclusions

This work includes the first systematic assessment of the impact of the frequency sub-bands to the epileptic EEG classification accuracy, and the obtained results revealed several frequency sub-band combinations that achieve high classification accuracy and have never been reported in the literature before.

1 Introduction

Signal processing of electroencephalogram (EEG) is a field that has drawn significant attention in the last years. As a result, numerous EEG processing methodologies have been presented in the literature. One of the most popular field in EEG signal processing is the epilepsy detection and classification. Being one of the most common neurological disorders [1], epilepsy has been the focus of hundreds of EEG analysis studies. Epilepsy is a chronic brain disorder, characterized by recurrent seizures, which cannot be predicted. The severity of the condition can vary greatly, while seizures may fall into a large variety of types [2].

Most of the studies for epileptic activity detection/classification using EEG signal processing, formulate methodologies that analyse the EEG signal by extracting informative features from it [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]. To this end, spectral analysis of the EEG signal is essential, since epileptic activity interrupts normal brain functionality. Analysing the EEG signal frequency patterns in order to extract spectral characteristics is one of the most common types of EEG analysis, either by itself (i.e. by focusing on the frequency domain) or combined with other types of analysis (such as non-linear analysis), thus resulting to a vector of features. Then, these features are used as input into a classifier, resulting to classification of epileptic signals.

The EEG spectral analysis is based on a set of frequency sub-bands. Researchers have mainly used wavelet transform (WT) [3,4,5,6,7,8,9,10,11,12,13,14,15,16] and time-frequency distributions (TFD) [17,18,19,20] to analyse the EEG spectral patterns. However, although spectral analysis is a well-known approach, with numerus studies including spectral characteristics in the features extracted from the EEG, the importance of the frequency sub-bands that are used to analyse the signal has never been thoroughly investigated in the literature. It is medically established that brainwaves are divided based on their frequency into several sub-bands, being delta (1–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz) and gamma (30–80 Hz) [21]. Thus, several researchers roughly focus on these sub-bands [3,4,5,6,7,8,9,10,11,12,13,14, 17, 18], with the technical limitations that the analysis technique imposes (i.e. WT). Thus, the importance of the frequency sub-bands and their limits have not been analysed in the literature, since in WT-based approaches the frequency sub-bands are automatically set [3,4,5,6,7,8,9,10,11,12,13,14,15,16], while in TFD-based methodologies, an attempt to compare the impact of different sub-bands has been presented [17], however not being a systematic approach since only four different sub-band combinations were analysed.

The main focus of this study is to study the impact of frequency sub-band selection regarding the EEG epilepsy classification. To this end, a methodology has been developed, which initially defines the number of spectral thresholds (which determines the number of frequency sub-bands that are created) from 0 to 12, with 0 meaning that the overall frequency spectrum of the EEG is considered as a single frequency sub-band and all other values (1–12) defining the number of frequency sub-bands (i.e. for five spectral thresholds, six frequency sub-bands are created). Then, all possible combinations of these sub-bands are created, subject to simple limitations (i.e. the range of each sub-band is forced to be ≥ 2 Hz). From each combination, a set of features is extracted, which are used in a classifier. The Bonn EEG database has been employed and results are obtained in terms of classification accuracy, indicating the importance of this study. To the best of the author’s knowledge, this is the first systematic analysis of the impact of different frequency sub-band number and range, presented in the literature. Furthermore, the results reveal frequency sub-bands that presented high classification accuracy and have never been studied in the literature before.

2 Related work

2.1 Dataset

The Bonn EEG database [22] has been employed in this study, which is a well-known benchmark dataset for this problem. The database includes recordings for both healthy and epileptic subjects, divided in five subsets (denoted as A-E and named as Z, O, N, F and S, respectively) each of them containing 100 single-channel EEG recordings. Sets A and B (Z and O files) are recordings from five healthy volunteers with eyes open and eyes closed, respectively. The recordings are made extracranially, using the standard 10–20-electrode positioning system. Sets C and D (N and F files) are seizure-free recordings from five epileptic patients, from the epileptogenic zone (set D) and the hippocampal formation of the opposite brain hemisphere (set C), while set E (S files) contains seizure activity, selected from several recording sites exhibiting ictal activity. Sets C, D and E are recorded intracranially, using depth electrodes implanted symmetrically into the hippocampal formation and strip electrodes are implanted onto the lateral and basal regions (middle and bottom) of the neocortex. An example recording of each set is illustrated in Fig. 1. The sampling rate of the EEG data is 173.61 Hz, and each of them has duration of 23.6 s (4096 samples), recorded using 12-bit resolution, while the spectral bandwidth is 0.5 to 85 Hz.

Fig. 1
figure 1

Recordings from the five sets of the Bonn EEG database

2.2 Methods using wavelet transform

The WT-based methods presented in the literature for the analysis of epilepsy in EEG mainly apply discrete wavelet transform (DWT) or wavelet packet decomposition (WPD). WT is a time-frequency technique, which provides both time and frequency views of a signal [23]. Thus, it can accurately capture and localize transient features in the data like the epileptic spikes. In wavelet analysis, a linear combination of specific functions represents the initial signal. These functions are obtained by dilation and translation of the mother wavelet. The signal is decomposed into segments of half its size and spectrum with the use of the mother wavelet. Particularly, in DWT the scaling and translating parameters are presented in powers of two. A series of quadrature mirror filters (QMF) are used, serving as high-pass and low-pass filters. In the first level, the conjugate filters (high-pass and low-pass) are applied to the input signal resulting to a set of coefficients, named wavelet coefficients. The “approximation” is the output of the low-pass filter and is sub-decomposed, extending this procedure in the next level. However, the output of the high-pass filter (“detail”) is not further decomposed. In the next level, the procedure is repeated only for the approximation until the signal is decomposed to reveal the band of interest.

WPD is a wavelet transform and it can also be interpreted as an expansion of the DWT, wherein the signal is analysed with a set of QMFs that divide the frequency axis in separate intervals of various sizes [24]. However, in the WPD, the signal is passed through more filters than the DWT and both the detail and approximation coefficients are decomposed. In the first level of decomposition, the obtained wavelet packet coefficients are referred as first-level approximation and detail respectively. In the second level, the approximation of the approximation (AA), the detail of the approximation (DA), the approximation of the detail (AD) and the detail of the detail (DD) coefficients are computed and this recursive algorithm renders each newly computed wavelet packet coefficient the root of its own analysis tree. This recurrent splitting is represented in a binary tree. The steps of the methodological approaches presented in the literature are common in both cases. The EEG signal is decomposed into several frequency sub-bands and features are extracted, creating a feature vector, most commonly used as input to a classifier.

2.2.1 DWT-based studies

The sampling frequency of the EEG recordings in the Bonn database is 173.61 Hz, and thus the frequency range is 0–86.8 Hz. In the majority of methods, the entire spectrum of the EEG recordings was analysed. However, frequencies higher than 60 Hz are often characterized as noise and are subsequently discarded. For that reason, some researchers have initially applied a band-pass filter, which removes the redundant frequency and focuses only on the spectrum that corresponds to the five medically established EEG rhythms, i.e. delta (0–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), beta (13–30 Hz) and gamma (30–60 Hz or 30–80 Hz).

Subasi [3] used DWT to decompose the EEG signals into six frequency sub-bands. However, only the wavelet coefficients that correspond to the frequency range of interest 0–21.7 Hz, meaning the details D3-D5 and the approximation A5, were used to calculate the features and train a mixture of experts (ME)-based classifier. Guo et al. [4] also used the DWT to analyse the EEG signals, applying a four-level decomposition, dividing the selected EEG recordings into five frequency sub-bands. The line length feature was extracted from each of the five sub-signals (D1-D4 and A4) forming the feature vector that trained a multilayer perceptron neural network (MLP). Ocak [5] applied a decomposition of three levels in the entire spectrum (0–86.8 Hz). Approximate entropy (ApEn) values, calculated for all the frequency bands, were used to define a threshold which classified the EEG segments. Kumar et al. [6] applied a five-level decomposition and calculated the ApEn in each decomposition level. The generated feature vector was fed to an MLP classifier. In a subsequent study, the same group applied a decomposition of five levels (as they previously suggested in [6]), using the fuzzy approximate entropy (fApEn) and support vector machines (SVM) for classification.

A comparison of three feature extraction techniques, principal component analysis (PCA), independent component analysis (ICA) and linear discriminant analysis (LDA) was presented in [8]. The EEG recordings were subjected to a five-level decomposition, and statistical features were extracted only by the sub-signals D3, D4, D5 and A5, which correspond to the frequency range of 0–21.7 Hz. The dimension of the resulting feature set was reduced by using PCA, ICA and LDA, and the feature vector was used as input to an SVM classifier. In another DWT-based study [9], the authors’ main target was the implementation of a feature extraction system based on genetic programming. Therefore, they applied a four-level decomposition to analyse the signal in sub-signals and then genetic programming, aiming to reduce the dimension of the extracted feature vector. The extracted set of features and the reduced were used respectively to train a k-nearest neighbour (KNN) classifier. Results indicated that the reduced feature vector improved the classifier’s performance. A comprehensive methodology based on optimized extreme learning machine (OELM) was proposed in [10]. In this methodology, wavelet-based statistical features were extracted from a four-level decomposition and the OELM classifier was trained by the features that were extracted from the entire spectrum (0–86.8 Hz). Five classification problems were conducted (among them the five-class problem Z-O-N-F-S), and the performance was measured with accuracy, which reached above 94% for all of the problems.

Another approach is to isolate the frequency band of interest from the five EEG rhythms, from the redundant frequency of the signal, by applying a band-pass filter. A wavelet-chaos methodology was presented by Adeli et al. [11], where a low-pass finite impulse response (FIR) was used to filter the EEG signal to the 0–60 Hz band. The EEG recordings were then subjected to a four-level decomposition, and the average values and standard deviations of a couple of parameters (namely correlation dimension and largest Lyapunov exponent) were calculated in each wavelet sub-signal (D1-D4 and A4), representing the system’s chaocity. In a subsequent study [12], the aforementioned authors applied wavelet analysis and decomposed the signals into the same frequency sub-bands, evaluating different methods of classification. A similar approach is described in study [13], wherein the authors applied a band-pass filter and cut off all the signal activity outside the 0–60-Hz range to prepare the EEG signals for further processing. In the next stage, a four-level decomposition was applied and the calculated autoregressive (AR) parameters of each sub-band were fed to an MLP classifier. Wang et al. [14] presented a novel classification algorithm based on a voting strategy and a hardware implementation. The authors used a band-pass filter to focus only to the 0–32-Hz range and then applied a three-level decomposition and extracted the sample entropy (SampEn) only by the detail coefficients (D1, D2, D3).

2.2.2 WPD-based studies

Ocak [15] divided the EEG segments through a four-level wavelet packet decomposition. ApEn values of the wavelet coefficients of all the 31 nodes of the decomposition tree were used as a feature vector, while a genetic algorithm was employed to reduce the number of features and find the optimal feature subset that maximizes the classification performance of a learning vector quantization (LVQ) scheme. Swami et al. [16] used wavelet packet decomposition to extract valuable information from the EEG signal. A six-level wavelet packet decomposition yielding 64 nodes was performed, and several statistical features were extracted from each node. The authors tested seven different combinations of the feature vector and resulted in the best pair, reaching high levels of accuracy. Table 1 summarizes WT-based methods (DWT and WPD) presented in the literature.

Table 1 WT-based methods for EEG analysis

2.3 Methods using time-frequency analysis

The smoothed pseudo Wigner-Ville distribution (SPWVD) was applied in study [17]. Various lengths of time-frequency resolutions (64, 128, 256 and 512), time windows (3 and 5) and frequency sub-bands (4, 5, 7 and 13) were analysed, aiming to extract several features from the spectrum of the signal reflecting the energy distribution over the time-frequency plane. PCA was applied to the obtained features, and then an artificial neural network (ANN) was employed for classification. In [18], the same group presented a comprehensive study wherein the short-time Fourier transform (STFT) and 12 other TFDs were evaluated. The power spectrum density (PSD) of each segment was also extracted and used as input to an ANN classifier.

A methodology based on fast Fourier transform (FFT) and ApEn was proposed in [19]. The average power spectrum was extracted in each sub-band of 4 Hz along with the ApEn. In total, 16 features were extracted, and the ability of genetic programming and PCA to reduce the dimension of feature vector was examined. The SVM classifier with linear and radial basis functions (kernel functions) was also employed.

In study [20], EEG analysis using TFDs and particularly the spectrogram (SP), the Choi-Williams distribution (CWD) and the SPWVD are performed. The purpose of the study was both the identification of the seizure peaks and the classification of the EEG signals. For the identification of the peak seizures, the TFDs were calculated and the maximum values were found. The normalized Renyi marginal entropy (RME) was extracted for various lengths of a window (11, 17, 27, 41, 49, 93, 151, 205, 255) for SP and SPWVD and the best value of CWD obtained by the best values of window length of SP and SPWVD. The SPWVD with the RME provided the best results in terms of time-frequency resolution for the peak identification problem. Each signal of the entire datasets was segmented in six sub-bands, and the energy from the sub-bands B1, B2 and B3 corresponding to the frequency range of interest of 0.5 to 12 Hz was extracted. A vector of 200 values of energy for the three sub-bands of interest was obtained, and the moving averages were extracted. The classification of the signals was performed by a threshold which was defined by the mean of the moving average of energy for each band. The obtained results were used as input to a score function to classify each signal. Methods based on TFD analysis are summarized in Table 2.

Table 2 TFD-based methods for EEG analysis

3 Method

The flowchart of the methodology followed for this study in order to access the spectral characteristics of the EEG signals is presented in Fig. 2.

Fig. 2
figure 2

Flowchart of the methodology followed in this study

3.1 Select number of thresholds

Initially, the number of spectral thresholds is selected, which determines the number of frequency sub-bands that are created; for N spectral thresholds, N + 1 frequency sub-bands are analysed. The number of spectral thresholds that are examined in this study varied from N = 0 (thus considering all EEG spectrum to be a single sub-band) to N = 12 (thus creating 13 spectral sub-bands).

3.2 Create combinations

For each number of thresholds, all possible threshold combinations are generated, subject to a single constrain, being that no two consecutive thresholds can be closer than 2 Hz. The limits for the spectral analysis are set to [0, 42] Hz. For N spectral thresholds, the threshold set TN is defined as:

$$ {T}^N=\left\{{t}_i\right\},i=1:N $$
(1)

with t0 = 0 Hz and tN + 1 = 42 Hz, thus:

$$ {t}_{i+1}-{t}_i\ge 2\ \mathrm{Hz},\forall i=0:N, $$
(2)

while each frequency sub-band is defined as:

$$ {f}_i=\left[{t}_i,{t}_{i+1}\right],i=0:N $$
(3)

and the frequency sub-bands set FN is defined as:

$$ {F}^N=\left\{{f}_i\right\},i=0:N $$
(4)

with |FN| = N + 1. For example, for N = 5, F5 = {[0, t1], [t1, t2], [t2, t3], [t3, t4], [t4, t5], [t5, 42]} Hz.

In order to create all different threshold combinations CN that satisfy the above limitation, only integer values of thresholds are considered. Thus, ti [2, 40] Hz, i = 1:N, since all frequency sub-bands must be ≥ 2 Hz, and:

$$ {C}^N=\left\{\mathrm{all}\ \mathrm{different}\ \mathrm{combinations}\ \mathrm{of}\ {F}^N\right\} $$
(5)

The number of combinations varies greatly as N increases; N vs |CN| is presented in Fig. 3.

Fig. 3
figure 3

Number of spectral threshold combinations (x-axis) for all values of N (y-axis)

3.3 Spectral feature extraction

3.3.1 Sub-band energy

All EEG signals are initially filtered using a low-pass filter with cut-off frequency of 42 Hz. Then, each threshold set combination CN is used in order to define a set of filters for the EEG signal, one low pass for the [0, t1] Hz sub-band, one high pass for the [tN, 42] Hz sub-band and N − 1 band-pass filters for the [ti, ti + 1] Hz, i = 1:N − 1 sub-bands. All EEG filters are designed as Elliptic IIR filters, with ti ± 0.5 Hz values as fstop and fpass thresholds, respectively. The overall procedure is illustrated in Fig. 4.

Fig. 4
figure 4

Spectral feature extraction step. After initial filtering (0–42 Hz), the signal is filtered with N + 1 elliptic filters (middle column), resulting to N + 1 filtered signals (right column)

The energy of each of the N + 1 filtered signals is the calculated (ei), and the vector of energies (EN) is used for the classification.

3.3.2 Total EEG energy

The total EEG energy (TE) is also calculated as sum of all sub-band energies:

$$ \mathrm{TE}=\Sigma {e}_i $$
(6)

3.3.3 Sub-band fractional energy

Besides the energy of each sub-band, the fractional energy (fei) is also calculated:

$$ {fe}_i={e}_i/\mathrm{TE} $$
(7)

The vector of fractional energies (FEN) is also used as input for the classification step.

3.3.4 Spectral entropy

The spectral entropy (SEn) is the Shannon entropy of the power spectrum density of each EEG signal, calculated as:

$$ \mathrm{SEn}=-\Sigma \left({P}_k\log\ {P}_k\right)/\log (M) $$
(8)

with Pk being the spectral power of normalized frequencies (and ΣPk = 1), and M is the number of frequency bins.

3.4 Classification

The spectral feature vector created in the previous step is FV = {EN, TE, FEN, SEn}. Thus, the size of FV is 2 N + 4, except in the case of N = 0 (i.e. when all EEG spectrum is considered as a single sub-band) where |FV| = 1 (i.e. a single feature is included). The number of spectral sub-bands (FN), spectral threshold combinations (CN) and the size of the feature vector (FVN) with respect to the number of spectral thresholds (N) are presented in Table 3. Classification is based on a random forest classifier [25], which is an ensemble learning method based on the construction of a multitude of decision trees. In this study, random forests were constructed with standard parameters, i.e. each forest containing 100 decision trees, which are grown to the full depth.

Table 3 Number of spectral thresholds and size of spectral threshold set (N/TN) and respective number of spectral sub-bands (FN), spectral threshold combinations (CN) and size of feature vector (FEN)

The overall methodology is presented in Algorithm 1.

figure a

4 Results

The study focused on two different classification problems, the five-class problem (i.e. classifying all Z, O, N, F and S categories) with the main objective being to identify the spectral sub-bands that carry the maximum information, and the three-class problem (i.e. ZO-NF-S categories), which is a well-known medically established problem in this area. The obtained results are in terms of classification accuracy. The 10-fold stratified cross-validation technique has been employed in the classification, thus the dataset has been divided into 10 equally sized datasets, with each of them having the same number of EEG recordings from each of the categories, and then nine of them were used for training the classifier, and the final for testing. This procedure is applied 10 times, thus resulting into 10 confusion matrices, while the final confusion matrix (used to calculate classification accuracy) is their summation.

In Table 4, the best obtained accuracy for the five-class problem, for all number of thresholds (N) is presented (max accuracy). Also, the average value of the top-10 classification accuracies for each number of thresholds (N) is calculated (average accuracy). The results are illustrated in Fig. 5.

Table 4 Maximum obtained accuracy (max accuracy) and average value of the top 10 obtained values for accuracy (average accuracy) for the five-class problem, for N = 0–12
Fig. 5
figure 5

Maximum obtained accuracy (max accuracy) and average value of the top-10 obtained values for accuracy (average accuracy) for the five-class problem, for N = 0–12

The obtained accuracy results for N = 1 are presented in Fig. 6. The value of the threshold (t1) is on the x-axis; thus, the respective accuracy result is obtained using features extracted from frequency sub-bands F1 = {[0, t1], [t1, 42]} Hz, with the size of the feature vector FV1 = 6. For example, for t1 = 4 Hz, the frequency sub-bands are {[0, 4], [4, 42]} Hz and the accuracy result is 73.60%. Also, the accuracy result of N = 0 (F0 = [0, 42] Hz, FV0 = 1), being 44.80%, is depictured in Fig. 6 (black line) as a baseline result.

Fig. 6
figure 6

Obtained accuracy results for N = 1 (t1 = 2:40). Black line denotes the accuracy for N = 0 (44.80%)

The obtained accuracy results for N = 2 are presented in Fig. 7. Since using two spectral thresholds, the obtained results formulate a matrix M (with M (t1, t2) = accuracy obtained using these spectral thresholds), the results are depicted in a 3D image. The value of t1 threshold (Hz) is on the x-axis and the value of t2 threshold (Hz) is on the y-axis. Thus, the accuracy result for frequency sub-bands F2 = {[0, t1], [t1, t2], [t2, 42]} Hz (with size of feature vector FV2 = 8). For example, for t1 = 4 Hz and t2 = 6 Hz, the frequency sub-bands are {[0, 4], [4, 6], [6, 42]} Hz.

Fig. 7
figure 7

Obtained accuracy for N = 2 (t1 = 2:38 Hz, t2 = 4:40 Hz)

For values of N greater than 2, the obtained results cannot be presented with respect to the ti values. Thus, results for N > 2 are presented in Fig. 8a–j with respect to the overall number of combinations (CN). Vertical lines represent the changes of t1. For example, the first part of Fig. 8a (denoted with gray color) presents the results of all C3 combinations with t1 = 2 Hz (which is the first valid value for t1, since t1t0 must be ≥ 2 Hz) and thus t2 [4, 38] Hz and t3 [6, 40] Hz. The sequence of C3 combinations for t1 = 2 Hz is {{[0, 2], [2, 4], [4, 6], [6, 42]}, {[0, 2], [2, 4], [4, 7], [7, 42]}, … {[0, 2], [2, 4], [4, 40], [40, 42]}, {[0, 2], [2, 5], [5, 7], [7, 42]}, …, {[0, 2], [2, 38], [38, 40], [40, 42]}}.

Fig. 8
figure 8

Obtained accuracy results for N = 3 (a), N = 4 (b), N = 5 (c), N = 6 (d), N = 7 (e), N = 8 (f), N = 9 (g), N = 10 (h), N = 11 (i) and N = 12 (j)

To make clearer the plots of Fig. 8, the results of C2 combinations are also generated in this form (Fig. 9). The subplots (a) to (f) in Fig. 9 correspond to the parts of the main plot that are connected with the red lines, for a specific value of t1. Figure 8a (the first part of the main plot) corresponds to t1 = 2 Hz and thus t2 [4, 40] Hz, Fig. 8b (the second part of the main plot) corresponds to t1 = 3 Hz and thus t2 [5, 40] Hz, Fig. 8c corresponds to t1 = 4 Hz and t2 [6, 40] Hz, Fig. 8d corresponds to t1 = 6 Hz and t2 [8, 40] Hz, Fig. 8e corresponds to t1 = 7 Hz and t2 [9, 40] Hz and Fig. 8f corresponds to t1 = 8 Hz and t2 [10, 40] Hz.

Fig. 9
figure 9

Accuracy results (%) for N = 2. Results for a t1 = 2 Hz and t2 = 4:40 Hz, b t1 = 3 Hz and t2 = 5:40 Hz, c t1 = 4 Hz and t2 = 6:40 Hz, d t1 = 5 Hz and t2 = 7:40 Hz, e t1 = 6 Hz and t2 = 8:40 Hz, f t1 = 7 Hz and t2 = 9:40 Hz

The top five obtained classification accuracy results for each N value, and the respective FN are presented in Table 5.

Table 5 Top 5 accuracy results for the five-class problem and the respective frequency sub-bands (FN)

Besides the five-class problem, the well-known three-class problem (ZO-NF-S) is also addressed. In this case, the main focus is a medically established problem, addressed from several researchers in the literature [17, 26,27,28,29,30,31]. Again, the results are in terms of classification accuracy, and the 10-fold stratified cross-validation technique has been employed. The obtained results are presented in Table 6.

Table 6 Max and average accuracy for the three-class problem, for N = 0–12

5 Discussion

A methodology for systematic analysis of the frequency sub-band definition regarding EEG analysis for epilepsy, is presented in this work, in order to assess the impact of different number and alternative definitions of frequency sub-bands in this problem. The methodology is based on the definition of a number of spectral thresholds, based on which a set of frequency sub-bands is created. Then, a set of spectral features are extracted and used to train a random forest classifier. For each specific number of spectral thresholds (ranging from 0 to 12), all combinations of sub-band definition are analysed, with the limitation that each sub-band range must be at least 2 Hz, resulting to a total of ~ 1.32 × 108 frequency sub-band combinations. The methodology has been applied on a benchmark dataset, being the Bonn EEG database, for the five-class (Z-O-N-F-S) and the three-class (ZO-NF-S) problems.

For the five-class problem, the maximum accuracy obtained for each N (presented in Table 4) ranges from 44.80% (for N = 0) to 91.20% (obtained for two combinations with N = 9). An important conclusion extracted from this analysis is that increasing the number of frequency sub-bands does not have a positive impact in the classification accuracy, since the results after peaking for N = 9 are slightly decreasing with respect to N (Fig. 5). The same conclusion is reached when the average accuracy of the top 10 results is taken under consideration; maximum average accuracy is 90.72% (obtained for N = 9), decreasing to 90.08% (for N = 12). It should be noted that evidence for this conclusion can be found in Tzallas et al. [17] and Liang et al. [19], where 13 and 15 frequency sub-bands were examined, respectively, however drawn from single experiments and not a systematic analysis. In [17], the results are decreasing for 13 frequency sub-bands compared to the results obtained for five and seven frequency sub-bands (although the five-class problem is not included in the analysis of [17]), while in [19] the obtained accuracy for the five-class problem is 85.90% using 15 frequency sub-bands. Furthermore, combinations with N = 5–12 achieved classification results ≥ 90%, being in accordance with the majority of researchers, using four to seven frequency sub-bands in their analysis (without however any justification for this selection).

Considering the delta, theta, alpha, beta and gamma frequency sub-bands (medically established rhythms) that correspond to the {[0–4], [4–8], [8–13], [13–30], [30–42]} Hz combination for four spectral thresholds (N = 4), the obtained accuracy is 82.80%, being 6.8% lower than the maximum classification accuracy obtained for N = 4 (89.60%) and 8.4% lower than the best classification accuracy obtained in this study (being 91.20%, obtained for two frequency sub-band combinations for N = 9). Several of the frequency sub-band combinations that achieved high classification accuracy (≥ 90%) include frequency sub-bands that correlate with the medically established rhythms, including also however sub-bands that clearly differentiate from them. For N = 4 spectral thresholds, the {[0–3], [3–8], [8–18], [18–33], [33–42]} Hz combination, which achieved the best classification accuracy (for N = 4), includes [0–3] Hz band (resembling delta) and [3–8] Hz (resampling theta); however, the other bands are somewhat different. Also, the {[0–2], [2–8], [8–16], [16–25], [25–35], [35–42]} Hz combination, which is one of the frequency sub-band combinations that achieved maximum classification accuracy for N = 5, includes [8–16] Hz band (alpha rhythm) but significant differences for all other rhythms. Furthermore, for N > 4, additional frequency sub-bands that carry significant information regarding this problem are revealed.

The frequency sub-band combinations that achieved maximum classification accuracy are in the first two lines for N = 9 in Table 5. Both include the [0–3] Hz and [3–7] Hz bands, closely related to delta and theta rhythms, but also an additional band [7, 8] Hz, between theta and alpha rhythms, is included. In both cases, beta rhythm is split into four and three smaller bands, for the first and second combination, respectively. Also, gamma rhythm is split into smaller bands (two for the first combination and three for the second). The low-frequency bands [0–3] and [3–7] are the most common among the ones that achieved high classification accuracy (≥ 90%). This is in compliance with several works presented in the literature [5, 7, 8, 11,12,13,14, 17,18,19,20]. In higher frequencies, however, there are major differences in the frequency sub-band combinations that achieved maximum results in this study. Especially with the WPD-based studies [15, 16], the frequency sub-bands used are in complete disagreement with the results obtained in this study. A band (0–43.4 Hz), included in [15, 16] studies, carries little information for this problem, while low-frequency sub-bands, extensively included in the high-accuracy achieving combinations in this study, are excluded from the WPD-based studies.

Considering the three-class problem, the maximum accuracy obtained for each N (presented in Table 6) ranges from 56% (for N = 0) to 98.8% (obtained for several combinations with N = 8 and N = 9). Again, increasing the number of frequency sub-bands does not have a positive impact in the classification accuracy; the maximum values are obtained for N = 8 and then the results are decreasing with respect to N. In this case also, the combination that corresponds to the medically established rhythms obtained much lower classification accuracy. Among the frequency sub-band combinations that achieved high classification accuracy (≥ 90%), the low-frequency bands [0–3] and [3–7] are the most common while there are significant differences in the high-frequency bands.

In Table 7, a comparison of methodologies presented in the literature for the five-class problem is presented. Although the focus of this study is to assess the impact of the number of frequency sub-bands and the different frequency sub-band combinations in the classification of EEG regarding epilepsy, the obtained results compare well with the ones reported in the literature. The works by Guler and Ubeyli [32, 33] and Murugavel and Ramakrishnan [10] reported high classification accuracy; however, they are validated using a 50% holdout technique and not a cross-validation procedure. The obtained results using a cross-validation technique [17, 19, 34, 35] range from 86.10 to 93.75%, with the best obtained results in this study being 91.20%.

Table 7 Comparison of methodologies presented in the literature for the five-class (Z-O-N-F-S) problem

A comparison of methodologies presented in the literature for the three-class problem is presented in Table 8. The results reported in the literature range from 95.6 to 98.8%, with the proposed method archiving 98.8%. Again, some researchers used different validation techniques; however, works employing a 10-fold cross-validation technique [29,30,31] range from 98.28 to 98.8%.

Table 8 Comparison of methodologies presented in the literature for the three-class (ZO-NF-S) problem

6 Conclusions

The first systematic analysis in the literature, regarding the impact of the frequency sub-band definition in the epileptic EEG classification problem, is presented in this study. The study revealed significand conclusions, some are in accordance to the majority of works presented in the literature, while others are contradicting with published works. Yet, a major conclusion of this study is that examining additional frequency sub-bands (and not only focusing on the medically established rhythms) can greatly benefit studies focusing on the EEG analysis for epilepsy detection.

A limitation of this study is that the range of each sub-band was forced to be ≥ 2 Hz, thus not examining in greater detail the frequency sub-bands. The main reason for this limit was the high number of spectral threshold combinations, as the number of spectral thresholds increase. In future, the results obtained in this study will be validated in additional EEG recordings and other well-known EEG databases [36], including different types of seizure activity; the latter is of major importance since different types of epileptic seizure activity may present different spectral patterns. Also, the application of frequency-based EEG analysis (as in this work) is advantageous compared to other types of EEG processing, since it is of low computational complexity and can be applied in real time. Furthermore, the author will exploit the conclusions from this study (i.e. frequency sub-band combinations that achieve maximum classification accuracy), in the design of an EEG epilepsy classification procedure based on more complex signal processing techniques (such as using this combination for a time-frequency grid, as in [17]). Also, employment of additional classification methods, such as neural networks and deep learning networks [37,38,39], will be studied in future communications.

Abbreviations

ANN:

Artificial neural network

ApEn:

Approximate entropy

AR:

Autoregressive

CWD:

Choi-Williams distribution

DWT:

Discrete wavelet transform

EEG:

Electroencephalogram

fApEn:

Fuzzy approximate entropy

FFT:

Fast Fourier transform

FIR:

Finite impulse response

ICA:

Independent component analysis

KNN:

k-nearest neighbor

LDA:

Linear discriminant analysis

LVQ:

Learning vector quantization

ME:

Mixture of experts

MLP:

Multilayer perceptron neural network

OELM:

Optimized extreme learning machine

PCA:

Principal component analysis

PSD:

Power spectrum density

QMF:

Quadrature mirror filters

RME:

Normalized Renyi marginal entropy

SampEn:

Sample entropy

SEn:

Spectral entropy

SP:

Spectrogram

SPWVD:

Smoothed pseudo Wigner-Ville distribution

STFT:

Short-time Fourier transform

SVM:

Support vector machines

TFD:

Time-frequency distributions

WPD:

Wavelet packet decomposition

WT:

Wavelet transform

References

  1. D. Hirtz, D.J. Thurman, K. Gwinn-Hardy, M. Mohamed, A.R. Chaudhuri, R. Zalutsky, How common are the “common” neurologic disorders? Neurology 68(5), 326–337 (2007). https://doi.org/10.1212/01.wnl.0000252807.38124.a3

    Article  Google Scholar 

  2. S.F. Robert, W.E. Boas, W. Blume, C. Elger, P. Genton, P.L.J. Engel, Epileptic seizures and epilepsy: Definitions proposed by the international league against epilepsy (ILAE) and the International Bureau for Epilepsy (IBE). Epilepsia 46(4), 470–472 (2005). https://doi.org/10.1111/j.0013-9580.2005.66104.x

    Article  Google Scholar 

  3. A. Subasi, EEG signal classification using wavelet feature extraction and a mixture of expert model. Expert Syst. Appl. 32(4), 1084–1093 (2007). https://doi.org/10.1016/j.eswa.2006.02.005

    Article  Google Scholar 

  4. L. Guo, D. Rivero, J. Dorado, J.R. Rabunal, A. Pazos, Automatic epileptic seizure detection in EEGs based on line length feature and artificial neural networks. J Neurosci Methods 191(1), 101–109 (2010). https://doi.org/10.1016/j.jneumeth.2010.05.020

    Article  Google Scholar 

  5. H. Ocak, Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy. Expert Syst. Appl. 36(2), 2027–2036 (2009). https://doi.org/10.1016/j.eswa.2007.12.065.

    Article  Google Scholar 

  6. Y. Kumar, M.L. Dewal, R.S. Anand, Epileptic seizures detection in EEG using DWT-based ApEn and artificial neural network. SIViP 8(7), 1323–1334 (2014). https://doi.org/10.1007/s11760-012-0362-9

    Article  Google Scholar 

  7. Y. Kumar, M.L. Dewal, R.S. Anand, Epileptic seizure detection using DWT based fuzzy approximate entropy and support vector machine. Neurocomputing 133, 271–279 (2014). https://doi.org/10.1016/j.neucom.2013.11.009

    Article  Google Scholar 

  8. A. Subasi, M.I. Gursoy, EEG signal classification using PCA, ICA, LDA and support vector machines. Expert Syst. Appl. 37(12), 8659–8666 (2010). https://doi.org/10.1016/j.eswa.2010.06.065

    Article  Google Scholar 

  9. L. Guo, D. Rivero, J. Dorado, C.R. Munteanu, A. Pazos, Automatic feature extraction using genetic programming: An application to epileptic EEG classification. Expert Syst. Appl. 38(8), 10425–10436 (2011). https://doi.org/10.1016/j.eswa.2011.02.118

    Article  Google Scholar 

  10. A.M. Murugavel, S. Ramakrishnan, An optimized extreme learning machine for epileptic seizure detection. IAENG Int J Comput Sci 41(4), 212–221 (2014)

    Google Scholar 

  11. H. Adeli, S. Ghosh-Dastidar, N. Dadmehr, A wavelet-chaos methodology for analysis of EEGs and EEG subbands to detect seizure and epilepsy. IEEE Trans Biomed Eng 54(2), 205–211 (2007). https://doi.org/10.1109/TBME.2006.886855

    Article  Google Scholar 

  12. S. Ghosh-Dastidar, H. Adeli, N. Dadmehr, Mixed-band wavelet-chaos-neural network methodology for epilepsy and epileptic seizure detection. IEEE Trans Biomed Eng 54(9), 1545–1551 (2007). https://doi.org/10.1109/TBME.2007.891945

    Article  Google Scholar 

  13. S.R. Mousavi, M. Niknazar, B.V. Vahdat, Epileptic seizure detection using AR model on EEG signals. Cairo International Biomedical Engineering Conference 2008 (CIBEC 2008) (IEEE, Cairo), p. 2008

  14. Y. Wang, Z. Li, L. Feng, C. Wang, W. Jing, Y. Zhang, Hardware Design of Seizure Detection Based on wavelet transform and sample entropy. J Circuits Syst Comp 25(9), 1650101 (2016). https://doi.org/10.1142/S0218126616501012

    Article  Google Scholar 

  15. H. Ocak, Optimal classification of epileptic seizures in EEG using wavelet analysis and genetic algorithm. Signal Process. 88(7), 1858–1867 (2008). https://doi.org/10.1016/j.sigpro.2008.01.026

    Article  MathSciNet  MATH  Google Scholar 

  16. P. Swami, A.K. Godiyal, J. Santhosh, B.K. Panigrahi, M. Bhatia, S. Anand, Robust expert system design for automated detection of epileptic seizures using SVM classifier. International Conference on Parallel, Distributed and Grid Computing (PDGC 2014). IEEE (2014). https://doi.org/10.1109/PDGC.2014.7030745

  17. A.T. Tzallas, M.G. Tsipouras, D.I. Fotiadis, Automatic seizure detection based on time-frequency analysis and artificial neural networks. Comput Intell Neurosci, 80510 (2007). https://doi.org/10.1155/2007/80510

  18. A.T. Tzallas, M.G. Tsipouras, D.I. Fotiadis, Epileptic seizure detection in EEGs using time–frequency analysis. IEEE Trans Inf Technol Biomed 13(5), 703–710 (2009). https://doi.org/10.1109/TITB.2009.2017939

    Article  Google Scholar 

  19. S.F. Liang, H.C. Wang, W.L. Chang, Combination of EEG complexity and spectral analysis for epilepsy diagnosis and seizure detection. EURASIP J Adv Signal Process 1, 853434 (2010). https://doi.org/10.1155/2010/853434

    Article  Google Scholar 

  20. A. Ridouh, D. Boutana, S. Bourennane, EEG signals classification based on time frequency analysis. J Circuits Syst Comp 26(12), 1750198 (2017). https://doi.org/10.1142/S0218126617501985

    Article  Google Scholar 

  21. N.E. Crone, A. Korzeniewska, P.J. Franaszczuk, Cortical gamma responses: Searching high and low. Int. J. Psychophysiol. 79(1), 9–15 (2011). https://doi.org/10.1016/j.ijpsycho.2010.10.013

    Article  Google Scholar 

  22. R.G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David, C.E. Elger, Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: Dependence on recording region and brain state. Phys. Rev. E 64, 061907 (2001)

    Article  Google Scholar 

  23. S.G. Mallat, A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans Pattern Anal Mach Intell 11(7), 674–693 (1989). https://doi.org/10.1109/34.192463

    Article  MATH  Google Scholar 

  24. S.G. Mallat, A wavelet tour of signal processing (Academic press, 1999)

  25. L. Breiman, Random Forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324.

    Article  MATH  Google Scholar 

  26. U.R. Acharya, S.V. Sree, S. Chattopadhyay, W. Yu, P.C.A. Ang, Application of recurrence quantification analysis for the automated identification of epileptic EEG signals. Int J of Neural Syst 21(3), 199–211 (2011). https://doi.org/10.1142/S0129065711002808

    Article  Google Scholar 

  27. U. Orhan, M. Hekim, M. Ozer, EEG signals classification using the K-means clustering and a multilayer perceptron neural network model. Expert Syst. Appl. 38, 13475–13481 (2011). https://doi.org/10.1016/j.eswa.2011.04.149

    Article  Google Scholar 

  28. U.R. Acharya, F. Molinari, S.V. Sree, S. Chattopadhyay, K.H. Ng, J.S. Suri, Automated diagnosis of epileptic EEG using entropies. Biomed Signal Process Control 7, 401–408 (2012). https://doi.org/10.1016/j.bspc.2011.07.007

    Article  Google Scholar 

  29. M. Peker, B. Sen, D. Delen, A novel method for automated diagnosis of epilepsy using complex-valued classifiers. IEEE J Biomed Health Inform 20(1), 108–118 (2016). https://doi.org/10.1109/JBHI.2014.2387795

    Article  Google Scholar 

  30. A.K. Tiwari, R.B. Pachori, V. Kanhangad, B. Panigrahi, Automated diagnosis of epilepsy using key-point based local binary pattern of EEG signals. IEEE J Biomed Health Inform 21(4), 888–896 (2017). https://doi.org/10.1109/JBHI.2016.2589971

    Article  Google Scholar 

  31. A. Bhattacharyya, R.B. Pachori, A. Upadhyay, U.R. Acharya, Tunable-Q wavelet transform based multiscale entropy measure for automated classification of epileptic EEG signals. Appl. Sci. 7, 385 (2017). https://doi.org/10.3390/app7040385

    Article  Google Scholar 

  32. I. Guler, E.D. Ubeyli, Adaptive neuro-fuzzy inference system for classification of EEG signals using wavelet coefficients. J Neurosci Methods 148(2), 113–121 (2005). https://doi.org/10.1016/j.jneumeth.2005.04.013

    Article  Google Scholar 

  33. E.D. Ubeyli, I. Guler, Features extracted by eigenvector methods for detecting variability of EEG signals. Pattern Recogn. Lett. 28(5), 592–603 (2007). https://doi.org/10.1016/j.patrec.2006.10.004

    Article  Google Scholar 

  34. N. Nicolaou, J. Georgiou, Detection of epileptic electroencephalogram based on permutation entropy and support vector machines. Expert Syst. Appl. 39(1), 202–209 (2012). https://doi.org/10.1016/j.eswa.2011.07.008

    Article  Google Scholar 

  35. N.S. Tawfik, S.M. Youssef, M. Kholief, A hybrid automated detection of epileptic seizures in EEG records. Comput Electr Eng 53, 177–190 (2016). https://doi.org/10.1016/j.compeleceng.2015.09.001

    Article  Google Scholar 

  36. P. Fergus, A. Hussain, D. Hignett, D. Al-Jumeily, K. Abdel-Aziz, H. Hamdan, A machine learning system for automated whole-brain seizure detection. Appl Comput Inform 12(1), 70–89 (2016). https://doi.org/10.1016/j.aci.2015.01.001

    Article  Google Scholar 

  37. U.R. Acharya, S.L. Oh, Y. Hagiwara, J.H. Tan, H. Adelid, Deep convolutional neural network for the automated detection and diagnosis of seizure using EEG signals. Comput. Biol. Med. 100(1), 270–278 (2018) https://doi.org/10.1016/j.compbiomed.2017.09.017

  38. P Thodoroff, J Pineau, A Lim, Learning robust features using deep learning for automatic seizure detection. arXiv:1608.00220, (2016)

    Google Scholar 

  39. O. Fausta, Y. Hagiwara, T.J. Hong, O.S. Lih, U.R. Acharya, Deep learning for healthcare applications based on physiological signals: A review. Comput. Methods Prog. Biomed., 161 (2018). https://doi.org/10.1016/j.cmpb.2018.04.005

Download references

Funding

This research has been co-financed by the European Union and Greek national funds through the Operational Program Competitiveness, Entrepreneurship and Innovation, under the call RESEARCH – CREATE – INNOVATE (project code: T1EDK-01958).

Availability of data and materials

All data used in this manuscript are publicly available in [22].

Author information

Authors and Affiliations

Authors

Contributions

Markos G. Tsipouras is the sole author. The author read and approved the final manuscript.

Corresponding author

Correspondence to Markos G. Tsipouras.

Ethics declarations

Author’s information

MGT was born in Athens, Greece, in 1977. He received the diploma degree in Computer Science from the University of Ioannina, Greece, in 1999, and M.Sc. and Ph.D degrees in computer science, in 2002 and 2008 respectively, from the same department. Also, he received a Natural Sciences diploma from the Hellenic Open University in 2013. He has participated in more than 15 European and National Research & Development Projects as a researcher/developer. He has published more than 40 papers in peer-reviewed scientific journals, and more than 60 articles in peer-reviewed conference proceedings. Also, he has published 7 book chapters, and he has co-authored one book. His research interests include digital signal and image processing, medical informatics, artificial intelligence, fuzzy logic, data mining, decision support systems and expert systems.

Competing interests

The author declares that he has no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tsipouras, M.G. Spectral information of EEG signals with respect to epilepsy classification. EURASIP J. Adv. Signal Process. 2019, 10 (2019). https://doi.org/10.1186/s13634-019-0606-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s13634-019-0606-8

Keywords