Next Article in Journal
Stimulated Microcontroller Dataset for New IoT Device Identification Schemes through On-Chip Sensor Monitoring
Previous Article in Journal
Predicting Academic Success of College Students Using Machine Learning Techniques
 
 
Data Descriptor
Peer-Review Record

Training Datasets for Epilepsy Analysis: Preprocessing and Feature Extraction from Electroencephalography Time Series

by Christian Riccio 1, Angelo Martone 2, Gaetano Zazzaro 3,* and Luigi Pavone 4
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Submission received: 20 March 2024 / Revised: 23 April 2024 / Accepted: 23 April 2024 / Published: 26 April 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

 

• The main objective of the manuscript is unclear

  Some papers have the aim of presenting and making available (freely or commercially) a certain database of datasets, describing its characteristics and giving guidelines about how it can be used.

  Other papers have the aim of presenting and making available (freely or commercially) a certain software toolbox designed to analyze a specific type of data, e.g. long-term EEGs, either locally (as client) or at server level. In order for the toolbox to be widely useful, it should be easily applicable not only to a specific group of datasets (e.g. the Freiburg EEG database), but to any dataset satisfying a simple set of criteria (e.g. recordings from any EEG laboratory).

  The current manuscript appears to attempt a little bit of both: It offers a group of preprocessed and semi-processed datasets from the Freiburg EEG database by using the Training Builder (TrB) toolbox. It does not present or describe a novel database, nor does it appear to allow a client to use TrB in order to perform a series of analyses on raw EEG data of his/her choice either locally or after being uploaded to the server.

   If the latter statement is incorrect and TrB can indeed be used on raw EEG data either locally or after being uploaded to the server, this should be clearly stated and guidelines should be provided.

  If TrB cannot be applied to raw EEG data by someone except its designers at the server level, can it be modified so that it could easily be easily applied to raw EEG data locally by a client?  Is there a plan for such a modification?

  The manuscript might be useful to computer engineers and related specialists working on the development of software and hardware for seizure prediction and/or forecasting, especially if it the toolbox can be applied to raw EEG data.

 The manuscript is, on the other hand, unlikely to be useful for professionals (including computer engineers, physicists or mathematicians) who work in epilepsy monitoring units and need a ready to use tool for seizure prediction/forecasting or for presurgical Epileptogenic Zone identification.

 

 

• Availability of the raw Freiburg EEG database

The Freiburg EEG database page is:

https://epilepsy.uni-freiburg.de/freiburg-seizure-prediction-project/eeg-database

As of 10 April 2024, the page informs us that:

The EEG Database is discontinued and not further available for download since it is superseded by the new European Epilepsy Database.  (available for purchase)

 

  Is the manuscript affected by the change of availability of the Freiburg database from free to paid ?

 

  Whether free or not, what, if anything, distinguishes the Freiburg EEG database from other EEG databases of patients with epilepsy?

  In any case, the TrB tool (with or without freely available code) would be much more useful if it could be applied to any raw EEG dataset.

 

• Sampling rate

  As stated in lines 91-92:

 The temporal resolution of the database is very high, with recordings sampled at 256 Hz.

The Epileptogenic Zone (EZ) is usually characterized by interictal and ictal activity in the High Frequency Oscillation (HFO) range, i.e. from approximately 100 Hz to 400 Hz. In order to localize the EZ, HFOs are routinely sought for in intracranial recordings and are increasingly recognized, although with technical difficulties, in extracranial recordings as well. Reliable analysis in the frequency domain can be obtained up to frequencies approximately one third of the sampling rate. With fs=256Hz, this is up to about 85 Hz, borderline for high gamma frequencies and clearly inadequate for HFOs.

  Therefore, the Freiburg database, with fs=256 Hz, does not appear to be particularly suitable for research and development on EZ localization. Its suitability for research on seizure prediction / forecasting is also limited to some extent due to its relatively low sampling rate.

 

  Can TrB analyze frequencies higher than the high gamma range, assuming that data with higher sampling rate is provided?

Author Response

Dear Reviewer 1,

Thank you very much indeed for your Comments and Suggestions. They have been highly beneficial as they have provided us with the opportunity to delve deeper and enhance our manuscript.

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Dear authors.

I suggest that you provide guidelines for defining the duration of the pre-ictal, post-ictal and inter-ictal periods in Section 1.1. Also give approximate guidelines for the selection of windowing time parameters L and S. Will you provide an open implementation of your Training Builder (TrB) tool? Finally, the link https://doi.org/10.5281/zenodo.10808054 doesn't seem to work. 

Author Response

Dear Reviewer 2,

Thank you very much indeed for your Comments and Suggestions. They have been highly beneficial as they have provided us with the opportunity to delve deeper and enhance our manuscript.

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This manuscript aimed to introduce EEG data of epilepsy that the authors have made publicly available. It is a processing and feature extraction done on 20 subjects of other data freely available, the Freiburg Seizure Prediction Database (FSPD).  I should first mention that I was not able to get access to the database of this paper. I tried several times to have access to the data but could not. Neither logging in to "Zenodo" nor the link have been provided in the manuscript worked ( https://doi.org/10.5281/zenodo.10808054). When clicking on the link through different browsers, this error shows up: This DOI cannot be found in the DOI System. I am not sure if something is wrong in my system to get access to. If not, the authors should resolve this error by inserting a true link in the manuscript.

They extracted multiple features from the time series of the FSPD. Briefly, 22 features from each electrode, each frequency band, and each time window using a tool called “TrB”, to provide good training data sets for machine learning studies in epilepsy,  which are labeled as pre-ictal, ictal, post-ictal, and inter-ictal. This feature was already tested to automatically detect the labels in previous studies (Reference # 3) reaching great performance, using tools of  TrB,  using just one subject of the FSPD. The data can help the researchers to have the FSPD data already preprocessed, with no need to preprocess the long recorded data of FSPD,  and further have the features for short segments of EEG  time series in different bands and electrodes.

The output file of the TrB from the FSPD data is helpful to be used in other software, such as MATLAB, useful for machine learning approaches in epilepsy studies.

Here are a couple of concerns that the authors are expected to address:

-It seems that TrB has different options for filtering EEG artifacts. It is unclear how TrB removes artifacts, as just filtering embedded in the TrB, is not enough to greatly remove the artifacts normally removed using ICA /PCA based methods.

- The method that was used to compute the features was already tested (Zazzaro et al, reference # 3) while in  Zazzaro’s study, the mathematics of the features are not provided (Reference # 20 of it is not available to know the algorithms). As it is aimed to provide training data that might be used in other software such as MATLAB, one needs to know the exact formula of features to remove subtle differences in calculation between software. However, it is mentioned that the TrB has been developed to be as extensible as possible, to be able to run feature calculation algorithms developed with different programming languages and this can remove this concern in the future.

 

Author Response

Dear Reviewer 3,

Thank you very much indeed for your Comments and Suggestions. They have been highly beneficial as they have provided us with the opportunity to delve deeper and enhance our manuscript.

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

 

    Important points have been clarified in the response / cover letter. Nevertheless, in the Abstract and Summary, the Objectives and Conclusions are still somewhat confusingly overstated.

    Modifying at least 2 important sentences is necessary to restore accuracy and clear the confusion about the (in reality limited) objectives and the (also limited) targeted audience of the manuscript:

 

 

(1) Abstract: Conclusions

    “The 20 datasets can significantly enhance the training of accurate data-driven models for epilepsy analysis, providing a detailed view of seizure dynamics and promising improvements in epilepsy management and treatment strategies. This highlights the importance of such datasets in advancing diagnostic and therapeutic methods.”

    It cannot be convincingly argued that these datasets with preprocessing and feature extraction by TrB “promis[e] improvements in epilepsy management and treatment strategies.” What they do is “provide a foundational corpus for analyzing seizures and training analytical models for seizure detection, prediction and clustering”, as stated in Summary.

    It is also totally unfounded to claim that “This highlights the importance of such datasets in advancing diagnostic and therapeutic methods.”

    The importance of such datasets would be highlighted if they were not just derived but actually SUCCESFULLY USED to demonstrate promise in seizure detection, prediction or clustering. Therefore, the concluding sentence should be removed.

    The lack of any demonstration of the usefulness of these datasets is probably the main limitation of the manuscript.

(2) Summary, lines 40-43.

    In response to request for clarification of the paper’s objective, the authors have added the following in

the revised version:

    The main objective of this paper is to describe 20 datasets derived from the EEG data of as many epileptic patients, which we are making available to encourage further research into epilepsy. These datasets provide a foundational corpus for analyzing seizures and training analytical models for seizure detection, prediction, clustering, and others topics.

    It would be more accurate and informative to leave out ambiguous generalities, such as “encourage further research into epilepsy” and “other topics”, focusing on the tangible objective, which might be described, for example, as follows:

  “The main objective of this paper is to provide a foundational corpus for analyzing seizures and training analytical models for seizure detection, prediction and clustering by describing the derivation (through signal filtering and feature extraction) of 20 datasets derived from EEG patients with focal epilepsy and making them available.”

 

    In conclusion, it should be admitted and made clear that the objectives of the manuscript relatively limited and addressed to a specialized audience, i.e. computer engineers and related professionals interested in developing software for seizure detection, prediction and clustering. It is not addressed to epilepsy center engineers and physicists who need ready to use seizure detection software.

    This is clearly within the scope and, probably, adequate for the DATA journal, with the caveat of being of low interest to readers except for a specialized audience.

Author Response

Dear Reviewer 1,

Thank you very much indeed for your additional comments and suggestions. They have been highly beneficial, particularly in the sections discussing objectives and conclusions.

We completely agree with the considerations and comments provided. We acknowledge that our initial draft contained sentences that lacked in-depth analysis and well-founded research. We are committed to making these changes, thanks to your suggestions, in our latest revised manuscript. Specifically, we have added your suggested edits in red and removed the conclusions from the abstract that lacked scientific foundation.

Reviewer 3 Report

Comments and Suggestions for Authors

No comments, thanks

Author Response

Thank you very much for your kind support.

 

Back to TopTop