Athanasios Tsanas (‘Thanasis’)

 

 

Home

 

Vitae

 

Research

 

Publications

 

Talks

 

Software

 

Data

 

Consulting

 

Contact

 

Links

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This is a collection of datasets I have used in my research and which are made freely available. Please cite the relevant papers if you use these datasets in your research. The datasets have also been deposited in the standard UCI Machine Learning Repository.

 

Parkinson’s telemonitoring

[MATLAB]

[Excel]

UCI

This study looked into the problem of mapping dysphonia measures (speech signal characteristics) to a standard clinical metric of Parkinson’s disease symptom severity. The dataset comprises 5875 samples and 16 features to predict a real valued response (regression problem). It can also be used as a multi-class classification problem if the response is rounded to the nearest integer. More details can be found in my IEEE Transactions on Biomedical Engineering 2010 paper. Please include the following citation if you use it in your work:

-         A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig: "Accurate telemonitoring of Parkinson's disease progression by non-invasive speech tests", IEEE Transactions on Biomedical Engineering, Vol. 57, pp. 884-893, April 2010

 

Energy efficiency

[MATLAB]

[Excel]

UCI

 

This study looked into the problem of assessing heating load and cooling load (that is, energy efficiency) as a function of some building parameters. The dataset comprises 768 samples and 8 features to predict two real valued responses (regression problem). It can also be used as a multi-class classification problem if the response is rounded to the nearest integer. More details can be found in my Energy and Buildings 2012 paper. See also this Supplementary Material with additional information. Please include the following citation if you use it in your work:

-         A. Tsanas, A. Xifara: "Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools", Energy and Buildings, Vol. 49, pp. 560-567, 2012

 

LSVT Voice rehabilitation

[MATLAB]

[Excel]

UCI

 

This study uses 309 speech signal processing algorithms to characterize 126 signals from 14 individuals collected during voice rehabilitation. The aim is to replicate the experts’ assessment denoting whether these voice signals are considered “acceptable” or “unacceptable” (binary classification problem). More details can be found in my IEEE Transactions on Neural Systems and Rehabilitation Engineering 2014 paper. Please include the following citation if you use it in your work:

-         A. Tsanas, M.A. Little, C. Fox, L.O. Ramig: “Objective automatic assessment of rehabilitative speech treatment in Parkinson’s disease”, IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol. 22, pp. 181-190, January 2014

 

Sustained vowels /a/ with F0 ground truth

[Zipped data]

 

 

The accurate estimation of the fundamental frequency (F0) is a well-known challenging problem in the speech signal processing research community. Unfortunately, it is difficult to obtain objective ground truth values with contemporary approaches which rely on EGGs. Here, we used a sophisticated, state of the art physiological model of voice production to construct sustained /a/ vowels, where the exact ground truth of F0 values is known. We benchmarked 10 established F0 estimation algorithms, and proposed a novel fusion approach to further improve F0 estimates. We would like to encourage researchers to use this database when evaluating F0 estimation algorithms in order to benchmark results in this application. More details can be found in my IEEE Transactions on Neural Systems and Rehabilitation Engineering 2014 paper. (Note that here I am providing 130 *.wav files, and the ground truth values are provided in an Excel spreadsheet). Please include the following citation if you use it in your work:

-         A. Tsanas, M. Zañartu, M.A. Little, C. Fox, L.O. Ramig, G.D. Clifford: “Robust fundamental frequency estimation in sustained vowels: detailed algorithmic comparisons and information fusion with adaptive Kalman filtering”, Journal of the Acoustical Society of America, Vol. 135, pp. 2885-2901, 2014

 

 

© Athanasios Tsanas

Last updated: 23 November 2016