Athanasios Tsanas (‘Thanasis’)

 

 

Home

 

Vitae

 

Research

 

Publications

 

Talks

 

Software

 

Data

 

Consulting

 

Contact

 

Links

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

This page offers specialised software I use for my research, which may prove useful to researchers working on related areas. All functions are implemented as part of the Matlab software package (occasionally using *.mex files) and have been tested with the 2010b version on a 64-bit Windows machine (I’ve tested most functions on a Linux machine as well). The code is heavily annotated, facilitating further experimentation. Each function includes references and explanation of the key ideas. To use the toolboxes, download the zipped files, extract the files in the same folder, and add this folder to Matlab's path.

In all cases the source code is provided under the standard GPL license and is free for academic use. Please contact me for commercial use.

Copyright © A. Tsanas, 2014

Please cite my relevant publications if you make use of this code. I would be grateful to receive feedback regarding bugs, suggestions to improve the code, or additional functions/options to be included in forthcoming versions of the functions/toolboxes.

Information fusion using adaptive Kalman filtering

[MATLAB]

 

The concept of combining multiple experts or sensors is fascinating. One approach to achieve this is to use the adaptive Kalman filter, where we use some confidence metrics (also known as signal quality indices) to account for the variable confidence in the estimates of each expert (or sensor). This variability in confidence may be due to the particular characteristics of the measures quantity, or inherent limitations of the expert (sensor) at accurately providing estimates at the given instance (this can also be assessed with respect to the estimates of the other experts). The function provided here is for the application of estimating the fundamental frequency of sustained vowels, but the framework is generic and the user could adapt appropriately the function with the confidence metrics for different applications. More details can be found in my JASA2014 paper. Please include the following citation if you use it in your work:

-         A. Tsanas, M. Zañartu, M.A. Little, C. Fox, L.O. Ramig, G.D. Clifford: “Robust fundamental frequency estimation in sustained vowels: detailed algorithmic comparisons and information fusion with adaptive Kalman filtering”, Journal of the Acoustical Society of America, Vol. 135, pp. 2885-2901, 2014

 

Stage-independent, single-lead EEG sleep spindle detector

[MATLAB]

 

A robust, efficient, and minimal requirement sleep staging-independent spindle detector by processing a single EEG (most algorithms in the research literature require the use of additional EEG leads, and frequently also the hypnogram). More details can be found in my Frontiers in Human Neuroscience 2015 paper. Please include the following citation if you use it in your work:

-        A. Tsanas, G.D. Clifford: “Stage-independent, single lead EEG sleep spindle detection using the continuous wavelet transform and local weighted smoothing”, Frontiers in Human Neuroscience 9:181, 2015

 

Simple correlation-based feature selection: mRMRSpearman

[MATLAB]

 

This function is a simplified, computationally efficient, robust approach for feature selection using the minimum Redundancy Maximum Relevance (mRMR) principle. The original paper by Peng et al. (2005) used the mutual information criterion (its computation is extremely demanding if done via proper density estimation, and problematic if done via crude histograms as in the open source code provided by Peng) to select features. Instead, here I opt for the Spearman correlation coefficient criterion which allows for a fast and computationally inexpensive feature selection algorithm (hence I call this technique mRMRSpearman). More details can be found in my simple methodological guide for data analysis book chapter. Please include the following citation if you use it in your work:

-        A. Tsanas, M.A. Little, P.E. McSharry: "A methodology for the analysis of medical data", Chapter 7 in Handbook of Systems and Complexity in Health, pp. 113-125, Eds. J.P. Sturmberg, and C.M. Martin, Springer, 2013

 

Voting mechanism for feature selection

[MATLAB]

 

Applying most feature selection algorithms on perturbed versions of a dataset will likely result in different feature subsets being selected. This function takes as input a matrix with the computed feature subsets across L repetitions, where each repetition holds a perturbed (e.g. bootstrapped) version of the original design matrix, and votes for the most appropriate final feature ranking. More details can be found in my TNSRE2014 paper. Please include the following citation if you use it in your work:

-          A. Tsanas, M.A. Little, C. Fox, L.O. Ramig: “Objective automatic assessment of rehabilitative speech treatment in Parkinson’s disease”, IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol. 22(1), pp. 181-190, January 2014

 

Estimation of mutual information (vanilla kde-based approach)

[MATLAB]

 

This function computes the mutual information, which can be thought of as a more general method compared to correlation coefficients in order to quantify the association between two random variables (vectors). Most freely available implementations of mutual information estimation rely on prior sub-optimal intermediate steps such as estimating probability densities using histogram techniques; here I provide a simple proof of concept approach estimating densities relying on kernel density estimation before computing the mutual information. Note there are more sophisticated and accurate approaches for computing the mutual information, but this (one might say naïve) implementation is simple, easy to understand, and computationally fairly efficient. The mutual information is not upper bounded which makes its direct interpretation difficult; for this reason I am also providing a normalised version. The normalised mutual information ranges between 0 and 1, where 0 denotes no association between the two random variables (that is, they are independent) and 1 denotes perfect association (knowledge of one random variable allows perfect prediction of the other). The function has been created with the standard goal in data analysis of determining the univariate association of each feature (attribute) with the outcome (target) we aim to predict. Please include the following citation if you use it in your work:

-        A. Tsanas: Accurate telemonitoring of Parkinson’s disease symptom severity using nonlinear speech signal processing and statistical machine learning, D.Phil. thesis, Oxford Centre for Industrial and Applied Mathematics, University of Oxford, UK, 2012

Alternatively, if you prefer a journal paper citation:

-         A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig: "Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson’s disease symptom severity", Journal of the Royal Society Interface, Vol. 8, pp. 842-855, June 2011

 

Mapping UPDRS to Hoehn & Yahr

[MATLAB]

[SAS] (thanks to Andrew Kramer)

 

This function serves to map the commonly used Parkinson’s disease symptom severity rating scale UPDRS to a Parkinson’s disease symptom severity stage called H&Y. I am grateful to Dr. A. Kramer for developing the SAS code. More details can be found in my Parkinsonism and Related Disorders 2012 paper. Please include the following citation if you use it in your work:

-        A. Tsanas, M.A. Little, P.E. McSharry, B.K. Scanlon, S. Papapetropoulos: "Statistical analysis and mapping of the Unified Parkinson’s Disease Rating Scale to Hoehn and Yahr staging", Parkinsonism and Related Disorders, Vol. 18 (5), pp. 697-699, 2012

 

Time-series analysis: features using wavelet decomposition

[MATLAB]

 

 

This function aims to characterize time series (extracting a feature vector) using standard wavelet decomposition. It was originally proposed to analyze properties of voice signals, but the technique is generic and could be, in principle, applied to any time series. More details can be found in my Nonlinear Theory and its Applications 2010 paper. Please include the following citation if you use it in your work:

-        A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig: “New nonlinear markers and insights into speech signal degradation for effective tracking of Parkinson’s disease symptom severity”, International Symposium on Nonlinear Theory and its Applications (NOLTA), pp. 457-460, Krakow, Poland, 5-8 September 2010 (won the student paper award)

 

Voice Analysis Toolbox (version 1.0)

[MATLAB]

[executable] (requires Windows)

 

This toolbox presents a number of speech signal processing algorithms, aiming at the objective characterization of voice, and in particular the assessment of voice disorders. These algorithms are mainly directed at quantifying amplitude (shimmer variants), frequency (jitter variants) and increased noise (signal-to-noise measures). Note that the toolbox has been developed and has only been validated in settings with the sustained vowel /a/. The algorithmic tools herein may be generalizable to other sustained vowels, but they are definitely not appropriate for conversational speech. The toolbox was developed in a series of journal and conference studies, and the most important are highlighted below. Please include the following citations if you use this toolbox in your work:

-         A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig: “New nonlinear markers and insights into speech signal degradation for effective tracking of Parkinson’s disease symptom severity”, International Symposium on Nonlinear Theory and its Applications (NOLTA), pp. 457-460, Krakow, Poland, 5-8 September 2010

-         A. Tsanas, M.A. Little, P.E. McSharry, L.O. Ramig: "Nonlinear speech analysis algorithms mapped to a standard metric achieve clinically useful quantification of average Parkinson’s disease symptom severity", Journal of the Royal Society Interface, Vol. 8, pp. 842-855, June 2011

-         A. Tsanas: Accurate telemonitoring of Parkinson’s disease symptom severity using nonlinear speech signal processing and statistical machine learning, D.Phil. thesis, Oxford Centre for Industrial and Applied Mathematics, University of Oxford, UK, 2012

 

Statistical Machine Learning Toolbox (version 1.0)

[MATLAB]

 

This toolbox contains functions which can be used in a variety of data analysis applications. The toolbox tackles problems in the general field of statistical machine learning, including functions for data visualisation, feature selection, regression and classification, using a wide range of available and refined methods. It is fairly basic for now, but I intend to keep updating it with additional functions for methodological concepts. Please include the following citation if you use it in your work, or look at specific functions within the toolbox for appropriate referencing and citing purposes:

-        A. Tsanas: Accurate telemonitoring of Parkinson’s disease symptom severity using nonlinear speech signal processing and statistical machine learning, D.Phil. thesis, Oxford Centre for Industrial and Applied Mathematics, University of Oxford, UK, 2012

 

© Athanasios Tsanas

Last updated: 23 November 2016