Parkinsons Biomarker Challenge

Title: Parkinsons Biomarker [Archived Project]

July 6 2017 - October 2017


Recent advances in mobile health have demonstrated great potential to leverage sensor-based technologies for quantitative, remote monitoring of health and disease - particularly for diseases affecting motor function such as Parkinson’s disease. Such approaches have been rolled out using research-grade wearable sensors and, increasingly, through the use of smartphones and consumer wearables, such as smart watches and fitness trackers. These devices not only provide the ability to measure much more detailed disease phenotypes but also provide the ability to follow patients longitudinally with much higher frequency than is possible through clinical exams. However, the conversion of sensor-based data streams into digital biomarkers is complex and no methodological standards have yet evolved to guide this process.

Parkinson’s disease (PD) is a neurodegenerative disease that primarily affects the motor system but also exhibits other symptoms. Typical motor symptoms of the disease include tremors, slowness (bradykinesia), posture and walking perturbations, muscle rigidity and speech perturbations. In the clinic, symptoms are evaluated using physician observation and patient reports. Multiple approaches are under investigation for development of digital biomarkers in PD using accelerometer data from mobile sensor devices with the goal of improving monitoring of treatment efficacy and disease progression for use in clinical care and drug development.

The Parkinson’s Disease Digital Biomarker DREAM Challenge is a first of it's kind challenge, designed to benchmark methods for the processing of sensor data for development of digital signatures reflective of Parkinson's Disease. Participants will be provided with raw sensor (accelerometer, gyroscope, and magnetometer) time series data recorded during the performance of pre-specified motor tasks, and will be asked to extract data features which are predictive of PD pathology. In contrast to traditional DREAM challenges, this one will focus on feature extraction rather than predictive modeling, and submissions will be evaluated based on their ability to predict disease phenotype using an array of standard machine learning algorithms.!Synapse:syn8717496/wiki/422884

Leader(s): Xinlin Song (slack: @spidermanxyz, email:

Participants: Arya Farahi (@aryaf), Junhao Wang (@junhao), Yi-Lun Wu (@Yi-Lun), Chun-Yu Hsiung (@Bear5566)

Link to the Code:

Link to the Dataset: Access to data requires registration. Speak to Xinlin if you need access.


Parkinson’s disease (PD) is a degenerative disorder of central nervous system that mainly affects the motor system. Currently, there is no objective test to diagnose PD and the bedside examination by a neurologist remains the most important diagnostic tool. The examination is performed using the assessment of motor symptoms such as shaking, rigidity, slowness of movement and postural instability. However, these motor symptoms begin to occur in very late stage. Smartphones and smartwatches have sensitive sensors (accelerometer, gyroscope and pedometer) that can track the user’s motion more frequently than clinical examinations at much lower cost. While the movement information is recorded by the sensors, the rough sensor data is hard to interpret and give limited help to PD diagnosis.

In the Parkinson’s Biomarker Challenge, we are tasked to extract useful features from time series accelerometer and gyroscope data. The data of Challenge 1 consist of ~ 35000 records collected from ~ 3000 participants with phone APP in their daily life. The final goal is to prediction whether a participant has Parkinson’s disease or not. The data of Challenge 2 consist of records from ~ 20 patients doing different tasks (such as drinking water, folding towels, assembling nuts and bolts etc.) in the research labs. And the goal is to predict how severe is the limb action tremor.

The general method we used in both two challenges is generating multiple features from the time series sensor data and performing feature selection to get the top features. Finally, a machine learning model is built based on the top features. The details of the methods we use can be found here:

Challenge 1:!Synapse:syn10894377/wiki/470036

Challenge 2:!Synapse:syn11317207/wiki/486357

The best result we got is the 4th place in challenge 2.