The healthcare and AI communities have witnessed a growing interest in the development of AI-assisted systems for automated diagnosis of Parkinson's Disease (PD), one of the most prevalent neurodegenerative disorders. However, the progress in this area has been significantly impeded by the absence of a unified, publicly available benchmark, which prevents comprehensive evaluation of existing PD analysis methods and the development of advanced models. This work overcomes these challenges by introducing YouTubePD -- the first publicly available multimodal benchmark designed for PD analysis. We crowd-source existing videos featured with PD from YouTube, exploit multimodal information including in-the-wild videos, audio data, and facial landmarks across 200+ subject videos, and provide dense and diverse annotations from a clinical expert. Based on our benchmark, we propose three challenging and complementary tasks encompassing both discriminative and generative tasks, along with a comprehensive set of corresponding baselines. Experimental evaluation showcases the potential of modern deep learning and computer vision techniques, in particular the generalizability of the models developed on our YouTubePD to real-world clinical settings, while revealing their limitations. We hope our work paves the way for future research in this direction.
@inproceedings{YouTubePD2023,
author = {Zhou, Andy and Li, Samuel and Sriram, Pranav and Li, Xiang and Dong, Jiahua and Sharma, Ansh and Zhong, Yuanyi and Luo, Shirui and Jaromin, Maria and Kindratenko, Volodymyr and Heintz, George and Zallek, Christopher and Wang, Yu-Xiong},
title = {YouTubePD: A Multimodal Benchmark for Parkinson\’s Disease Analysis},
booktitle = {Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year = {2023},
}