Challenge Description and Goal:
The pathogenesis of COVID-19 is increasingly suggesting impairments in the respiratory system. In this light, it is natural to ask – Can sound samples serve as acoustic biomarkers of COVID-19? If yes, an acoustics based COVID-19 diagnosis can provide a fast, contactless, and inexpensive testing scheme, with potential to supplement the existing molecular testing methods, such as RT-PCR and RAT. The present Challenge is an exploration of ideas to find answers to this question.
The data is provided in two sets. The first set is a development dataset allowing participants to design and train classifier models. The second set is a blind test set.
Development dataset description
- This includes audio samples from 965 subjects. Every subject contributes three audio files, namely, breathing, cough, and speech audio samples.
- Subject distribution:
- Age: The subjects fall in the age group of 15-90 years, with a majority falling in the age 15-40 years.
- Sex: From the 965 subjects, 242 are female and the rest are male.
- Health status: 172 subjects are COVID-19 positive (asymptomatic, mild, moderate); 793 subjects are COVID-19 negative and comprise a mix of individuals who are completely healthy, have respiratory ailments (like tuberculosis, pneumonia, and chronic lung disease) or COVID-19 like symptoms (cough, fever, etc.)
A blind test set will be provided for model evaluation. You will submit your model results on the blind set and the validation lists to a leaderboard interface, featuring performance of other teams on the same dataset.
- The performance of each model will be judged by a committee of judges. The winners will be decided based on:
- Obtained ROC-AUC on the blind test set, the higher the better
- Interpretability of the developed models.
- Results and code of the model classifier must be posted on GitHub to be considered in the competition.
- Each competitor / group should submit an up to 4-page paper, following the standard IEEE format for conference paper submissions. The report should:
- summarize the method and results, including the applied interpretability approach and obtained interpretations of the model’s results.
- provide the link to the GitHub post.
- contain the proper citation / acknowledgement for the data use.
References & Resources:
- C. Brown, J. Chauhan, A. Grammenos, J. Han, A. Hasthana-Sombat, D. Spathis, T. Xia, P. Cicuta, C. Mascolo, “Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data,” in: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Association for Computing Machinery, New York, NY, USA, 2020, pp. 3474–3484.
- A. Imran, I. Posokhova, H. N. Qureshi, U. Masood, M. S. Riaz, K. Ali, C. N. John, M. I. Hussain, M. Nabeel, “AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app,” Informatics in Medicine Unlocked 20 (2020) 100378.