Challenge Description and Goal:

The pathogenesis of COVID-19 is increasingly suggesting impairments in the respiratory system. In this light, it is natural to ask – Can sound samples serve as acoustic biomarkers of COVID-19? If yes, an acoustics based COVID-19 diagnosis can provide a fast, contactless, and inexpensive testing scheme, with potential to supplement the existing molecular testing methods, such as RT-PCR and RAT. The present Challenge is an exploration of ideas to find answers to this question.


The participants will be provided with the dataset after filling the registration form available here and sending the signed Terms and Conditions form to

The data is provided in two sets. The first set is a development dataset allowing participants to design and train classifier models. The second set is a blind test set.

Development dataset description

  1. This includes audio samples from 965 subjects. Every subject contributes three audio files, namely, breathing, cough, and speech audio samples.
  2. Subject distribution:
    • Age: The subjects fall in the age group of 15-90 years, with a majority falling in the age 15-40 years.
    • Sex: From the 965 subjects, 242 are female and the rest are male.
    • Health status: 172 subjects are COVID-19 positive (asymptomatic, mild, moderate); 793 subjects are COVID-19 negative and comprise a mix of individuals who are completely healthy, have respiratory ailments (like tuberculosis, pneumonia, and chronic lung disease) or COVID-19 like symptoms (cough, fever, etc.)


A blind test set will be provided for model evaluation. You will submit your model results on the blind set and the validation lists to a leaderboard interface, featuring performance of other teams on the same dataset.

Evaluation criteria

  1. The performance of each model will be judged by a committee of judges. The winners will be decided based on:
    • Obtained ROC-AUC on the blind test set, the higher the better
    • Interpretability of the developed models.
  2. Results and code of the model classifier must be posted on GitHub to be considered in the competition.
  3. Each competitor / group should submit an up to 4-page paper, following the standard IEEE format for conference paper submissions. The report should:
    • summarize the method and results, including the applied interpretability approach and obtained interpretations of the model’s results.
    • provide the link to the GitHub post.
    • contain the proper citation / acknowledgement for the data use.

References & Resources:

  1. C. Brown, J. Chauhan, A. Grammenos, J. Han, A. Hasthana-Sombat, D. Spathis, T. Xia, P. Cicuta, C. Mascolo, “Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data,” in: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Association for Computing Machinery, New York, NY, USA, 2020, pp. 3474–3484.
  2. A. Imran, I. Posokhova, H. N. Qureshi, U. Masood, M. S. Riaz, K.  Ali, C. N. John, M. I. Hussain, M. Nabeel, “AI4COVID-19: AI enabled preliminary diagnosis for COVID-19 from cough samples via an app,” Informatics in Medicine Unlocked 20 (2020) 100378.