An official website of the United States government

The National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) Central Repository is hosting a data challenge platform aimed at augmenting and enhancing existing Repository data for future secondary research, including data-driven discovery by artificial intelligence researchers

Join the Data-Centric movement!

All NIDDK-CR Challenges are listed below. Challege status is indicated below the challenge name.

About the Data Challenge Platform


NIDDK Central Repository (NIDDK-CR) is implementing a data challenge platform to support a suite of data-centric challenge-related activities aimed at augmenting and enhancing existing Repository data, including future artificial intelligence (AI) and machine learning (ML) applications. The NIDDK-CR program methodically strives to increase the utilization and impact of the resources under its guardianship. In fall 2021, NIDDK-CR initiated efforts to enhance data quality for AI-readiness, employing natural language processing in a small pilot project to tag study variables. In 2022, NIDDK-CR upgraded internal processes and adopted industry-leading data standards to align with FAIR and TRUST principles, improving technical data and metadata quality. Capitalizing on these accomplishments, and to further promote the visibility of resources and increase the potential for reuse in innovative research, NIDDK-CR established a platform to host a series of data challenges that will leverage existing data in the repository and provide tools for participants to develop solutions for increasing the AI readiness of NIDDK data and compete for prizes in a collaborative environment.

NIDDK-CR data challenges activities will build on one another to develop tools, approaches, models, and methods to increase the interoperability and usability of NIDDK data including AI/ML applications. These enhanced NIDDK-CR practices and AI-ready datasets would then be used in subsequent data challenge activities focusing on hypothesis generation and new analysis methodology in NIDDK mission areas. Contestants of all experience levels in data science and analytics, from novice to expert level, are encouraged to participate in NIDDK-CR data challenge activities. Submissions will be evaluated based on performance and innovation across each participation tier. Tiered levels for participation will allow NIDDK to build and retain a community of participants across the activities and engage more junior researchers to encourage the next generation of data scientists and secondary researchers.

Have Questions?


Register for a NIDDK-CR account to stay up-to-date on future challenge-related activities hosted by the Repository. The NIDDK-CR support staff are also available to answer questions at NIDDK-CRsupport@niddk.nih.gov.

Office Hours


NIDDK-CR, in partnership with Booz Allen Hamilton, hosts Office Hours, which feature talks from experts in data science and AI and provide a forum for data challengers to ask questions.

In association with the Data-Centric Challenge (December 2023-January 2024), NIDDK-CR hosted eight (8) office hour sessions on data science and AI-readiness with topics ranging from preparing AI-ready datasets in R to performing principal component analysis, to provide an educational opportunity for data challengers and the broader research community alike to learn about tools, models, and approaches for AI-driven research.

Recordings and materials from past AI-Readiness Office Hours are available on the NIDDK-CR website here.

Previous Challenges


NIDDK-CR Data Centric Challenge

DEC 1, 2023 - JAN 22, 2024 Closed

In September 2023, NIDDK-CR announced the Data Centric Challenge aimed at enhancing NIDDK datasets for future AI applications. Challenge participants were tasked with generating an AI-ready dataset that could be used for future data challenges and producing methods to enhance the AI-readiness of NIDDK data. Participation in the Challenge was tiered (i.e., beginner-level and intermediate/advanced-level) and utilized data from two longitudinal studies focused on type 1 diabetes (TEDDY and TrialNet).