Welcome to the GDI Data Catalogue
Welcome to the GDI Data Catalogue user guide!
The GDI Data Catalogue is the central management system for Europe's genomic dataset network. As part of the 1+ Million Genomes Initiative, this catalogue serves as the source for dataset information that users can access through the public-facing GDI Data Portal↗.
This guide is for catalogue managers—data stewards, repository administrators, and data managers—who are responsible for publishing, curating, and managing genomic datasets within the GDI ecosystem.
Learn more about Genomic Data Infrastructure (GDI) and its founding initiatives.
Manage datasets in two ways
-
Harvest datasets automatically: Set up automated harvesting from external sources like FAIR Data Points and HealthDCAT-AP endpoints to synchronise datasets continuously.
-
Manage datasets manually: Create and update individual datasets through the catalogue interface—add detailed metadata, upload resources, and ensure data quality standards are met.
Datasets flow to the Data PortalAll datasets you manage in this Data Catalogue are displayed in the public-facing GDI Data Portal↗, making them discoverable to researchers across Europe. Your work ensures high-quality genomic data is available for research and clinical purposes.