Harvest from HealthDCAT-AP endpoints
Connect to European data portals using the HealthDCAT-AP standard to import standardised health-related datasets.
HealthDCAT-AP is a European standard for describing health data catalogues, extending the DCAT-AP specification with health-specific metadata. The GDI platform is HealthDCAT-AP compliant, ensuring standardised metadata across the European health data ecosystem.
For complete technical specifications, see the HealthDCAT-AP documentation↗.
Configure the HealthDCAT-AP source
The dcat_rdf_harvester extension must be added to the CKAN plugins for this harvester to be available. Additionally, ensure the CKAN.ini file contains:
ckanext.dcat.rdf.profiles =(space-separated list of profiles)ckanext.dcat.compatibility_mode = true
When adding a harvest source, use these settings for HealthDCAT-AP endpoints:
| Field | Description |
|---|---|
| URL | Enter the HealthDCAT-AP endpoint URL. Examples: • https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.xml• https://raw.githubusercontent.com/Health-RI/starter-kit-info/main/example.ttl |
| Source type | Select Generic DCAT RDF Harvester from the dropdown |
| Configuration | Enter: { "profile": "fairdatapoint_dcat_ap", "rdf_format": "text/turtle", "force_all": "true" }Note: Set rdf_format to match your file format:• text/turtle for .ttl files• application/rdf+xml for .rdf/.xml files |
To harvest data sources, the system looks at MIME types:
- For turtle format files (.ttl):
text/turtle - For RDF/XML files (.rdf):
application/rdf+xml
If you're experiencing harvesting issues, verify the rdf_format in your configuration matches your file type.