Skip to main content

Harvest from HealthDCAT-AP endpoints

Connect to European data portals using the HealthDCAT-AP standard to import standardised health-related datasets.

What is HealthDCAT-AP?

HealthDCAT-AP is a European standard for describing health data catalogues, extending the DCAT-AP specification with health-specific metadata. The GDI platform is HealthDCAT-AP compliant, ensuring standardised metadata across the European health data ecosystem.

For complete technical specifications, see the HealthDCAT-AP documentation.

Configure the HealthDCAT-AP source

Prerequisites

The dcat_rdf_harvester extension must be added to the CKAN plugins for this harvester to be available. Additionally, ensure the CKAN.ini file contains:

  • ckanext.dcat.rdf.profiles = (space-separated list of profiles)
  • ckanext.dcat.compatibility_mode = true

When adding a harvest source, use these settings for HealthDCAT-AP endpoints:

FieldDescription
URLEnter the HealthDCAT-AP endpoint URL. Examples:
https://opendata.swiss/en/dataset/verbreitung-der-steinbockkolonien.xml
https://raw.githubusercontent.com/Health-RI/starter-kit-info/main/example.ttl
Source typeSelect Generic DCAT RDF Harvester from the dropdown
ConfigurationEnter: { "profile": "fairdatapoint_dcat_ap", "rdf_format": "text/turtle", "force_all": "true" }

Note: Set rdf_format to match your file format:
text/turtle for .ttl files
application/rdf+xml for .rdf/.xml files
Troubleshooting MIME types

To harvest data sources, the system looks at MIME types:

  • For turtle format files (.ttl): text/turtle
  • For RDF/XML files (.rdf): application/rdf+xml

If you're experiencing harvesting issues, verify the rdf_format in your configuration matches your file type.

Next steps