CARVIEW |
This document reports on evidence and implementations of the Data Catalog vocabulary version 2 (DCAT2) Candidate Recommendation [[VOCAB-DCAT-2]]. In particular, it demonstrates that the revisions proposed in [[VOCAB-DCAT-2]] are already in use and are also implementable.
Introduction
DCAT is an RDF vocabulary, first published as a W3C Recommendation in 2014 [[VOCAB-DCAT]], designed to facilitate interoperability between data catalogs published on the Web. The Dataset Exchange Working Group (DXWG) has developed a new version of DCAT [[VOCAB-DCAT-2]], which extends [[VOCAB-DCAT]] to support further use cases and requirements [[DCAT-UCR]]. These include the possibility of cataloging other resources in addition to datasets, such as data services, and of describing relationships between datasets as well as between datasets and other cataloged resources.
To show that the revisions proposed in [[VOCAB-DCAT-2]] are implementable as well as broadly adopted, we collected evidence in the form of vocabularies, data catalogs, data services, and datasets (). The results are summarized in this report.
Methodology
We followed the steps described below to collect evidence for the revisions proposed in [[VOCAB-DCAT-2]]:
- Identify the actual list of revisions proposed in [[VOCAB-DCAT-2]] for which implementation evidence should be collected (see for more details).
- Review existing vocabularies based on [[VOCAB-DCAT]], and identify if they include already the changes proposed in [[VOCAB-DCAT-2]], and/or their alignment with [[VOCAB-DCAT-2]] is planned. The rationale for including also planned implementations is that these vocabularies are not likely to adopt the revisions proposed in [[VOCAB-DCAT-2]] until it becomes a W3C Recommendation.
- Review existing guidelines and implementation reports of [[VOCAB-DCAT]], as well as of vocabularies based on it, to verify if the changes proposed [[VOCAB-DCAT-2]] are implemented in existing data catalogs, data services, and datasets.
- Develop an Implementation Report with details of the steps above.
As noted, to have a broader coverage of [[VOCAB-DCAT-2]] adoption we considered different types of evidence:
- DCAT-based vocabularies
- Data catalogs, data services, and datasets
Due to the large number of DCAT-based vocabularies and data catalogs supporting [[VOCAB-DCAT]], this report includes only a representative subset, providing nonetheless enough implementation evidence of the revisions proposed in [[VOCAB-DCAT-2]].
Meeting the exit criteria
As described in the DXWG charter, to advance to Proposed Recommendation, evidence will be adduced in order to demonstrate that each of the changes made to the original DCAT Recommendation [[VOCAB-DCAT]] has been published and consumed independently in catalogs and related systems at least once, although a higher number is expected for the majority of terms.
Evidence of implementation was gathered from existing data catalogs, data services, and datasets, which already implement the proposed changes, as well as from existing application profiles of [[VOCAB-DCAT]].
Summary of proposed revisions
The following table shows the set of changes proposed in [[VOCAB-DCAT-2]].
Implementation Evidence
The table below shows the evidence collected for each one of the revisions proposed in [[VOCAB-DCAT-2]], including, for each revision, the existing implementations - flagged with "✔" -, as well as the planned implementations - flagged with "(✔)". The vocabularies and data catalogs/data services/datasets corresponding to columns in this table are listed in subsequent sections.
The reported planned implementations reflect the following (not mutually exclusive) scenarios:
- The vocabulary or the metadata schema of the data catalog is being aligned with [[VOCAB-DCAT-2]] (or this alignment is planned).
- The vocabulary or data catalog supports equivalent or similar properties and classes, that may be eventually replaced by those defined in [[VOCAB-DCAT-2]].
DCAT-based vocabularies
The following table shows DCAT-based vocabularies providing implementation evidence of the revisions proposed in [[VOCAB-DCAT-2]].
ID | Application Profile | Creator | Country | Year |
---|---|---|---|---|
V01 | [[DCAT-AP]] | European Commission | Europe | 2018 |
V02 | [[GeoDCAT-AP]] | European Commission | Europe | 2016 |
V03 | [[StatDCAT-AP]] | European Commission | Europe | 2016 |
V04 | [[CiteDCAT-AP]] | European Commission | Europe | 2019 |
V05 | [[DCAT-AP-JRC]] | European Commission | Europe | 2019 |
The criteria used to select the relevant vocabularies are the following ones:
- They are based on [[VOCAB-DCAT-2]] or they extend [[VOCAB-DCAT]] with properties and classes included in [[VOCAB-DCAT-2]].
- Preferably, they are used across multiple data catalogs and/or domains/disciplines.
At the time this report was written, it was not possible to identify vocabularies based on [[VOCAB-DCAT-2]]. However, there are quite a few vocabularies extending [[VOCAB-DCAT]] - a non-exhaustive list is included in § 14 DCAT Profiles of [[VOCAB-DCAT-2]]. Among these, we have selected the most representative ones, for geographic and domain coverage as well as tooling support, which already include at least some of the revisions proposed in [[VOCAB-DCAT-2]], and/or whose alignment with [[VOCAB-DCAT-2]] is under way or planned.
The vocabularies listed earlier in this section have been chosen for the following reasons:
- [[DCAT-AP]] is a profile of [[VOCAB-DCAT]] used across Europe since 2014 as a metadata interchange format, primarily for catalogs of government data, and, to some extent, for scientific data. As such, it has a broad geographic coverage, and it is supported in data catalogs (e.g., the European Data Portal) and catalog platforms (e.g., CKAN).
- [[GeoDCAT-AP]] and [[StatDCAT-AP]] are domain-specific extension of [[DCAT-AP]] for geospatial and statistical data, respectively, and they share the same geographic coverage of [[VOCAB-DCAT]].
- [[CiteDCAT-AP]] and [[DCAT-AP-JRC]] are extensions of [[DCAT-AP]] specifically designed for multidisciplinary research data, and they are implemented in the corporate catalog of the European Commission's Joint Research Centre. Moreover, [[CiteDCAT-AP]] is supported in Zenodo, the research data catalog and repository most widely used in Europe.
[[DCAT-AP]] has also been used as a basis for the development of country-specific extensions (see [[DCAT-AP-EXT]]). Such extensions have not been included in this report, but they provide additional support to the implementation evidence for the revisions proposed in [[VOCAB-DCAT-2]] already included in [[DCAT-AP]].
Moreover, at the date this report was written, work is under way to release a new version of [[DCAT-AP]], aligned with [[VOCAB-DCAT-2]], which will eventually be reflected in the [[DCAT-AP]] extensions.
Data catalogs, data services, and datasets
The following table shows data catalogs, data services, and datasets providing implementation evidence of the revisions proposed in [[VOCAB-DCAT-2]].
ID | Service | Evidence | Category | Supported DCAT profile | Domain | Catalog platform?* |
---|---|---|---|---|---|---|
D01 | European Data Portal | [[DCAT-AP-USE]] includes statistics for the metadata elements used in the European Data Portal, along with the SPARQL queries used for them. | Data catalog | [[DCAT-AP]], [[GeoDCAT-AP]], [[StatDCAT-AP]] | Cross Domain | CKAN |
D02 | Zenodo | DCAT version of Record 3467639 | Data catalog, Catalog platform | [[CiteDCAT-AP]] | Cross Domain | Zenodo |
D03 | JRC Data Catalogue | [[DCAT-AP-JRC]] includes examples taken directly from the JRC Data Catalogue | Data catalog | [[DCAT-AP-JRC]], [[CiteDCAT-AP]] | Cross Domain | CKAN |
D04 | Katalog der Deutschen Nationalbibliotek | authorities.ttl |
Data catalog | [[VOCAB-DCAT-2]] | Cross Domain | Proprietary |
D05 | LusTRE | LusTRE-DCAT2.ttl |
Data service, Dataset | [[VOCAB-DCAT-2]] | Environment | Proprietary |
D06 | NERC Vocabulary Service | SPARQL endpoint | Dataset | [[VOCAB-DCAT-2]] | Geospatial | Proprietary |
D07 | G-NAF | Endpoint description | Data service, Dataset | [[VOCAB-DCAT-2]] | Geospatial | Proprietary |
D08 | Media Types Web Service | Endpoint description | Data service, Dataset | [[VOCAB-DCAT-2]] | Cross Domain | Proprietary |
The criteria used to select the relevant data catalogs are the following ones:
- They implement [[VOCAB-DCAT-2]] or one of the vocabularies listed in .
- Preferably, they have a broad geographic and/or community coverage.
As explained in , there are many data catalogs supporting [[DCAT-AP]] and its extensions. Among these, we have selected the European Data Portal, since it provides a single access point to data published in catalogs across Europe (at the date this report was written, the European Data Portal included nearly 1,000,000 metadata records), and Zenodo, which is the most used repository in Europe for scientific data (at the date this report was written, it included nearly 1,500,000 metadata records).
The rest of the selected data catalogs have been chosen because they support either [[VOCAB-DCAT-2]] or the other vocabularies listed in .
As already mentioned in , CKAN, a popular catalog platform on which also some of the data catalogs listed earlier are based, supports [[DCAT-AP]] via a specific extension, and therefore it has not been included here.
Finally, among other reported implementations, it is worth mentioning the LinkedPipes DCAT-AP Viewer and the LinkedPipes DCAT-AP Forms, where support to [[VOCAB-DCAT-2]] is being implemented at the date this report was written.
Guidelines and reports
The following table lists guidelines and reports that demonstrate actual implementations of the revisions proposed in [[VOCAB-DCAT-2]]. These examples are drawn from the data catalogs and DCAT-based vocabularies listed in the previous sections.
ID | Guide | Creator | Country | Year |
---|---|---|---|---|
[[DCAT-AP-IG]] | DCAT-AP Implementation Guidelines | European Commission | Europe | 2016 |
[[DCAT-AP-USE]] | Report on DCAT-AP use | European Commission | Europe | 2017 |
[[DCAT-AP-EXT]] | Analysis of the DCAT-AP extensions | European Commission | Europe | 2017 |
General analysis
Based on the collected evidence, the large majority of the revisions proposed in [[VOCAB-DCAT-2]] have at least 1 implementation, considering both the existing and the planned ones.
More precisely:
- 46 revisions out of 81 have at least 2 existing implementations.
- 22 revisions out of the remaining 35 have 1 existing implementation.
- 6 revisions out of the remaining 13 have at least 2 planned implementations.
- The remaining 7 revisions have 1 planned implementation.
The corresponding statistics are summarised in .
For the 68 revisions with existing implementations, the number of evidences ranges from 1 to 8, according to the distribution illustrated in . Considering that these evidence has been collected over a representative subset of vocabularies, data catalogs, data services, and datasets, we can state that these revisions have already a high level of adoption. In particular, these implementations reflect revisions and extensions to [[VOCAB-DCAT]] which have been documented in the use cases included in [[DCAT-UCR]].
The 13 revisions with no existing implementations are listed in the following table. They all have at least one planned implementation in [[DCAT-AP]], which, at the time this report was written, was in the process of being aligned with [[VOCAB-DCAT-2]] (see the relevant issues in the [[DCAT-AP]] GitHub repository). As already explained in , [[DCAT-AP]] has been used as a basis for country- and domain-specific extensions, which in turn may therefore be aligned as well to [[VOCAB-DCAT-2]]. For these reasons, we can state that there is enough evidence that these revisions are implementable.
Acknowledgements
The editors gratefully acknowledge the contributions made to gathering evidence for [[VOCAB-DCAT-2]] and reviewing this report by all members of the working group, especially Annette Greiner.
The editors would also like to thank evidences received from Jakub KlĂmek.