Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Standard / DCAT (and profiles) export (#7600)
* Standard / ISO / DCAT formatters Proposal to better organize various flavor of DCAT. The target is to support: * DCAT * EU DCAT-AP * EU DCAT-AP mobility * EU GeoDCAT-AP. * Formatter / DCAT / Improvements. * Formatter / DCAT / Core. * Formatter / DCAT / Core / Associated. * Formatter / DCAT / Core / Lineage. * Formatter / DCAT / Core / Dataset / Temporal. * Formatter / DCAT / Core / Dataset / Distributions. * Formatter / DCAT / Core / Dataset / Services. * Formatter / DCAT / Core / Mulitlingual support. * Formatter / DCAT / RDF valid model. * Formatter / DCAT / Tests. * Formatter / EU-DCAT-AP / CatalogRecord. * Formatter / DCAT / Test dep. * Formatter / EU-DCAT-AP / Draft with test. * Added DCAT-AP validation with SHACL rules in FormatterApiTest. * Library / Update Jena for SHACL validation (issue on various commons-codec versions) * Formatter / EU-DCAT-AP / Improve shacl validation status - license and status. * Formatter / EU-DCAT-AP / Improve shacl validation status - bytesize. * Formatter / EU-DCAT-AP / Improve shacl validation status - 1 format, 1 accrualPeriodicity. * Formatter / EU-DCAT-AP / Improve shacl validation status - license, accessRights and rights. * Formatter / EU-DCAT-AP / Improve shacl validation status - dcat:theme from EU vocabulary as skos:Concept. * Formatter / EU-DCAT-AP / Improve shacl validation status - primaryTopic is required even if isPrimaryTopic is defined. Only one rights allowed. * Formatter / EU-DCAT-AP / Improve shacl validation status - only one dct:type allowed. * Formatter / EU-DCAT-AP / Add 2.1.1 base shacl validation level. * Formatter / EU-DCAT-AP / Validation / Shacl 2.1.1 / Success. * Formatter / EU-DCAT-AP-HVD / Base conversion with HVD category * Formatter / EU-DCAT-AP-HVD / Add Shacl validation - which is failing for now. * Formatter / EU-GEODCAT-AP / CatalogRecord. * Formatter / EU-GEODCAT-AP / Shacl validation. * Formatter / DCAT / Declare ISO conformity for the source and DCAT-AP conformity for the CatalogueRecord. * Formatter / DCAT / ISO19139 bridges. * Added Mobility DCAT - Mobility Theme concept. * CSW / Improve outputschema configuration and add DCAT outputs. * CSW / Improve outputschema configuration / Add config for ISO19110 and ISO19139 (which is the same as 115-3) * DCAT / Mobility / Remove unused import. * Added Mobility DCAT - support of georefenrecingMethod, networkCoverage and transportMode. * WIP mobility dcat - distribution * Dependencies / Declare version in root pom. * DCAT / ISO19139 / Fallback URI builder when no metadata linkage exist. To be discussed what is the best fallback. Former DCAT conversion was using a setting for resourcePrefix * DCAT / Add formatters to test list. * Standard / ISO / DCAT formatters / GeoDCAT-AP / Use dct:description instead of rdfs:label lfor potentially long texts SEMICeu/GeoDCAT-AP#108. * Standard / ISO / DCAT formatters / EU-DCAT-AP / Update shacl validation rules to latests. * Standard / ISO / DCAT formatters / Testing / Turn off shacl phase for now and pass test. * Standard / ISO / DCAT formatters / EU-DCAT-AP / Use dct:provenance for lineage. DCAT core use versionNotes. * Standard / ISO / DCAT formatters / Distributions / Do not repeat eleements from the resources by default eg. rights, license, resolution, applicable legislation. * Standard / ISO / DCAT formatters / GeoDCAT-AP spatial representation technique TODO. * Standard / ISO / DCAT formatters / GeoDCAT-AP spatialResolutionAsText TODO. * Standard / ISO / DCAT formatters / adms:identifier / Schema Agency only if codespace is defined. * Standard / ISO / DCAT formatters / Bytesize / Range changed to xsd:nonNegativeInteger SEMICeu/GeoDCAT-AP#91. * Standard / ISO / DCAT formatters / GeoDCAT-AP element without mapping. * Standard / ISO / DCAT formatters / HVD / Add applicable legislation. * Standard / ISO / DCAT formatters / Assume individual name is not multilingual. Use first CharacterString or Anchor text element. * Standard / ISO / DCAT formatters / Add references for org and individual. * Standard / ISO / DCAT formatters / Cleanup. Remove validation mode which will be probably either shacl or schematron at some point. * Standard / ISO / DCAT formatters / Refactor for easier object reference configuration. References are added to org, individual, keywords now. * API / Formatter / If no output parameter is provided try to infer content type from formatter id. eg. DCAT are usually XML. Formatter with JSON in name are JSON. * Standard / ISO / DCAT formatters / A bit closer to what EU-DCAT-AP SHACL rule expect. Tested with https://www.itb.ec.europa.eu/shacl/dcat-ap/upload. Difficulties to reproduce SHACL errors similar the the online validator. Could be related to not using the same SHACL rules or a difference with Jena validator? * Standard / ISO / DCAT formatters / Shacl rules are now in the same folder. Fix path in comments * Documentation / DCAT * Documentation / DCAT / Image * Standard / ISO / DCAT formatters / HVD extend DCAT-AP. Fix missing DCAT-AP theme (based on mapping from INSPIRE themes and topic category. * Test / EU-DCAT-AP / SHACL / Fix Jena issue. Related to SEMICeu/DCAT-AP#366. * Standard / ISO / DCAT formatters / Fix range for MediaTypeOrExtent. * Standard / ISO / DCAT formatters / SHACL / Rules are now working. Then we have to fix input document or decide to lose information on constraints. * Standard / ISO / DCAT formatters / HVD / Fix test. * Standard / ISO / DCAT formatters / Fix namespace of contactPoint. * Standard / ISO / DCAT formatters / Better support multilingual records. * Admin / CSW / Add more GET request examples. * Standard / ISO / DCAT formatters / Fix namespace of contactPoint * Standard / ISO / DCAT formatters / Better handle HVD concept by normalizing long labels. * DCAT / Shacl rule update. From SEMICeu/DCAT-AP@591fae6 * Standards / ISO / Formatters / DCAT / Multilingual / Do not output empty text ```xml <dcat:keyword xml:lang="fre">Observation</dcat:keyword> <dcat:keyword xml:lang="eng">Observation</dcat:keyword> <dcat:keyword xml:lang="fre">Observation par point</dcat:keyword> <dcat:keyword xml:lang="eng"/> ``` * Standards / ISO / Formatters / DCAT / Distribution / Map taking into account protocol (and not only function). * Standards / ISO / Formatters / DCAT / Period of time using beginPosition. * Standards / ISO / Formatters / DCAT / Documentation and entry point for license mapping. * Updated license mapping to map to EU licenses for dcat-ap profile (and keep original license in dcat-core) mapping done in dcat-core profile (trigger behaviour with variable in dcat-ap profile). * Updated license mapping to map to EU licenses for dcat-ap profile (removed xsl messages) * Standards / ISO / Formatters / DCAT / Update SEMICeu conversion - following GeoDCAT-AP 3 revision working group progress. * Standards / ISO / Formatters / DCAT / Updates following GeoDCAT-AP working group meeting. * Standards / ISO / Formatters / DCAT / Update SEMICeu conversion - fix template. * Standards / ISO / Formatters / DCAT / Update SEMICeu conversion - more robust on CRS. * Formatter / DCAT / ISO common name are unused for keywords Dedicated template exist in dcat-core-keyword * Formatter / DCAT / Mobility DCAT improvement. * Add mapping for referenceSystem * Add test * Disable distribution for now and delegate to DCAT-AP for now. * Formatter / DCAT / accrualPeriodicity / Add support for userDefinedMaintenanceFrequency which can be mapped to mobility DCAT vocabulary https://mobilitydcat-ap.github.io/controlled-vocabularies/update-frequency/latest/index.html#. * DCAT / Formatter / Simplify template for constraint Match first resource constraints and then map to access and use elements. * Formatter / DCAT / Improve mapping for organization with multiple individual. * Formatter / DCAT / Mobility DCAT - improve accrual periodicity mapping. Cardinality: * ISO 0..n * DCAT 0..n * DCAT-AP 0..1 * Mobility DCAT 1..1 (in ISO either use corresponding period eg. P0Y0M0DT1H0M0S or extend the codelist with the proper vocabulary) accrualPeriodicity mapping done using the ISO to Dublin core value mapping but additional checks are done when ISO records extended the codelist and may used the EU Publication Office frequency codes or the Mobility DCAT-AP update frequency codes. Domain specific codelists take priority over the DC or ISO codelists. eg. <mmi:MD_MaintenanceFrequencyCode codeListValue="15min"/> multipleAccrualPeriodicityAllowed is a parameter that can be set to true to allow multiple accrualPeriodicity values. Default to false for EU formatters. true for DCAT. * Missing test file update. * Formatter / DCAT / Applicable legislations https://semiceu.github.io/DCAT-AP/r5r/releases/3.0.0/#applicableLegislation Add the element to DCAT-AP base. Element is 0..n in DCAT-AP and should be present in extensions (mobility, hvd, geodcat). HVD requires at least http://data.europa.eu/eli/reg_impl/2023/138/oj and cardinality is 1..n. Do not restrict to a particular legislation list. A sample vocabulary is provided but it can be extended depending on catalogue domains. * Formatter / DCAT / Identifier(s) In DCAT and DCAT-AP, `dct:identifier` is 0..n. Mobility DCAT restrict it to 0..1. In DCAT-AP and extensions, only convert the first identifier as `dct:identifier`; others as `adms:identifier`. * Formatter / DCAT / Identifier / Urn Use `:` separator for URN like identifiers. * Formatter / DCAT / Map ISO language codes (bibliographic codes) to DCAT language codes (terminology codes) * Formatter / DCAT / Map ISO language codes / Test. --------- Co-authored-by: GeryNi <[email protected]> Co-authored-by: Jose García <[email protected]>
- Loading branch information