-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OGC Metadata Codesprint 2024 - Proposals and use cases #17
Comments
Proposal - Can Metadata Standards Play Nicely?The idea is to determine and demonstrate where and when different standards (namely STAC, OGC API records, ISO 19115, and GeoDCAT) can work in tandem to provide the most relevant information to the right users at the right time. The goal is to create a demonstration where the strengths of each are highlighted while working in tandem. |
OGC will provide an open source framework for testing mappings between standards, using test case examples covering Records, STAC and DCAT. Codesprint participants would be able to extend this to XML based metadata standards such as ISO 19115, in a way that is extensible to various profiles in use. This could leverage skills in any or all of coding, datamodelling, metadata standards or application domain requirements. |
Proposal: what's the future for ISO 19115? Background: roughly half the ISO/TC 211 members consider that ISO 19115-1:2014 should be revised and yet a lot of areas have not yet moved from the earlier version. There is a lot of discussion e.g. in Europe about moving from ISO 19115:2003 to DCAT instead. I would like to canvas views on what could be useful for ISO/TC 211 to do, for example:
I'm sure I've not thought of all the "possibly useful futures"; I hope the assembled brains will help! |
Hi Peter, in terms of a specific "code sprint" activity, we could look to define and exercise the "mapping" - this could be done several ways and the GeoDCAT Building Blocks can be used to execute, test and publish these mappings. Five options I can immediately see:
Note that option 4 could be used to generate the transforms for 1,2 or 5. Option 5 will be most OGC API friendly solution. |
Thanks Rob; I was thinking more to have a short 'think & talk' break where people just give their views on what they'd like to see improve/change in ISO 19115. The revision project in TC 211 hasn't even started yet, so I we're not ready for any hands on stuff. The activities you list would all sit in the area of using the current ISO 19115-1:2014 (with accompanying XML spec 19115-3 & soon to have JSON spec 19115-4), or even the old ISO 19115:2003/ISO 19139:2006 (as still used in Europe quite a lot). Paul Jansen is leading that second project, which has involved some of 1 & 2 in your list. 5 looks a useful input to the upcoming ISO 19115-1:2014 revision project. |
I would suggest skipping OWL and going straight to SHACL. |
Hi all, We would like to submit a proposal on FAIRness evaluation & GeoDCAT: Evaluating the FAIRness of geospatial data - Extending the F-UJI tool for GeoDCAT Evaluating the FAIRness of geospatial data requires specific adoptions of the FAIR principles and related evaluation tools. Taking F-UJI (https://github.com/pangaea-data-publisher/fuji), the well-known open-source evaluation tool, as a software project for our use case, we consider the following GeoDCAT extensions for further brainstorming:
We intend to share the developed extension with the Earth System Sciences community as open-source, facilitating its reuse in geodata portals/catalogues, such as the European data portal or the NFDI4Earth Services Knowledge Hub or OneStop4All. |
Proposal: GeoDCAT response for OGC API Records We could use pygeoapi's CSW facade as a starting point and transform the ISO19115/ISO19139 metadata from a CSW to GeoDCAT. Your thoughts on this would be greatly appreciated (especially as we have only limited knowledge on OGC API Records). |
Does the GeoNetwork (draft?) API Records implementation provide an alternative starting point? It already provides an OGC API Records DCAT output for records that are "natively" stored as ISO 19139 (or presumably other forms). See example (with unattractive GUI layout!) at https://osmetadata.astuntechnology.com/geonetwork/api/collections/main/items (use ?f=dcat or click link) |
Review https://docs.ogc.org/bp/17-084r1/17-084r1.html and explore how this overlaps with Records or other options for JSON encoding. (Thanks Uwe) |
Hi All, ISO TC 211 - working group 7 - Information Communities, is developing a JSON implementation of the 19115 metadata-standard, the 19115-4, Geographic information — Metadata — We are looking for help on creating a GeoJSON schema and want to put that into the Code-sprint Our approach is the following:
The goal set and work to be done is reworking the first concept of the GeoJSON schema on the basis of the agreed GeoJSON data example. What is needed:
|
Note I have moved this to a new issue to allow targetted discussions. |
A valuable input for a specification of a JSON(+LD) Encoding of the ISO metadata could be the OGC BP https://docs.ogc.org/bp/17-084r1/17-084r1.html A convergence of the proposed JSON-LD mapping with GeoDCAT-AP [OR15] was a further intention. The GeoJSON encoding is defined as a compaction through a normative context, of the proposed JSON-LD encoding, with some extensions. JSON is human readable and easily parseable. However, the BP is based on a normative JSON-LD context which allows each property to be explicitly defined as a URI. Furthermore, the JSON encoding is defined using JSON Schema which allows validation of instances against these schemas. JSON-LD context /compaction [NR13]A JSON-based Serialisation for Linked Data, W3C Recommendation, 2014, http://www.w3.org/TR/json-ld/ for more details, examples etc refer to https://docs.ogc.org/bp/17-084r1/17-084r1.html You can also experiment (compaction/expansion,...) with the BP examples in the json-ld playground: https://json-ld.org/playground/ |
@uvoges I have created a "BuildingBlock" using these resources - refactoring a little bit to use more modern schemas and reference existing standard BuildingBlocks.. https://ogcincubator.github.io/iso19115-json/bblock/ogc.metacat.iso19115.bp17-084 I think we could easily refactor further to use standard provenance building blocks, push other components into reusable pieces - make the context 19115 element specific to inherit standard contexts and add SHACL validation rules. |
|
Proposal: Extend the APIRecords Response to support GeoDCAT as proposed by the API Records-SWG (Peter Vretanos): Records, on purpose, defines a very light weight "default" model for describing resources to be discovered along the same lines as csw:Record. The idea is that the record is extended as necessary by communities of interest. In order to make it clear how exactly a record has been extended, the model includes a "conformsTo" member that is a place were conformance URIs can be included in the record. This allows clients to determine if the record contains extensions they understand. So, for example, a record extended with GeoDCAT elements would include a URI in the "conformsTo" member to indicate that the record includes GeoDCAT members. GeoDCAT-aware clients can look for this URI (or URIs if there are a number of GeoDCAT conformance classes) and know right away if the GeoDCAT extensions are present. |
|
see also:#17 (comment) |
Proposal: Geonetwork 5 - OGCAPI-Records and with DCAT-AP/GeoDCAT (country specific APs) Myself, Jose Garcia, and Jeroen Ticheler have been working with François Prunayre on making BIG improvements to Geonetwork's OGCAPI-records implementation (see recently merged pull requests). François Prunayre has recently improved Geonetwork 4's DCAT-AP/GeoDCAT (country specific APs) and the next step is to incorporate these into Geonetwork 5 as well incorporate the new OGCAPI-Records implementation. We will be working in the EU (UTC+1) as well as West Coast Canada (UTC-8) Time Zone. We hope to be attending most of the main meetings. See tickets for the work outlined in the diagram, above. GeoNetwork 5 board: https://github.com/orgs/geonetwork/projects/4
|
Great. Can you provide details yet on how you handle multiple specific APs and what we might focus on to make these machine actionable for configuration perhaps? |
Hi, Rob, The plan is to move the Geonetwork 4 Right now choosing the output format selects a specific XSLT that will produce the AP-specific DCAT output (i.e. This is quite a bit work, so I expect it will not be completed next week. |
Note: ISO19193 => ISO19139 |
@avillar can we have rdf/xml examples validated with current bb pipeline? What about xslt transforms? If we do this then we can define various profiles in SHACL and also test Records and native DCAt examples. |
|
Proposal: Data Format and Validation Service For Interoperability Between Analytics and Data As systems become more complicated, and a system of systems approach becomes the norm rather than the dream, it's important that datasets and containerised analytics to be able declare, trust, and verify that data meets certain assertions and standards. At AURIN we've been throwing around an idea for building at a Data Format and Validation Registry, first internally and then externally, that would mix persistent identifiers, human readable data assertions, and machine readable and executable assertions on data formats to add an automated "trust, with verification" layer to our data/analytics ecosystem. The basic sketch -- which is as far as we've progressed it -- would be to have a minted persistent identifier (similar to a DOI or an ORCID) of some description that links to a human readable and machine readable page describing what this data format is and what assertions data would have to meet to be considered valid. A back of the envelope service could have something like a standard DOI type link that would resolve to a page with human readable metadata about the data format and then validation scripts or attached assertions in a validation language like Great Expectations or something similar. The challenge, beyond the obvious of vetting whether this is a good approach in the first place, would be to think about what metadata should be provided on the other end of that "format persistent identifier" link that would allow for a service to independently and automatically verify that a dataset complies with the standard it claims to adhere to. Lot to unpack there, and not a lot of direction, but hopefully useful grist for the mill! |
this Use Case now tracked in #25 |
@rob-metalinkage How could we integrate such a check in GeoNetwork? Is there some end point to submit a record and retrieve the resulting report? |
This crosses a whole bunch of implementation concerns - including infrastructure etc. IMHO the first step is to have registries of profiles (of standards) for different application domains, and focus on machine-readability of these. Hence the OGC Building Blocks - but we need to articulate the patterns better - please join in the report writing where we will try to outline whats common here. I would think Geonetwork should embed profile validation - using cachable resources from "fine grained" profiles - so you can validate parts of resources quickly and elegantly in user interactions if you choose. We will be adding XSLT transforms into the Building Block toolkits so we can exploit SHACL validation rules - and could include XSD validation possibly. The utility of XSD alone seems low IMHO, as so much is often dependent on what you put into the schema - not whether you follow the structure.. So, lets look at it from the perspective of ISO and OGC - what can we do to make Geonetwork users lives easier? Whats the goal and what are the first steps we can take? |
Could you use the describes element at the end of the ISO XML record:
|
spdx:checksum If the idea is to validate the file which is at the end of a gcx:Anchor, I'd see that as more of a limitation of the xlink namespace - it could probably do with a checksum attribute. if it's about validating a file at the end of one of the URLs that are more specifically part of ISO 19115 (e.g. a citation or online resource) then I guess that is a "limitation" of ISO 19115-1: it didn't imagine also having to cope with information security/integrity. |
For xlink, could possibly use |
For any proposed solutions I'd like to see how you would map it to DCAT - we are adding support for XSLT to the building block register so we can make this more easily found and shared (FAIR) |
I probably am misunderstanding, but as for example checksum is a property of the DCAT 3 Distribution Class, you would want a mapping in ISO 19115 associated with MD_Distribution? I took [1] https://semiceu.github.io/DCAT-AP/releases/3.0.0/#Checksum |
@nmtoken - thanks for the profile example. Will be useful looking at GeoDCAT scope. We need some 19115 XML officianados to provide XML examples and profile descriptions. For example I dont see how the mapping here works: https://wiki.earthdata.nasa.gov/pages/viewpage.action?pageId=6226303 /gmi:MI_Metadata/gmd:identificationInfo/gmd:MD_DataIdentification/gmd:citation/gmd:CI_Citation/gmd:identifier/gmd:MD_Identifier/gmd:code/gco:CharacterString seems to be just smuggling it into a generic identifier... |
A very philosophical question - would a checksum be a good identifier? It's likely (but not guaranteed!) to be unique, but it's hardly likely to be actually used to retrieve the data object (although I guess if a system designer chose to make the checksum the identifier, then it could). Anyway, if NASA DataFileContainer.Checksum is their identifier for the dataset, then that seems a reasonable place. But you would need a separate MD_DataIdentification for each file (distribution) - very different from many people's view of a dataset having many distributions (e.g. formats). I agree with Rob that isn't what I'd see as a generic mapping. Note: I've only really considered mapping from ISO 19115 to DCAT, not the other way round.... |
I actually missed that DCAT 3 also has a Checksum Class (https://www.w3.org/TR/vocab-dcat-3/#Class:Checksum) which isn't associated with the Distribution Class. I assume that leaves it open as to how to map to ISO 19115, if you wanted to do that exercise. |
Please use this space to submit ideas for issues that could be addressed during the OGC Metadata Codesprint, which will be held from November 18 to 19, 2024 Sydney Aus and online.
Alternatively, create new issue for your codesprint idea and tag it with Codespint.
Participants are encouraged to collaborate and comment on proposals.
The text was updated successfully, but these errors were encountered: