-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concept mapping to EDR #135
Comments
@chris-little will probably have some ideas |
Making a start - ammended table from the STA docs...
|
This gets at the thorny issue of what an EDR "collection" means in sensor data context. I could see collections might be mapped to a Datastreams, Thing/Locations, or even ObservedProperties(x)/Datastreams This is why I think in concept mapping I think we need to think in parallel about the query mapping.
|
I think ObservedProperty can be mapped to the EDR parameter The hard part is indeed defining what a "collection" is. When applying Features I would normally map each EntityType to a collection, but clearly that doesn't work for EDR. Lets build a query:
For many STA servers this would work find, since generally services don't mix observations from different domains. In this case there is only 1 collection for the entire service. Of course for servers where there are very different uses for the same ObservedProperty, like indoor and outdoor temperatures, this would not work, and additional filtering is needed. This additional filtering would need to come from the collection. For example:
We can then make two collections, that add the filter parts:
Of course one could also make an EDR endpoint that returns Things instead of Observations... |
This is a productive conversation -- thanks for opening it @ksonda -- a couple things to help steer the analysis as I don't think much of the above is taking in the full scope of what EDR specified. Jumping to the end, the EDR First, in regards to the idea of a mapping in the first place. Second, in regards to the nature of EDR implementations. For STA implementations that choose to use the "items" endpoint, we have something to work with. But what we have is actually just a collection that has some set of items that can be accessed as OGC-API Features. These items have a specific set of attributes that map onto the valid EDR query patterns for each item -- e.g. valid parameter-names, time range, etc. see here for more on this pattern. Note that these items need not be sensors -- they could also be pre-determined EDR queries against any backing data. e.g. pre-defined useful point, area, or trajectory queries that people may want to discover as I could go on, but will leave it there and add my two cents on where there are some firm mappings. An EDR Apologies for not being more specific on STA details -- I have to admit that I still am just kind of confused by the API pattern and how it maps to backing data. Hopefully the description of the intention for EDR implementation patterns above is useful. |
I agree that STA implementation patterns are infinite which means we're not going to come up with a mapping that could be considered "standard". For use cases that STA is particularly suited for, like moving platforms or platforms that change their feature of interest over time and space, and extremely complex OData queries with filters and selects of attributes of multiple entities, etc. then EDR may just not be the correct interface for that kind of query. In this category I would put maybe many of the "Smart Cities" implementations that track transportation infrastructure and vehicles. In practice though we have a relatively small community in the environmental monitoring space, where we are dealing with stationary monitoring locations with discrete, stable features of interest. The motivating use cases are essentially:
Sure, for providers of such STA endpoints, could we simply ask them to publish netCDF/zarr versions of the underlying data and publish EDR based on that? But I think there is desire among STA providers to move directly from STA to EDR, so that EDR can be supported for simple RESTful queries and STA for those users who need OData without needing to construct and maintain a separate back end for each of them. For these implementations I think we have some traction between the monitoring network mockup from @dblodgett-usgs and what @hylkevds put together from above. In practice, I think the major sources of variation in STA implementations of stationary environmental sensor networks have been
I think 1. and 2. and 3. can be handled across STA implementations (in the stationary sensor case) by just concatenating all the information into a I think 4. is both trickier and more important. Maybe there is some room for flexibility in whether the EDR |
👍 from me -- this is spot on. Getting at your:
If we rewind the clock a bit to the HDWG best practice for SOS2 and WaterML2-Timeseries... For the sake of interoperability, the "feature of interest" was restricted to be the sensor's location. That was not intended to limit association to other features of interest, but the interpretation of "what feature does this observation characterize" is a bit too open ended for interoperability purposes -- especially when we start to think about the schema used to describe a feature of interest. I think a similar approach would need to be taken here -- and clearly is the approach that some STA implementers are taking. EDR doesn't attempt to take on the issue of "domain" feature of interest. Rather, it focuses on the sampling feature which may or may not be an a-priori identified (EDR) |
This suggests that to cover bases within the intention of EDR, EDR |
@ksonda And of course we put Making a start - amended table from the STA docs...
|
Yes, and I think we need to concatenate many of the STA entities into some EDR collection to enable the kind of combined space + parameter query that EDR is fundamentally about. I think Dealing with a similar problem with OAF, me and @webb-ben put together an ad-hoc STA->OAF mapping for pygeoapi that @KoalaGeo was generous enough to try out That mapping roughly looks like this
Maybe all of the above is way too opinionated though, and my ambitions for even that level of consensus is not feasible |
We are making progress here. A few important points to surface.
So we might have: Note that the Taking another stab at this table...
[1] Note that the more typical EDR use case (a data cube that we want to sample with some sampling geometry) is quite different with respect to STA -- potentially different enough that it isn't worth pursuing a mapping between the two. [2] The EDR GeoJSON schema contains required parameters for a geojson document:
[3] Note that EDR "instances" are intended to be instances of a collection. The real intention here is for collections that have versions (like forecast models that have runs or ensembles). Use of instances for Datastreams would be a very different implementation pattern where a collection is a Thing with one item and one location. I don't think this is worth pursuing. |
Thanks, this is helpful. This combined with these docs in particular are helping think through this better. In the stationary sensor/discrete sample location use case I'm imagining (@KoalaGeo and @jkreft-usgs chime in if I'm off base), As for STA entities, The EDR Capabilities endpoints Now, the other endpoints get more complicated. The EDR schema for the responses for requests for a given metadata endpoint requires information from multiple STA entities, such that it doesn't make much sense to me to map for example an STA Thing to an EDR item. Rather, an EDR item corresponds to a document that would need to be cobbled together from an STA EDR Collection metadata endpoints
|
@ksonda A minor point, but what did you intend by group? |
@chris-little the short answer is I don't know. It's in both the ReDoc and swagger docs under Capabilities endpoints but not in the standard (at least according to a ctrl + f). |
@mburgoyne Any suggestions about the /group endpoint? |
Looks like it's coming from this? https://github.com/opengeospatial/ogcapi-environmental-data-retrieval/blob/master/standard/openapi/schemas/groups.yaml - I opened opengeospatial/ogcapi-environmental-data-retrieval#327 to track this. |
@ksonda at the moment, agree However I have talked with @hylkevds previously about adding our groundwater forecast ensembles to our STA endpoint which occurs already do -
If the Borehole is your Thing, you can make a new Datastream for the forecast data. Someone who wants the current forecast can request Observations and filter on "validTime not lt now()" or with FROST: "overlaps(validTime, now())". For this use case, then |
I retract the irrelevance of instances :) However, it does present similar problems as collections, in that the STA provider has wide latitude in defining what an instance is, the indicator for which could be instantiated as a name, description, or arbitrary property attribute of Things, Sensors, Datastreams, or even just Observations (as a parameter). Is there any best practices guidance within STA for how to deal with an idea similar to instance. If not, should there be? |
@ksonda A PR is being prepared to remove it, in V1.0.1 of the standard, which is the Master branch now. Paleo-ware, software archaeology? |
If /group is not actually part of the standard then I don't think there's anything to align about it! |
How the EDR concept of |
Spot on @hylkevds -- You've illustrated the perfect example of why there are very few standards that attempt to handle "versions" of a dataset where there is any semantic structure to what the versions are. The fact is, STA does not have a meta-dimension to say "this set is the same in all ways from the containing set except in this one way that makes it a useful version of the same thing that we want to track uniquely." In EDR, "instance" is really intended to support use cases where the same model has been run multiple times and produced slightly different results. But as you all have pointed out above, there are other use cases where this kind of grouping may be useful. IMHO, we would do ourselves a favor to avoid instances in this conversation except in the case that we have multiple STA endpoints that are identical to each other except in some specific difference, like software version, model initialization conditions, etc. In that case, we would call the two or more STA endpoints that are essentially identical a separate instance in EDR. |
Connected discussion started on opengeospatial/ogcapi-environmental-data-retrieval#373 |
IMO if we can use the water quality IE to define what the semantics are for this kind of observation data, we may make progress on this as both EDR and STA concepts could be mapped more clearly to O&M/SSN/SOSA semantics. |
@liangsteve @chris-little is there scope for an "Official" OGC interoperability experiment/sprint or similar to look at definitive mappings between SensorThingsAPI & EDR? @ghobona @hylkevds @KathiSchleidt @m-burgoyne for info |
@KoalaGeo @liangsteve I suppose there is scope, though spare effort is scarce. The current EDR focus is on V1.1 (supporting POST/GET, and custom and categorical dimensions) and the V1.2 focus is Pub-Sub, which has overlap with streaming/async and perhapsSTA . |
Discussion of this was raised on the HDWG Spring meetings today. We should look for an opportunity to advance the topic. Perhaps stepping back just a bit and looking at it through the lens of a OMS mapping to EDR and STA? opengeospatial/ogcapi-environmental-data-retrieval#373 (comment) |
@ksonda -- my read of this is that you were on a good track with your mapping of expansions of location to EDR items and locations. You had said that you would come back to it in your comment above but never did. Do you think you could revisit that comment and refine it based on your latest understanding? I feel like there's a best practice report in this or something. See opengeospatial/ogcapi-environmental-data-retrieval#373 (comment) for my latest take over in the EDR issue on monitoring networks. |
See my response there opengeospatial/ogcapi-environmental-data-retrieval#373 (comment) . I know EDR less than STA but I think a deminimis mapping is possible, despite variety in STA implementations. |
cool -- so what do you think closure criteria for this issue should look like? |
We have implemented an experimental yet operational mapping. Compare USGS STA and EDR proxy Essentially sta/Things?$expand=Locations maps to /items and /locations Sensors and FoI are not directly mapped but conceivably could populate /items somehow |
This may be a can of worms.
Over at geopython/pygeoapi#817 (comment) there is discussion of implementing an STA provider for an EDR endpoint. This of course raises the question of how to map STA queries to EDR queries. I think this discussion is better had here for possible general application and broader thoughts from STA community than silo'd within pygeoapi.
@hylkevds @KoalaGeo @jkreft-usgs
The text was updated successfully, but these errors were encountered: