-
-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ODS harvesting via simple URL not working #6962
Comments
Related to break change introduced in #6677 |
Definitely not an ODS expert but it looks like depending on if you request API version 1 and version 2 the datased id path is different. The PR you're pointing at ODS API v2 support was added a3db440 and should have preserved compability with version 1 API. So before 4.2.3, harvesting an ODS API v2 was not working. Running on main the following harvester config {"@id":"228","@type":"simpleurl","owner":["1"],"ownerGroup":["2"],"ownerUser":["undefined"],"site":{"name":"6962","uuid":"d3e54543-097d-4bb8-bfe6-0fa9c04bb73d","account":{"use":false,"username":[],"password":[]},"url":"https://opendata.lillemetropole.fr/api/datasets/1.0/search?refine.publisher=M%C3%A9tropole+Europ%C3%A9enne+de+Lille&start=0&rows=20","icon":"blank.png","loopElement":"/datasets","numberOfRecordPath":"/nhits","recordIdPath":"/datasetid","pageSizeParam":"rows","pageFromParam":"start","toISOConversion":"schema:iso19115-3.2018:convert/fromJsonOpenDataSoft"},"content":{"validate":"NOVALIDATION","importxslt":"none","batchEdits":"[]"},"options":{"every":"0 0 0 ? * *","oneRunOnly":false,"overrideUuid":"SKIP","status":"active"},"privileges":[{"@id":"1","operation":[{"@name":"view"},{"@name":"dynamic"},{"@name":"download"}]}],"ifRecordExistAppendPrivileges":false,"info":{"lastRun":"2023-05-05T05:25:19.923Z","running":false,"result":{"added":"224","atomicDatasetRecords":"0","badFormat":"0","collectionDatasetRecords":"0","datasetUuidExist":"0","privilegesAppendedOnExistingRecord":"0","doesNotValidate":"0","xpathFilterExcluded":"0","duplicatedResource":"0","fragmentsMatched":"0","fragmentsReturned":"0","fragmentsUnknownSchema":"0","incompatible":"0","recordsBuilt":"0","recordsUpdated":"0","removed":"0","serviceRecords":"0","subtemplatesAdded":"0","subtemplatesRemoved":"0","subtemplatesUpdated":"0","total":"224","unchanged":"0","unknownSchema":"0","unretrievable":"0","updated":"0","thumbnails":"0","thumbnailsFailed":"0"}}} for v1 API collects 224 records. and playing {"@id":"373","@type":"simpleurl","owner":["1"],"ownerGroup":["2"],"ownerUser":["undefined"],"site":{"name":"6962 v2","uuid":"cc6c2ae1-34a8-4ac6-bd19-8df33098f61b","account":{"use":false,"username":[],"password":[]},"url":"https://opendata.lillemetropole.fr/api/explore/v2.0/catalog/datasets?rows=100","icon":"blank.png","loopElement":"/datasets","numberOfRecordPath":"/nhits","recordIdPath":"/dataset/dataset_id","pageSizeParam":"rows","pageFromParam":"start","toISOConversion":"schema:iso19115-3.2018:convert/fromJsonOpenDataSoft"},"content":{"validate":"NOVALIDATION","importxslt":"none","batchEdits":"[]"},"options":{"every":"0 0 0 ? * *","oneRunOnly":false,"overrideUuid":"SKIP","status":"active"},"privileges":[{"@id":"1","operation":[{"@name":"view"},{"@name":"dynamic"},{"@name":"download"}]}],"ifRecordExistAppendPrivileges":false,"info":{"lastRun":"2023-05-05T05:46:25.882Z","running":false,"result":{"added":"10","atomicDatasetRecords":"0","badFormat":"0","collectionDatasetRecords":"0","datasetUuidExist":"0","privilegesAppendedOnExistingRecord":"0","doesNotValidate":"0","xpathFilterExcluded":"0","duplicatedResource":"0","fragmentsMatched":"0","fragmentsReturned":"0","fragmentsUnknownSchema":"0","incompatible":"0","recordsBuilt":"0","recordsUpdated":"0","removed":"1","serviceRecords":"0","subtemplatesAdded":"0","subtemplatesRemoved":"0","subtemplatesUpdated":"0","total":"10","unchanged":"0","unknownSchema":"0","unretrievable":"0","updated":"0","thumbnails":"0","thumbnailsFailed":"0"}}} collect 100 records So this seems fine to me, no? |
So your issue was related to String uuid = this.extractUuidFromIdentifier(record.get(params.recordIdPath).asText()); which only works if the property you need is a property of the loopElement node which is not the case in all JSON harvester and not in ODS API v2. So it was indeed changed to String uuid = this.extractUuidFromIdentifier(record.at(params.recordIdPath).asText()); This explains why your config in 4.2.2 did not work in 4.2.3. By the way, a quite clear error is reported in the harvester log
|
Thanks for looking into this @fxprunayre. Indeed, in the end, it's just the missing I didn't pay attention that the mentioned PR was using ODS v2 having a different hierarchy and keys |
Just to clarify, this had nothing to do with ODS API v2 (which we don't use). It was an error on our side, indeed it works with the new format for the record id pointer. Thanks @fxprunayre |
FYI, I opened geonetwork/doc#240 regarding this. |
Describe the bug
Harvesting the following ODS catalog via the simple url harvester (which works on version 4.2.2) does not seem to work anymore. I have the feeling, this is related to the change that the
recordIdPath
input now expects a path/datasets/datasetid
(from the document root?). Or is it just me indicating the wrong path? In version 4.2.2 only the property keydatasetid
is indicated here.To Reproduce
Steps to reproduce the behavior:
Expected behavior
Harvest ~208 records from the catalog.
Log file
harvester_simpleUrl_MEL_ODS_GN_main_202303301528.log
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: