You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that if a call to the publication service is interrupted then
it leaves a record in Solr but with a short list of files - this can
be seen in CoG as a discrepancy between the "number of files" in the
summary metadata and the number of files in the actual file list - see
attached image with relevant part of a screenshot. Indeed, when
publishing a large dataset to the index (large enough that the
publication time is long relative to the master-slave sync interval in
Solr), I can watch the number of files grow in CoG.
This has left some items in the CEDA index in an inconsistent state in
our index because I was assuming that if a record had appeared then
publication to the index had succeeded. (Thanks to Katharina for
noticing this.)
Is it possible to make the publication service atomic? It probably
would not be a big problem if Solr does not allow it to be totally
atomic on the scale of the small amount of time it actually takes to
write the Solr document (although I'd be surprised), but could it at
least gather all the necessary information and then write it in a
single call to Solr?
The text was updated successfully, but these errors were encountered:
Who: Alyn
It seems that if a call to the publication service is interrupted then
it leaves a record in Solr but with a short list of files - this can
be seen in CoG as a discrepancy between the "number of files" in the
summary metadata and the number of files in the actual file list - see
attached image with relevant part of a screenshot. Indeed, when
publishing a large dataset to the index (large enough that the
publication time is long relative to the master-slave sync interval in
Solr), I can watch the number of files grow in CoG.
This has left some items in the CEDA index in an inconsistent state in
our index because I was assuming that if a record had appeared then
publication to the index had succeeded. (Thanks to Katharina for
noticing this.)
Is it possible to make the publication service atomic? It probably
would not be a big problem if Solr does not allow it to be totally
atomic on the scale of the small amount of time it actually takes to
write the Solr document (although I'd be surprised), but could it at
least gather all the necessary information and then write it in a
single call to Solr?
The text was updated successfully, but these errors were encountered: