forked from ontoportal/ontologies_linked_data
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Add index all data step #136
Merged
syphax-bouazzouni
merged 3 commits into
development
from
feature/add-index-all-data-step
Apr 21, 2024
Merged
Feature: Add index all data step #136
syphax-bouazzouni
merged 3 commits into
development
from
feature/add-index-all-data-step
Apr 21, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
syphax-bouazzouni
force-pushed
the
feature/add-index-all-data-step
branch
2 times, most recently
from
April 20, 2024 22:38
87b87ed
to
038883b
Compare
syphax-bouazzouni
force-pushed
the
feature/add-index-all-data-step
branch
from
April 20, 2024 22:47
038883b
to
2c3b3a2
Compare
syphax-bouazzouni
added a commit
that referenced
this pull request
May 22, 2024
…ndexing ontologies data and metadata, URI data fetching (#135) * Feature: Optimize tests run time by 50% (#107) * update bubastis.jar v1.4.0 and fix missing import exception * optimize mappings tests * optimize provisional relation tests * optimize notes tests * fix mappings tests * optimize instances tests * add generate_missing_labels and extract_metadata process options * don't index and extract by default in submission process in tests * optimize ontology submission tests run time * Feature: Add Virtuso, Allegrograph and Graphdb integration to OLD (#106) * setup multi-store unit-tests environment * fix unit tests * add vo parsing optimization * update RDF version replaced RDF::SKOS with RDF::Vocab::SKOS (#131) * Fix: an issue after update RDF gem to 3.0 that frozen request params (#133) * fix an issue after update RDF gem to 3.0 that frozen request params * handlee the case when the sparql endpoint default value is empty * Feature: Migrate SOLR configuration files to use SOLR Schema API (#126) * use standard SOLR in docker compose with no ontoportal old confgis * migrate ontology properties SOLR configuration to use Schema API * migrate ontology classes SOLR configuration to use Schema API * migrate provisional classes indexation to use Schema API and model hooks * update tests to handle the new indexation API * simplify the ontology properties index schema * update class and properties schema to use the existent dynamic names * Feature: Index Ontologies metadata and content & Agents (#130) * use standard SOLR in docker compose with no ontoportal old confgis * migrate ontology properties SOLR configuration to use Schema API * migrate ontology classes SOLR configuration to use Schema API * migrate provisional classes indexation to use Schema API and model hooks * update tests to handle the new indexation API * simplify the ontology properties index schema * update class and properties schema to use the existent dynamic names * index submission and ontologies metadata on save * index agents metadata * add ontology and agent metadata indexation tests * make agent, name , acronym, email and identifiers searchable * unindex ontology submission when archived * make ontology acronym and name searchable * update embedded ontology to all the fields and update submission in save * fix embed docs search tests * rename ontology unindex to unindex_all_data to prevent conflicts * implement index all ontology content * fix unescaping indexed properties naming * fix an issue after update RDF gem to 3.0 that frozen request params * add parallel processing the index_all_data step * clear indexed data after ontology delete * optimize index all data in Virtuoso and GraphDb by pre-fetching all ids - Before optimization - fs ⇒ 15.224490000051446s - ag ⇒ 19.238805999979377s - vo ⇒ 42.95274499990046s - gb ⇒ 33.52821200003382s - After optimization - fs ⇒ 15.369778999942355s - ag ⇒ 17.367580000078306s - vo ⇒ 16.564614000031725s - gb ⇒ 15.431716999970376s * Feature: Add URI fetching related triples and serialization in different formats (#125) * Add raptor library to parse ntriples data * Add resource model to fetch id related triples and serialize it * Add and inhance xml, ntriples, turtle and json serializers * Updating rdf version in goo project * updating resource model * Adding tests for resource model and serializers * update the resource test to have a more complete data to test (array, bnodes, typed values) * re-implement xml serializer using RDF/XML parser instead of Raptor * implement array handelling of resource to_object * Enhance and refactor serializers ntriples, turtle and xml * Enhance and refactor serializers ntriples, turtle and xml * Handle blank nodes and reverse triples - handle blank nodes - fetch reverse triples - generate random name for models in to_object, because when two model created the same time one overrides the other - call the new serializer JSONLD and RDF_XML * Impliment new serializers jsonld and rdf_xml - impliment jsonld serializer that uses json-ld library - revert changes in xml.rb file to the original implimentation, and put the new implimentation in rdf_xml.rb file - Add the media types :jsonld and :rdf_xml * Add json-ld gem * Enhance the test resource - Add some cases to the data tests - refactor the test of the serializers formats * Fix test for fetch-related triples and json * clean and refactor the resource serializer code * Removed unused methods * Extracted duplicated code in methods * Removed skip from the tests --------- Co-authored-by: Syphax bouazzouni <[email protected]> * Feature: Add submission metrics to the indexed data * Feature: isolate ontology submission process steps (#132) * add an abstraction for submission process steps * extract submission generate_rdf step to a file * extract submission generate missing labels steps into a file * extract the submission archiving step into a file * add abstraction to diff tool & extract the submission step to a file * extract the submission metrics generation step to a file * extract the submission properties indexation step into a file * extract the submission terms indexation step into a file * move the extract metadata concern to submission process step file * extract the submission generate obsolete classes step from generate rdf * add the global submission process that call the sub-steps * Feature: Add index all data step (#136) * move the submission_all_data concern to a submission process service * add index_all step to submission parsing steps * add index all data submission status * send note creation notification to also the admins (#137) * change sparql client branch to use development * fix indexing all data been removed after the index terms step --------- Co-authored-by: Imad Bourouche <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Require
Changes