Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New import #153

Merged
merged 11 commits into from
Jun 28, 2024
Merged

New import #153

merged 11 commits into from
Jun 28, 2024

Conversation

elsaperelli
Copy link
Contributor

Summary

This PR changes the workflow of the bulk $import operation. Previously, this server used the Ping and Pull approach, but now it uses the proposed bulk data import outlined in this IG.

New behavior

For more information on the previous Ping and Pull approach, refer to this repository's Wiki page here. The main difference between that approach and the new approach is that the new approach utilizes already exported inputs rather than this server kicking off the export request. This simplifies the process by including the export ndjson urls in the Import Manifest in the request body of the $import operation rather than just the export url.

Code changes

  • package.json / package-lock.json - removed bulk-data-utilities dependency.
  • dbOperations.js - two new functions for storing and getting ndjson file failed outcomes, inclusion of the import manifest in the bulkImportClient object so that it may be included on the Import Result.
  • importWorker.js - executeNewImportWorkflow replaces executePingAndPull.
  • ndjsonWorker.js - ndjson failed outcomes are now stored as well.
  • bulkstatus.service.js - WORK IN PROGRESS - the structure of the Import Result object has yet to be finalized.
  • import.service.js - use retrieveInputUrls.
  • measure.service.js - bulkSubmitData now follows the new $import workflow. This is okay for now because $bulk-submit-data may even go away.
  • exportUtils.js - retrieveInputUrls now returns all ndjson fileUrls included in the Import Manifest
  • test - deleted param files that are no longer used in tests, updated unit tests to follow new workflow

Testing guidance

  • npm run check
  • Utilize the bulk-export-server to get the export ndjson url files to include in the Import Manifest. This can be done by populating the bulk-export-server, kicking off the $export request, and then polling the bulkstatus location until ndjson file urls are returned. Make sure the bulk-export-server is running on a different port than deqm-test-server (PORT=3001 npm start).
  • Spin up the deqm-test-server: npm start
  • Send a GET request to http://localhost:3000/4_0_1/$import with an Import Manifest json object in the request body (defined in IG) that includes ndjson file urls from the bulk-export-server. I am happy to send examples, but I figured leaving this open would be better for testing to make sure that I interpreted the IG correctly.
  • Make sure the $import kickoff returns a 202 status.
  • Send a GET request to the content-location (bulkstatus polling location). Make sure this either returns in-progress or OK status with a Import Result object.

Import Results

Note that the structure of the Import Results (bulkstatus.service.js) is still a work in progress.

Copy link

github-actions bot commented Jun 10, 2024

Coverage report

St.
Category Percentage Covered / Total
🟢 Statements
91.64% (-2.03% 🔻)
1239/1352
🟢 Branches
82.61% (-3.67% 🔻)
342/414
🟢 Functions
91.59% (-2.16% 🔻)
207/226
🟢 Lines
91.81% (-1.78% 🔻)
1210/1318
Show files with reduced coverage 🔻
St.
File Statements Branches Functions Lines
🟢
... / populateTestData.js
86.96% (-10.87% 🔻)
72.22% (-27.78% 🔻)
100%
92.5% (-5% 🔻)
🟢
... / errorUtils.js
90% (-10% 🔻)
100%
87.5% (-12.5% 🔻)
90% (-10% 🔻)
🟢
... / dbOperations.js
88.46% (-4.32% 🔻)
100%
81.82% (-8.18% 🔻)
88.35% (-4.36% 🔻)
🟢
... / import.controller.js
80% (-20% 🔻)
100%
66.67% (-33.33% 🔻)
80% (-20% 🔻)
🟢
... / import.service.js
95% (-5% 🔻)
100% 100%
95% (-5% 🔻)
🟢
... / bundleUtils.js
89.01% (-1.1% 🔻)
83.33% (-2.78% 🔻)
90.91%
88.89% (-1.11% 🔻)
🟡
... / bulkstatus.service.js
75% (-19.94% 🔻)
47.06% (-29.61% 🔻)
71.43% (-28.57% 🔻)
76.54% (-18.33% 🔻)

Test suite run success

164 tests passing in 20 suites.

Report generated by 🧪jest coverage report action from 2d81da4

@elsaperelli elsaperelli requested a review from hossenlopp June 18, 2024 22:06
Copy link
Contributor

@hossenlopp hossenlopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The $import requests if you give a non-existent file (one that would return 404) in the input. I'll look more closely at what's causing this to happen.

@hossenlopp
Copy link
Contributor

Additional error handling should be added to the ndjsonWorker.js file to catch any mongo or Axios errors (such as a 404) and probably note their outcome and either fail the whole import or just that single file and create the OperationOutcome for that ndjson.

Suggest wrapping the most of the ndjson.process handler with a try catch to handle any unforeseen errors. Potentially add try catch with re-throwing for the retrieveNDJSONFromLocation function if you want to make the error message more descriptive that it had to do with request for the data failing.

@elsaperelli elsaperelli requested a review from hossenlopp June 24, 2024 21:02
Copy link
Contributor

@hossenlopp hossenlopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there with this! Just a logger statement would help with seeing whats going on on the console with our testing.

src/server/ndjsonWorker.js Show resolved Hide resolved
@elsaperelli elsaperelli requested a review from hossenlopp June 25, 2024 13:22
Copy link
Contributor

@hossenlopp hossenlopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good and makes good use of the ping and pull stuff that was already there.

Copy link
Contributor

@lmd59 lmd59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great and thanks for all of the hard work on this one! My only comment is certainly a nitpick, so feel free to ignore if you disagree:
There are comment references and function names that call this the "new" import workflow. In the end this will just be the import workflow, so should we remove the "new" from comments and in function name?

@elsaperelli elsaperelli requested review from lmd59 and hossenlopp June 27, 2024 14:22
Copy link
Contributor

@lmd59 lmd59 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@elsaperelli elsaperelli merged commit 1e0a3e1 into main Jun 28, 2024
4 checks passed
@elsaperelli elsaperelli deleted the new-import branch June 28, 2024 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants