-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve batch loading of UUIDs stats #154
base: master
Are you sure you want to change the base?
Conversation
@TeachMeTW at your convenience, could you test this locally on the I tested it with smaller ones and it looks promising, but I am not able to easily load large dumps |
I've tested the application locally on the staging dataset and wanted to share my findings: Error on Overview Page
Performance Observations
It appears that the issue might be related to how larger datasets are being handled. The superfast staging datasetScreen.Recording.2024-12-13.at.8.12.56.PM.mov |
Screen.Recording.2024-12-17.at.1.35.20.PM.movTested on open_access. The UUIDs do not load in batches, they stay in a blank screen for 3-4 min then load everything at once -- akin to pre-batch behavior. |
Thanks for testing. I must have messed something up. I'll check it out tomorrow |
https://dash.plotly.com/background-callbacks#using-set_props-within-a-callback Implements a 'background' callback to improve the batching of add_user_stats. This avoids relying on a dcc.Interval with a fixed time between batches The new callback, which has background=True, continues running continuously on the server until all of the chunks have been processed. At the end of each chunk it calls set_props, which sends an update to the UI with the latest data for 'store-loaded-uuids' Via `running=`, the loading spinner is disabled during execution of the callback and reinstated when the callback finishes or cancels. When there are 0 loaded-uuids-stats and >0 total uuids (i.e. while we are loading the first batch of 10) a secondary loading spinner is shown in place of the table This should speed things up further and prevent any glitches caused by the fixed interval The Output of the callback is 'all-uuids-stats-loaded', a new store which is not used anywhere. This seems pointless, but is necessary because the callback's initialization depends on its Output. if it has no Output, it initializes on the Overview page causing errors. If the output is 'loaded-uuids-stats', it waits for the callback to finish before showing anything, defeating the purpose of batch loading. Thus, I resorted to making a new store for it
This store is already declared in app_sidebar_collapsible (which is global to the entire app) Redeclaring it on this page, I think does nothing, but it might cause issues because dash doesn't expect 2 components with the same ID. removing it.
f7b773d
to
616f7da
Compare
@TeachMeTW I believe I have resolved the issue. Can you test again on I wanted to test it myself on a larger dataset so I went through the trouble of loading in the latest Testing done: Before the first batch is loaded Callback cancels when switched to different page or tab (it never got past 70, which is when I switched to the trips tab) |
I think the reason for it being fast on Stage but slow on Laos + Open Access is simply that Stage has not nearly as many trips as Laos + Open Access. |
One hiccup I noticed that is annoying for the dev workflow is that autoreload sometimes breaks. Normally, when a file is modified and the dashboard is running, it refreshes itself. But if I do this while the UUIDs tab is showing, it gets stuck and I have to restart the the docker container to see my changes. But if I move off of the UUIDs tab and then modify a file, it reloads like normal. So I think there is something about having a background callback running that blocks the autoreload process. |
@JGreenlee Tested with openaccess, it does batch loading as intended. It also resets progress as you said when switching between tabs |
https://dash.plotly.com/background-callbacks#using-set_props-within-a-callback
Implements a 'background' callback to improve the batching of add_user_stats. This avoids relying on a dcc.Interval with a fixed time between batches
The new callback, which has background=True, continues running continuously on the server until all of the chunks have been processed. At the end of each chunk it calls set_props, which sends an update to the UI with the latest data for 'store-loaded-uuids'
This should speed things up further and prevent any glitches caused by the fixed interval