-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I need some guide: w/r at maximum speed with Tickstorage #911
Comments
I already took my tick database, loaded entirely in dataframe and then write it again in Mongo (now I have 100k rows, by default, per document and not 1 tick per document as before). It seems that speed has performed better. I'll measure the differences and post it here. |
Reading speed testOkay, grouping ticks in chunks of 100k rows, Definetely a huge difference... Now I need to think a way on how to group 100k ticks before save it in a document... Any idea? In my case, tick data is not quick enough to fill quickly that 100k rows so IDK how to save that ticks while they grow as 100k rows, because if something happens and algo gets down, I can lose 99.999 ticks that are not written in Mongo. PS: also another important advange is that the size of the BD has decreased from 1060 MB to 73 MB, insane... |
Collect it in a different data store and then copy it to Arctic when you have accumulated enough rows. For example you could collect it in Redis with journaling.
LZ compression works, approximately, by finding backreferences to content it already compressed and emitting a reference to the previous content instead of repeating it. Arctic compresses all the rows in a column. When you only write a single row at a time you are compressing a single value at a time (one row, one column) so there's little context to find repeated content in. |
Thanks a lot for your advice! |
Arctic Version
Arctic Store
Platform and version
Windows 10 x64, Intel I7-6700, 32GB RAM.
Description of problem and/or code sample that reproduces the issue
Is not really an issue, but a problem by my side. I have a python script writing tick data from different assets in real time and I'm not having any problem writing every tick... The problem is that I'm writing EVERY tick that I receive in this way and my read speed is VERY LOW:
Notice that
tick_dataframe
is a dataframe with a single row (the tick) parsed with timestamp and written in MongoDB as a document. I'm not having any problem writting the data in this way, but after read some closed threads here I see that the efficient approach is to save, at least, 100k rows or ticks in just ONE document.Any advise about how to do it? Keep the tick data saved in a dataframe and if len > 100k = write ticks in a document and then store 100k ticks more and write it in document? I'm and very noob user yet...
How Can I merge all single ticks in just one document for faster reads operations? Maybe this can be a solution for me.
Any other recommendation? Thanks in advanced for your reads and also for this awesome library.
The text was updated successfully, but these errors were encountered: