Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving archiver space efficiency #3

Open
AsterNighT opened this issue Jul 19, 2023 · 13 comments
Open

Improving archiver space efficiency #3

AsterNighT opened this issue Jul 19, 2023 · 13 comments

Comments

@AsterNighT
Copy link
Contributor

The idea of project is quite interesting. I haven't really tested it for long but it seems promising. The greatest drawback now seems to be the archiver storage consumption. It only takes like 3 minites to produce 100MB screenshots.

The screenshots seem to contain lots of duplications. It would be good if there would be a filter before an image is ever archived. A first idea to me is to check for duplicated images with hashes. It is also possible to "rank" the images based on the text extracted from it, but this would require careful research and design.

@STRRL
Copy link
Owner

STRRL commented Jul 20, 2023

The most mainstream video codec (like h.265 and others) could reduce storage usage by just encoding the diff parts between frames. So I think this goal could be completed by the future archiver implemented by the video, which is also in the ROADMAP.

What do you think about it?

@STRRL
Copy link
Owner

STRRL commented Jul 20, 2023

But that's a lot of tuning options on video encoding process, maybe the early implementation is not the best one(mostly perhaps) ... 🫣

@STRRL
Copy link
Owner

STRRL commented Jul 20, 2023

I just tried it on my Linux machine:

  • capture 2 screens in 4k resolution for about 10 minutes
  • images take 176 MiB
  • use ffmpeg for encoding video, with default profile and h265 encoding, 0.5 framerate.
  • it takes 12s to encode all the images into the video, but it consumes all the CPU during the encoding
  • the one of final video output files for each screen is 4 MiB, another is 3MiB
  • compression ratio is 176/(4+3) = 25~

@STRRL
Copy link
Owner

STRRL commented Jul 20, 2023

with the default h265 encoding, the video would take 4 MiB for every 10 minutes.

@STRRL
Copy link
Owner

STRRL commented Jul 20, 2023

so it would take about 200~ MiB for 8 hrs daily usage, 1.4~ GiB for weekly usage.

@STRRL
Copy link
Owner

STRRL commented Jul 20, 2023

I think it's close enough to the performance to Rewind on macOS
image

@STRRL
Copy link
Owner

STRRL commented Jul 20, 2023

the file size of the video always depends on the content, so it would increase with complex content, but I think it would NOT exceed of 10x more spaces.

I think it's kind of enough to use for now. 🤩

What do you think about it? @AsterNighT

@AsterNighT
Copy link
Contributor Author

AsterNighT commented Jul 21, 2023

That makes sense. Actually I haven't heard of rewind before.
The way Dejavu runs now use like 15% of my cpu time (laptop, 6800H, mostly tesseract, I suppose.) And I think we would need something like a live streaming encoder. Not sure how much extra cpu it would take.

@AsterNighT
Copy link
Contributor Author

AsterNighT commented Jul 24, 2023

I tried a few seemingly viable way for screen capturing and video encoding.

  1. Call ffmpeg directly. It works, and the overhead is minimum. ffmpeg should be cross-platform but the arguments are not. And it does not provide an interface for processing the frames.
  2. Capture screens and feed them to https://github.com/ralfbiedert/openh264-rs. This does not seem to support a frame-by-frame encoding (or it is supported by raw APIs, the documentation is limited). The document claims that it is cross-platform. Haven't tried it personally.
  3. https://github.com/astraw/vpx-encode gives an example of encoding with libvpx. From the code it supports frame-by-frame encoding. But it neither builds on my windows nor linux.

@STRRL
Copy link
Owner

STRRL commented Jul 25, 2023

I dive into the detail about what rewind does..

A first idea to me is to check for duplicated images with hashes. It is also possible to "rank" the images based on the text extracted from it, but this would require careful research and design.

it really does the same thing: when there are no lots of changes in the content, it drops some picture, fallback to 1 images per 20s~. When there are lost of changes on the screen, it would use the 0.5 fps.

@STRRL
Copy link
Owner

STRRL commented Jul 25, 2023

There are lots of algorithms for image similarity detection.. I have been lost in them.

maybe I would make a simple one(Histogram comparison) and a heavy one(opencv).. and make it extensible

@AsterNighT
Copy link
Contributor Author

I'm not sure but would manually detecting image similarity outperform encoding it with some encoding algorithm? It will be more tunable indeed. While would it not be a more accessible way, if you would like to compress texts, to use some compressing algorithm than manually detecting text similarity and deduplicate them?

@AsterNighT
Copy link
Contributor Author

AsterNighT commented Jul 26, 2023

There are lots of algorithms for image similarity detection.. I have been lost in them.

maybe I would make a simple one(Histogram comparison) and a heavy one(opencv).. and make it extensible

I don't think simple algorithms like Histogram would be of great effect, consider this: You are reading a very long markdown article on, say, github. There will be load of texts and clearly you would like these texts to be recorded. But the histogram of the article will stay almost the same (after all they are texts only, in this sense they are "similar").

Or rather, maybe the filtering should be done after tesseract, not before. Since after all it is the texts that are searched with, not the image itself.

I'm thinking something like "Retaining the word set of the most recently captured X screenshots and calculate the similarity between the current pic and the set". Never done such thing before so I'm not sure if it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants