Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable cache access #169

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Enable cache access #169

wants to merge 1 commit into from

Conversation

t-nil
Copy link
Contributor

@t-nil t-nil commented Jan 1, 2024

Hey,

I really want to be able to edit the cache. So I started refactoring a bit. Two things:

  • As you can see, I started to replace the Arc<> with just clones of String and PathBuf. Usage was inconsistent over the files and it seems unnecessary. I assume you wanted to gain speed/save RAM? Copying FFmpeg Param struct even over like 20 samples and 10 CRF search runs should be negligible RAM usage compared to what the video encoder takes, and speed should also not nearly be a bottleneck to any use case. If anything, the whole thread-safe accessing over tokyo threads and mutex locking could make things slower, compared to cheap memcpys (atleast thats my understanding). Would you be ok with me simplifying those structs?

  • Can I ask why you use sled? Again, performance should not be an issue (a few ms lookup vs minutes to hours of encoding), and with JSON we have the possibility to

    1. edit the cache with text editors (deleting entries primarily),
    2. version it in GIT (if experimenting or discovering bugs), and best of all
    3. saving cleartext params instead of only hashes (so we can precisely look up past runs).

    Would you allow me to change the backend to JSON (or similar)? It would probably make things easier instead of maintaining two formats (and should also make development alot easier since you can just look at the runs directly).

I don't want to come of as criticizing, I'm sure you noticed I enjoy working with and on your program and design decisions in the start are hard if not impossible to get right :) so thanks again!

@alexheretic
Copy link
Owner

I do intend to replace sled, I think with sqlite. Having multiple small json files is very inefficient on most filesystems, having one/few big json files is kinda just rolling our own embedded db.

I really want to be able to edit the cache

This isn't something I would generally envision users doing. What's your motivation for doing so?

For example, perhaps we could include time with cache and offer a cache pruning command + cache clearing command in the binary.

saving cleartext params instead of only hashes (so we can precisely look up past runs).

It would be interesting to more eagerly lookup info for crf-searches, but I do like the simplicity of the current approach and determinism of the search itself. I also will probably keep hashing filenames as it makes the cache less sensitive privacy wise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants