Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable threads #39

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Configurable threads #39

wants to merge 6 commits into from

Conversation

Zand3r24
Copy link

I added a lot between commits, which was a mistake, but the rundown is that I re-worked the threaded downloads.

Previously each 4chan thread would get its own process to check and download new images.
I created a queue (manager.list) system and made static workers/processes. By default, 4 will be created, but that is configurable with -p. As 'jobs' are pulled from the queue, a worker thread will work on it, then wait until another job is available to pull from the queue.

Let me know if you have any questions or want me to change anything.

Thanks!

@Exceen Exceen closed this Nov 10, 2022
@Exceen Exceen reopened this Nov 10, 2022
@Exceen
Copy link
Owner

Exceen commented Nov 10, 2022

Hi, I'm currently resolving the merge conflicts while also trying to improve the code a bit further. Could you explain why this is necessary? What is this check for? To me it looks like it doesn't do anything.

# check for any threads that have completed
for process in processes:
       process.join(0.25)  # this will clean up any processes that exited/crashed somehow, while also blocking for .25 seconds

@Exceen
Copy link
Owner

Exceen commented Nov 10, 2022

You are also never removing anything from the processes list, just adding to it.

# if, for some reason, we do not have the required amount of threads running, spin up new threads
while len(processes) < args.parallel_threads:
    p = Process(target=call_download_thread, args=(tasks_to_accomplish, args))
    processes.append(p)
    p.start()

This code never gets executed because of that.

@Zand3r24
Copy link
Author

Ah, my mistake. I can clean up that process list logic.

So from what I remember p.join() will wait for the process to exit and will clean up resources if the process has exited. I think (or thought) that without this the child process may exit but still exist until it was "joined" back into the main thread. But that could have been a misunderstanding, ill have to fact check that. I'm also using it to pause the main thread so it doesnt burn up too many cpu cycles looping too quickly.

Let me know what else you think should be done differently or cleaned up and i can either fix it in this pr or submit another if you'd like.

@Exceen
Copy link
Owner

Exceen commented Nov 11, 2022

The logic with the processes list definitely needs some rework. I didn't have enough time to completely check it but I have the feeling that we don't need running_links AND tasks_to_accomplish. I guess the code can be refactored so that only tasks_to_accomplish is needed.

Other than that I resolved the merge conflicts and improved the code a bit.

@onioneffect onioneffect mentioned this pull request Nov 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants