Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

soft timeout for long running text extractions #397

Open
hi-ko opened this issue Apr 4, 2022 · 0 comments
Open

soft timeout for long running text extractions #397

hi-ko opened this issue Apr 4, 2022 · 0 comments

Comments

@hi-ko
Copy link

hi-ko commented Apr 4, 2022

as discusse on Discord the new transformer framework degrades scalability/stability due to more long-running threads.

The only work around by today is to increase timeouts for the http client but that will pile up the number of threads which is not a good idea. e.g.

solr.http.socket.timeout=30000
solr.http.connection.timeout=10000

To fix this, the tracker or repo web script should support a soft timeout that offloads the threads and triggers a mechanism as discussed in #396 to mark a node so that it is not captured by the content tracker and that automatically restores visibility to the tracker once the content has been transformed by a T-Engine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant