Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move regex to lazy lock #37

Closed
wants to merge 2 commits into from
Closed

Conversation

ruwi-next
Copy link

Hi!

We can compile the message regex once and reuse it everywhere. This has the benefit of not recompiling the regex if a new iterator is created. It also means the test does not need a copy of the regex. And it is one less reference to pass around.

Let me know what you think :)

Copy link

google-cla bot commented Dec 2, 2024

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@jrouaix
Copy link
Contributor

jrouaix commented Dec 3, 2024

Hi there, about regexes in LazyLock.
We had our share of mem leak problems in production because of those Regexes in Lazies.
Somehow Regex capture use some share of memory per thread and do not release until the regex is dropped.

We sometimes fixed it by .clone() the regex before using it.

some readings :

It's now a weak/monitored point in our software : we never really know when the regexes will bite us.

So we avoid having lazy ones, we encapsulate when in some "per execution structure" that compile them once, in a big while, then the regexed are dropped in the end of the execution.

This lib could be used in a long running process, I'd advice not to lazy anything, enabling it's memory footprint to go back to 0 when it's not used.

@ruwi-next
Copy link
Author

Hi there, about regexes in LazyLock. We had our share of mem leak problems in production because of those Regexes in Lazies. Somehow Regex capture use some share of memory per thread and do not release until the regex is dropped.

We sometimes fixed it by .clone() the regex before using it.

some readings :

It's now a weak/monitored point in our software : we never really know when the regexes will bite us.

So we avoid having lazy ones, we encapsulate when in some "per execution structure" that compile them once, in a big while, then the regexed are dropped in the end of the execution.

This lib could be used in a long running process, I'd advice not to lazy anything, enabling it's memory footprint to go back to 0 when it's not used.

Hi, thanks for the explanation,

I did not know about the regex memory leak using TLS as a cache, it is very interesting.

https://docs.rs/regex/latest/regex/#sharing-a-regex-across-threads-can-result-in-contention seems to be the best documentation for it. I agree this is not worth the potential issues for dependencies.

Thanks

@ruwi-next ruwi-next closed this Dec 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants