-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential Memory Leak #45
Comments
Oh fun. First, thanks for all the work you've done here. So, this is basically an asynchronous stack overflow, and so isn't a memory-leak in the traditional sense, so much as increased memory retention/overhead from retained parent domains while child domains remain. Normally, this is not an issue, since mostly I use However, it would cause a memory-leak equivalent in circumstances where you had an indefinitely growing asynchronous call stack (A calls B calls C calls D ... ad infinitum), which I suspect is what you're seeing. This is analogous to V8 historically limiting a call-stack to 200 frames for memory reasons. So..... Since This is all stupid in the sense that Not sure when I'll have a chance to get to this since it may be a decent amount of work. Then again, since we'd just be removing domains with some shallow EE instance, it may potentially be very easy. TLDR; Domains should be removed. Your hack solves your issue, but breaks catching of async |
I'm honestly trying to think of a short-term solution for you, but there really isn't one. Your fix is the your best option since the async catch nesting isn't necessarily a big concern for you. Also, domains could be kept in a WeakMap, on the catchFn / EE state object, so that the async error nesting would work, but allow domains to be GCd. This is maybe a more attractive solution, though I'm not sure it's any easier and so doesn't help you at the moment. |
Looks like |
Thanks a lot for the feedback! Your mention of I then turned my attention to what was arguably the actual source of this issue - the potentially infinite nesting of function onError(err){/* handle error */}
var initiateMessageProcessing = function(){
trycatch(function doTry(){
fetchMessage(function messageReceived(err, msg){
initiateMessageProcessing(); //Start processing/fetching of next message
if (err) return onError(err);
//do actual (async) message processing here
});
},
function doCatch(err){
onError();
}
);
};
initiateMessageProcessing(); Looking at this, the infinite In this case, the nesting of //Inside messageReceived
var activeDomain = domain.active;
activeDomain && activeDomain.exit();
initiateMessageProcessing()
activeDomain && activeDomain.enter(); That didn't work, because as it turned out, each time that code was hit, the active domain also happened to be the next domain on the stack, so calling My final port of call was to create my own root domain and bind var rootDomain = domain.active || domain.create();
var initiateMessageProcessing = function(){/* as before */};
initiateMessageProcessing = rootDomain.bind(initiateMessageProcessing);
initiateMessageProcessing(); Fortunately, this worked! I'm relatively satisfied with this solution so am I'm going to go with it for now. So, in closing, since it turned out that it was actually my code which caused the nesting of |
I appreciate you working with me to reach a solution. You're a model user / contributor. ;) I think there's definitely a solution here using WeakMap (I know you concluded the opposite), that would allow the parent domains to close, while maintaining a reference to the catch callback. This would completely eliminate all memory growth except the catch callback function, which would be relatively negligible. With long-stack-traces on, it would also keep a ~100B reference to a call stack string. Both would add up for long running processes. There still needs to be a way to avoid these slow leaks, so I'm going to reopen this. Your trycatch.noNest(tryFn, catchFn)
// or...
trycatch.root(tryFn, catchFn) This will be relatively easy to add, as it would just skip the check for a currently active domain. Thanks! |
We've been trying to track down a memory leak which has been killing our prod boxes. Eventually, with the help of the heapdump module, we've been led to believe that it has something to do with either trycatch, or the way we use it. See attached screenshot of profile.
The doCatch function is the closure we pass as the catchFn to trycatch, and onError and onMessageHandled are functions referenced from doCatch.
According to the profile, it seems that these functions can't get cleaned up due to persistent references from inside trycatch, in particular via the _trycatchOnError function. This function in turn creates a reference from the child domain to the parent domain. It does this via
trycatch/lib/trycatch.js
Line 58 in 0fc57b3
trycatch/lib/trycatch.js
Line 98 in 0fc57b3
To test this theory, I hacked trycatch locally to pass a callback to the tryFn (incidentally the catchFn doesn't need to get invoked to create to the leak). This callback would then remove the listener added at
trycatch/lib/trycatch.js
Line 58 in 0fc57b3
I'm not sure whether this is a feasible route to go down though, as handling the error case, where the catchFn actually needs to be called, which itself could fail, would be challenging.
What is also confusing is that we do use trycatch elsewhere and I haven't seemed to notice any leaks there, though that could be due to us using it at much larger scale in this use case, which makes the leaks visible.
The text was updated successfully, but these errors were encountered: