-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] CoreMessageReceiver.returnMesagesLoopDaemon is being executed too often causing higher CPU consumption #14550
Comments
Hi @ryan-murphy-sc someone from the |
Hello, may I ask what's the status of this issue? We're getting affected by the bug as well. In the screenshot I am attaching, there are two identical builds of one of our services - the second one has got Service Bus disabled via configuration (ie. there are no listeners created; the library itself is still included and active). When the listeners are created, there's basically a busy waiting in Thank you. |
Let me reproduce and see how much of CPU it is taking. When I ran it myself sometime back, I didn't see any significant additional CPU usage. Of course, significant means different things to different people. In both cases, how may receivers were there in your application for which you captured CPU? TraceLogging is not the best way to see if it is taking lot of CPU. And also it will come into play only for receivers that receive very little number or no messages. |
We have about 100 receivers throughout our app server. A number of them receive very few messages (1/hour and even 1/day). We're trying to consolidate to reduce the overhead caused by idle receivers, but it poses some design challenges to do so, so it's going slowly. Thanks. |
I have just tested it on my desktop. I created a java program that runs with 1000 receivers not receiving any messages and nothing else, the CPU usage is between 3% and 9%, averaging 6%. I don't really understand the 50% claim made with just 100 receivers. That 50% likely includes many other things. I don't think it is consuming too much CPU. You are free to reproduce this with a standalone app that just creates some receivers and check the CPU usage consumed by the app. Another point I want to make is that the daemon doesn't technically run every 1 millisecond. It just sleeps for 1 millisecond after the daemon is run once, before running it again. You can guess how many CPU cycles it consumes. In my test run on my desktop, when I change the sleep duration to 10 milliseconds from 1 millisecond with the same 1000 receivers, CPU usage dropped to average 3%. I can do that change. But I am not sure if this is the cause of your high CPU issues. |
The 50% contained other workers as well, and not only ASB Receiver threads, although they were a big portion of that number. This was running with the JVM pinned to a single core of an Azure A2 processor, so we're pretty resource constrained. Your workstation probably has more horsepower. |
@ryan-murphy-sc @mxmxcz The reason I am saying it might not be taking much of CPU is because your high CPU issue may not be resolved even if I changed that daemon sleep time from 1 millisecond to 100 milliseconds. |
@yvgopal Thank you for your answers. In the original screen I posted, the app was connected to just 1 subscription (with 1 listener and 1 instance). AFAIK there were none or close-to-none messages at the moment, which is fairly common in our use-case. I created a quick PoC as you said: https://github.com/mxmxcz/servicebus-test. It's a bare Java app containing just the Service bus library in the latest version, not even Spring. To amplify the effect on the local (much more powerful) machine, it creates several listeners to the topic's subscription. On my laptop, it averages at about 6 % - but bear in mind that it's a high-end machine with i7-9750H CPU 12 cores @ 2.60GHz; our pods in k8s are way less powerful for obvious reasons. I also tried to run the app in several (10?) instances, each with 1 listener, which had a similar effect on some of them (these were running at ~8 %, the rest at ~2 %); I didn't find a way to reproduce this scenario reliably, though. There were no messages throughout the whole testing; I was also the only one connected to the subscription. I absolutely agree with you on
but when the daemon does nothing (which is the case when there're no messages, I suppose), the actual work will take much less than 1 millisecond (x << 1ms), therefore it will be run every "1ms + x". I completely understand that logging messages are by no means a reliable way to debug the problem, but it helps to uncover what's going on. I don't understand the approach to threading in the library, but is it really necessary to keep polling every x milliseconds, instead of using some blocking mechanism and wake the thread only when needed? |
@mxmxcz Thanks for testing it. I put that 1 millisecond sleep only after running it for sometime to assess CPU impact. The reason I took this daemon approach is to solve a producer-consumer problem without having to block a thread. A blocking queue would solve it for me, but always blocks a thread. Imagine a 1000 receivers and each receiver instance has one thread blocked in it. I considered multiple ways and decided a daemon is better. May be there is another way I can solve it. As long as CPU impact is not bad and code is stable, I thought it was OK to have a daemon. Having said that, I am not treating it as a bug that needs fixing now. It will be low priority for us and I will consider increasing the sleep time. |
@yvgopal will there be a separate github issue for the enhancement or can we continue to use this one and change the title to indicate it's an enhancement? It's an important issue for us and I want to keep some visibility on it. |
@yvgopal also out of curiosity, is this daemon loop even relevant for Message Receivers that don't use prefetch and don't use MessageAndSessionPump? Our receivers rarely use prefetch and never use MessageAndSessionPump. So this entire daemon seems unnecessary. Also I do feel this was closed prematurely. It's a very real issue impacting Azure Service Bus customers. |
Remove x-ms-flatten setting for resource properties (Azure#14550) * Disabled flatten setting for properties * added forceDelete parameter for orchestrator resource * remove forcedelete * config Co-authored-by: msyyc <[email protected]>
Remove x-ms-flatten setting for resource properties (Azure#14550) * Disabled flatten setting for properties * added forceDelete parameter for orchestrator resource * remove forcedelete * config Co-authored-by: msyyc <[email protected]>
Hi. This is a cross-post from Azure/azure-service-bus-java#405 since I don't know if anyone is looking at those, and we are having the issue with service bus 3.x, which is in this repo.
Describe the bug
CoreMessageReceiver.returnMesagesLoopDaemon is scheduled to be executed every millisecond even if there are neither prefetched messages nor pending receivers, causing unnecessary higher CPU usage.
We migrated our clients from a JMS/QPID approach to this SDK and our server is now idling at about 50% CPU without any traffic.
Exception or Stack Trace
N/A
To Reproduce
Although 50% CPU involves many Message Receivers across multiple applications on the server, higher than expected CPU usage can be seen with a handle of Message Receivers.
The Message Receivers are kept open to retrieve for messages periodically, due to some paradigm issues with Message pump approach.
Code Snippet
ClientFactory.createMessageReceiverFromEntityPath(getMessagingFactory(), entityPath, receiveMode);
Expected behavior
Expected behaviour is to have lower idle CPU usage when there are no messages being received.
Setup (please complete the following information):
Additional context
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
The text was updated successfully, but these errors were encountered: