Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tccd uses 3% CPU continuously #37

Open
xrat opened this issue Aug 4, 2020 · 52 comments
Open

tccd uses 3% CPU continuously #37

xrat opened this issue Aug 4, 2020 · 52 comments

Comments

@xrat
Copy link

xrat commented Aug 4, 2020

We received an InfinityBook S 14 v5 installed w/ Tuxedo Ubuntu Budgie 20.04 and we observed the following also on the same hardware with a fresh installation of Tuxedo Ubuntu 20.04 (i.e. w/ GNOME 3): tccd continuously uses about 3% CPU according to top.

# delay=10
# LC_ALL=C.UTF-8 top -b -d $delay -p $(pgrep tccd) | awk -v OFS="," '$1=="top"{ time=$3 }
  $1+0>0 { print time,$1,$NF,$9; fflush() }'
...
22:45:08,6872,tccd,3.4
22:45:18,6872,tccd,3.7
22:45:28,6872,tccd,3.1
22:45:38,6872,tccd,2.7
22:45:48,6872,tccd,3.0
22:45:58,6872,tccd,3.1
22:46:08,6872,tccd,3.2
22:46:18,6872,tccd,2.7
22:46:28,6872,tccd,2.6
22:46:38,6872,tccd,2.7
22:46:48,6872,tccd,3.3
22:46:58,6872,tccd,2.8
22:47:08,6872,tccd,2.6
22:47:18,6872,tccd,3.4
22:47:28,6872,tccd,3.3
22:47:38,6872,tccd,3.3
22:47:48,6872,tccd,3.7
...
@tuxedoxt
Copy link
Collaborator

Hello,

this is in percentage of one logical core right?

It is not clear what your exact issue is. Does this affect the performance of other programs you run? How is the CPU usage during other types of load? Does it have a noticeable effect on energy performance for you?

@nickma82
Copy link

Same here ~2.8% of a logical core v1.0.3.
journalctl -b -u tccd -f is not logging anything suspicious.

@xrat
Copy link
Author

xrat commented Aug 12, 2020

We just received our 2nd InfinityBook S 14 v5 with a 4 core Intel i7-10510U which shows the same behavior.

this is in percentage of one logical core right?

Yes, top reports per process cpu usage per logical core.

Does this affect the performance of other programs you run?

No.

How is the CPU usage during other types of load?

It goes down to about 1%. To reproduce use e.g. dd if=/dev/zero of=/dev/null. But apparently, it does not depend on CPU but rather(?) interrupts b/c I see the same drop when I start cheese.

Does it have a noticeable effect on energy performance for you?

This is hard to measure, as you know. Moreover, I have only very limited time where I can test the hardware before I have to give it to the users we bought it for. Anyway, 3% for a program that I expect to idle (especially when the system is idling) seems like too much.

BTW, I am not running the Control Center GUI, not even the tray icon.

BTW2, I realized the Awk command in OP does not work well in Mawk which is the default at least on the Tuxedo Budgie I received:

# delay=2
# LC_ALL=C.UTF-8 top -b -d $delay -p $(pgrep tccd) | mawk -W interactive '$1=="top"{ time=$3 }
  $1+0>0 { print time,$1,$NF,$9 }'

@tuxedoxt
Copy link
Collaborator

Right. Of course it is idling for the most part. According to previous profiling the "heaviest" part is the fan control. This is, of course, idling for the most part as well but spends some time reading/writing to the hardware, which I believe should not actually be CPU intensive.

How often to do this is a balance between responsiveness, "smoothness" and being lean. I'll have a look in between on what can be further optimized.

Otherwise please let me know if there are any concrete noticeable issues regarding energy performance. Our aim is of course to have all passive processes (tccd and the tray) use the least amount of resources while providing their benefits.

Note: With GUI, especially on the dashboard, more communication is done with tccd.

@mbway
Copy link

mbway commented Aug 18, 2020

I have the same situation where tccd takes around 5-10% cpu at all times. After some fiddling I was able to run the service without having it packaged so that I could use the node profiler (using node --prof):
note: tuxedo-control-center is not running, this is running the daemon completely idle

sudo NODE_PATH="./dist/tuxedo-control-center/data/service:${NODE_PATH}" node --prof ./dist/tuxedo-control-center/service-app/service-app/main.js --start
node --prof-process isolate* > processed.txt

An overview of the results are

 [Summary]:
   ticks  total  nonlib   name
     43    0.7%    0.9%  JavaScript
   4817   78.0%   99.1%  C++
    115    1.9%    2.4%  GC
   1316   21.3%          Shared libraries

 [C++]:
   ticks  total  nonlib   name
   3176   51.4%   65.3%  v8impl::(anonymous namespace)::FunctionCallbackWrapper::Invoke(v8::FunctionCallbackInfo<v8::Value> const&)
   1138   18.4%   23.4%  epoll_pwait
     87    1.4%    1.8%  __libc_read
     69    1.1%    1.4%  __libc_open
     68    1.1%    1.4%  access
     36    0.6%    0.7%  syscall
     29    0.5%    0.6%  node::native_module::NativeModuleEnv::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
     19    0.3%    0.4%  node::fs::ReadDir(v8::FunctionCallbackInfo<v8::Value> const&)
     13    0.2%    0.3%  void node::Buffer::(anonymous namespace)::StringSlice<(node::encoding)1>(v8::FunctionCallbackInfo<v8::Value> const&)
     12    0.2%    0.2%  node::fs::Open(v8::FunctionCallbackInfo<v8::Value> const&)
     12    0.2%    0.2%  node::fs::Access(v8::FunctionCallbackInfo<v8::Value> const&)
     12    0.2%    0.2%  __write

 [C++ entry points]:
   ticks    cpp   total   name
   3299   73.0%   53.4%  v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*)
    507   11.2%    8.2%  v8::internal::Builtin_JsonParse(int, unsigned long*, v8::internal::Isolate*)
    373    8.2%    6.0%  v8::internal::Builtin_JsonStringify(int, unsigned long*, v8::internal::Isolate*)
     87    1.9%    1.4%  __libc_read
     69    1.5%    1.1%  __libc_open
     68    1.5%    1.1%  access
     36    0.8%    0.6%  syscall
     19    0.4%    0.3%  v8::internal::Builtin_ArrayBufferConstructor(int, unsigned long*, v8::internal::Isolate*)
      9    0.2%    0.1%  __write

So, there are a few things to note from the results

  • the javascript-based logic is insignificant and only accounts for ~1% cpu usage
  • I/O doesn't seem to account for much either (<5%) (at least, I/O initiated directly from javascript, Not counting any I/O done in the native library)
  • There could be some wasted time spent parsing the config files when they haven't changed since last time? (~20%). Maybe the mtime could be checked before re-parsing?
  • FunctionCallbackWrapper::Invoke might be to-do with the calls to the native-library? profiling that might be tricky.

from what I can tell, the epoll_pwait is to do with the libuv event loop inside node.
I tried profiling a setTimeout call to see whether time spent sleeping is counted by --prof and it doesn't look like it is. Looking at the implementation of epoll_pwait (specifically, ep_poll), it looks like the kernel enters a busy loop (which makes sense since the function has poll in the name).

using time, it looks like more time is spent in kernel-mode than in user-mode which would agree with this hypothesis

# time ... node .../main.js
real    0m58.922s
user    0m0.853s
sys     0m1.979s

I could be wrong about what is going on however. I haven't written any node applications so there may be some things I'm missing.

@sma-ops
Copy link

sma-ops commented Feb 9, 2021

Having the same behavior (~4 % constant CPU usage) on my Infinity Book with Ubuntu 18. Control center is not running.

I saw no updates on the issue for quite some time. Is the current status that it is an expected / to be tolerated behavior?

@mbway
Copy link

mbway commented Feb 9, 2021

I would still be interested in looking more into this at some point but it's a bit disheartening that there is no communication from tuxedo computers (my first pull request has been sitting idle for 5 months). If a simpler c++ daemon was written that would be easier to profile as there would be no node runtime polluting things and it might turn out that node was the culprit anyway, meaning the problem would automatically be solved by this c++ daemon. The first step to achieve this is to move the configuration state out of the typescript code and into configuration files so that TCC and tccd aren't so coupled. That's what my pull request #44 starts to do

@Matheus-Garbelini
Copy link

Hi @mbway I don't get high cpu usage, but I'm using a ryzen 3900x. If you try to increase this value from 500 to, 3000 for example:

Do you get a reduction in cpu usage? It seems there a lot of IO fetch happening there every 500ms and dbus-next is the library being used to do that. So I'm not sure if its their code or something on their dbus-library which could be blocking somewhere.

@MartinLuethi
Copy link

I just bought a InfinityBook S 15, very nice hardware. With help of powertop I got the power consumption down to about 5 W, mostly screen and keyboard backlight.
tccd sucks up about 200 mW continuously (according to powertop) without doing much (no control center running). This also keeps on CPU in a higher state than necessary. Switching off tccd gives me about 1 W less power consumption, which is 20%!

Is nodejs this inefficient? How often is the hardware polled, and can this be adjusted (I have a hard time finding anything in the code).
Thanks, Martin

@Matheus-Garbelini
Copy link

@MartinLuethi maybe take a look here:

try increasing this time from 500 to something like 3000 to see if CPU usage reduces.

@MartinLuethi
Copy link

Thanks! I tried this and it changed a lot!
Although, now using the original version of tccd again is much better, too, and power is down to 3 mW now.
My best guess is, that one of the million libraries, that get pulled in building tuxedo-control-center is different and might have caused the problems in the first place.

Never having used npm before, the instructions in the README are not sufficient. Missing packages were

  • pkg and electron (from the AUR on Archlinux)
  • angular which I installed using sudo npm i -g @angular/cli
    So, maybe angular was to blame for the bad performance?

@brunoais
Copy link

Angular is very unperformant if not tamed (which is very hard to do). It can very well be angular.

@Fuzzillogic
Copy link

On my Infinity Book Pro 14 gen 6 tccd consumes more than 0.4W constantly, according to powertop. That's about as much as plasmashell, which does a lot more. On ~6.5W total idle consumption that's quite significant. And I too wonder about the choice to use web technology as Angular and Electron for a always-active background service on a device where power efficiency is a top priority. Wrong tool for the job?

@brunoais
Copy link

I think so too. Electron is a bad tool for the job.

@datenfalke
Copy link

I have the same issue with tccd using constantly 5% CPU of my Infinity S17 Gen 6 :-(

@RomanHiden
Copy link

I have similar issue. just take a look at overal TIME that was spend on tccd
image

@erikkallen
Copy link

Same issue on pulse 14

@gary094
Copy link

gary094 commented Dec 18, 2022

I'd like to report the same (Infinitybook S14 Gen6):

Screenshot_tccd

Also the process seems to jump frequently between "S" (Sleeping), "R" (Running) and "D" (Uninterruptable Sleep Mode)

@mt2506
Copy link

mt2506 commented Jan 16, 2023

Same issue on Polaris 17 - Gen1 and tuxedo-control-center 1.2.4

@thrasymache
Copy link

I have the impression that tccd is supposed to optimize battery life, but my Aura15 Gen 2 had a single CPU at 100% for tccd, and when I stopped the daemon the predicted life of the battery jumped from one hour to more than two, so it's a huge battery drain right now.

@alexbradd
Copy link

This problem is still relevant as of TCC 2.1.6: tccd doesn't pin any core, however it consumes a contiguous 2% of total CPU (on a Pulse 15 gen2 with ryzen 7500U, kernel version doesn't seem to matter). Unfortunately this kills the battery life since it never allows the CPU to idle, and thus I am forced to disable tccd until this is solved.

I have done some light digging and it seems that some peaks of CPU usage can be traced back to this line:

`ps -u $(id -u) -o pid= | xargs -I{} cat /proc/{}/environ 2>/dev/null | tr '\\0' '\\n'`

which seems to be regularly called, with a spike right after the command exits, sometimes upwards of 30/40% on various cores. I am on GNOME Wayland if it is relevant.

@brunoais
Copy link

Wow! In terms of performance, that code excerpt is horrible.
It only wants to get 2 pieces of information and it's listing all environment variables from all processes from the current user. That's really bad.

@datenfalke
Copy link

WTF

@tuxedoder
Copy link
Contributor

I tested cpu usage myself with watch -n 1 ps -C tccd -o %cpu,cmd on the Pulse 15 Gen 2 with a 5700U and currently get as low as 0.5% on Tuxedo OS 2 (22.04) KDE X11 6.5.0-10022-tuxedo with 2.1.6 in idle after some waiting. Similar for pidof tccd combined with top -p PID, but that will not show an average and varies a bit so it shows ~0.3-0.7%.

At startup it is higher due to initialization, but it isn't very high after a while. I don't really think that rewriting the whole tccd into another language will make a big difference since most of tccd is just hardware interaction and cli commands. Here a screenshot:

tccd

Keep in mind that tccd will show higher cpu usage once tcc is open since tcc communicates with tccd. For the lowest cpu usage, close the tcc so it is not visible in the tray or make sure that it is only in the tray and not in the taskbar.

Some workers run periodically after certain time periods to set values and look for availability and thus there isn't full idle with constant 0% usage. The cpu usage should be a bit reduced compared to earlier versions since I rewrote the fan control to only run on profile changes at some point.

The ps command that is highlighted collects environment variables which are required for the refresh rate menu entry and dashboard prime-select. It isn't possible to directly ask for specific environment variables because tccd runs with root and does not see the required variables. If I run printenv inside tccd I get:

LC_TIME=de_DE.UTF-8
LC_MONETARY=de_DE.UTF-8
SYSTEMD_EXEC_PID=812
PKG_EXECPATH=/opt/tuxedo-control-center/resources/dist/tuxedo-control-center/data/service/tccd
JOURNAL_STREAM=8:30470
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
INVOCATION_ID=0be724c3537b494b9740712e83306ffb
LC_ADDRESS=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LANG=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
PWD=/
XDG_DATA_DIRS=/var/lib/flatpak/exports/share:/usr/local/share/:/usr/share/
LC_NUMERIC=de_DE.UTF-8
LC_PAPER=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LC_MONETARY=de_DE.UTF-8
SYSTEMD_EXEC_PID=812
PKG_EXECPATH=/opt/tuxedo-control-center/resources/dist/tuxedo-control-center/data/service/tccd
JOURNAL_STREAM=8:30470
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
INVOCATION_ID=0be724c3537b494b9740712e83306ffb
LC_ADDRESS=de_DE.UTF-8
LC_TELEPHONE=de_DE.UTF-8
LANG=de_DE.UTF-8
LC_NAME=de_DE.UTF-8
LC_MEASUREMENT=de_DE.UTF-8
LC_IDENTIFICATION=de_DE.UTF-8
PWD=/
XDG_DATA_DIRS=/var/lib/flatpak/exports/share:/usr/local/share/:/usr/share/
LC_NUMERIC=de_DE.UTF-8
LC_PAPER=de_DE.UTF-8

There is no DISPLAY, XAUTHORITY and XDG_SESSION_TYPE and that is a workaround for that issue. I added such because the Xauthority-file can be in various places and XDG_SESSION_TYPE should be the most reliable way to check if someone is using wayland.

I do appreciate feedback and I am open for suggestions. I try to keep tccd usage low and that is easier to do if people can point out which lines of code should be changed or have concrete implementation ideas, but in this case setEnvVariables() only gets called on tccd startup and not always periodically. What could be added is a retry count to stop trying to gather information after a certain amount of failed attempts, but that is a rather minor optimization I could look into.

@datenfalke
Copy link

datenfalke commented Feb 22, 2024

I tested it with Controlcenter minimized in Taskbar. I am using a Tuxedo OS 2 on Infinity S17 Gen 6.

I still get between 2 and 3.3 % CPU usage shown in htop constantly.

tcc

@alexbradd
Copy link

Thank you for the reply! Seems the command was a red herring, sorry about that. Still, TCC doesn't expose any kind of display settings for my device so seeing something polling X specific envvars was puzzling.

Regarding usage, I may think that the difference may be in our setups since on mine, after not touching the laptop for ~20 minutes, tccd sits at 1% to 1.5% of cpu (checked using watch -n 1 ps -C tccd -o %cpu,cmd) with tcc fully closed (not even reduced to system tray).
Watching it with top revealed that the usage oscillates between 0.3% and 1/1.5% about every half a second.

My setup is the following:

  • Pulse 15 gen 2
  • openSUSE Tumbleweed with latest kernel (as of writing 6.7.5) and a minimal installation
  • GNOME 45.3 running on Wayland

As soon as I have some free time I'll give it a spin on X11 to see if something is different and poke around disabling some control features to see if I can narrow the cause.

@datenfalke
Copy link

Using watch -n 1 ps -C tccd -o %cpu,cmd I get 1.8 % CPU usage constantly.

Screenshot_20240222_181710

@munix9
Copy link

munix9 commented Feb 22, 2024

2.3 % CPU usage constantly.

InfinityBook S 17 Gen7/NS5x_7xPU, BIOS 1.07.09RTR2 11/17/2022
openSUSE Tumbleweed 20240220
Kernel 6.7.5-1-default x86_64
KDE/X11
tcc 2.1.6
tux-drv 4.2.2

@brunoais
Copy link

brunoais commented Feb 22, 2024

I do appreciate feedback and I am open for suggestions. I try to keep tccd usage low and that is easier to do if people can point out which lines of code should be changed or have concrete implementation ideas

I'd start by limiting the number of pid to get information from. The more processes running the worst this piece of code runs and that's a huge no-no.

I have quite some ideas here, so I'll try building tccd in my system and test them.


Edit:
As baseline:

image
image

@brunoais
Copy link

Initial comparative data:
image

Leaving this to C-code by GNU seems to work nice.

Now I just need to test this and I'll work on a PR

@brunoais
Copy link

brunoais commented Feb 22, 2024

@tuxedoder, I made a proposal @ #362 . Check it when able please.

@tuxedoder
Copy link
Contributor

TCC doesn't expose any kind of display settings for my device so seeing something polling X specific envvars was puzzling.

It was the goal to not display certain settings when wayland is used. xrandr doesn't work in wayland and XDG_SESSION_TYPE is used to hide certain configurations in tcc. The current code looks up DISPLAY , XAUTHORITY and XDG_SESSION_TYPE at once to avoid multiple child_process.

Check it when able please

Thank you for your suggestions, I will check it next week.

@brunoais
Copy link

@tuxedoder

Check it when able please

Thank you for your suggestions, I will check it next week.

Any news?

@tuxedoder
Copy link
Contributor

Any news?

Not yet. I was trying to at least test it this friday. Didn't have time due to other tasks.

@brunoais
Copy link

brunoais commented Mar 1, 2024

@tuxedoder Any news?

@tuxedoder
Copy link
Contributor

tuxedoder commented Mar 1, 2024

average time comment command
91.4 ms ± 5.1 ms original ps -u $(id -u) -o pid= | xargs -I{} cat /proc/{}/environ 2>/dev/null | tr '\0' '\n'
8.4 ms ± 0.8 ms suggestion with 20 pids ps -u $(id -u) -o pid= | tail --lines 20 | xargs -I{} cat /proc/{}/environ 2>/dev/null | tr '\0' '\n' | awk '/DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit}'
8.4 ms ± 0.9 ms suggestion with 10 pids ps -u $(id -u) -o pid= | tail --lines 10 | xargs -I{} cat /proc/{}/environ 2>/dev/null | tr '\0' '\n' | awk '/DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit}'
8.3 ms ± 0.9 ms ps up to 10 pids per user that are not root ps -N -u root -o pid= | xargs ps -o user=,pid= | awk '{if (++count[$1] <= 10) print $2}' | xargs -I{} cat /proc/{}/environ 2>/dev/null | tr '\0' '\n' | awk '/DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit}'
8.5 ms ± 1.0 ms pgrep up to 10 pids per user that are not root pgrep -vu root | xargs ps -o user=,pid= | awk '{if (++count[$1] <= 10) print $2}' | xargs -I{} cat /proc/{}/environ 2>/dev/null | tr '\0' '\n' | awk '/DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit}'
6.5 ms ± 1.0 ms pgrep 10 pids that are not root pgrep -vu root | tail --lines 10 | xargs -I{} cat /proc/{}/environ 2>/dev/null | tr '\0' '\n' | awk ' /DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit} '
6.1 ms ± 0.6 ms pgrep 10 pids that are not root + xargs -P4 pgrep -vu root | tail --lines 10 | xargs -P4 -I{} cat /proc/{}/environ 2>/dev/null | tr '\0' '\n' | awk ' /DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit} '

I did try to gather information before I reply today. I think I managed to make it a little faster with your suggestions. All benchmarks ran sequentially without waiting periods with hyperfine <command> -m 1000 -w 100 to get an average runtime. At first glance 10 pids seem enough, but that needs to be tested more.

@brunoais
Copy link

brunoais commented Mar 1, 2024

Regardless of how many PID you try, awk does limit the output, however, using tail ensures it doesn't keep hogging through too many processes in case the variables do not exist or are not set.

I don't know how you only get 91ms with original code. How many processes are running in your system? That's critical for how long it takes. In my case, I had 411 processes when I first tested.
Also, if the variables aren't set, it takes much longer to run than if they are set. Keep that in mind. However, if you limit to 10 pids, that difference shouldn't be much significant.

@tuxedoder
Copy link
Contributor

tuxedoder commented Mar 1, 2024

Regardless of how many PID you try, awk does limit the output, however, using tail ensures it doesn't keep hogging through too many processes in case the variables do not exist or are not set.

I mainly did it for benchmarking purposes. The goal was to limit the maximum.

I don't know how you only get 91ms with original code. How many processes are running in your system?

Currently I have 182 if I do pgrep -vu root | wc -l. ps -u $(id -u) -o pid= | wc -l is not a lot different and shows 173.

@brunoais
Copy link

brunoais commented Mar 1, 2024

OK. You have about a third than me and maybe more powerful CPU and also how much I limit my CPU.
If I put it in full power with considerable fan speed, I can make it go down from (aproximate numbers) 1s ± 200ms to 0.6s ± 100ms. So seems like it's comparable.

@brunoais
Copy link

brunoais commented Mar 5, 2024

@tuxedoder Were you able to find a solution?

@brunoais
Copy link

@tuxedoder News?

@tuxedoder
Copy link
Contributor

I don't have news yet. If there aren't more suggestions I would add the fastest pgrep command later at some point after some more testing.

@brunoais
Copy link

Then I have this suggestion which is also based on the one you provided. I increased the tail to 20 processes because some flatpak or snap processes seem to have a completely empty environ and 20 gives that extra margin.

cat $(printf "'/proc/%s/environ' " $(pgrep -vu root | tail -n 20)) | tr '\0' '\n'

My times:
~23ms
vs your fastest:
~35ms
image

Note: It assumes that pgrep never outputs single quotes and I think that's reasonable to assume.

That code assumes no whitespace is outputed by pgrep but I think that's

@tuxedoder
Copy link
Contributor

tuxedoder commented Mar 13, 2024

The provided command didn't work on my system.

cat $(printf "'/proc/%s/environ' " $(pgrep -vu root | tail -n 20)) | tr '\0' '\n'
cat: "'/proc/24908/environ'": Datei oder Verzeichnis nicht gefunden
cat: "'/proc/24928/environ'": Datei oder Verzeichnis nicht gefunden
cat: "'/proc/25250/environ'": Datei oder Verzeichnis nicht gefunden
...
cat: "'/proc/31263/environ'": Datei oder Verzeichnis nicht gefunden
cat: "'/proc/31264/environ'": Datei oder Verzeichnis nicht gefunden
cat: "'/proc/31266/environ'": Datei oder Verzeichnis nicht gefunden

I fixed it by removing quotes like the screenshot shows.

cat $(printf "/proc/%s/environ " $(pgrep -vu root | tail -n 20)) | tr '\0' '\n'

When I run that command in my cli without su I get:

cat: /proc/109455/environDISPLAY=:0
XAUTHORITY=/home/test/.Xauthority
XDG_SESSION_TYPE=x11
: Datei oder Verzeichnis nicht gefunden
cat: /proc/109456/environ: Datei oder Verzeichnis nicht gefunden
cat: /proc/109458/environ: Datei oder Verzeichnis nicht gefunden

So I added an additional 2>/dev/null to be sure there isn't error text in there.

cat $(printf "/proc/%s/environ " $(pgrep -vu root | tail -n 20)) 2>/dev/null | tr '\0' '\n' | awk ' /DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit} '
average time with sh (hyperfine) average time with sh (multitime) comment command
6.5 ms ± 0.7 ms [User: 8.3 ms, System: 2.8 ms] Mean (15ms real, 11ms user, 13ms sys) previous best pgrep -vu root | tail --lines 20 | xargs -P4 -I{} cat /proc/{}/environ 2>/dev/null | tr '\0' '\n' | awk ' /DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit} '
4.6 ms ± 0.5 ms [User: 4.2 ms, System: 2.0 ms] Mean (13ms real, 10ms user, 8ms sys) cat proc cat $(printf "/proc/%s/environ " $(pgrep -vu root | tail -n 20)) 2>/dev/null | tr '\0' '\n' | awk ' /DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit} '

I tested hyperfine with -m 10000 -w 1000 and multitime -q -n 10 . Seems like the the cat proc command is cheating a bit in hyperfine and to avoid this I put it into a sh file.

Benchmark 1: cat /proc/10889/environ /proc/10893/environ /proc/10901/environ /proc/10917/environ /proc/10991/environ /proc/11085/environ /proc/11200/environ /proc/11278/environ /proc/11282/environ /proc/11333/environ /proc/11381/environ /proc/11437/environ /proc/11471/environ /proc/11639/environ /proc/11663/environ /proc/39731/environ /proc/63777/environ /proc/448077/environ /proc/448078/environ /proc/448080/environ  2>/dev/null | tr '\0' '\n' | awk ' /DISPLAY=/ && !countDisplay {print; countDisplay++} /XAUTHORITY=/ && !countXAuthority {print; countXAuthority++} /XDG_SESSION_TYPE=/ && !countSessionType {print; countSessionType++} {if (countDisplay && countXAuthority && countSessionType) exit} '
  Time (mean ± σ):       0.5 ms ±   0.2 ms    [User: 1.3 ms, System: 0.2 ms]

Your new approach seems faster by ~41% if I take the hyperfine benchmarks and that aligns with your ~52%. If I take multitime numbers it is only ~26% for some reason. Thanks for the suggestion, since it is faster on average I will try to get that into the code after some more testing.

@brunoais
Copy link

brunoais commented Mar 13, 2024

I tested it with root running it. Isn't that the user that runs the command?
If it's the actual user and not root that runs it, then I'd use
pgrep -u "$(id -u)"
instead of:
pgrep -u root

@brunoais
Copy link

@tuxedoder News?

@tuxedoder
Copy link
Contributor

I tested it with root running it. Isn't that the user that runs the command?

The command is for tccd and tccd runs with root. My intent was to showcase what happens if a path is not available and that 2>/dev/null can help in such situations, even if it is unlikely that paths will disappear during this short time between printf and cat. Such error messages are easier to print by running that command as a normal user and thus I mentioned it. I am sorry for the confusion, I should have written more details.

@tuxedoder News?

I tried to make a new release soon which should also fix 2 other unrelated issues. I also noticed a few things related to the display worker and adjusted code a bit, but that was rather bug fixing.

@brunoais
Copy link

brunoais commented May 9, 2024

I've been using 2.1.8 for a while now. From what I got, the % CPU used by tccd has improved significantly. It was doing spikes of 10-12% with continuous 2-5% and now it's been doing spikes of 5% with continuous 2-3% (according to Htop).

I still don't know what else it's doing that causes those spikes.

@tuxedoder What is it programmed to do periodically? Maybe every second.

@tuxedoder
Copy link
Contributor

tuxedoder commented Jun 12, 2024

Sorry for the late reply, I was actually trying to improve code and gather data.

To answer the question, tccd contains workers that run periodically at different time intervals via onWork(). Here a list with workers from 2.1.8.

  • CpuWorker: 10s, set cpu values
  • DisplayBacklightWorker: 3s
  • DisplayRefreshRateWorker: 5s, set display refresh rate (if active in config)
  • FanControlWorker: 1s, set fan speed, read cpu temp
  • PrimeWorker: 10s, get prime status (tray/tcc)
  • StateSwitchWorker: 2s, check state and apply profile
  • TccDBusService: 1.5s
  • WebcamWorker: 2s, get webcam availability

There are a few more but aren't always active, only when requested via tcc.

  • CpuPowerWorker: 2s, get cpu power values for dashboard
  • GpuInfoWorker: 2.5s, get igpu/dgpu values for dashboard

While I was working on rewriting the fan control to properly add a new fan api to control new devices, I did notice that it does vary a lot across devices. While reworking the fan code, I also tried to improve performance.

Notable changes to reduce calls between tccd and tuxedo-drivers:

  • tuxedo-io only has cpu0, gpu0 and gpu1, not checking hardware after init if it doesn't exist
  • not checking fan speed outside of dashboard
  • making sure to completely disable worker if globally disabled

Some results:

  • Reboot, closed tcc in tray and waited around 10 minutes, since init will have higher usage and is not representative and usage will be higher if dashboard is shown due to data gathering
  • Results can be a bit lower with longer time measurements, but 10 minutes is roughly enough
  • Using ps instead of htop since htop is changing the shown value a lot and percentage peaks are not that useful, ps shows an average which is better
  • watch -n 1 ps -C tccd -o %cpu,%mem,cmd
  • Keep in mind that 100% is referring to one core
  • 0.1% difference can be ignored
  • All tests used tuxedo-drivers 4.5.0
  • May or may not represent final version, since I tested a development version and I try to fix the new fan control, but hopefully nothing major will change
EC communication 2.1.10 2.1.0 + disabled fan in global settings 2.1.10 + new fan control 2.1.10 + new fan control + disabled fan in global settings
Polaris 15 Gen4 6800H 3060 (on-demand) WMI 0.6% - 0.6% -
Polaris 15 Gen3 i7-11800H 3060 (on-demand) WMI 0.7% 0.5% 0.6% 0.3%
Stellaris 16 Gen5 4060 (on-demand) WMI 0.6% - - -
Pulse 15 Gen 1 4800H WMI 0.7% - 0.7% -
Pulse 14 Gen 3 7840HS direct 0.8% - 0.8% 0.5%
IB Pro 16 Gen 7 i7-12700H 3070ti (dgpu) WMI 0.9% 1.0% 1.0% 0.6%
Gemini 17 Gen1 12900H 3070ti (on-demand) ACPI 2.1% 0.9% 1.6% 0.4%
IBS17 Gen7 i5-1240P ACPI 2.5% 1.0% 1.2% 0.4%
IBS15 Gen8 i5-1340P ACPI 3.6% - 1.7% -
Aura 15 Gen2 5300U ACPI 4.2% 1.4% 1.8% 0.5%
sudo NODE_PATH="./dist/tuxedo-control-center/data/service:${NODE_PATH}" node --prof ./dist/tuxedo-control-center/service-app/service-app/main.js --start
node --prof-process isolate* > processed.txt
node profiler with 2.1.10 + new fan control on the IBS17 Gen7 (ACPI)
Statistical profiling result from isolate-0x47366d0-6151-v8.log, (6975 ticks, 0 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
   4551   65.2%          /home/test/.nvm/versions/node/v14.21.3/bin/node
     95    1.4%          /usr/lib/x86_64-linux-gnu/libc.so.6
      2    0.0%          [vdso]
      1    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30

 [JavaScript]:
   ticks  total  nonlib   name
     23    0.3%    1.0%  LazyCompile: *join path.js:1142:7
      8    0.1%    0.3%  LazyCompile: *listOnTimeout internal/timers.js:505:25
      4    0.1%    0.2%  LazyCompile: *__awaiter /tuxedo-control-center/node_modules/tslib/tslib.js:161:26
      2    0.0%    0.1%  LazyCompile: *existsSync fs.js:258:20
      1    0.0%    0.0%  LazyCompile: *readFileSync fs.js:391:22
      1    0.0%    0.0%  LazyCompile: *percolateDown internal/priority_queue.js:49:16
      1    0.0%    0.0%  LazyCompile: *openSync fs.js:489:18
      1    0.0%    0.0%  LazyCompile: *normalizeString path.js:59:25
      1    0.0%    0.0%  LazyCompile: *getOptions internal/fs/utils.js:305:20
      1    0.0%    0.0%  LazyCompile: *dirname path.js:1247:10
      1    0.0%    0.0%  LazyCompile: *Module._nodeModulePaths internal/modules/cjs/loader.js:625:37
      1    0.0%    0.0%  LazyCompile: *LogicalCpuController /tuxedo-control-center/dist/tuxedo-control-center/service-app/common/classes/LogicalCpuController.js:32:16

 [C++]:
   ticks  total  nonlib   name
   1998   28.6%   85.9%  epoll_pwait@@GLIBC_2.6
    129    1.8%    5.5%  __read@@GLIBC_2.2.5
     54    0.8%    2.3%  __open@@GLIBC_2.2.5
     32    0.5%    1.4%  access@@GLIBC_2.2.5
     21    0.3%    0.9%  syscall@@GLIBC_2.2.5
     13    0.2%    0.6%  __write@@GLIBC_2.2.5
     10    0.1%    0.4%  __libc_malloc@@GLIBC_2.2.5
      4    0.1%    0.2%  __pthread_mutex_lock@GLIBC_2.2.5
      3    0.0%    0.1%  fwrite@@GLIBC_2.2.5
      3    0.0%    0.1%  __pthread_mutex_unlock@GLIBC_2.2.5
      3    0.0%    0.1%  __lll_lock_wait_private@@GLIBC_PRIVATE
      2    0.0%    0.1%  pthread_cond_signal@@GLIBC_2.3.2
      1    0.0%    0.0%  std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > > const& std::use_facet<std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > > >(std::locale const&)@@GLIBCXX_3.4
      1    0.0%    0.0%  std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@@GLIBCXX_3.4.9
      1    0.0%    0.0%  sigaddset@@GLIBC_2.2.5
      1    0.0%    0.0%  pthread_testcancel@@GLIBC_2.34
      1    0.0%    0.0%  cfree@GLIBC_2.2.5
      1    0.0%    0.0%  __pthread_getspecific@GLIBC_2.2.5
      1    0.0%    0.0%  __mprotect@@GLIBC_PRIVATE
      1    0.0%    0.0%  __errno_location@@GLIBC_2.2.5
      1    0.0%    0.0%  _IO_file_xsputn@@GLIBC_2.2.5

 [Summary]:
   ticks  total  nonlib   name
     45    0.6%    1.9%  JavaScript
   2281   32.7%   98.1%  C++
    121    1.7%    5.2%  GC
   4649   66.7%          Shared libraries

 [C++ entry points]:
   ticks    cpp   total   name
    129   48.3%    1.8%  __read@@GLIBC_2.2.5
     54   20.2%    0.8%  __open@@GLIBC_2.2.5
     32   12.0%    0.5%  access@@GLIBC_2.2.5
     21    7.9%    0.3%  syscall@@GLIBC_2.2.5
      9    3.4%    0.1%  __write@@GLIBC_2.2.5
      9    3.4%    0.1%  __libc_malloc@@GLIBC_2.2.5
      3    1.1%    0.0%  __pthread_mutex_lock@GLIBC_2.2.5
      3    1.1%    0.0%  __lll_lock_wait_private@@GLIBC_PRIVATE
      2    0.7%    0.0%  fwrite@@GLIBC_2.2.5
      1    0.4%    0.0%  std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > > const& std::use_facet<std::num_get<char, std::istreambuf_iterator<char, std::char_traits<char> > > >(std::locale const&)@@GLIBCXX_3.4
      1    0.4%    0.0%  std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@@GLIBCXX_3.4.9
      1    0.4%    0.0%  cfree@GLIBC_2.2.5
      1    0.4%    0.0%  __pthread_mutex_unlock@GLIBC_2.2.5
      1    0.4%    0.0%  _IO_file_xsputn@@GLIBC_2.2.5
...
node profiler with 2.1.10 + new fan control on the Pulse14 Gen3 (direct)
Statistical profiling result from isolate-0x508e6d0-7377-v8.log, (6094 ticks, 0 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
   2007   32.9%          /home/tux/.nvm/versions/node/v14.21.3/bin/node
     95    1.6%          /usr/lib/x86_64-linux-gnu/libc.so.6
      2    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30
      1    0.0%          [vdso]

 [JavaScript]:
   ticks  total  nonlib   name
     10    0.2%    0.3%  LazyCompile: *join path.js:1142:7
      8    0.1%    0.2%  LazyCompile: *listOnTimeout internal/timers.js:505:25
      3    0.0%    0.1%  LazyCompile: *__awaiter /home/tux/Downloads/tuxedo-control-center/node_modules/tslib/tslib.js:161:26
      2    0.0%    0.1%  LazyCompile: *readFileSync fs.js:391:22
      1    0.0%    0.0%  LazyCompile: *slice buffer.js:1131:40
      1    0.0%    0.0%  LazyCompile: *resolve path.js:1067:10
      1    0.0%    0.0%  LazyCompile: *normalizeString path.js:59:25
      1    0.0%    0.0%  LazyCompile: *fulfilled /home/tux/Downloads/tuxedo-control-center/node_modules/tslib/tslib.js:164:31
      1    0.0%    0.0%  LazyCompile: *Module._nodeModulePaths internal/modules/cjs/loader.js:625:37
      1    0.0%    0.0%  LazyCompile: *<anonymous> /home/tux/Downloads/tuxedo-control-center/node_modules/tslib/tslib.js:163:50

 [C++]:
   ticks  total  nonlib   name
   3531   57.9%   88.5%  epoll_pwait@@GLIBC_2.6
    132    2.2%    3.3%  __read@@GLIBC_2.2.5
    120    2.0%    3.0%  __open@@GLIBC_2.2.5
     64    1.1%    1.6%  access@@GLIBC_2.2.5
     42    0.7%    1.1%  syscall@@GLIBC_2.2.5
     35    0.6%    0.9%  __write@@GLIBC_2.2.5
      7    0.1%    0.2%  __libc_malloc@@GLIBC_2.2.5
      6    0.1%    0.2%  pthread_cond_signal@@GLIBC_2.3.2
      3    0.0%    0.1%  epoll_ctl@@GLIBC_2.3.2
      3    0.0%    0.1%  cfree@GLIBC_2.2.5
      3    0.0%    0.1%  __pthread_mutex_lock@GLIBC_2.2.5
      3    0.0%    0.1%  __mprotect@@GLIBC_PRIVATE
      3    0.0%    0.1%  __mmap@@GLIBC_PRIVATE
      2    0.0%    0.1%  __pthread_getspecific@GLIBC_2.2.5
      1    0.0%    0.0%  std::ostream::operator<<(int)@@GLIBCXX_3.4
      1    0.0%    0.0%  std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@@GLIBCXX_3.4.9
      1    0.0%    0.0%  sigaddset@@GLIBC_2.2.5
      1    0.0%    0.0%  __pthread_mutex_unlock@GLIBC_2.2.5
      1    0.0%    0.0%  __lll_lock_wait_private@@GLIBC_PRIVATE
      1    0.0%    0.0%  _IO_file_xsputn@@GLIBC_2.2.5

 [Summary]:
   ticks  total  nonlib   name
     29    0.5%    0.7%  JavaScript
   3960   65.0%   99.3%  C++
    151    2.5%    3.8%  GC
   2105   34.5%          Shared libraries

 [C++ entry points]:
   ticks    cpp   total   name
    124   32.6%    2.0%  __read@@GLIBC_2.2.5
    120   31.6%    2.0%  __open@@GLIBC_2.2.5
     64   16.8%    1.1%  access@@GLIBC_2.2.5
     42   11.1%    0.7%  syscall@@GLIBC_2.2.5
     20    5.3%    0.3%  __write@@GLIBC_2.2.5
      4    1.1%    0.1%  __libc_malloc@@GLIBC_2.2.5
      1    0.3%    0.0%  std::ostream::operator<<(int)@@GLIBCXX_3.4
      1    0.3%    0.0%  std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)@@GLIBCXX_3.4.9
      1    0.3%    0.0%  cfree@GLIBC_2.2.5
      1    0.3%    0.0%  __pthread_mutex_lock@GLIBC_2.2.5
      1    0.3%    0.0%  __mprotect@@GLIBC_PRIVATE
      1    0.3%    0.0%  __lll_lock_wait_private@@GLIBC_PRIVATE
...
node profiler with 2.1.10 + new fan control on the Pulse15 Gen1 (WMI)
Testing v8 version different from logging version
Statistical profiling result from isolate-0x56d467bd4030-6065-v8.log, (9496 ticks, 1 unaccounted, 0 excluded).

 [Shared libraries]:
   ticks  total  nonlib   name
   3445   36.3%          /usr/lib/x86_64-linux-gnu/libnode.so.72
    242    2.5%          /usr/lib/x86_64-linux-gnu/libc.so.6
     92    1.0%          [vdso]
      9    0.1%          /usr/lib/x86_64-linux-gnu/libuv.so.1.0.0
      2    0.0%          /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.30

 [JavaScript]:
   ticks  total  nonlib   name
     11    0.1%    0.2%  LazyCompile: *join path.js:1033:7
      6    0.1%    0.1%  LazyCompile: *parseSignature /tuxedo-control-center/node_modules/dbus-next/lib/signature.js:13:25
      6    0.1%    0.1%  LazyCompile: *listOnTimeout internal/timers.js:502:25
      6    0.1%    0.1%  LazyCompile: *<anonymous> /tuxedo-control-center/node_modules/@nornagon/put/index.js:26:45
      5    0.1%    0.1%  LazyCompile: *write /tuxedo-control-center/node_modules/dbus-next/lib/marshall.js:39:16
...

 [C++]:
   ticks  total  nonlib   name
   2975   31.3%   52.1%  epoll_pwait@@GLIBC_2.6
   1131   11.9%   19.8%  node::SyncProcessRunner::Spawn(v8::FunctionCallbackInfo<v8::Value> const&)
    566    6.0%    9.9%  __read@@GLIBC_2.2.5
    182    1.9%    3.2%  __open@@GLIBC_2.2.5
    166    1.7%    2.9%  node::contextify::ContextifyContext::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
    124    1.3%    2.2%  access@@GLIBC_2.2.5
     93    1.0%    1.6%  syscall@@GLIBC_2.2.5
     64    0.7%    1.1%  node::native_module::NativeModuleEnv::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
     62    0.7%    1.1%  __write@@GLIBC_2.2.5
...

 [Summary]:
   ticks  total  nonlib   name
     83    0.9%    1.5%  JavaScript
   5622   59.2%   98.5%  C++
    333    3.5%    5.8%  GC
   3790   39.9%          Shared libraries
      1    0.0%          Unaccounted

 [C++ entry points]:
   ticks    cpp   total   name
   1131   44.4%   11.9%  node::SyncProcessRunner::Spawn(v8::FunctionCallbackInfo<v8::Value> const&)
    558   21.9%    5.9%  __read@@GLIBC_2.2.5
    182    7.1%    1.9%  __open@@GLIBC_2.2.5
    166    6.5%    1.7%  node::contextify::ContextifyContext::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
    124    4.9%    1.3%  access@@GLIBC_2.2.5
     93    3.6%    1.0%  syscall@@GLIBC_2.2.5
     64    2.5%    0.7%  node::native_module::NativeModuleEnv::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
     51    2.0%    0.5%  pthread_sigmask@GLIBC_2.2.5
     42    1.6%    0.4%  node::fs::Access(v8::FunctionCallbackInfo<v8::Value> const&)
     37    1.5%    0.4%  __write@@GLIBC_2.2.5
     17    0.7%    0.2%  void node::StreamBase::JSMethod<&node::StreamBase::WriteBuffer>(v8::FunctionCallbackInfo<v8::Value> const&)
...
sudo strace -c -p PID
`strace` with 2.1.10 on the Pulse 14 Gen 3 (direct)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 31,34    1,006404         481      2091           write
 17,49    0,561444           9     60397           read
 11,27    0,361744        1955       185           wait4
 11,16    0,358305          10     33586      1666 openat
  7,74    0,248437        1357       183           clone
  6,65    0,213494           6     30891       180 access
  4,14    0,133035           3     34582       706 close
  3,48    0,111847          21      5165           epoll_wait
  3,17    0,101814           3     29478           statx
  1,18    0,037997          17      2218           getdents64
...
------ ----------- ----------- --------- --------- ----------------
100,00    3,210878          15    206422      3337 total
`strace` with 2.1.10 + new fan control on the Pulse 14 Gen 3 (direct)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 45,85    1,259927          71     17578           epoll_wait
 14,19    0,389899        2107       185           wait4
 10,75    0,295390        1614       183           clone
  8,13    0,223493           4     45318           read
  7,55    0,207575           9     20885      1667 openat
  3,82    0,104837           5     17581       180 access
  3,05    0,083723           3     21881       707 close
  2,27    0,062369           3     18590           statx
  1,90    0,052285           5      9395       717 futex
  0,53    0,014447          14      1010           getdents64
...
------ ----------- ----------- --------- --------- ----------------
100,00    2,747916          17    159236      3606 total
`strace` with 2.1.10 on the Polaris 15 Gen 3 (WMI)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 45,72    0,758650          58     12860           ioctl
 11,05    0,183418         980       187           clone
 10,44    0,173252           7     21883           read
 10,36    0,171936         919       187           wait4
  7,62    0,126432           8     15219           openat
  4,76    0,078911           4     17219           close
  3,98    0,065987          12      5089           epoll_wait
  1,90    0,031471           3     10188       122 access
  1,33    0,022043           2     10174           statx
...
------ ----------- ----------- --------- --------- ----------------
100,00    1,659417          16    100732       628 total
`strace` with 2.1.10 + new fan control on the Polaris 15 Gen 3 (WMI)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 45,23    0,741612          76      9633           ioctl
 11,89    0,194985        1053       185           clone
 10,38    0,170218         920       185           wait4
 10,27    0,168314           7     21827           read
  7,25    0,118895           8     13916           openat
  4,45    0,072996          13      5333           epoll_wait
  4,39    0,072046           4     15892           close
  1,78    0,029215           2     10188       122 access
  1,29    0,021175           2     10160           statx
...
------ ----------- ----------- --------- --------- ----------------
100,00    1,639525          17     95124       636 total
`strace` with 2.1.10 on the IBS17 Gen 7 (ACPI)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 88,65    9,645322         697     13822           ioctl
  2,65    0,288388          18     15317           openat
  1,62    0,176322           8     20357           read
  1,57    0,171265           9     17177           close
  1,54    0,167598         946       177           wait4
  1,33    0,144969         833       174           clone
  1,20    0,130460          26      4927           epoll_wait
...
------ ----------- ----------- --------- --------- ----------------
100,00   10,879853         111     98005       554 total
`strace` with 2.1.10 + new fan control on the IBS17 Gen 7 (ACPI)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 71,52    2,869181         524      5469           ioctl
  4,82    0,193400           9     21091           read
  4,68    0,187833          15     12323           openat
  4,60    0,184542        1013       182           wait4
  3,90    0,156585         865       181           clone
  3,79    0,152161          28      5407           epoll_wait
  2,51    0,100618           7     14259           close
  1,19    0,047831           4      9854       118 access
...
------ ----------- ----------- --------- --------- ----------------
100,00    4,011514          46     86073       586 total
`strace` with 2.1.10 on the Aura 15 Gen2 (ACPI)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 90,80   19,953861        1459     13675           ioctl
  3,45    0,758776        4335       175           wait4
  1,37    0,301187          19     15488       448 openat
  1,34    0,294102        1709       172           clone
  0,89    0,196332           9     19896           read
...
------ ----------- ----------- --------- --------- ----------------
100,00   21,976362         229     95962      1125 total
`strace` with 2.1.10 + new fan control on the Aura 15 Gen2 (ACPI)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 79,13    7,214409        1404      5136           ioctl
  8,22    0,749808        4384       171           wait4
  3,23    0,294389        1731       170           clone
  2,46    0,224144          18     12053       448 openat
  2,13    0,194541           9     19829           read
  1,28    0,116887          23      4954           epoll_wait
  1,13    0,102992           7     13421           close
  0,95    0,086740          10      8345       168 access
...
------ ----------- ----------- --------- --------- ----------------
100,00    9,117462         113     80191      1126 total
`strace` with 2.1.10 on the Gemini 17 Gen1 (ACPI)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 85,03    8,072399         548     14729           ioctl
  3,14    0,298045        1000       298           wait4
  3,05    0,289374          15     18683           openat
  2,22    0,210680           7     27162         1 read
  1,76    0,166863           7     21120           close
  1,56    0,147789          24      5956           epoll_wait
  1,54    0,146367         602       243           clone
  0,49    0,046258           3     12421        60 access
...
------ ----------- ----------- --------- --------- ----------------
100,00    9,493452          77    122010       591 total
`strace` with 2.1.10 + new fan control on the Gemini 17 Gen1 (ACPI)
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 79,42    5,425086         598      9058           ioctl
  4,28    0,292014        1017       287           wait4
  3,68    0,251283          15     15948           openat
  3,22    0,219673           8     26643         5 read
  2,55    0,174158         731       238           clone
  2,17    0,148552          26      5679         1 epoll_wait
  2,02    0,137764           7     18332           close
  0,75    0,050959           4     12214        59 access
...
------ ----------- ----------- --------- --------- ----------------
100,00    6,830562          62    109670       646 total

These just mainly show that usage mainly comes from C++ and ioctl, but also shows the improvement of the new fan control.

Further analysis with perf was interesting.

sudo perf record -g -p PID
sudo perf report -G -i perf.data
`perf` with 2.1.10 on the IBS17 Gen 7 (ACPI)
Samples: 83K of event 'cpu_core/cycles:P/', Event count (approx.): 14475920526
  Children      Self  Command  Shared Object         Symbol
-   84,36%     0,03%  tccd     [kernel.kallsyms]     [k] entry_SYSCALL_64_after_hwframe                                            ◆
   - 84,34% entry_SYSCALL_64_after_hwframe
      - 84,31% do_syscall_64
         - 83,68% x64_sys_call
            - 74,60% __x64_sys_ioctl
               - 74,50% fop_ioctl
                  - 74,40% clevo_evaluate_method
                     - 74,39% clevo_acpi_interface_method_call
                        - clevo_acpi_evaluate
                           - 74,29% acpi_evaluate_dsm <- lowest level in tuxedo-drivers, this is a kernel function
                              - 74,26% acpi_evaluate_object
                                 - 73,83% acpi_ns_evaluate
                                    - 73,35% acpi_ps_execute_method
                                       - 72,88% acpi_ps_parse_aml
                                          - 71,17% acpi_ps_parse_loop
                                             - 47,10% acpi_ds_exec_end_op
                                                + 21,99% acpi_ex_opcode_1A_1T_1R
                                                + 7,74% acpi_ds_evaluate_name_path
                                                + 4,69% acpi_ds_create_operands
                                                + 2,45% acpi_ds_delete_result_if_not_used
                                                + 2,18% acpi_ex_resolve_operands
                                                + 1,94% acpi_ds_clear_operands
                                                  0,87% acpi_ds_get_predicate_value
                                                + 0,87% acpi_ds_eval_data_object_operands
                                                  0,57% acpi_ds_result_push
                                             + 7,41% acpi_ps_get_arguments.constprop.0
                                             + 6,86% acpi_ps_create_op
                                             + 5,44% acpi_ps_complete_op
                                             + 1,08% acpi_ps_push_scope
                                               0,56% acpi_ut_status_exit
                                            0,82% acpi_ds_terminate_control_method
            + 2,35% __x64_sys_clone
            + 2,22% __x64_sys_read
            + 1,98% __x64_sys_execve
            + 1,25% __x64_sys_openat
         + 0,56% syscall_exit_to_user_mode
+   84,31%     0,01%  tccd     [kernel.kallsyms]     [k] do_syscall_64
+   83,70%     0,07%  tccd     [kernel.kallsyms]     [k] x64_sys_call
+   75,67%     0,00%  tccd     tccd                  [.] 0x000000000089fb7c
+   75,59%     0,02%  tccd     TuxedoIOAPI.node      [.] Napi::details::CallbackData<Napi::Boolean (*)(Napi::CallbackInfo const&), Napi::Boolean>::Wrapper(napi_env__*, napi_callback_info__*)  ▒
+   74,87%     0,00%  tccd     libc.so.6             [.] 0x00007669d311a94f
+   74,61%     0,02%  tccd     [kernel.kallsyms]     [k] __x64_sys_ioctl
+   74,51%     0,02%  tccd     [tuxedo_io]           [k] fop_ioctl
+   74,46%     0,02%  tccd     [kernel.kallsyms]     [k] acpi_evaluate_object
+   74,41%     0,02%  tccd     [tuxedo_keyboard]     [k] clevo_evaluate_method
+   74,39%     0,03%  tccd     [clevo_acpi]          [k] clevo_acpi_interface_method_call
+   74,36%     0,02%  tccd     [clevo_acpi]          [k] clevo_acpi_evaluate
+   74,29%     0,00%  tccd     [kernel.kallsyms]     [k] acpi_evaluate_dsm
+   74,02%     0,02%  tccd     [kernel.kallsyms]     [k] acpi_ns_evaluate
+   73,51%     0,02%  tccd     [kernel.kallsyms]     [k] acpi_ps_execute_method
+   73,03%     0,04%  tccd     [kernel.kallsyms]     [k] acpi_ps_parse_aml
+   71,43%     0,48%  tccd     [kernel.kallsyms]     [k] acpi_ps_parse_loop
+   57,53%     0,02%  tccd     TuxedoIOAPI.node      [.] SetFanSpeedPercent(Napi::CallbackInfo const&)
+   47,58%     0,57%  tccd     [kernel.kallsyms]     [k] acpi_ds_exec_end_op
+   26,24%     0,06%  tccd     [kernel.kallsyms]     [k] acpi_ex_field_datum_io
+   26,13%     0,05%  tccd     [kernel.kallsyms]     [k] acpi_ex_access_region
+   25,99%     0,08%  tccd     [kernel.kallsyms]     [k] acpi_ev_address_space_dispatch
+   25,36%     0,07%  tccd     [kernel.kallsyms]     [k] acpi_ec_space_handler
+   25,24%     0,03%  tccd     [kernel.kallsyms]     [k] acpi_ec_transaction
+   25,14%     0,09%  tccd     [kernel.kallsyms]     [k] acpi_ec_transaction_unlocked
+   22,05%     0,19%  tccd     [kernel.kallsyms]     [k] acpi_ex_opcode_1A_1T_1R
+   21,61%     0,07%  tccd     [kernel.kallsyms]     [k] acpi_ex_store
+   21,06%     0,08%  tccd     [kernel.kallsyms]     [k] acpi_ex_store_object_to_node
+   20,77%     0,08%  tccd     [kernel.kallsyms]     [k] acpi_ex_write_data_to_field
+   20,25%     0,04%  tccd     [kernel.kallsyms]     [k] acpi_ex_insert_into_field
+   20,19%     0,03%  tccd     [kernel.kallsyms]     [k] acpi_ex_write_with_update_rule
+   16,76%     0,00%  tccd     [kernel.kallsyms]     [k] asm_common_interrupt
+   16,76%     0,00%  tccd     [kernel.kallsyms]     [k] common_interrupt
+   16,75%     0,00%  tccd     [kernel.kallsyms]     [k] __common_interrupt
+   16,75%     0,00%  tccd     [kernel.kallsyms]     [k] handle_fasteoi_irq
+   16,72%     0,00%  tccd     [kernel.kallsyms]     [k] handle_irq_event
+   16,70%     0,00%  tccd     [kernel.kallsyms]     [k] __handle_irq_event_percpu
+   16,70%     0,00%  tccd     [kernel.kallsyms]     [k] acpi_ev_sci_xrupt_handler
+   16,70%     0,00%  tccd     [kernel.kallsyms]     [k] acpi_irq
+   16,16%     0,03%  tccd     [kernel.kallsyms]     [k] acpi_ev_gpe_detect
+   15,98%     0,06%  tccd     [kernel.kallsyms]     [k] acpi_ev_detect_gpe
+   15,67%     0,02%  tccd     TuxedoIOAPI.node      [.] GetFanTemperature(Napi::CallbackInfo const&)
+   15,45%    15,45%  tccd     [kernel.kallsyms]     [k] advance_transaction
+   13,09%    13,07%  tccd     [kernel.kallsyms]     [k] acpi_os_read_port
+   12,64%     0,07%  tccd     [kernel.kallsyms]     [k] acpi_hw_gpe_read
+    9,42%     0,55%  tccd     [kernel.kallsyms]     [k] _raw_spin_unlock_irqrestore
+    8,11%     0,16%  tccd     [kernel.kallsyms]     [k] acpi_ex_resolve_to_value
+    7,78%     0,03%  tccd     [kernel.kallsyms]     [k] acpi_ds_evaluate_name_path
+    7,64%     0,52%  tccd     [kernel.kallsyms]     [k] acpi_ps_get_arguments.constprop.0
+    7,19%     0,70%  tccd     [kernel.kallsyms]     [k] acpi_ps_create_op
+    6,60%     0,05%  tccd     [kernel.kallsyms]     [k] acpi_ex_resolve_node_to_value
+    6,41%     0,06%  tccd     [kernel.kallsyms]     [k] acpi_ex_read_data_from_field
+    6,15%     0,03%  tccd     [kernel.kallsyms]     [k] acpi_ex_extract_from_field
+    5,89%     0,00%  tccd     [unknown]             [.] 0000000000000000
+    5,77%     0,28%  tccd     [kernel.kallsyms]     [k] acpi_ns_lookup
+    5,61%     0,35%  tccd     [kernel.kallsyms]     [k] acpi_ps_complete_op
+    5,31%     0,37%  tccd     [kernel.kallsyms]     [k] acpi_ds_create_operand
+    5,29%     0,15%  tccd     [kernel.kallsyms]     [k] acpi_ns_search_and_enter
+    5,23%     0,56%  tccd     [kernel.kallsyms]     [k] acpi_ut_update_object_reference
+    4,91%     0,23%  tccd     [kernel.kallsyms]     [k] acpi_ds_create_operands
+    4,64%     0,54%  tccd     [kernel.kallsyms]     [k] acpi_ut_update_ref_count.part.0
+    4,63%     3,02%  tccd     [kernel.kallsyms]     [k] acpi_ns_search_one_scope
...
`perf` with 2.1.10 + new fan control on the IBS17 Gen 7 (ACPI)
Samples: 38K of event 'cpu_core/cycles:P/', Event count (approx.): 6962670981
  Children      Self  Command  Shared Object         Symbol
-   67,22%     0,01%  tccd     [kernel.kallsyms]     [k] entry_SYSCALL_64_after_hwframe                                            ◆
   - 67,21% entry_SYSCALL_64_after_hwframe
      - 67,18% do_syscall_64
         - 66,31% x64_sys_call
            - 48,91% __x64_sys_ioctl
               - 48,82% fop_ioctl
                  - 48,74% clevo_evaluate_method
                     - 48,72% clevo_acpi_interface_method_call
                        - 48,70% clevo_acpi_evaluate
                           - 48,64% acpi_evaluate_dsm <- lowest level in tuxedo-drivers, this is a kernel function
                              - 48,63% acpi_evaluate_object
                                 - 48,30% acpi_ns_evaluate
                                    - 48,05% acpi_ps_execute_method
                                       - 47,75% acpi_ps_parse_aml
                                          - 46,57% acpi_ps_parse_loop
                                             - 29,99% acpi_ds_exec_end_op
                                                + 13,58% acpi_ex_opcode_1A_1T_1R
                                                + 4,92% acpi_ds_evaluate_name_path
                                                + 3,07% acpi_ds_create_operands
                                                + 1,59% acpi_ds_delete_result_if_not_used
                                                + 1,33% acpi_ex_resolve_operands
                                                + 1,27% acpi_ds_clear_operands
                                                  0,62% acpi_ds_eval_data_object_operands
                                                  0,59% acpi_ds_get_predicate_value
                                             + 5,52% acpi_ps_get_arguments.constprop.0
                                             + 4,38% acpi_ps_create_op
                                             + 3,70% acpi_ps_complete_op
                                               0,65% acpi_ps_push_scope
                                            0,55% acpi_ds_terminate_control_method
            + 4,84% __x64_sys_clone
            + 4,79% __x64_sys_read
            + 3,95% __x64_sys_execve
            + 1,69% __x64_sys_openat
            + 0,57% __x64_sys_access
              0,53% __x64_sys_epoll_wait
         + 0,72% syscall_exit_to_user_mode
+   67,19%     0,02%  tccd     [kernel.kallsyms]     [k] do_syscall_64
+   66,33%     0,10%  tccd     [kernel.kallsyms]     [k] x64_sys_call
+   49,85%     0,00%  tccd     tccd                  [.] 0x000000000089fb7c
+   49,82%     0,01%  tccd     TuxedoIOAPI.node      [.] Napi::details::CallbackData<Napi::Boolean (*)(Napi::CallbackInfo const&), Napi::Boolean>::Wrapper(napi_env__*, napi_callback_info__*)  ▒
+   49,14%     0,00%  tccd     libc.so.6             [.] 0x00006ffcc631a94f
+   49,06%     0,01%  tccd     [kernel.kallsyms]     [k] acpi_evaluate_object
+   48,92%     0,03%  tccd     [kernel.kallsyms]     [k] __x64_sys_ioctl
+   48,83%     0,02%  tccd     [tuxedo_io]           [k] fop_ioctl
+   48,74%     0,02%  tccd     [tuxedo_keyboard]     [k] clevo_evaluate_method
+   48,73%     0,01%  tccd     [kernel.kallsyms]     [k] acpi_ns_evaluate
+   48,73%     0,02%  tccd     [clevo_acpi]          [k] clevo_acpi_interface_method_call
+   48,70%     0,02%  tccd     [clevo_acpi]          [k] clevo_acpi_evaluate
+   48,64%     0,00%  tccd     [kernel.kallsyms]     [k] acpi_evaluate_dsm
+   48,39%     0,02%  tccd     [kernel.kallsyms]     [k] acpi_ps_execute_method
+   48,05%     0,03%  tccd     [kernel.kallsyms]     [k] acpi_ps_parse_aml
+   46,90%     0,32%  tccd     [kernel.kallsyms]     [k] acpi_ps_parse_loop
+   33,36%     0,01%  tccd     TuxedoIOAPI.node      [.] SetFanSpeedPercent(Napi::CallbackInfo const&)
+   30,43%     0,46%  tccd     [kernel.kallsyms]     [k] acpi_ds_exec_end_op
+   15,71%     0,05%  tccd     [kernel.kallsyms]     [k] acpi_ex_field_datum_io
+   15,62%     0,04%  tccd     [kernel.kallsyms]     [k] acpi_ex_access_region
+   15,53%     0,12%  tccd     [kernel.kallsyms]     [k] acpi_ev_address_space_dispatch
+   14,97%     0,05%  tccd     [kernel.kallsyms]     [k] acpi_ec_space_handler
+   14,88%     0,05%  tccd     [kernel.kallsyms]     [k] acpi_ec_transaction
+   14,79%     0,07%  tccd     [kernel.kallsyms]     [k] acpi_ec_transaction_unlocked
+   13,62%     0,11%  tccd     [kernel.kallsyms]     [k] acpi_ex_opcode_1A_1T_1R
+   13,35%     0,07%  tccd     [kernel.kallsyms]     [k] acpi_ex_store
+   13,00%     0,08%  tccd     [kernel.kallsyms]     [k] acpi_ex_store_object_to_node
+   12,73%     0,00%  tccd     [unknown]             [.] 0000000000000000
+   12,72%     0,06%  tccd     [kernel.kallsyms]     [k] acpi_ex_write_data_to_field
+   12,30%     0,04%  tccd     [kernel.kallsyms]     [k] acpi_ex_insert_into_field
+   12,25%     0,02%  tccd     [kernel.kallsyms]     [k] acpi_ex_write_with_update_rule
+   12,25%     0,03%  tccd     TuxedoIOAPI.node      [.] GetFanTemperature(Napi::CallbackInfo const&)
+    9,24%     0,00%  tccd     tccd                  [.] 0x0000000001704762
+    8,89%     8,89%  tccd     [kernel.kallsyms]     [k] advance_transaction
+    8,10%     0,00%  tccd     tccd                  [.] 0x0000000000b2783f
+    8,10%     0,00%  tccd     tccd                  [.] 0x0000000001702258
+    8,01%     0,00%  tccd     tccd                  [.] 0x000000000170247a
+    5,90%     0,00%  tccd     [kernel.kallsyms]     [k] asm_common_interrupt
+    5,90%     0,00%  tccd     [kernel.kallsyms]     [k] common_interrupt
+    5,90%     0,00%  tccd     [kernel.kallsyms]     [k] __common_interrupt
+    5,90%     0,00%  tccd     [kernel.kallsyms]     [k] handle_fasteoi_irq
+    5,88%     0,00%  tccd     [kernel.kallsyms]     [k] handle_irq_event
+    5,87%     0,00%  tccd     [kernel.kallsyms]     [k] __handle_irq_event_percpu
+    5,87%     0,00%  tccd     [kernel.kallsyms]     [k] acpi_irq
+    5,87%     0,00%  tccd     [kernel.kallsyms]     [k] acpi_ev_sci_xrupt_handler
+    5,75%     0,39%  tccd     [kernel.kallsyms]     [k] acpi_ps_get_arguments.constprop.0
+    5,65%     0,01%  tccd     [kernel.kallsyms]     [k] acpi_ev_gpe_detect
+    5,60%     0,02%  tccd     [kernel.kallsyms]     [k] acpi_ev_detect_gpe
+    4,99%     0,03%  tccd     [kernel.kallsyms]     [k] acpi_ds_evaluate_name_path
...
`perf` with 2.1.10 + new fan control on the Pulse 14 Gen 3 (direct)
Samples: 26K of event 'cycles:P', Event count (approx.): 10760818545
  Children      Self  Command  Shared Object         Symbol
-   59,37%     0,15%  tccd     [kernel.kallsyms]     [k] entry_SYSCALL_64_after_hwframe                                            ◆
   - 59,23% entry_SYSCALL_64_after_hwframe
      - 58,93% do_syscall_64
         - 56,59% x64_sys_call
            - 24,51% __x64_sys_write
               - 24,47% ksys_write
                  - 24,43% vfs_write
                     - 23,92% kernfs_fop_write_iter
                        - 23,75% sysfs_kf_write
                           - 23,70% dev_attr_store
                              - 12,17% fan2_pwm_store
                                 - 12,06% nb05_write_ec_ram <- writing directly into EC with tuxedo-drivers
                                      0,85% mutex_lock
                              - 11,42% fan1_pwm_store
                                 - 11,26% nb05_write_ec_ram <- writing directly into EC with tuxedo-drivers
                                      0,81% mutex_lock
            - 9,28% __x64_sys_clone
                 __do_sys_clone
               + kernel_clone
            - 5,90% __x64_sys_openat
               + 5,82% do_sys_openat2
            - 4,84% __x64_sys_execve
               + 4,84% do_execveat_common.isra.0
            - 3,98% __x64_sys_read
               + 3,94% ksys_read
            - 2,37% __x64_sys_access
               + 2,36% do_faccessat
            - 1,81% __x64_sys_epoll_wait
               + 1,73% do_epoll_wait
            - 1,69% __x64_sys_futex
               + 1,66% do_futex
            - 0,80% __x64_sys_statx
                 0,54% do_statx
         + 2,01% syscall_exit_to_user_mode
+   59,13%     0,25%  tccd     [kernel.kallsyms]     [k] do_syscall_64
+   56,80%     0,27%  tccd     [kernel.kallsyms]     [k] x64_sys_call
+   24,64%     0,01%  tccd     libc.so.6             [.] __GI___libc_write
+   24,53%     0,00%  tccd     [kernel.kallsyms]     [k] __x64_sys_write
+   24,47%     0,00%  tccd     [kernel.kallsyms]     [k] ksys_write
+   24,46%     0,02%  tccd     [kernel.kallsyms]     [k] vfs_write
+   23,92%     0,01%  tccd     [kernel.kallsyms]     [k] kernfs_fop_write_iter
+   23,92%    23,75%  tccd     [kernel.kallsyms]     [k] nb05_write_ec_ram
+   23,75%     0,00%  tccd     [kernel.kallsyms]     [k] sysfs_kf_write
+   23,70%     0,02%  tccd     [kernel.kallsyms]     [k] dev_attr_store
+   16,53%     0,00%  tccd     [unknown]             [.] 0000000000000000
+   12,17%     0,01%  tccd     [kernel.kallsyms]     [k] fan2_pwm_store
+   11,42%     0,00%  tccd     [kernel.kallsyms]     [k] fan1_pwm_store
+   10,08%     0,00%  tccd     tccd                  [.] 0x0000000001704762
+    9,32%     0,00%  tccd     libc.so.6             [.] _Fork
+    9,28%     0,00%  tccd     [kernel.kallsyms]     [k] __x64_sys_clone
+    9,28%     0,00%  tccd     [kernel.kallsyms]     [k] __do_sys_clone
+    9,28%     0,00%  tccd     [kernel.kallsyms]     [k] kernel_clone
+    9,21%     0,01%  tccd     [kernel.kallsyms]     [k] copy_process
+    8,88%     0,00%  tccd     [kernel.kallsyms]     [k] dup_mm.constprop.0
+    8,77%     0,17%  tccd     [kernel.kallsyms]     [k] dup_mmap
+    7,26%     0,00%  tccd     tccd                  [.] 0x0000000000b2783f
+    7,25%     0,00%  tccd     tccd                  [.] 0x0000000001702258
+    7,07%     0,00%  tccd     tccd                  [.] 0x000000000170247a
...
`perf` with 2.1.10 + new fan control on the Pulse 15 Gen 1 (WMI)
Samples: 20K of event 'cycles:P', Event count (approx.): 6589674133
  Children      Self  Command  Shared Object         Symbol
-   54,08%     0,18%  tccd     [kernel.kallsyms]     [k] entry_SYSCALL_64_after_hwframe
   - 53,89% entry_SYSCALL_64_after_hwframe
      - 53,45% do_syscall_64
         - 50,70% x64_sys_call
            - 12,51% __x64_sys_clone
               - __do_sys_clone
                  - 12,50% kernel_clone
                     - 12,36% copy_process
                        - 11,96% dup_mm.constprop.0
                           - 11,70% dup_mmap
                              - 7,37% copy_page_range
                                 - 7,25% copy_p4d_range
                                    - 5,10% copy_pte_range
                                         1,64% copy_present_pte
                                         1,37% _compound_head
                                       + 1,08% __pte_alloc
                                    + 0,92% __pmd_alloc
                                    + 0,63% __pud_alloc
                              + 1,48% anon_vma_fork
                              + 1,06% vm_area_dup
                              + 0,92% mas_store
            - 10,83% __x64_sys_execve
               - 10,83% do_execveat_common.isra.0
                  - 10,65% bprm_execve
                     - 10,63% bprm_execve.part.0
                        - 10,50% exec_binprm
                           - search_binary_handler
                              - 10,48% load_elf_binary
                                 - 10,38% begin_new_exec
                                    + 10,31% exec_mmap
            - 7,14% __x64_sys_ioctl
               - 7,08% fop_ioctl <- tuxedo-drivers fan control
                  - 4,38% uw_set_fan.isra.0
                     + 2,19% uniwill_write_ec_ram <- writing into EC with tuxedo-drivers
                     + 2,19% uniwill_read_ec_ram <- reading from EC with tuxedo-drivers
                  - 2,52% uniwill_read_ec_ram
                     + 2,50% uw_wmi_read_ec_ram <- reading from EC with tuxedo-drivers
            - 7,10% __x64_sys_openat
               + 6,93% do_sys_openat2
            - 4,59% __x64_sys_read
               - 4,57% ksys_read
                  - 4,33% vfs_read
                     - 3,92% kernfs_fop_read_iter
                        - 3,79% seq_read_iter
                           - 2,86% kernfs_seq_show
                              + 2,83% sysfs_kf_seq_show
                           + 0,56% kvmalloc_node
            - 3,88% __x64_sys_access
               - 3,87% do_faccessat
                  - 3,58% user_path_at_empty
                     - 3,21% filename_lookup
                        - 3,18% path_lookupat
                           - 2,72% link_path_walk.part.0.constprop.0
                              + 1,55% walk_component
                                0,62% inode_permission
            + 1,10% __x64_sys_statx
            + 1,07% __x64_sys_epoll_wait
         + 2,36% syscall_exit_to_user_mode
+   53,70%     0,30%  tccd     [kernel.kallsyms]     [k] do_syscall_64
+   50,89%     0,17%  tccd     [kernel.kallsyms]     [k] x64_sys_call
+   16,51%     0,00%  tccd     [unknown]             [.] 0000000000000000
+   12,57%     0,00%  tccd     libc.so.6             [.] _Fork
+   12,51%     0,00%  tccd     [kernel.kallsyms]     [k] __x64_sys_clone
+   12,51%     0,00%  tccd     [kernel.kallsyms]     [k] __do_sys_clone
+   12,50%     0,00%  tccd     [kernel.kallsyms]     [k] kernel_clone
+   12,36%     0,01%  tccd     [kernel.kallsyms]     [k] copy_process
+   11,96%     0,00%  tccd     [kernel.kallsyms]     [k] dup_mm.constprop.0
+   11,72%     0,51%  tccd     [kernel.kallsyms]     [k] dup_mmap
+   11,59%     0,00%  tccd     tccd                  [.] 0x0000000001704762
+   10,84%     0,00%  tccd     libc.so.6             [.] execve
+   10,83%     0,00%  tccd     [kernel.kallsyms]     [k] __x64_sys_execve
+   10,83%     0,01%  tccd     [kernel.kallsyms]     [k] do_execveat_common.isra.0
+   10,65%     0,00%  tccd     [kernel.kallsyms]     [k] bprm_execve
+   10,63%     0,00%  tccd     [kernel.kallsyms]     [k] bprm_execve.part.0
+   10,50%     0,00%  tccd     [kernel.kallsyms]     [k] exec_binprm
+   10,50%     0,00%  tccd     [kernel.kallsyms]     [k] search_binary_handler
+   10,48%     0,00%  tccd     [kernel.kallsyms]     [k] load_elf_binary
+   10,39%     0,00%  tccd     tccd                  [.] 0x0000000000b2783f
+   10,39%     0,00%  tccd     tccd                  [.] 0x0000000001702258
+   10,39%     0,00%  tccd     [kernel.kallsyms]     [k] begin_new_exec
+   10,31%     0,00%  tccd     [kernel.kallsyms]     [k] exec_mmap
+   10,29%     0,00%  tccd     [kernel.kallsyms]     [k] __mmput
+   10,29%     0,00%  tccd     tccd                  [.] 0x000000000170247a
+   10,29%     0,00%  tccd     [kernel.kallsyms]     [k] mmput
+   10,09%     0,08%  tccd     [kernel.kallsyms]     [k] exit_mmap
...
`perf` with 2.1.10 on the Gemini 17 Gen 1 (ACPI)
Samples: 74K of event 'cpu_core/cycles:P/', Event count (approx.): 18247185533
  Children      Self  Command       Shared Object                          Symbol
-   61,99%     0,02%  tccd          [kernel.kallsyms]                      [k] entry_SYSCALL_64_after_hwframe
   - 61,98% entry_SYSCALL_64_after_hwframe
      - 61,96% do_syscall_64
         - 61,31% x64_sys_call
            - 51,46% __x64_sys_ioctl
               - 51,39% fop_ioctl
                  - 51,30% clevo_evaluate_method
                     - 51,28% clevo_acpi_interface_method_call
                        - 51,26% clevo_acpi_evaluate
                           - 51,19% acpi_evaluate_dsm <- lowest level in tuxedo-drivers, this is a kernel function
                              - 51,18% acpi_evaluate_object
                                 - 50,93% acpi_ns_evaluate
                                    - 50,63% acpi_ps_execute_method
                                       - 50,31% acpi_ps_parse_aml
                                          - 49,08% acpi_ps_parse_loop
                                             - 32,70% acpi_ds_exec_end_op
                                                + 15,08% acpi_ex_opcode_1A_1T_1R
                                                + 5,61% acpi_ds_evaluate_name_path
                                                + 3,03% acpi_ds_create_operands
                                                + 1,64% acpi_ds_delete_result_if_not_used
                                                + 1,43% acpi_ds_clear_operands
                                                + 1,40% acpi_ex_resolve_operands
                                                  0,70% acpi_ds_eval_data_object_operands
                                                  0,65% acpi_ds_get_predicate_value
                                             + 5,09% acpi_ps_get_arguments.constprop.0
                                             + 4,57% acpi_ps_create_op
                                             + 3,68% acpi_ps_complete_op
                                               0,68% acpi_ps_push_scope
            + 2,73% __x64_sys_read
            + 2,68% __x64_sys_clone
            + 2,16% __x64_sys_execve
            + 1,02% __x64_sys_openat
         + 0,55% syscall_exit_to_user_mode
+   61,97%     0,03%  tccd          [kernel.kallsyms]                      [k] do_syscall_64
+   61,34%     0,07%  tccd          [kernel.kallsyms]                      [k] x64_sys_call
+   52,39%     0,00%  tccd          tccd                                   [.] 0x000000000089fb7c
+   52,32%     0,01%  tccd          TuxedoIOAPI.node                       [.] Napi::details::CallbackData<Napi::Boolean (*)(Napi::CallbackInfo const&), Napi::Boolean>::Wrapper(napi_env__*, napi_callback_info__*)
+   51,71%     0,03%  tccd          libc.so.6                              [.] __GI___ioctl
+   51,47%     0,03%  tccd          [kernel.kallsyms]                      [k] __x64_sys_ioctl
+   51,46%     0,02%  tccd          [kernel.kallsyms]                      [k] acpi_evaluate_object
+   51,40%     0,03%  tccd          [kernel.kallsyms]                      [k] fop_ioctl
+   51,30%     0,02%  tccd          [kernel.kallsyms]                      [k] clevo_evaluate_method
+   51,29%     0,02%  tccd          [kernel.kallsyms]                      [k] clevo_acpi_interface_method_call
+   51,26%     0,02%  tccd          [kernel.kallsyms]                      [k] clevo_acpi_evaluate
+   51,20%     0,02%  tccd          [kernel.kallsyms]                      [k] acpi_ns_evaluate
+   51,19%     0,00%  tccd          [kernel.kallsyms]                      [k] acpi_evaluate_dsm
+   50,87%     0,02%  tccd          [kernel.kallsyms]                      [k] acpi_ps_execute_method
+   50,53%     0,04%  tccd          [kernel.kallsyms]                      [k] acpi_ps_parse_aml
+   49,38%     0,40%  tccd          [kernel.kallsyms]                      [k] acpi_ps_parse_loop
+   39,19%     0,02%  tccd          TuxedoIOAPI.node                       [.] SetFanSpeedPercent(Napi::CallbackInfo const&)
+   33,15%     0,56%  tccd          [kernel.kallsyms]                      [k] acpi_ds_exec_end_op
+   18,18%     0,07%  tccd          [kernel.kallsyms]                      [k] acpi_ex_field_datum_io
+   18,08%     0,05%  tccd          [kernel.kallsyms]                      [k] acpi_ex_access_region
+   17,92%     0,10%  tccd          [kernel.kallsyms]                      [k] acpi_ev_address_space_dispatch
+   17,41%     0,06%  tccd          [kernel.kallsyms]                      [k] acpi_ec_space_handler
+   17,29%     0,03%  tccd          [kernel.kallsyms]                      [k] acpi_ec_transaction
+   17,20%     0,07%  tccd          [kernel.kallsyms]                      [k] acpi_ec_transaction_unlocked
+   15,13%     0,17%  tccd          [kernel.kallsyms]                      [k] acpi_ex_opcode_1A_1T_1R
+   14,83%     0,07%  tccd          [kernel.kallsyms]                      [k] acpi_ex_store
+   14,53%     0,09%  tccd          [kernel.kallsyms]                      [k] acpi_ex_store_object_to_node
+   14,50%    14,50%  tccd          [kernel.kallsyms]                      [k] advance_transaction
+   14,26%     0,07%  tccd          [kernel.kallsyms]                      [k] acpi_ex_write_data_to_field
+   13,87%     0,05%  tccd          [kernel.kallsyms]                      [k] acpi_ex_insert_into_field
+   13,81%     0,04%  tccd          [kernel.kallsyms]                      [k] acpi_ex_write_with_update_rule
+   11,24%     0,02%  tccd          TuxedoIOAPI.node                       [.] GetFanTemperature(Napi::CallbackInfo const&)
+    5,74%     0,14%  tccd          [kernel.kallsyms]                      [k] acpi_ex_resolve_to_value
+    5,65%     0,03%  tccd          [kernel.kallsyms]                      [k] acpi_ds_evaluate_name_path
...
`perf` with 2.1.10 + new fan control on the Gemini 17 Gen 1 (ACPI)
Samples: 62K of event 'cpu_core/cycles:P/', Event count (approx.): 15778506447
  Children      Self  Command       Shared Object                         Symbol
-   53,65%     0,02%  tccd          [kernel.kallsyms]                     [k] entry_SYSCALL_64_after_hwframe
   - 53,63% entry_SYSCALL_64_after_hwframe
      - 53,61% do_syscall_64
         - 53,00% x64_sys_call
            - 41,73% __x64_sys_ioctl
               - 41,66% fop_ioctl
                  - 41,59% clevo_evaluate_method
                     - 41,58% clevo_acpi_interface_method_call
                        - clevo_acpi_evaluate
                           - 41,52% acpi_evaluate_dsm <- lowest level in tuxedo-drivers, this is a kernel function
                              - 41,50% acpi_evaluate_object
                                 - 41,30% acpi_ns_evaluate
                                    - 41,00% acpi_ps_execute_method
                                       - 40,74% acpi_ps_parse_aml
                                          - 39,71% acpi_ps_parse_loop
                                             - 25,90% acpi_ds_exec_end_op
                                                + 11,71% acpi_ex_opcode_1A_1T_1R
                                                + 4,33% acpi_ds_evaluate_name_path
                                                + 2,57% acpi_ds_create_operands
                                                + 1,40% acpi_ds_delete_result_if_not_used
                                                + 1,19% acpi_ex_resolve_operands
                                                + 1,09% acpi_ds_clear_operands
                                                  0,54% acpi_ds_eval_data_object_operands
                                             + 4,34% acpi_ps_get_arguments.constprop.0
                                             + 3,84% acpi_ps_create_op
                                             + 2,99% acpi_ps_complete_op
                                               0,62% acpi_ps_push_scope
            + 3,10% __x64_sys_clone
            + 3,09% __x64_sys_read
            + 2,53% __x64_sys_execve
            + 1,19% __x64_sys_openat
         + 0,55% syscall_exit_to_user_mode
+   53,63%     0,03%  tccd          [kernel.kallsyms]                     [k] do_syscall_64
+   53,02%     0,07%  tccd          [kernel.kallsyms]                     [k] x64_sys_call
+   42,46%     0,00%  tccd          tccd                                  [.] 0x000000000089fb7c
+   42,45%     0,01%  tccd          TuxedoIOAPI.node                      [.] Napi::details::CallbackData<Napi::Boolean (*)(Napi::CallbackInfo const&), Napi::Boolean>::Wrapper(napi_env__*, napi_callback_info__*)
+   41,96%     0,03%  tccd          libc.so.6                             [.] __GI___ioctl
+   41,82%     0,02%  tccd          [kernel.kallsyms]                     [k] acpi_evaluate_object
+   41,74%     0,02%  tccd          [kernel.kallsyms]                     [k] __x64_sys_ioctl
+   41,66%     0,02%  tccd          [kernel.kallsyms]                     [k] fop_ioctl
+   41,60%     0,01%  tccd          [kernel.kallsyms]                     [k] acpi_ns_evaluate
+   41,60%     0,02%  tccd          [kernel.kallsyms]                     [k] clevo_evaluate_method
+   41,58%     0,02%  tccd          [kernel.kallsyms]                     [k] clevo_acpi_interface_method_call
+   41,56%     0,01%  tccd          [kernel.kallsyms]                     [k] clevo_acpi_evaluate
+   41,52%     0,00%  tccd          [kernel.kallsyms]                     [k] acpi_evaluate_dsm
+   41,28%     0,02%  tccd          [kernel.kallsyms]                     [k] acpi_ps_execute_method
+   40,98%     0,03%  tccd          [kernel.kallsyms]                     [k] acpi_ps_parse_aml
+   40,00%     0,36%  tccd          [kernel.kallsyms]                     [k] acpi_ps_parse_loop
+   30,70%     0,01%  tccd          TuxedoIOAPI.node                      [.] SetFanSpeedPercent(Napi::CallbackInfo const&)
+   26,28%     0,49%  tccd          [kernel.kallsyms]                     [k] acpi_ds_exec_end_op
+   13,90%     0,05%  tccd          [kernel.kallsyms]                     [k] acpi_ex_field_datum_io
+   13,81%     0,05%  tccd          [kernel.kallsyms]                     [k] acpi_ex_access_region
+   13,67%     0,09%  tccd          [kernel.kallsyms]                     [k] acpi_ev_address_space_dispatch
+   13,23%     0,06%  tccd          [kernel.kallsyms]                     [k] acpi_ec_space_handler
+   13,12%     0,03%  tccd          [kernel.kallsyms]                     [k] acpi_ec_transaction
+   13,04%     0,08%  tccd          [kernel.kallsyms]                     [k] acpi_ec_transaction_unlocked
+   11,76%     0,14%  tccd          [kernel.kallsyms]                     [k] acpi_ex_opcode_1A_1T_1R
+   11,53%     0,06%  tccd          [kernel.kallsyms]                     [k] acpi_ex_store
+   11,27%     0,09%  tccd          [kernel.kallsyms]                     [k] acpi_ex_store_object_to_node
+   11,01%     0,08%  tccd          [kernel.kallsyms]                     [k] acpi_ex_write_data_to_field
+   10,67%     0,04%  tccd          [kernel.kallsyms]                     [k] acpi_ex_insert_into_field
+   10,62%     0,03%  tccd          [kernel.kallsyms]                     [k] acpi_ex_write_with_update_rule
+   10,60%    10,59%  tccd          [kernel.kallsyms]                     [k] advance_transaction
+    9,71%     0,03%  tccd          TuxedoIOAPI.node                      [.] GetFanTemperature(Napi::CallbackInfo const&)
...

To be transparent I included a lot of my measurements, but only kept the important parts due to github text limits. The graphs look different between some measurements because different approaches are used depending on the laptop, because of different EC.

For the WMI and ACPI api tccd uses tuxedo_io_api.hh which calls ioctl for all fan controls which shows within all strace measurements which use tuxedo-drivers via C++, so tuxedo-drivers C++ code handles the fan control differently for devices further.

tuxedo-drivers uses acpi_evaluate_dsm for some EC, which is a kernel function and spends most of the time there. I have been told internally that it isn't really cpu usage, but more like waiting or blocking during EC communictation while values are read/written.

To further lower tccd usage, you can disable fan control in tcc globally, then fans will be in auto mode instead and tccd then won't need to communicate with the EC.

Keep in mind that I am primarily a TypeScript dev and I wasn't involved with kernel, driver, acpi, wmi or firmware development, so my knowledge in code below tccd is limited, but hopefully that explanation is enough to see why further optimization is hard from my side, since all of that shows as tccd usage, but it is most likely even more below tuxedo-drivers. The only thing I can do for now is to use tuxedo-drivers as little as possible.

Not sure when my new code will release, since it isn't finished yet. It took me a while to gather all of this, but I wanted to explain it a bit more indepth. One last idea that I have is to not write fan speed if it is unchanged.

@tuxedoder
Copy link
Contributor

Issue was probably closed since I finished the fan control internally, but it isn't released yet. I will leave the issue open until the code is public.

It was planned to update dependencies (angular, electron, node, ...) and fan control at once, since both are rather big changes. Updating the fan speed values on changes seemingly further helped on some devices. To test the impact of partially writing values I ran stress for 60 seconds and then applied idle for 60 seconds periodically with a script.

The dependency branch isn't finished yet, so these numbers aren't final, but I don't think the tccd values will change further, because I am currently mainly adjusting tcc, tccd seems to work without errors and I am out of ideas on what else to do. Testing was done with a deb built with node 20, electron 32 and angular 18 with the default profile on kernel 6.11 and tuxedo-drivers 4.8.0.

tccd usage on dependency test branch tray only (idle) no tray (idle) no tray + globally off (idle) no tray (changing load)
Aura 15 Gen2 5300U 0.9% 0.8% 0.4% 0.9%

I observed around roughly 1% or less for most devices on average during development with ps. The only exception I know of is Sirius, but we didn't release a fan control for it in the first place due to high cpu usage.

@tuxedoder tuxedoder reopened this Oct 9, 2024
@mserajnik
Copy link

@tuxedoder

The only exception I know of is Sirius, but we didn't release a fan control for it in the first place due to high cpu usage.

I'm sorry if this would be better asked elsewhere (the tuxedo-drivers repository perhaps?) but since you seem to know about this (and it's somewhat relevant to this issue, at least for Sirius users), I figured I might as well ask here:

The "high CPU usage" with Sirius that you speak of is/was related to implementing fan control for Sirius at the driver level, right? Since fan control is now available for Sirius, I assume a way has been found to implement it without causing the high CPU usage? Is there perhaps some more information about the issue and how it was fixed that you could point me to? I'm mainly interested because I've put in an order for a Sirius Gen2 and am trying to learn about potential culprits in advance (and even if this issue may now be fixed, it still interests me).

Also, does that mean that Sirius will see similarly low tccd CPU usage as the other devices once this new TCC release is made?

Finally, is there any timeframe for the release?

@tuxedoder
Copy link
Contributor

implementing fan control for Sirius at the driver level, right?

I was referring to the fan implementation in tuxedo-drivers for the Sirius. To mention a quote from one of our devs:

For Sirius the existing sensors driver module using the WMI BS interface has proved to be too CPU intensive [...]. For this reason we have extended the ACPI interface with temperature and fan speed read out. This interface uses direct memory address look-up on the EC [...]

Because of that tuxedo-drivers 4.11.3 now shows 2 hwmon paths on a Sirius and you need to use tcc 2.1.14 or newer for tccd to use the new "tuxi" hwmon interface. You may have to update your system for tuxi to be available.

does that mean that Sirius will see similarly low tccd CPU usage as the other devices once this new TCC release is made?

The fan rework won't do a lot on a Sirius, because tuxi does not do a lot of utilization. I did some measurements on already released tcc versions. I didn't add tuxi into the rework yet, so I don't have numbers for that.

Sirius Gen 1 (wayland) hwmon name tray only+idle tccd average after 10min
tuxedo-drivers 4.11.3 + tcc 2.1.13 tuxedo 6.0%
tuxedo-drivers 4.11.3 + tcc 2.1.14 tuxedo_tuxi_sensors 0.6%

is there any timeframe for the release?

I assume you mean the fan rework and dependency update, because tuxi is utilized with 2.1.14. Currently css needs to be adjusted because of mdc. I am also trying to add more error handling/printing and adjust/replace some code. I later need to clean the code I wrote, get ui changes approved and do a lot of testing. I was also busy with kernel debugging, package testing and tcc maintenance. It is hard to make a good estimate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet