You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am aware that this can be implemented by specifying multiple runners, but that would have consequences for the concurrency specifications. Specifically, my reading of the code is that if N is the number of concurrent runners that can be used in parallel on the worker, then two runners would have to be split N1+N2 <= N, otherwise the worker can be overloaded if all the runners are allocated by the scheduler.
Background: I recently tried to add a couple of old machines as workers to our Goma/Buildbarn system, with Windows cross-compile on Linux, but the system became unstable for some reason. While testing the upgraded system in the past couple of days I found that the instability is still present, and seemed to be caused by the case-insensitive file system mount (ciopfs) we need to use for the Windows cross-compile (many Windows SDK files are included with incorrectly cased names, from inside the SDK) . ciopfs seems to stall at times, causing long periods of the worker and ciopfs not doing any building, one case lasted about 40 minutes. My guess is that this problem is related to both SSD disk speed and possibly the number of parallel processes (we have a the same ciopfs configuration on a different worker, with much a faster CPU, more cores/threads, and a NVMe disk, which does not have this problem).
While there may be other ways to get a case-insensitive filesystem running, one alternative possibility would be to assign these workers to be Linux-only workers, without Windows-cross-compile (I have not yet tested this configuration).
However, it does not seem like the action system permit multiple platform specifications; "Use one of these platforms". This indicates that the Windows cross-compile workers need to be specifies as "LinuxWindows" platforms, while Linux workers have to be specified as "LinuxOnly". However, the "LinuxWindows" workers should also be able to run "LinuxOnly" builds, and AFAICT that is not possible, except by specifying multiple runners, and splitting the concurrency number between each of them, which also means halving the performance, except if there are Windows+Linux builds going on at the same time.
IMO either the concurrency system must be changed so that only N number of runners can be active at a time, or a single runner group should be available for multiple platforms.
The text was updated successfully, but these errors were encountered:
Are you aware of issue #40? It's not identical to what you requested here, but I think that if implemented properly, it could also be used to achieve something similar. Instead of letting workers announce multiple platforms, we could have a mechanism where the scheduler notifies some helper process, requesting it to spin up new workers of a given kind. In case resources aren't elastic (e.g., on physical systems), you could use that process to simply reconfigure a worker from variant A to variant B.
Does it make sense to keep this open, or do we want to fold this into #40?
I did see it, but I think our use case could be different enough that a separate report should be filed.
In our case we are using a fixed set of hard workers, not a docker-type system. The limits on CPU usage is therefore fixed in a way a cloud based system might not be.
It could be that the solutions will be the same, but there could be differences in how the two variations work, too.
Based on recent testing, having separate work areas (runners) for the Windows cross-compile and Linux could be useful; as I noticed that Linux build actions sometimes ran into trouble due to the case-insensitivity used in the workarea mount, but as mentioned, multiple runners on a worker require an overall management of concurrency that AFAICT is not currently possible
As I understand the worker configuration definition. a worker can have a combination of properties, such as this one in the dockers example:
However, as far as I can tell, the platform specification cannot specify multiple properties set for a given runner, e.g this way.
I am aware that this can be implemented by specifying multiple runners, but that would have consequences for the concurrency specifications. Specifically, my reading of the code is that if N is the number of concurrent runners that can be used in parallel on the worker, then two runners would have to be split N1+N2 <= N, otherwise the worker can be overloaded if all the runners are allocated by the scheduler.
Background: I recently tried to add a couple of old machines as workers to our Goma/Buildbarn system, with Windows cross-compile on Linux, but the system became unstable for some reason. While testing the upgraded system in the past couple of days I found that the instability is still present, and seemed to be caused by the case-insensitive file system mount (ciopfs) we need to use for the Windows cross-compile (many Windows SDK files are included with incorrectly cased names, from inside the SDK) . ciopfs seems to stall at times, causing long periods of the worker and ciopfs not doing any building, one case lasted about 40 minutes. My guess is that this problem is related to both SSD disk speed and possibly the number of parallel processes (we have a the same ciopfs configuration on a different worker, with much a faster CPU, more cores/threads, and a NVMe disk, which does not have this problem).
While there may be other ways to get a case-insensitive filesystem running, one alternative possibility would be to assign these workers to be Linux-only workers, without Windows-cross-compile (I have not yet tested this configuration).
However, it does not seem like the action system permit multiple platform specifications; "Use one of these platforms". This indicates that the Windows cross-compile workers need to be specifies as "LinuxWindows" platforms, while Linux workers have to be specified as "LinuxOnly". However, the "LinuxWindows" workers should also be able to run "LinuxOnly" builds, and AFAICT that is not possible, except by specifying multiple runners, and splitting the concurrency number between each of them, which also means halving the performance, except if there are Windows+Linux builds going on at the same time.
IMO either the concurrency system must be changed so that only N number of runners can be active at a time, or a single runner group should be available for multiple platforms.
The text was updated successfully, but these errors were encountered: