-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add test for round robin threads #125
base: master
Are you sure you want to change the base?
Conversation
Make Tick message slightly more informative by including the thread id. Signed-off-by: Gerwin Klein <[email protected]>
0f18a9e
to
e283a63
Compare
if (0 < id) { | ||
test_eq(counters[id], counters[id - 1]); | ||
} | ||
printf("Tick (%d)\n", id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a explicit note here that the blocking printing is not interfering with the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not sure it isn’t actually, I’m still experimenting. A good 30% of the configs where this test is enabled are still failing. But of course not when I’m trying to run it manually..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will interfere with the test, as it messes up all timing and hence scheduling.
delay_loop(5000000); | ||
} | ||
|
||
test_eq(counters[id], (unsigned long) SCHED0023_NUM_ROUNDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note there why this would ever fail. Is there any other cause besides memory corruption?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just making sure every thread is only writing to its own counter.
/* used in test_round_robin below */ | ||
static void round_robin_thread(int id, volatile unsigned long *counters) | ||
{ | ||
delay_loop(id * 100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this delay actually, is it to ensure counters
is initialized and visible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s to make sure that the writes to the counters are in order. When the threads are first started, this is not necessary, but when each thread gets scheduled the second time, we’d be relying on that each of them got the exact same number of cycles to make progress. If thread 0 got just 10 cycles less, the second counter increment in thread 0 might happen in the time slice after the second counter increment in thread 1. I don’t think the timer interrupts are that accurate for all platforms, so I’m adding some safety.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Please add this in a comment in the code.
printf("Tick (%d)\n", id); | ||
/* busy wait so we can be preempted at least once before we increment | ||
the counter again */ | ||
delay_loop(5000000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a note here why this magic value seem quite safe based on the assumption about time slice and core frequency/IPS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like it isn’t safe yet on all machines, esp when printing is off. I might have to actually try to compute a value here instead. Just wanted to avoid making any syscalls in these threads that could affect scheduling.
Add test SCHED0023 to check that round robin threads are indeed scheduled round robin. Signed-off-by: Gerwin Klein <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole test seems incredibly fragile.
The test should be inherently stable, so the busy looping at the start should be unnecessary. But if you do loop, at least add a nop or something, in case empty asm strings are optimised away.
To start the test I wouldn't lower the priority of the main thread, but instead let the main thread block till all tests are done. But if you do, please increase it again at the end of the test. I'd let the main thread start and block on a timer so the whole test has a fixed runtime (or use MCS with period = test period and budget -1 us + yield as a timer for the main thread).
Why not increment and check the counters in a busy loop? Check that the counter values are always lower than the previous task's one.
To avoid the problem of small jitter making tasks overtaking each other, you can give them progressively slightly shorter periods, but with differences small enough that a non-round-robin schedule is detected. You probably want to increase the period for this, as that reduces the jitter and scheduling overhead relatively.
The test relies on relative timing of a busy wait loop. The loop has to take at least 1 ms so that the round-robin test thread is preempted at least once before it prints+increments again.