Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add test for round robin threads #125

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

add test for round robin threads #125

wants to merge 2 commits into from

Conversation

lsf37
Copy link
Member

@lsf37 lsf37 commented Jul 24, 2024

  • add test SCHED0023 to check that round robin threads are indeed scheduled round robin.
  • print thread id in periodic thread tests

The test relies on relative timing of a busy wait loop. The loop has to take at least 1 ms so that the round-robin test thread is preempted at least once before it prints+increments again.

Make Tick message slightly more informative by including the thread id.

Signed-off-by: Gerwin Klein <[email protected]>
@lsf37 lsf37 self-assigned this Jul 24, 2024
@lsf37 lsf37 added the hw-test set to run sel4test hardware test for this PR label Jul 24, 2024
@lsf37 lsf37 force-pushed the round-robin branch 2 times, most recently from 0f18a9e to e283a63 Compare July 25, 2024 09:15
if (0 < id) {
test_eq(counters[id], counters[id - 1]);
}
printf("Tick (%d)\n", id);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a explicit note here that the blocking printing is not interfering with the test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure it isn’t actually, I’m still experimenting. A good 30% of the configs where this test is enabled are still failing. But of course not when I’m trying to run it manually..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will interfere with the test, as it messes up all timing and hence scheduling.

delay_loop(5000000);
}

test_eq(counters[id], (unsigned long) SCHED0023_NUM_ROUNDS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a note there why this would ever fail. Is there any other cause besides memory corruption?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just making sure every thread is only writing to its own counter.

/* used in test_round_robin below */
static void round_robin_thread(int id, volatile unsigned long *counters)
{
delay_loop(id * 100);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this delay actually, is it to ensure counters is initialized and visible?

Copy link
Member Author

@lsf37 lsf37 Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s to make sure that the writes to the counters are in order. When the threads are first started, this is not necessary, but when each thread gets scheduled the second time, we’d be relying on that each of them got the exact same number of cycles to make progress. If thread 0 got just 10 cycles less, the second counter increment in thread 0 might happen in the time slice after the second counter increment in thread 1. I don’t think the timer interrupts are that accurate for all platforms, so I’m adding some safety.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Please add this in a comment in the code.

printf("Tick (%d)\n", id);
/* busy wait so we can be preempted at least once before we increment
the counter again */
delay_loop(5000000);
Copy link
Member

@axel-h axel-h Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a note here why this magic value seem quite safe based on the assumption about time slice and core frequency/IPS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it isn’t safe yet on all machines, esp when printing is off. I might have to actually try to compute a value here instead. Just wanted to avoid making any syscalls in these threads that could affect scheduling.

Add test SCHED0023 to check that round robin threads are indeed
scheduled round robin.

Signed-off-by: Gerwin Klein <[email protected]>
Copy link
Contributor

@Indanz Indanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole test seems incredibly fragile.

The test should be inherently stable, so the busy looping at the start should be unnecessary. But if you do loop, at least add a nop or something, in case empty asm strings are optimised away.

To start the test I wouldn't lower the priority of the main thread, but instead let the main thread block till all tests are done. But if you do, please increase it again at the end of the test. I'd let the main thread start and block on a timer so the whole test has a fixed runtime (or use MCS with period = test period and budget -1 us + yield as a timer for the main thread).

Why not increment and check the counters in a busy loop? Check that the counter values are always lower than the previous task's one.

To avoid the problem of small jitter making tasks overtaking each other, you can give them progressively slightly shorter periods, but with differences small enough that a non-round-robin schedule is detected. You probably want to increase the period for this, as that reduces the jitter and scheduling overhead relatively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hw-test set to run sel4test hardware test for this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants