-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.8.9.1 end-of-story comprehension questions; original topic: READ often seems not to hear miscues -- why? #357
Comments
By miscues, you mean poor reading by the children?
|
I meant a misread word, false start, regression (rereading), or long hesitation. |
Actually, I'll let Judith respond if she has any comments, because on
reflection I didn't observe many students attempting reading.
|
|
I don't think we can lower the threshold any further. My personal sense has been that the swahili ASR is not reliable enough based on my testing to assess kids based on it. We should definitely leave the activities in so the kids get reading and speaking practice... but I'll rather we rely on multiple choice answer options @ 90% and writing @ 70% to assess kids for all tutors... For reading... as long as they make it to the end, promote them up one.... if they back out, they should repeat or something. |
d. If they pass, proceed to the next story. |
1. I have tested the reading tutor extensively during our internal QA
testing, I have also watched the kids use the tutor in previous videos.
With the current threshold, sometimes robotutor will accept a word even if
I tap beside the tablet... I don't actually have to say any words. Other
times, I say a word correctly over and over and it just doesnt take it. I
think you are trying to find the magic combination of lowering the
threshold and a reliable pass rate (which the best we can do is guess a
number). If you lower the threshold even further AND lower the pass rate,
at that point, you are just assessing if the kid produced a sound
regardless of it it's the right word or not. I think the current threshold
is fine... the user experience is not painful... but I dont think there is
a reliable pass rate that we can use based on speech... 70% is definitely
too high.. so if you insist of promoting them based on a value... lowering
it is a good way to go. It'll be a shame if we hold students back in the
matrix and cost time to advance in the curriculum because of issues related
to the speech recognizer.
2. I cannot chime in on cloze or the picture matching question. Based on
discussions we have had, I think the cloze questions generated have to be
thoroughly vetted by a native speaker before we can come up with a reliable
threshold. I also think that picture matching using the pictures from the
African storybook project need to be vetted thoroughly as well to come up
with a reliable threshold. Overall, I *very very strongly* advocate that
Filipo/Leonora/Maureen/any other creates a 10 question multiple choice
question (per story) presented at the end of each story to the kids based
comprehension questions similar to the EGRA questions .... I have requested
this in the past severally but you have mostly ignored my requests. We can
interleave cloze/picture matching/oral response questions between pages,
but need comprehension questions similar to the EGRA that we can reliably
assess (hence multiple choice). If we do this, we might not need to worry
about vetting the other question types so much because they mostly provide
practice opportunities and dont have to be 100% right, and can then rely on
vetted story-level questions from native speakers to for promotion
purposes.
3. The whole group can decide on a promotion policy in the Wednesday
meeting... I dont disagree with anything you said as a viable option, but
more feedback would be nice.
|
@judithodili - Good answers! Re 2.: Leonora is here and says that kids are used to open-ended questions, not multiple choice, so it's important to kid-test that form of question first at our beta sites before generating lots of them. a. Quickest is to send Fortunatus and Mwita multiple choice questions to kid-test. Leonora says 2-choice either-or questions would be familiar to kids, e.g. "Who fed the dog -- the rabbit or the cat?" b. Would next quickest be a stand-alone app to present questions on a tablet? c. Finally we can insert a prototype multiple-choice test at the end of a story to test the UI/UX as well. Authentic practice for EGRA requires the open-ended format. The generic wh- questions are open-ended. It's fine to include some story-specific questions. At least one of the new stories already contains end-of-story questions, so let's use them! A 10-item test is too long. EGRA only asks 3. Leonora suggests max 5 or 6. Back to narrating.... |
I agree 100% that we should have open ended questions so the kids can
practice... But unlike human graders, this is difficult to grade by machine
which leaves us with multiple choice for grading accuracy.
Quickest way is to send Mwita and Fortunatus some questions and answers via
google doc for them to kid test.
2 choices seems fine - 5/6 questions per story seem fine as well.
…--
Regards,
Judith Odili Uchidiuno
www.judithu.com
|
Just a quick question without weighing in on everything - isn't bubble pop
essentially a multiple choice, and therefore we know kids can at least do
that?
|
It's multiple choice and can be configured with 2 choices, but I believe would need to be modified to ask a different question on each screen. |
In my comment, I meant to imply that I think we don't have to kid-test
multiple choice as a concept, because we have evidence that they can choose
between items (in bubble pop).
|
Ah. Plausible but assumes that what works for simple current tasks will also work for comprehension. |
@judithodili - Questions/suggestions for your quick experiment, with apologies for any that are obvious:
|
READ often fails to respond to oral reading miscues. Why?
a. The speech is not loud enough to pass the detection threshold, perhaps due to a noisy environment.
b. The ASR recognizes something but RoboTutor doesn't respond.
(How) does the VERBOSE log show the ASR output?
What can we do about this problem?
a. Lower the threshold.
?: to what?
-: might hallucinate speech if environment is noisy
b. Supply the word after [5] seconds on the assumption that the kid attempted it.
+: easy to implement
+: solves UX problem without having to fix ASR problem
-: might... keep... reading... even if kid does nothing.
If we do 2b, when should it stop?
a. At end of sentence.
b. At end of story.
At what point should it time out?
@judithodili and @amyogan - Did you observe this problem? Is it ok to auto-advance after 5 seconds?
@kevindeland - Thoughts?
Thanks. - Jack
The text was updated successfully, but these errors were encountered: