-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1.8.9.1 ECHOWORDS mode to encourage fluent reading #348
Comments
Actually we can dispense with REREAD and just use ECHO, provided we modify ECHO to reread the entire sentence if the kid read it disfluently, in which case rereading it scaffolds comprehension. |
@octavpo - can you implement ECHO by August 28 when I return? |
@octavpo - Let me rephrase that request more emphatically: XPRIZE weights reading 60%, ECHO tries to addess severe problems in RoboTutor's current ASR-driven assisted oral reading, and is important to kid-test ASAP. So please implement ECHO in time for me to try out on August 29. If you have any questions before then, please email them to me. I'll be offline Friday and Saturday PST, but will reply ASAP. I'll be travelling all day Tuesday, August 28. Thanks. - Jack |
Octav email of Saturday, September 01, 2018 6:29 PM: ... I'll try to put more details on the Github issue, but in short I found that in my tests most cases when a word is marked red are due to the fact that works are recognized and lost rather than because they're not recognized correctly. So I'm trying to fix that first. Jack email of Sat 9/1/2018 9:47 PM: “Words are recognized and lost” is very interesting. Based just on your own testing, or can you tell from the verbose logs from use in the field? How? Please send a real example. From your brief description, I suspect a bug in the MultiMatch process that aligns ASR output against the text to read – not in MultiMatch itself, but perhaps something like omitting utterances from the speech aligned against the text if newer utterances have come in. |
@octavpo - #357 said: Note that a word turns red only if RoboTutor detected speech, i.e. not in case a. |
@judithodili suggested that lowering the speech detection threshold would make ASR hallucinate more. Options: a. Don't supply the word automatically, but cue the kid to tap, e.g. have RoboFinger tap the audio icon b. Treat the word as misread, i.e. turn it red. c. If RoboTutor still detects no speech, invoke the 2-strike rule, i.e. supply the word. I favor option a. Do you concur? |
Yes, I like keeping the kids actively involved!
|
@octavpo - Please implement option a. |
The problem is not the MultiMatch, which is pretty much disabled, but rather a combination of factors, the main one being that after each word the listening process is restarted and any extra words or parts that might have been recognized before are erased. So for instance in a sequence like 'na mitsu' it might send only 'na' (because of a requirement that words are stable for 300ms) and then 'mitsu' is skipped. I tried to fix the issue by removing the restart but I wasn't able to make it work yet. So Ive switched and now I'm working on implementing the features requested in this issue. |
@octavpo - Discussing how Listener works would have been much more timely months ago, and might already be too late. The above description is frustratingly vague -- e.g. that MultiMatch is "pretty much disabled." A "real example" would be well-chosen to illustrate an issue and include:
Unless Listener has changed drastically since the Reading Tutor, you may (or may not) be confusing two requirements:
Also, it's possible that ASR output is supposed to be concatenated over multiple utterances as input to MultiMatch, rather than input to MultiMatch one utterance at a time to track the reader through the text. Now that you're back, we should meet ASAP to look at a real example and trace the code involved, but I won't be in till 9/26. For right now, please post a real "real example" as I requested on 9/1/2018, with a detailed explanation including links to the specific relevant locations in source files. |
From: Octav Popescu [email protected] I have returned from vacation on 9/17, but I've returned to LA. I'm coming back to Pittsburgh on 10/16. |
So are these the old -> new behavior changes of the modes that listen to students, where "sentence" means a text segment as long as a sentence or as short as one word?
|
No, not sure why you think it would work like this. If it's related to my previous message, I was trying to say that the word skipping behavior described in the first two points under changes in the first message is common to all modes, not the new echo mode. |
Thanks for clarifying. I misunderstood 1-4 as applying to all modes, not just 1-2 as you in fact said. |
@octavpo (@kevindeland, @judithodili , @eyarzebinski , @amyogan take note) - |
@octavpo - When will your update be ready? - Jack |
I'm trying to finish it but it's no easy. The skipping code works fine. The echo mode still has a few issues, not sure how long I need to finish it. |
Btw one issue I found and I can't solve on my own, probably familiar to you, is that the segmentation timing is not very accurate. In my testing it seems it cuts off the last word, so I added another 100ms after it. Which seems to be fine for the few examples I was running, but probably not a good general solution. |
@octavpo - Can you TODAY (Wed. 10/3 your time) make a stable branch with just the skipping code, so we can try it ASAP? @kevindeland needs all code by Thursday night 10/4 or Friday (I forget which) to prepare to deploy Monday. Please point me to an APK. I assume you know how to generate them in general, but ask @kevindeland if you need RoboTutor-specific help (e.g. regarding assets). |
It took a while to merge it with the current development branch, but I think it's fine now. The PR #386 includes the previous change about having different reading modes for successive sentences (which hadn't been merged into development for some reason), since they're in the same tutor. Not sure about APKs, where do you guys put them? Not sure where I can find a tutor with 1-word sentence to test, I haven't tried that. There is a switch in the code to enable/disable skipping, but not in configurations, I'll see about that. It does accept skipped words but leaves them black, as requested. Not sure if it fixes most false rejections, it can still happen that the next word is again wrong, and then it's rejected. |
@octavpo - I wish you had just put the apk somewhere instead of incurring further delay by waiting for me to say where. Please put it in Downloads from any team member, and post a link to it here so that I'll get notified. Any of the word reading activities have lists of individual words to read. @kevindeland - Any requests/suggestions to Octav for changes to his test apk, e.g. where or how to include it under the activity matrix or CUSTOM menu in the debugger menu if it doesn't simply affect all story reading activities with modes that listen to the kid (PARROT, ECHO, READ, REVEAL)? Thanks. - Jack |
@octavpo story_4 as it appears in apk 2.3.0.1 (note that there was a story resort, so this might not be story_4 in old apks) has a 1 word sentence: |
@octavpo Looks like something went wrong with the build process, you're missing an important class in the APK. Here's the full stack trace:
In order to avoid this, please follow the directions found in the link Kevin provided (https://www.quora.com/How-do-I-build-an-APK-in-Android-studio). If you have the lastet version of Android Studio installed, there are a few slight differences. Go to Build > Generate Signed Bundle / APK, select the APK radio option and click Next. From there follow the directions in the link to generate a signing key and the APK. @kevindeland @JackMostow if there is a branch that @octavpo has been working off of to create this, I might be able to generate a quick APK if y'all provide the branch name/link. |
@ealanhill - Judging from https://github.com/RoboTutorLLC/RoboTutor/branches/active: |
@JackMostow I uploaded a debug build to the same folder: https://drive.google.com/open?id=1ZUIHwtaGq44rrgrfF3loIcBX0PErcurS It has all the classes, but I'm concerned it won't work because it's a debug version that is signed by my debug cert, so it may throw a fit when you install it on a device. If it doesn't, great! The version code looks to be off from the one @octavpo created, but it is built from the latest on the |
Just saw your post. Will try immediately! |
Good news: installs, opens, loads assets. Good news: not the fault of the apk building! |
@ealanhill - Thanks for cracking the apk issue. @octavpo - story.read::story_22 doesn't have the bug above, but it has as much trouble as ever recognizing my voice. RoboTutor wrote the recording as RoboTutor_debug.2.0.0.0_2018.10.05.13.59.56_6105001158.wav, but it sounds severely sped-up. |
@JackMostow yes, it says that it's version 2.0.0.0, and I think Robotutor is much higher than that |
The code on reading modes is 2.3.0.1, so I don't know what built you guys are using, but it's not from the current branch code. I was trying to build an apk myself, but since @ealanhill was saying I should be on the latest version and I had ignored Studio's message to update for the last week, I thought it would be a good idea to do so. But then the update went wrong, so I had to reinstall it. That worked, but now when I try to build I get a message saying I need to select Android SDK. No idea how to do that. Any help? I'm an Android 3.2. |
@JackMostow Before trying to update Studio I had run story.hear::story_54 and story.read::story_22, and they both work fine. 'parrot' and 'echo' tutors are not available on the matrix generated by this branch, as they're not on development either. That shows again you're not running my code. |
Ok it looks like I needed to do some sync-ing, now it compiles. I'll build an apk soon. |
@octavpo - 5pm today is the deadline to submit code to Kevin after our testing shows that it works properly. You were off-line for 8 hours. Meanwhile, @ealanhill was kind enough to find and fix the bug in your code and build an apk I could finally run. |
What you're saying about is not true, @ealanhill hasn't fixed anything and gave you an apk that's not the current state of the reading modes branch. And up to now it wasn't my job to create apks, so no surprise i didn't know how to do it. I do need to sleep from time to time. Anyway, I've built an apk with the code from the branch and put it in the download area if you still want to test it, as you wish. It's still 2.3 at this point, I'll work on merging the latest development code next. I don't have a problem if you don't want to include it, but the speech recognition is working definitely much better in my testing. |
I’m recording Leonora till ~5 but will then try your apk. |
I'm sure it wasn't the current version on the branch, since it was showing 2.0 instead of 2.3 and it had a different activity matrix. I've rebased my branch to the latest version on development, 2.5.0.0, and put a new apk in the download area. I don't get an activity selector matrix since the merge though. I thought it was because I had chosen a release version, but now I chose debug and I still don't see it. |
It seems the missing activity selector matrix is because Kevin has just pushed the external configuration facility to the development branch. So the code now defaults to a release version unless you have a configuration file that says otherwise. So to get it back you need to push the config.json file I just put in the Google drive download area to /sdcard/Download. |
octav_robotutor.release.2.3.0.1.apk had the same repeated-sentence problem and no debugger menu so I couldn't test ASR. I just installed octav_robotutor.debug.2.5.0.0.apk. |
Despite "debug" in the name, it has no debugger menu. Is there a configuration file to enable it? |
See my message above. |
octav_robotutor.debug.2.5.0.0.apk also takes a long time to display a story and then has the repeated-sentence bug. |
I don't see the issues you're seeing, it works fine on my tablet. Maybe I need to build it for an older API? What Android version are you running on your tablet? |
The configuration file enabled the debugger menu, thanks. |
I'm running Android 7.1.2 on a Google Pixel C. You? |
Now I do see an error when running the signed version, which I'm not getting when running from Studio. And it does slow down tutor starts, that's probably what you're getting. It's something about class access, I'll see if I can do anything about it. It might be an issue with gradle built files, which I don't know much about. I don't see the high false rejection rate, in my tests I'd say it has a high false acceptance rate. That's due to the very constraint language model (basically just the current sentence), I can't do much about. For me very rarely it's rejecting a word. Maybe it has something to do with your American accent. I did see that sometimes it gets stuck on the last word and I needed to repeat it a few times. It looks like that's something with the recognizer itself, I don't see it returning any words. The last word can be skipped, but only if the recognizer returns something. Not sure I can fix that, I'll see. |
I'm running Android 8.1.0. Now that I saw the error I'm guessing it has nothing to do with this. |
I fixed the error that was slowing tutor load down. It was an issue left from the reading mode changes, very strange that it wasn't showing when running from Studio. So I put a new 2.5 apk in downloads if you want to try it. About the high rejection rate, is it possible to have it tested by a native speaker? Or maybe at least a non-native English speaker? I definitely see a big improvement in my testing. |
I probably don't need to say this, but of course it also matters if you're testing in a quiet environment or with a headset. It still takes other noises and voices and tries to match them, and then because of the very restricted language model it will probably succeed to return a word that's not the right one. |
XPRIZE doesn't allow headsets or any peripheral devices. Were you using a headset mic? |
@octavpo - The slowdown error was in the reading_modes branch not yet absorbed into the development branch, right? Your changes were too late to test much, so I'm skeptical that they'll make it into the beta. |
I was just wondering if ambient noise can explain why you're getting a different experience than me. I'm using a headset when there's noise around, but I've tested without one too and I'm getting similar results. I'm not doing any silence compression. I just retested and I don't see any playback speed-up on my tablet. Could it be some setting on your tablet? Might help to restart it. All changes are on the same branch. |
This issue was originally part of Revise READ UI/UX tracking #157:
Problem: get READ’s “blame the kid” strict left-to-right policy not to encourage kids to, read, one, word, at, a, time, unlike Project LISTEN’s Reading Tutor, which used “chase the kid” to track the kid through the text.
Solution:
Currently RoboTutor underlines the word it expects next. It turns words green if accepted and red if not.
Requiring acceptance of each word before advancing to the next word discourages fluent reading due to false rejections.
Change this behavior as follows:
Modify position tracking to tolerate an individual missed word but not 2 in a row.
a. MultiMatch aligns ASR output to text to minimize total penalties for mismatches and jumps. It may or may not be the right mechanism to tolerate isolated missed words, e.g. by aligning only to i, i+1, or i+2.
b. The policy means that the expected next word can skip from word i to i+2 but not further.
c. Underline the expected next word to indicate the current position i.
d. The current position should not move backwards from word i to j < i because it would be confusing.
e. The current position can skip from i to i+2 if the ASR output ends with ... word i-1 word i+1.
Color words green when accepted, red when rejected, and (still) black when skipped.
a. Or should skipped words turn red?
b. It's ok to visibly accept and credit a previously rejected word, but not to rescind credit once granted.
When the reader pauses (0.5 sec) before reading word i, echo word(s) j..i-1, where word j-1 was the last echoed word, and turn each word green after echoing it. When RoboTutor reads just one word, use its isolated narration for clarity. Otherwise use the appropriate subsequence of its human narration. Note that:
a. Reading, word, by, word will echo a word 0.5 sec after it's spoken, but wait patiently for the next word.
b. Reading a phrase at a time will echo the phrase 0.5 sec after it's spoken.
Treat a tap almost like reading the current word:
a. A tap after a hesitation will just read (and credit) the current word i.
b. A tap with no hesitation before word i will reread the word(s) j..i. I don't expect this case to occur often.
Notes:
Project LISTEN's Reading Tutor changed the background color of each word to yellow while reading it, to show its position. This would be a nice feature if there's time, to make clear which word is being read.
Project LISTEN's Reading Tutor deferred crediting words green rather than turn them green right away. We can implement that behavior if the behavior above causes kids to wait for each word to turn green before reading the next word.
Replacing underlining with RoboFinger would be cute but is not worth the effort, especially if it requires representing the position of each text word on the screen.
The text was updated successfully, but these errors were encountered: