-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotation fixes and enhancement #4
Comments
I would like to expand on the first point, after using the app more, it seems very inconsistent when annotations are grouped into segments. I could not identify a pattern. Sometimes it creates 1 segment for each annotation, sometime it doesnt. |
@bpadovese, is this problem happening only when importing annotations to create the segments? |
Yes. only when importing annotations to create the segments. |
The algorithm for importing annotations is designed like this:
@fsfrazao @bpadovese The algorithm can be adjusted, but you guys need to let me know what exactly is the final result you expect. |
I think a better way would be to separate the import "annotations action" from the creation of segments (I believe this is what Bruno was suggesting above) The workflow could be something like this:
In this case, when importing annotations the annotator field would be set to the developer who is importing them. Does that make sense? |
I guess the main point is that the model developer would like more control over what the segments look like. I think what I suggested above would provide a flexible option, but other ways to achieve the same result could be fine too. |
@fsfrazao I hope you can explain a little more, I do not quite understand the role of the segment. Why the model developer needs to control the creation of segments in the import process. It seems to me that segments just provide an intermediate means between files to annotations, which is just to help ketos processing. Does it matter to the annotator how segments are generated? What is the difference for the annotator whether each annotation is exclusively mapped to a single segment or several annotations mapped to a shared segment? or any other combinations? Wouldn't the latter be more effective? The current algorithm is very effective in automatically generating the necessary segments, what is the difference compared to the segments created manually by the model developer? Maybe the latter can customize the length of segments, but does this have any particular purpose? |
Well, the segment is also a way for the model developer to control what they want to expose to the annotator in terms of temporal context. In a validation scenario where the developer wants the annotator to verify existing annotations (generated by a model or another anotator), the model developer might ask for different tasks from the annotators, such as:
All of these can be use the same imported annotations, but the second and third tasks would be impossible if the segment does not extend beyond the annotation. This is the kind of thing Bruno is trying to do. His detector-generated outputs are always 3 seconds, , but he might want to show 1 minute long segments to the annotators so they can not only confirm if the model detection was correct but also if it missed something in the vicinity of each detection. |
@fsfrazao Right now, It is relatively easy to provide some controls for generating segments in the import annotation interface, such as the length of the segment, whether to map multiple annotations to a single segment or to generate a single segment per annotation. There will be some edge cases, e.g. some annotations may cross the boundary of a segment, so it is also possible to control whether there is some overlap between segments. How these segments are generated depends on how the user wants to use the feature eventually. It is possible to implement such control on HALLO with the current model. However, To implement the scenario you described earlier, for example importing annotations and segments separately and then associating them, would require major changes to the database and backend. In the current database design, annotation, batch, and segment are in a one-to-many relationship, so each annotation must exist in a batch and must be associated with a segment. the scenario you mentioned requires a many-to-many association in the database, and separate tables to track the association between segment, batch, and annotation. This design would increase the overall complexity of data processing and the user interface would need to be adjusted accordingly. I feel this can be developed as a requirement for MAPIL, or just redevelop HALLO for the 2.0 version. |
I think what you suggested in the first paragraph is just fine for now. So putting it into bullet points, would it be reasonably easy to change the algorithm to this?:
However, I think for MAIPL, what you described in the second paragraph would be ideal. For the user, it wouldnt make a difference, the functionality would be the same. But I believe it would add flexibility for any future feature we would like to add to MAIPL if we treat annotations/batches/segments independently and associated them in a may-to-many relation. How does this sound? |
Yes, I can adjust the algorithm to achieve this requirement. I don't know if I understand this accurately: For example, if the user specifies that each segment is 60 seconds long, then the file will be segmented into 60s long segments and annotations will be mapped to the corresponding segments. If the end time of an annotation exceeds the end time of a segment, the end time of the segment needs to be extended accordingly. No padding means the last segment can be ended at the end of the file and does not need to be extended to the 60s length, right? |
Yes, that is what i mean. Does this sound good to you as well? @fsfrazao |
That sounds good to me too. Just to confirm that I understand it correctly, if there's an annotation table listing only one 3s annotation for one file that is 15min long and the user specifies 60s segments, will the batch contain 15 long segments with a duration of 60s each, and one of them will have the annotation? |
Yes, this is the updated algorithm. |
Opening this issue again. The algorithm and interface are think are much better and easier to understand now. I have just one request. Currently the algorithm is working more or less in the following way:
I think this is good as is if the model developer want the annotator to check for additional false positives in the file. However, would it be possible to add an additional option (a checkbox) that would delete segments that did not contain any annotations? Therefore the end result would be a batch where each segment of length X contained an annotation. The algorithm would essentially be the same with one extra step:
IF checkbox is checked
the use case of this would be similar to mine, where I just want to validate my models detection but I don't care about False positives. Does this make sense to you guys? |
That makes sense, Bruno. |
yup, it's possible. Just need to filter the segments and delete those ones that don't have annotations on them after the importation. |
There is an issue when importing the annotation and creating the segments.
The annotations themselves are imported correctly however the created segments are not. There are to issues here.
See images below for the issues:
We could consider adding the segments separate from the annotation.
The text was updated successfully, but these errors were encountered: