Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scatter plots for grid views #343

Open
K-Meech opened this issue Jul 8, 2021 · 33 comments
Open

Scatter plots for grid views #343

K-Meech opened this issue Jul 8, 2021 · 33 comments

Comments

@K-Meech
Copy link
Collaborator

K-Meech commented Jul 8, 2021

Grid views currently do not support scatter plots. This would be very useful for data with a lot of whole-image related measurements.

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

@K-Meech @constantinpape

I had a look and I could hack something together but I think it would be better if we spent some time on this to make it proper.

The point is that I think that the GridSourceTransformer and the SegmentationSourceDisplay will have a lot of common fields (both in the java code and in the JSON spec). Essentially everything that has to do with the table, the scatterplot and "segements/grid position" selection and coloring modes and so on will be identical.

Thus, I feel on the Java side those two classes should probably both extend the same class, maybe something like ImageRegionDisplay, which could contain all the common fields.

@constantinpape how are things looking on the JSON spec side in this regard? Is there a way to "inherit specifications" or will the GridSourceTransformer and SegmentationSourceDisplay simply have a lot of common fields?

@constantinpape
Copy link
Contributor

Having a common base class sounds like the way to go.

@constantinpape how are things looking on the JSON spec side in this regard? Is there a way to "inherit specifications" or will the GridSourceTransformer and SegmentationSourceDisplay simply have a lot of common fields?

jsonschema does not directly support inheritance, but it's possible to have references. So we could have external definitions of all the shared attributes and then just reference them in the grid transformation and segmentation display spec to avoid duplication.

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

Writing this, both in Java and in the spec it feels that the GridSourceTransformer is doing a bit too much right now, since it is both transforming the sources to land at the right grid positions, but is also does all the table stuff. I wonder whether we could separate these two funtionalities on the spec side? I think it comes back to the discussion that we had in zoom: Maybe we can somehow have a GridSourceDisplay in the JSON, which can have the table and scatterplot stuff and would live in parallel to the GridSourceTransformer?

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

...I feel if we find a way to do this it would probably also fix the issue that we discussed with the many image tables for a plate view...

@K-Meech
Copy link
Collaborator Author

K-Meech commented Jul 8, 2021

I agree, having a GridSourceDisplay makes a lot of sense. In the end, it's making a new row in the MoBIE GUI for the grid, so it seems a lot more consistent to have this listed as a display (rather than only a transform)

I guess you could, rather than a sources field, have a gridTransform field where you give the name of the grid transform? Then the transform could stay where it is currently?

@constantinpape
Copy link
Contributor

constantinpape commented Jul 8, 2021

I agree that it makes sense to separate the gridTransform and gridTable / gridDisplay functionality more, but I am not so sure what the best option is:

  • Right now gridTransform is part of the sourceTransforms and I am not sure it would be a good idea to move transform specific things into a gridDisplay (because that creates dependencies in the view). That's more a concern on the java side, it would be fine in the spec, so I don't mind doing this if you think it's ok. I think this is what Kimberly suggests, where we would add gridDisplay (or so) and have the gridTransform(s?) field in there instead of sources.
  • Alternatively we could also think about promoting grid to a source by adding the option {"grid": {...}} as a new source type, and then having gridSourceTransform as part of the view. I would like this more from the spec side, because it removes a lot of the duplication issues that motivated Composite views / meta views mobie.github.io#59.
    But I am not sure how feasible it is on the java side.

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

I have a slightly a different proposal, which is more flexible as it does not tie it to a GridTransformer.

public class SourceAnnotationDisplay
{
	// Serialization
	protected String name;
	protected List< List< String > > sources; // DIFFERENT
	protected Map< TableDataFormat, StorageLocation > tableData; // DIFFERENT
	protected List< String > tables;
	protected String lut = ColoringLuts.GLASBEY;
	protected String colorByColumn;
	protected double opacity = 1.0;
	protected Double[] valueLimits = new Double[]{ null, null };
	protected List< String > selectedSegmentIds; // this would correspond to the "region names", I guess...
	protected boolean showScatterPlot = false;
	protected String[] scatterPlotAxes; // DIFFERENT (no default)

Where (I think as in the GridTransfromer),

  1. the table rows correspond to the indices of the List< List< String > > sources
  2. the annotation interval is drawn such that is covers the union of the sources at each "position", i.e. the inner list
  3. the tables must have one column called name (corresponding to the label_id).

This is almost identical to the SegmentationSourceDisplay, apart from the places that I marked as different:

public class SegmentationSourceDisplay
{
	// Serialization
	protected String name;
	protected double opacity = 1.0;
	protected List< String > sources;  // DIFFERENT
        protected List< String > tables;
	protected String lut = ColoringLuts.GLASBEY;
	protected String colorByColumn;
	protected Double[] valueLimits = new Double[]{ null, null };
	protected List< String > selectedSegmentIds;
	protected boolean showSelectedSegmentsIn3d = false; // DIFFERENT
	protected boolean showScatterPlot = false;
	protected String[] scatterPlotAxes = new String[]{ Constants.ANCHOR_X, Constants.ANCHOR_Y };

I am not sure if the name selectedSegmentIds is still the best if we are using it for both cases. Maybe something like selectedTableRowIds would be better?!

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

...oh, and I already know a beautiful use-case for this: the tomograms in the yeast-CLEM project, which are not in a grid but it could be still very nice to browse them via a interactive table 😃

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

Alternatively we could also think about promoting grid to a source by adding the option {"grid": {...}} as a new source type

We could do this in addition to my proposal; as it would not change anything but just be another way to specify the protected List< List< String > > sources in the proposed SourceAnnotationDisplay.

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

Regarding the selectedTableRowIds: maybe we should be more explicit and require that there actually is a column with that string? Right now we are creating the ID on the fly in Java combining multiple columns. This works, but I think it may actually complicate things both on the spec as well as on the Java side.

@constantinpape
Copy link
Contributor

Ok, we decided to implement #343 (comment). This way the SourceAnnotationDisplay is independent from the grid transform, the grid transform just happens to arrange the same sources.
SegmentationSourceDisplay and SourceAnnotationDisplay can share the same base class, but we decided to have selectedSegmentIds and selectedSourceAnnotationIds as separate fields in the inhereriting classes, so that we don't have to change the current logic for the segmentation tables.

Also, we will not introduce grid sources (for now, we can consider doing this later; I will open another issue about this).

I will update the spec and prepare an example in the zebrafish project.

@K-Meech
Copy link
Collaborator Author

K-Meech commented Jul 8, 2021

Ok - great! Making it more generic is a great plan, as there are plenty of use cases for annotating / navigating sources that don't necessarily need to be arranged in a grid :)

I'm a bit unsure how these tables will look. Would they have a column for annotationId? Or they are just being matched based on the order of the rows being equal to order of the items in List< List< String > > sources?

@constantinpape
Copy link
Contributor

@K-Meech that's a good point, we didn't fully discuss this.
I would just add a column sourceAnnotationId (to be consistent with the name) and than indeed have the same order as in the outer list of sources.

We would also need to discuss which format selectedSourceAnnotationId has. I guess timepoint-sourceAnnotationId should be sufficient for uniqueness?!

What do you think, @tischi?

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

  1. I think it should be source_annotation_id to be consistent with the label_id?
  2. yes, I think timepoint - source_annotation_id should do the job
  3. @K-Meech currently, the table rows MUST be in the same order as the List< List< String > > sources, as the indexing would be used for the matching. We could think about changing this to a Map< String, List< String > > sourceAnnotationIdToSources to make things more explicit and not rely on the ordering. I think I'd like this (even though if an annotation position just contains single source it feels a bit annoying to have to give it another name, which probably would be just the same name as the source name itself).

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

We could also consider this for the GridTransformer: LinkedHashMap< String, List< String > > gridIdToSources, as this would give each grid position a unique name that we could use if we ever want to refer to a grid position as a "source". I am writing LinkedHashMap here, because the order to the items in the Map would still be important here as they would determine the Grid position if no specific grid position is given (if you just say Map in Java you cannot be sure in which order you will get the keys). For the case of providing specific grid positions we would then also have to change this to Map< String, int[] > gridIdToGrindIndices.

@K-Meech
Copy link
Collaborator Author

K-Meech commented Jul 8, 2021

@tischi I think it'd be nice to switch to Map<String, List<String>> - I think this is easier to understand + means assembling tables is a lot easier (no row order constraints!)

source_annotation_id sounds good

For the GridTransformer - maybe wait until there is a specific use case for this? I don't think it's super vital for now to change it.

@constantinpape
Copy link
Contributor

  • I think it should be source_annotation_id to be consistent with the label_id?

  • yes, I think timepoint - source_annotation_id should do the job

Ok!

LinkedHashMap< String, List< String > > gridIdToSources

If you prefer a (linked) map for all this (sources in sourceAnnotationDisplay, gridIdToSources, gridIdToGridIndices)
we should change this all in one go, because I need to change the spec for the grid view now anyways.
Would be fine with me, I don't have any preferences here.

@tischi
Copy link
Contributor

tischi commented Jul 8, 2021

we should change this all in one go

I agree.

Would be fine with me, I don't have any preferences here.

I don't have a very strong preference for the GridTransformer either... However, I think I would vote for making it a Map as well, because then for the HTM we could give names such as wellA01 as gridIds which may make human inspection of the JSON easier.

=> All Maps, please (my opinion).

@constantinpape
Copy link
Contributor

=> All Maps, please (my opinion).

Ok, I will change it for all three then, will ping you when the example is there.

@tischi
Copy link
Contributor

tischi commented Jul 9, 2021

@constantinpape
How should we name the fields?

public class SourceAnnotationDisplay extends TableDisplay< AnnotatedIntervalTableRow >
{
	// Serialization
	protected Map< String, List< String > > annotatedSources; // or just "sources"?
	protected List< String > selectedSourceAnnotationIds;

@tischi
Copy link
Contributor

tischi commented Jul 9, 2021

I am thinking now that just sources is probably better.

@constantinpape
Copy link
Contributor

I am thinking now that just sources is probably better.

Yes, let's just keep it as sources

@tischi
Copy link
Contributor

tischi commented Jul 9, 2021

@constantinpape

Would it be OK for you to simply call it selectedAnnotationIds instead of selectedSourceAnnotationIds?

protected Map< String, List< String > > sources; // annotationId to sources
protected List< String > selectedAnnotationIds;
protected Map< TableDataFormat, StorageLocation > tableData;

@constantinpape
Copy link
Contributor

Would it be OK for you to simply call it selectedAnnotationIds instead of selectedSourceAnnotationIds?

Yes, no problem.

I have the changes to the spec almost ready, I only need to create the example data; will do that later.

@constantinpape
Copy link
Contributor

LinkedHashMap< String, List< String > > gridIdToSources

@tischi I just realized that you should not rely on the sorting here!
I save the jsons with sort_keys=True, because this makes them more readable. However, this means that the keys of the grid sources will be string sorted and not (as would be correct) int sorted.

@constantinpape
Copy link
Contributor

@tischi
Copy link
Contributor

tischi commented Jul 10, 2021

Thanks! I guess the sourceNamesAfterTransform should then also be a Map? I don't think you have it in the example...

@tischi
Copy link
Contributor

tischi commented Jul 10, 2021

@constantinpape the column names in tables for the sourceAnnotationDisplay are still grid_id and membrane; I think there should be columns for annotation_id and timepoint, right?
(we originally said source_annotation_id but I think just annotation_id is better...)

@constantinpape
Copy link
Contributor

the column names in tables for the sourceAnnotationDisplay are still grid_id

Yes, I forgot to update the tables. The first column is now called annotation_id; I pushed the change to the branch already.

and timepoint, right?

I am not sure if there should be a timepoint column: for the segmentation tables we also don't have them in the table and you add this column when loading the table in MoBIE.

If possible I would keep the same loging for the source annotation tables; but if you need the timepoint column for technical reasons here I can also add it.

@tischi
Copy link
Contributor

tischi commented Jul 12, 2021

in other words: the timepoint column is optional, right?

@constantinpape
Copy link
Contributor

in other words: the timepoint column is optional, right?

Yes, indeed it's optional and only there if we actually have multiple timepoints.
I just checked and that's how we also handle it for the segmentation tables, see https://raw.githubusercontent.com/mobie/arabidopsis-root-lm-datasets/spec-v2/data/arabidopsis-root/tables/lm-cells/default.tsv.

@constantinpape
Copy link
Contributor

I think this one is fairly high priority because we can use it for some of the planned figures. @K-Meech is this something you could look into?

@K-Meech
Copy link
Collaborator Author

K-Meech commented Aug 12, 2021

Yes - I can take a look :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants