Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diagram - Fix the problem of Correct Positioning / Ordering and Spacing of the Bands on the Chromosome/Ideogram #407

Open
sprintell opened this issue Sep 25, 2024 · 2 comments
Assignees

Comments

@sprintell
Copy link
Member

sprintell commented Sep 25, 2024

The bands in the new created scalable diagram were previously randomly positioned on the chromosome which will not be acceptable. and needed to be fixed

WE NEED TO ENSURE THEY ARE ORGANIZED:

  1. According to the expected sequencial order (not random):

ch12-pic4.png

  1. In the exact/ accurate position where they are meant to be on each chromosome with the exact spacing on the chromosome

SOLVING PROBLEM 1:
An algorithm has been developed to solve Problem 1 while dynamically generating the trait dots, and constructing the svg diagram

SOLVING PROBLEM 2:
For this to happen we need to know where exactly in each of the chromosomes the Region Lines / Bands must be positioned, else the developer will have to guess.

Two Possible Options to solve this problem is to:

A. Either extract the knowledge from the existing diagram (but this is going to be very difficult)

An instance of the code that specifies the region lines and plots it on the chromosome in the old diagram is shown below:

<g id='http://rdf.ebi.ac.uk/dataset/gwas/Trackable/12856' transform='translate(0.0,0.0)' class='gwas-trait'>
<path d='m 38.795875,164.11187499999997 11.382812500000002,0.0 22.765625000000004,0.0' style='fill:none;stroke:#211c1d;stroke-width:1.1;stroke-linecap:butt;stroke-linejoin:miter;stroke-miterlimit:4;stroke-opacity:1;stroke-dasharray:none' />
</g>

There are over a thousand of this responsible for the region Lines and their exisence on the plot

The coordinate that holds the actual position of the bands on the Chromosome is hardcoded in the path tag above under the svg atttribute "d" as in for the example below as extracted from the old diagram:

d='m 38.795875,164.11187499999997 11.382812500000002,0.0 22.765625000000004,0.0'

The value 164.11187499999997 is what places the given band line on the chromosome, and there is no identifier in that tag that tells what region Line it represent.

Hence to extract knowledge from this, each of the tag has to be plotted individualy and visualized one after the order to unravel which region it represents ont he plot, after which we can make use of the coordinate. This is possible but will take some time of manual extraction of this coordinate

B. Alternatively, extract the knowledge from a static image

Such as found in Wikipedia for each of the chromosome

Human_chromosome_1_ideogram_vertical.svg.png

This is possible as well, and will require less time than option A.

CONCLUSION:
Implement and deploy solution to problem 1, the diagram has a lot of traits that shifts the angle of inclinationof the region line so much that it might not of kuch significance the actual spacing between them

We deploy this, and implement solution to problem 2 Later

@sprintell sprintell self-assigned this Sep 25, 2024
@sprintell
Copy link
Member Author

The new diagram has 3 services and uses Solr:

1. THE DIAGRAM DATA RELEASE:
a. This is similar to the previous pussycat that does data loading during data release
b. It doesn’t build diagram anymore, it only prepares the data to be retrieved as JSOn format
b. It retrieves data from Oracle and loads/organize it into Solr (Making it for easy retrieval for the diagram UI) (edited)

2. GWAS DIAGRAM BACK END API:
a. This service provides different endpoints to be called by the UI to dynamically build the diagram on the fly
b. It queries the prebuilt Solr to get chromosomes, regions, associations etc

Example - Chromosome Endpoints:
http://gwas-snoopy.ebi.ac.uk:8685/chromosomes/1
http://gwas-snoopy.ebi.ac.uk:8685/chromosomes/2
...
http://gwas-snoopy.ebi.ac.uk:8685/chromosomes/22
Association Endpoint For the UI Pop Up:
http://gwas-snoopy.ebi.ac.uk:8685/associations?region=3p12.3&efo=self%20reported%20educational%20attainment

3. GWAS DIAGRAM INTEGRATED IN THE CATALOG UI:
a. This can be accessed on the catralog UI at the url /v2/diagram
b. I’ve also implemented zoom, drag and pan feature

IMPORTANT NOTE FOR DEVS:

  1. To avoid merging to dev (to ensure other UI features can be released b4 we release d diagram into production) while allowing curators to test/feedback. So I’ve just deployed all of the services manually as embedded jar in snoopy
  2. The Diagram is not using our dev solr at the moment:
    I was also facing some Marshalling problems when retrieving data after loading into solr in dev even though everythign was working properly locally (I remembered Sajo had similar problems in the past), so I discovered it was due to our Solr instances are old, (Solr 5.X.X)
    So, I temporarily deployed a new up to date Solr 8.11.4 here (/nfs/gwas/data/dev1/tools/solr-2024) which is where the diagram data is loaded

SUMMARY:

  1. Test Diagram API: http://gwas-snoopy.ebi.ac.uk:8685/chromosomes/1
  2. Test GWAS UI for Testing Diagram: http://gwas-snoopy.ebi.ac.uk:8082/v2/diagram
  3. Test Diagram Solr: http://gwas-snoopy.ebi.ac.uk:8983/solr

@sprintell
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant