Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement plan for creating ground truth YAML files, converting to Dhall, and deploying new ICEES instances #29

Open
karafecho opened this issue Oct 5, 2021 · 4 comments
Assignees

Comments

@karafecho
Copy link
Collaborator

karafecho commented Oct 5, 2021

(1) QC Priya's identifiers YAML (fix errors, adjust mappings, update search terms in all_features and add identifiers to identifiers file). (2) Add SeptRelayFix identifiers and fix spelling error (https://github.com/NCATS-Tangerine/icees-api-config/pull/28/files) and add identifiers for methylprednisone below. (3) Check variable names in identifiers YAML against ICEES tables for all use cases. (4) Check ICEES COVID for correct dexamethasone; if present, register with ARS.

{
  "PUBCHEM.COMPOUND:124653": {
    "id": {
      "identifier": "PUBCHEM.COMPOUND:124653",
      "label": "6alpha-Methylprednisone"
    },
    "equivalent_identifiers": [
      {
        "identifier": "PUBCHEM.COMPOUND:124653",
        "label": "6alpha-Methylprednisone"
      },
      {
        "identifier": "UNII:621XR2W6OP",
        "label": "6.ALPHA.-METHYLPREDNISONE"
      },
      {
        "identifier": "DRUGBANK:DB12952"
      },
      {
        "identifier": "MESH:C050574",
        "label": "6-methylprednisone"
      },
      {
        "identifier": "CAS:91523-05-6"
      },
      {
        "identifier": "INCHIKEY:SVYCRJXQZUCUND-PQXSVQADSA-N"
      }
    ],
    "type": [
      "biolink:SmallMolecule",
      "biolink:MolecularEntity",
      "biolink:ChemicalEntity",
      "biolink:NamedThing",
      "biolink:Entity",
      "biolink:PhysicalEssence",
      "biolink:PhysicalEssenceOrOccurrent",
      "biolink:ChemicalOrDrugOrTreatment",
      "biolink:ChemicalEntityOrGeneOrGeneProduct",
      "biolink:ChemicalEntityOrProteinOrPolypeptide"
    ]
  },
  "PUBCHEM.COMPOUND:5877": {
    "id": {
      "identifier": "PUBCHEM.COMPOUND:5877",
      "label": "Methylprednisolone acetate"
    },
    "equivalent_identifiers": [
      {
        "identifier": "PUBCHEM.COMPOUND:5877",
        "label": "Methylprednisolone acetate"
      },
      {
        "identifier": "CHEMBL.COMPOUND:CHEMBL1364144",
        "label": "METHYLPREDNISOLONE ACETATE"
      },
      {
        "identifier": "UNII:43502P7F0P",
        "label": "METHYLPREDNISOLONE ACETATE"
      },
      {
        "identifier": "CHEBI:6889",
        "label": "methylprednisolone acetate"
      },
      {
        "identifier": "MESH:C000873",
        "label": "[OBSOLETE] methylprednisolone acetate"
      },
      {
        "identifier": "MESH:D000077555",
        "label": "Methylprednisolone Acetate"
      },
      {
        "identifier": "CAS:53-36-1"
      },
      {
        "identifier": "DrugCentral:1770",
        "label": "methylprednisolone acetate"
      },
      {
        "identifier": "KEGG.COMPOUND:C08179",
        "label": "Methylprednisolone acetate"
      },
      {
        "identifier": "INCHIKEY:PLBHSZGDDKCEHR-LFYFAGGJSA-N"
      }
    ],
    "type": [
      "biolink:SmallMolecule",
      "biolink:MolecularEntity",
      "biolink:ChemicalEntity",
      "biolink:NamedThing",
      "biolink:Entity",
      "biolink:PhysicalEssence",
      "biolink:PhysicalEssenceOrOccurrent",
      "biolink:ChemicalOrDrugOrTreatment",
      "biolink:ChemicalEntityOrGeneOrGeneProduct",
      "biolink:ChemicalEntityOrProteinOrPolypeptide"
    ]
  },
  "PUBCHEM.COMPOUND:6741": {
    "id": {
      "identifier": "PUBCHEM.COMPOUND:6741",
      "label": "Methylprednisolone"
    },
    "equivalent_identifiers": [
      {
        "identifier": "PUBCHEM.COMPOUND:6741",
        "label": "Methylprednisolone"
      },
      {
        "identifier": "CHEMBL.COMPOUND:CHEMBL650",
        "label": "METHYLPREDNISOLONE"
      },
      {
        "identifier": "UNII:X4W7ZR7023",
        "label": "METHYLPREDNISOLONE"
      },
      {
        "identifier": "CHEBI:6888",
        "label": "6alpha-methylprednisolone"
      },
      {
        "identifier": "DRUGBANK:DB00959"
      },
      {
        "identifier": "MESH:D008775",
        "label": "Methylprednisolone"
      },
      {
        "identifier": "CAS:83-43-2"
      },
      {
        "identifier": "DrugCentral:1768",
        "label": "methylprednisolone"
      },
      {
        "identifier": "GTOPDB:7088",
        "label": "methylprednisolone"
      },
      {
        "identifier": "HMDB:HMDB0015094",
        "label": "Methylprednisolone"
      },
      {
        "identifier": "INCHIKEY:VHRSUDSXCMQTMA-PJHHCJLFSA-N"
      }
    ],
    "type": [
      "biolink:SmallMolecule",
      "biolink:MolecularEntity",
      "biolink:ChemicalEntity",
      "biolink:NamedThing",
      "biolink:Entity",
      "biolink:PhysicalEssence",
      "biolink:PhysicalEssenceOrOccurrent",
      "biolink:ChemicalOrDrugOrTreatment",
      "biolink:ChemicalEntityOrGeneOrGeneProduct",
      "biolink:ChemicalEntityOrProteinOrPolypeptide"
    ]
  }
}
@karafecho karafecho self-assigned this Oct 5, 2021
@karafecho karafecho changed the title Add identifiers for methylprednisone to updated identifiers.yml Implement plan for updating all_features and identifiers YAML files Oct 5, 2021
@karafecho
Copy link
Collaborator Author

karafecho commented Oct 5, 2021

(6) Change IN_ICEES to IN_UNC_Health, all config files.

@karafecho
Copy link
Collaborator Author

karafecho commented Oct 11, 2021

Update, 10.27.21:

Below is a quick update on our efforts to correct mapping issues with the ICEES YAML files and redeploy the ICEES instances. (1), (2), and (6) above have been completed.

  1. We now have two "ground truth" YAML files, all_features_version_04.yaml and identifiers_version_03.yaml. Note that I only QC'd the variables for the patient tables, not the visit tables, given that we will be using one set of variables for both tables after we fully migrate from a YAML-based to a Dhall-based config system.
  • I hid the visit tables (# visit) in the main files and deleted those variables in a second set of files; plan is to merge both sets of files with main

COMPLETE

  1. Note that while I am considering the files to be "ground truth", we likely will need to refine some of the mappings over time. The labs, for instance, are a bit inconsistent, in part due to the fact that Translator does not yet do a great job with labs.

NO ACTION REQUIRED

  1. We have not yet created a "ground truth" FHIR_mappings_version_01.yaml file. However, since this is needed only to map to FHIR data elements and generate new ICEES tables via FHIR PIT, this step is not high priority. 

ACTION: Hong is investigating non-Athena approaches to do this. After this file has been finalized, we can create a new set of Dhall files and hopefully consider the migration complete.

  1. We have several sets of ICEES+ integrated feature tables on ebcr0.edc.renci.org or (in the case of COVID) covid-db-dev.edc.renci.org. These need to be QC'd such that the headers in the data files align with the variable names in the two ground truth YAML files.
  • Complete for DILI
  • Complete for Asthma
    ACTION: Plan to be considered for COVID, which isn't really part of Translator, but is being called by Translator ARAs, so decision needed as to whether this is a priority
  1. Several variable values within the data files for DILI dataset need to be binned to match the variable enumeration.
  • I plan to complete this and upload to ebcr0.edc.renci.org
    COMPLETE
  1. Redeploy the ICEES asthma instance and ICEES DILI instance

  2. Deploy an initial ICEES PCD instance, using the initial set of tables currently on Rockfish

  3. Deploy a new ICEES APR2021 COVID instance after generating a new set of tables.

  4. Test adjustments to the default P value by way of direct ICEES query.

  • Consider whether this is needed before the Dec demo
  1. Adjust the formatting of edge attributes that are returned to users.
  • Align with COHD and Clinical Risk Provider's formatting

@karafecho karafecho changed the title Implement plan for updating all_features and identifiers YAML files Implement plan for creating ground truth YAML files, converting to Dhall, and deploying new ICEES instances Oct 11, 2021
@karafecho
Copy link
Collaborator Author

Clean repo and add config file schema doc.

@karafecho
Copy link
Collaborator Author

FYI: I'll break this ticket down into individual tickets after we agree on the plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants