Skip to content

Commit

Permalink
Merge pull request #35 from UW-GAC/test_data
Browse files Browse the repository at this point in the history
Add pheno test data
  • Loading branch information
amstilp authored May 21, 2024
2 parents 16e1f95 + a9bdc3f commit 6835393
Show file tree
Hide file tree
Showing 17 changed files with 652 additions and 0 deletions.
21 changes: 21 additions & 0 deletions test_data/cancer_breast.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
subject_id age_at_obs visit breast_cancer_status_emerge_1 breast_cancer_status_registry_1 breast_cancer_status_survey_1 age_at_diagnosis_1 year_at_diagnosis_1 breast_cancer_type_1 cancer_behavior_1 her2_1 pr_1 er_1 T_stage_clinical_1 T_stage_pathological_1 T_stage_uknown_1 T_stage_clinical_2 T_stage_pathological_2 T_stage_unknown_2 nodal_involvement_1 distant_metastasis_1 stage_system_1 grade_clinical_1 grade_pathological_1 grade_unknown_1 screening_history_1 recurrence_1 surgery_1 radiotherapy_1 chemotherapy_1 hormone_therapy_1 NSAID_1 age_at_natural_menopause_1 post_menopausal_hormone_use_1 parity_1 age_at_first_birth_1 age_at_menarche_1 deceased_1 cause_of_death_breast_cancer_1 age_at_death_1
subject1 59 visit_1 0 0 0 63 2017 unilateral benign negative unknown positive unstaged stage 4 stage 3 unknown unknown unstaged N2 M1 NA grade 1 grade 1 grade 3 0 recurrence_second_primary 0 0 1 pharmaceutical 1 64 1 0 26 15 0 0 64
subject2 46 visit_1 0 1 0 46 2013 bilateral borderline unknown unknown negative unstaged stage 4 stage 3 localized distant in_situ NX MX NA grade 1 grade 3 grade 1 1 none 0 0 0 surgical 0 58 0 2 30 9 1 1 48
subject3 55 visit_1 0 0 0 62 2011 unilateral invasive positive negative positive stage 3 unstaged stage 1 unstaged unstaged regional N2 MX NA grade 2 grade 2 grade 1 0 unknown 1 0 0 surgical 1 62 1 2 32 18 0 0 72
subject4 55 visit_1 0 0 0 64 2017 bilateral borderline unknown negative unknown stage 3 stage 2 stage 4 localized localized regional N0 MX NA grade 2 grade 3 grade 3 0 unknown 1 1 0 pharmaceutical 1 59 0 2 30 14 0 0 78
subject5 62 visit_1 0 0 0 63 2016 unilateral benign positive positive negative stage 3 unstaged unknown regional regional unknown NX MX NA grade 2 grade 3 grade 2 0 recurrence_second_primary 0 0 0 unknown 0 63 1 2 30 13 0 0 77
subject6 55 visit_1 0 0 0 59 2002 bilateral in_situ negative positive positive stage 1 stage 4 stage 1 localized localized unknown N3 MX NA grade 2 grade 3 grade 2 1 none 1 1 1 both 0 57 0 0 23 20 0 0 73
subject7 61 visit_1 0 0 0 61 2002 bilateral invasive negative positive unknown stage 3 stage 4 stage 3 localized unstaged unknown NX MX NA grade 1 grade 1 grade 3 1 unknown 1 1 0 none 0 61 0 2 17 12 0 0 73
subject8 65 visit_1 0 0 0 68 2021 unilateral benign unknown positive positive stage 3 stage 3 unstaged in_situ localized unknown NX M0 NA grade 1 grade 3 grade 3 0 unknown 1 1 0 none 1 81 1 0 31 13 0 0 86
subject9 66 visit_1 0 0 0 72 2006 bilateral in_situ unknown positive unknown stage 4 stage 3 unknown regional regional regional N3 M1 NA grade 3 grade 2 grade 1 1 none 0 0 1 none 0 68 1 0 25 14 0 0 81
subject10 51 visit_1 0 0 0 62 2017 bilateral borderline negative unknown unknown stage 1 unknown stage 2 unknown distant unknown N2 MX NA grade 1 grade 1 grade 1 1 recurrence_primary 1 0 0 surgical 0 59 1 0 27 16 0 0 74
subject11 61 visit_1 0 0 0 63 2006 unilateral borderline negative positive negative stage 1 stage 4 unknown unknown regional unstaged NX M0 NA grade 1 grade 1 grade 3 0 recurrence_primary 0 1 1 unknown 0 77 0 1 30 17 0 0 67
subject12 55 visit_1 0 0 0 59 2004 unilateral benign positive negative unknown unknown unknown stage 1 unknown localized unstaged N1 MX NA grade 1 grade 1 grade 2 0 none 0 1 1 pharmaceutical 1 74 1 1 26 15 0 0 59
subject13 52 visit_1 0 0 0 55 2010 unilateral borderline negative positive unknown unknown stage 2 stage 2 unstaged localized regional N1 MX NA grade 2 grade 1 grade 1 0 unknown 0 1 0 surgical 1 53 0 1 25 12 0 0 66
subject14 66 visit_1 1 0 0 68 2007 unilateral benign unknown negative positive stage 4 stage 2 stage 3 in_situ distant unknown N1 M0 NA grade 2 grade 3 grade 1 1 recurrence_primary 1 1 1 both 0 82 0 2 22 11 0 0 72
subject15 57 visit_1 0 0 0 64 2008 bilateral invasive positive positive positive unstaged stage 2 stage 4 distant in_situ regional N0 M1 NA grade 3 grade 3 grade 3 0 recurrence_second_primary 0 1 1 both 1 66 0 1 22 12 0 0 80
subject16 57 visit_1 0 1 0 62 2003 bilateral invasive unknown positive unknown unstaged stage 4 stage 4 unknown unknown unstaged N0 M0 NA grade 2 grade 1 grade 1 1 recurrence_second_primary 0 0 0 both 0 57 1 1 26 19 1 0 70
subject17 67 visit_1 1 0 0 69 2014 unilateral in_situ positive negative unknown stage 1 unknown stage 2 localized regional localized N1 MX NA grade 2 grade 1 grade 3 1 none 0 1 1 none 0 74 0 1 32 14 0 0 85
subject18 59 visit_1 0 0 0 62 2009 bilateral borderline positive positive negative stage 2 stage 2 unknown unknown regional distant N3 M0 NA grade 3 grade 2 grade 2 0 recurrence_primary 0 1 0 pharmaceutical 0 59 0 1 23 12 0 0 74
subject19 67 visit_1 1 0 0 69 2001 bilateral borderline negative negative negative stage 1 stage 2 stage 1 unknown unstaged unknown N0 MX NA grade 2 grade 3 grade 2 0 recurrence_second_primary 1 1 1 both 0 78 0 1 32 18 0 0 73
subject20 62 visit_1 0 0 0 63 2013 unilateral benign unknown unknown positive stage 4 stage 2 unknown in_situ regional distant N0 M0 NA grade 1 grade 2 grade 3 0 recurrence_second_primary 0 0 0 surgical 0 75 0 2 28 16 1 0 64
21 changes: 21 additions & 0 deletions test_data/cancer_prostate.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
subject_id age_at_obs visit prostate_cancer_status_emerge_1 prostate_cancer_status_registry_1 prostate_cancer_status_survey_1 age_at_diagnosis_1 year_at_diagnosis_1 cancer_behavior_1 T_stage_clinical_1 T_stage_pathological_1 T_stage_uknown_1 T_stage_clinical_2 T_stage_pathological_2 T_stage_unknown_2 nodal_involvement_1 distant_metastasis_1 stage_system_1 gleason_score_clinical_1 gleason_score_pathological_1 gleason_score_unknown_1 psa_1 psa_at_diagnosis_1 screening_history_1 recurrence_1 surgery_1 radiotherapy_1 chemotherapy_1 hormone_therapy_1 NSAID_1 deceased_1 cause_of_death_prostate_cancer_1 age_at_death_1
subject1 59 visit_1 1 1 1 63 2017 borderline stage 1 stage 4 unknown localized unstaged in_situ N1 M1 NA 9 9 7 3.3273850651366184 0.6910954617090586 0 none 0 0 0 0 1 0 0 68
subject2 46 visit_1 0 1 0 46 2013 in_situ stage 3 unstaged unknown in_situ unstaged in_situ N0 MX NA 5 6 3 0.1117048051118088 1.114277099266359 1 recurrence_second_primary 1 1 0 0 0 0 0 70
subject3 55 visit_1 0 1 1 62 2011 borderline unknown stage 2 stage 3 regional distant unstaged NX M1 NA 8 3 8 1.1467807022659444 1.650565194078115 0 recurrence_primary 0 1 1 0 0 0 0 67
subject4 55 visit_1 0 0 0 64 2017 in_situ stage 2 unknown stage 3 unknown distant regional N2 MX NA 2 3 5 0.6244635030052077 1.385998555747178 1 none 0 1 1 1 0 0 1 62
subject5 62 visit_1 1 1 1 63 2016 borderline stage 1 stage 2 stage 2 in_situ distant unstaged N3 MX NA 3 10 3 1.0444848894466576 1.9458136918517794 1 recurrence_second_primary 0 0 0 0 0 1 0 63
subject6 55 visit_1 0 1 0 59 2002 in_situ unstaged stage 3 stage 1 regional localized in_situ NX MX NA 2 10 9 0.959323014536773 1.2909907535132685 1 recurrence_second_primary 1 1 1 1 1 0 0 59
subject7 61 visit_1 1 0 1 61 2002 in_situ unknown stage 4 stage 1 unstaged distant in_situ N0 MX NA 8 10 2 0.08122257203704342 0.302979697472334 0 none 1 1 1 1 0 0 0 62
subject8 65 visit_1 1 1 1 68 2021 benign stage 2 unknown stage 1 regional distant distant N2 M0 NA 2 10 2 1.6165848336901831 0.3081623612021196 1 none 0 1 1 1 0 0 0 86
subject9 66 visit_1 1 1 1 72 2006 invasive stage 4 unstaged stage 1 unstaged in_situ distant N3 MX NA 4 4 9 3.215384566425846 3.4306225959208776 0 recurrence_primary 1 1 0 0 1 0 0 70
subject10 51 visit_1 0 1 1 62 2017 invasive stage 3 stage 3 unstaged unknown localized unknown N0 M1 NA 5 9 8 0.21256352824871372 1.1323419133081138 0 recurrence_primary 1 0 1 0 0 0 0 66
subject11 61 visit_1 1 1 0 63 2006 borderline stage 3 stage 3 stage 1 distant localized in_situ N3 M1 NA 4 8 2 1.6517128367933704 0.9405233862074993 0 none 0 0 0 1 1 1 0 85
subject12 55 visit_1 0 0 1 59 2004 borderline stage 2 stage 1 stage 4 unknown unknown unknown NX M1 NA 2 8 6 0.2398326754427973 0.7250777214867976 0 unknown 0 1 0 1 1 0 0 85
subject13 52 visit_1 0 1 1 55 2010 borderline stage 3 stage 4 stage 2 unknown unknown regional NX M1 NA 3 3 6 1.1806309466433595 1.9446144610218248 0 recurrence_primary 0 1 0 1 0 0 0 54
subject14 66 visit_1 1 1 1 68 2007 borderline stage 2 unknown stage 4 regional in_situ regional N0 M0 NA 6 9 6 1.6857904625831341 1.52517841165081 1 recurrence_primary 1 0 1 1 1 0 0 86
subject15 57 visit_1 0 0 1 64 2008 invasive unstaged stage 2 stage 1 regional unstaged regional N2 M0 NA 7 4 3 2.1816372213400363 2.002552414007943 1 unknown 0 0 0 1 1 0 0 73
subject16 57 visit_1 0 1 0 62 2003 in_situ unknown unknown stage 1 unstaged unstaged in_situ N2 M1 NA 10 8 4 1.8889410806023983 1.223594159764647 0 recurrence_primary 1 0 0 0 0 0 0 57
subject17 67 visit_1 1 0 0 69 2014 borderline unstaged stage 2 stage 4 unknown localized unknown N0 MX NA 4 2 5 0.8706631343253838 1.8269750432423142 0 none 1 1 0 1 1 0 0 78
subject18 59 visit_1 1 1 0 62 2009 invasive stage 2 stage 2 stage 2 distant regional regional N3 M1 NA 4 5 10 1.5219466926525085 2.223313852420131 0 unknown 0 0 0 1 0 0 0 60
subject19 67 visit_1 1 0 1 69 2001 invasive stage 3 stage 3 stage 3 distant localized regional NX M1 NA 9 9 5 0.61399603928097 1.4170698016606862 1 unknown 0 0 1 1 1 0 0 82
subject20 62 visit_1 1 0 1 63 2013 benign stage 1 unstaged unstaged localized in_situ regional N3 M0 NA 3 5 4 1.211037802492477 0.8747108557841603 1 none 0 0 0 0 0 0 0 81
21 changes: 21 additions & 0 deletions test_data/cmqt_anthropometry.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
subject_id age_at_obs visit height_1 weight_1 bmi_1 waist_hip_ratio_1
subject1 59 visit_1 166.51728404004504 87.70407490435488 31.630141478991852 0.9074966900959498
subject2 46 visit_1 161.202551994156 80.8258450985496 31.103297082655853 0.8145228307678352
subject3 55 visit_1 171.23801251550628 86.53811180127285 29.51257188546359 0.9034009869133715
subject4 55 visit_1 169.17186404031233 86.4412843896079 30.204033404764687 0.6649561139308164
subject5 62 visit_1 176.4493260077908 82.96448470324013 26.647238715038643 0.7343205137895237
subject6 55 visit_1 169.82492809343805 78.58528157838867 27.248232409006288 0.7310283084720187
subject7 61 visit_1 156.0312735892919 86.27942012799926 35.439200098235276 0.8079074951310364
subject8 65 visit_1 163.50798836505146 84.54919575614312 31.62507251261148 0.7699475884641312
subject9 66 visit_1 178.2757791033649 75.35985947462825 23.711301552730852 0.8579123324227408
subject10 51 visit_1 177.43804249577906 86.20090419001572 27.379048176451334 0.656209438514197
subject11 61 visit_1 168.9662314872622 80.76732089777315 28.290188511041826 0.7469005486739022
subject12 55 visit_1 165.110036178032 85.25966289494805 31.274953024607136 0.7501018809016656
subject13 52 visit_1 167.68140136962006 76.22894393591241 27.111285687273472 0.7936294054529471
subject14 66 visit_1 164.68404018860684 72.58905440174385 26.765053327547104 0.8348499810263527
subject15 57 visit_1 165.24046335177894 84.30565936248858 30.876205386787987 0.9576720775779393
subject16 57 visit_1 166.18318741952814 77.97740084621502 28.23544170233956 0.75225930619927
subject17 67 visit_1 173.1551878731937 78.86297291363192 26.30280885590887 0.7557994230712831
subject18 59 visit_1 164.6905720192354 84.67048085431246 31.217243374076368 0.8556773306960885
subject19 67 visit_1 164.29742090189868 77.67052060104524 28.773634591122615 0.7875468828313612
subject20 62 visit_1 163.01588801788486 76.81228250714892 28.90485548023756 0.9079118556158186
21 changes: 21 additions & 0 deletions test_data/cmqt_blood_pressure.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
subject_id age_at_obs visit systolic_bp_1 diastolic_bp_1 hypertension_1
subject1 59 visit_1 124.33509725727151 95.40814980870978 0
subject2 46 visit_1 109.15014855473149 81.65169019709921 0
subject3 55 visit_1 137.82289290144655 93.0762236025457 0
subject4 55 visit_1 131.91961154374948 92.88256877921579 0
subject5 62 visit_1 152.71236002225947 85.92896940648028 0
subject6 55 visit_1 133.78550883839446 77.17056315677733 0
subject7 61 visit_1 94.37506739797686 92.5588402559985 0
subject8 65 visit_1 115.73710961443275 89.09839151228624 0
subject9 66 visit_1 157.93079743818544 70.7197189492565 0
subject10 51 visit_1 155.53726427365447 92.40180838003143 1
subject11 61 visit_1 131.33208996360634 81.53464179554629 0
subject12 55 visit_1 120.3143890800914 90.5193257898961 0
subject13 52 visit_1 127.66114677034301 72.45788787182482 0
subject14 66 visit_1 119.09725768173382 65.17810880348769 0
subject15 57 visit_1 120.68703814793986 88.61131872497717 0
subject16 57 visit_1 123.38053548436613 75.95480169243004 0
subject17 67 visit_1 143.30053678055347 77.72594582726384 0
subject18 59 visit_1 119.11592005495825 89.34096170862492 0
subject19 67 visit_1 117.9926311482819 75.3410412020905 0
subject20 62 visit_1 114.33110862252818 73.62456501429784 0
21 changes: 21 additions & 0 deletions test_data/cmqt_flags.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
subject_id age_at_obs visit flag_pregnancy_1 flag_acute_illness_1 flag_bld_1 flag_anemia_1 flag_hiv_1 flag_eskd_1 flag_splenectomy_1 flag_cirrhosis_1 flag_fasting_1 flag_lipids_med_1 flag_bp_med_1 flag_cvd_1 flag_t2d_1 flag_t1d_1 flag_diabetes_other_1
subject1 59 visit_1 unknown unknown data not collected unknown data not collected yes no unknown data not collected no data not collected unknown yes data not collected data not collected
subject2 46 visit_1 no data not collected no no unknown unknown yes unknown data not collected unknown data not collected unknown yes no yes
subject3 55 visit_1 yes unknown unknown data not collected unknown yes data not collected yes yes no yes data not collected no unknown data not collected
subject4 55 visit_1 yes yes no data not collected data not collected unknown yes data not collected yes data not collected yes yes unknown no no
subject5 62 visit_1 data not collected unknown data not collected no data not collected yes no no yes unknown yes data not collected data not collected no no
subject6 55 visit_1 yes data not collected no unknown no unknown unknown yes no no no unknown no no no
subject7 61 visit_1 unknown yes data not collected no no unknown data not collected unknown no unknown yes unknown yes no unknown
subject8 65 visit_1 data not collected data not collected data not collected unknown data not collected no no data not collected no yes yes yes unknown unknown no
subject9 66 visit_1 data not collected unknown unknown data not collected no data not collected unknown data not collected no data not collected unknown yes data not collected yes no
subject10 51 visit_1 no unknown unknown data not collected data not collected data not collected yes yes data not collected data not collected no data not collected yes data not collected yes
subject11 61 visit_1 data not collected unknown yes yes no yes yes yes no yes no unknown data not collected data not collected yes
subject12 55 visit_1 yes no unknown unknown no yes no no unknown data not collected data not collected data not collected no data not collected no
subject13 52 visit_1 no data not collected data not collected yes unknown yes yes unknown no data not collected data not collected no no unknown no
subject14 66 visit_1 data not collected unknown data not collected yes yes yes no data not collected unknown no unknown no yes unknown yes
subject15 57 visit_1 yes yes data not collected data not collected yes data not collected data not collected yes no yes data not collected yes unknown yes unknown
subject16 57 visit_1 yes data not collected yes unknown no unknown data not collected data not collected no unknown unknown unknown unknown data not collected data not collected
subject17 67 visit_1 data not collected yes no yes data not collected yes unknown no unknown data not collected no data not collected yes no yes
subject18 59 visit_1 unknown unknown no yes yes data not collected yes yes no yes no yes data not collected data not collected yes
subject19 67 visit_1 data not collected yes data not collected yes no data not collected yes yes yes yes no no no data not collected data not collected
subject20 62 visit_1 data not collected no unknown no unknown no no data not collected unknown no unknown yes data not collected unknown no
21 changes: 21 additions & 0 deletions test_data/cmqt_glycemic.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
subject_id age_at_obs visit fasting_glucose_plasma_1 fasting_glucose_serum_1 fasting_insulin_1 hba1c_1
subject1 59 visit_1 54.335097257271514 100.40814980870978 44.062251757196236 0.8629538590264505
subject2 46 visit_1 39.150148554731494 86.65169019709921 37.08921230758763 5.12890149360826
subject3 55 visit_1 67.82289290144655 98.0762236025457 43.75507401850285 0.3745564709250373
subject4 55 visit_1 61.9196115437495 97.88256877921579 25.871708544811234 7.12738940450381
subject5 62 visit_1 82.71236002225946 90.92896940648028 31.074038534214274 3.262766021325356
subject6 55 visit_1 63.78550883839445 82.17056315677733 30.827123135401397 2.5366231021701244
subject7 61 visit_1 24.37506739797687 97.5588402559985 36.59306213482773 2.2052889540530853
subject8 65 visit_1 45.73710961443275 94.09839151228624 33.746069134809844 4.778864164562747
subject9 66 visit_1 87.93079743818544 75.7197189492565 40.343424931705556 4.052338078993457
subject10 51 visit_1 85.53726427365447 97.40180838003143 25.215707888564772 2.657453514074762
subject11 61 visit_1 61.332089963606336 86.53464179554629 32.01754115054266 3.31735379488651
subject12 55 visit_1 50.3143890800914 95.5193257898961 32.25764106762492 2.0286698676551476
subject13 52 visit_1 57.66114677034302 77.45788787182482 35.52220540897103 1.0821878500015871
subject14 66 visit_1 49.09725768173383 70.17810880348769 38.61374857697645 3.36103458421411
subject15 57 visit_1 50.68703814793985 93.61131872497717 47.825405818345445 4.443468565637837
subject16 57 visit_1 53.38053548436613 80.95480169243004 32.41944796494524 2.2609190437584035
subject17 67 visit_1 73.30053678055346 82.72594582726384 32.68495673034623 3.475076625079314
subject18 59 visit_1 49.11592005495825 94.34096170862492 40.17579980220664 1.6681557751701703
subject19 67 visit_1 47.99263114828191 80.3410412020905 35.06601621235209 1.4063849803144863
subject20 62 visit_1 44.33110862252818 78.62456501429784 44.09338917118639 2.8966061373772334
Loading

0 comments on commit 6835393

Please sign in to comment.