Parses the UMLS source files.
In order to use the UMLS you have to be licensed. For more information please refer to https://uts.nlm.nih.gov/home.html -> Request a License.
This tool requires the full UMLS release, so please download the Full UMLS Release Files.
TODO: MAKE SCRIPT AND CHANGE PATHS IN PARSER ACCORDINGLY
mkdir umls-extract
mkdir umls-extract/META
mkdir umls-extract/NET
unzip umls-2022AB-full.zip
rm umls-2022AB-full.zip
unzip 2022AB-full/2022ab-1-meta.nlm
unzip 2022AB-full/2022ab-otherks.nlm
gunzip -c 2022AB/META/MRCONSO.RRF.*.gz > umls-extract/META/MRCONSO.RRF
gunzip 2022AB/META/MRDEF.RRF.gz
mv 2022AB/META/MRDEF.RRF umls-extract/META/
gunzip 2022AB/META/MRSTY.RRF.gz
mv 2022AB/META/MRSTY.RRF umls-extract/META/
mv 2022AB/NET/SRDEF umls-extract/NET/
mv 2022AB/NET/SRSTRE1 umls-extract/NET/
rm -rf 2022AB-full/
TODO WRITE ME
from umlsparser import UMLSParser
umls = UMLSParser('/home/toberhauser/DEV/Data/UMLS/2017AA-full/2017AA')
for cui, concept in umls.get_concepts().items():
if 'ICD10CM' in concept.get_source_ids().keys():
icd10ids = concept.get_source_ids().get('ICD10CM')
print(icd10ids, concept.get_preferred_names_for_language('ENG')[0])
from umlsparser import UMLSParser
import collections
umls = UMLSParser('/home/toberhauser/DEV/Data/UMLS/2017AA-full/2017AA')
sources_counter = collections.defaultdict(int)
for cui, concept in umls.get_concepts().items():
sources = concept.get_source_ids().keys()
for source in sources:
sources_counter[source] += 1
print('|SOURCE|COUNT|\n|------|-----|')
for source, count in sorted(sources_counter.items(), key=lambda t: t[1], reverse=True):
print('|{}|{}|'.format(source, count))
from umlsparser import UMLSParser
umls = UMLSParser('/home/toberhauser/DEV/Data/UMLS/2017AA-full/2017AA')
for cui, concept in umls.get_concepts().items():
tui = concept.get_tui()
name_of_semantic_type = umls.get_semantic_types()[concept.get_tui()].get_name()
for name in concept.get_names_for_language('ENG'):
print(cui, name, tui, name_of_semantic_type)
We use SemVer for versioning. For the versions available, see the tags on this repository.
- Tom Oberhauser - Initial work - GitHub