Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UMLS Entity Linker throws BadZipFile error #534

Open
markediger opened this issue Nov 22, 2024 · 0 comments
Open

UMLS Entity Linker throws BadZipFile error #534

markediger opened this issue Nov 22, 2024 · 0 comments

Comments

@markediger
Copy link

markediger commented Nov 22, 2024

I am trying to run a basic example of the UMLS Entity Linker:

import spacy
import scispacy
from scispacy.umls_linking import UmlsEntityLinker
nlp = spacy.load('en_core_sci_md')
linker = UmlsEntityLinker()

nlp.add_pipe(linker)
doc = nlp("Spinal and bulbar muscular atrophy (SBMA) is an \
           inherited motor neuron disease caused by the expansion \
           of a polyglutamine tract within the androgen receptor (AR). \
           SBMA can be caused by this easily.")

entity = doc.ents[1]
print("Name: ", entity)

for umls_ent in entity._.umls_ents:
    print(linker.umls.cui_to_entity[umls_ent[0]])

I get an error implying that scispacy is not able to identify the UMLS dictionaries?

Traceback (most recent call last):
  File "H:\integrated_evidence\indication_coding\indication-master\src\scispacy_test.py", line 5, in <module>
    linker = UmlsEntityLinker()
             ^^^^^^^^^^^^^^^^^^
  File "H:\integrated_evidence\indication_coding\indication-master\.venv\Lib\site-packages\scispacy\linking.py", line 85, in __init__
    self.candidate_generator = candidate_generator or CandidateGenerator(
                                                      ^^^^^^^^^^^^^^^^^^^
  File "H:\integrated_evidence\indication_coding\indication-master\.venv\Lib\site-packages\scispacy\candidate_generation.py", line 222, in __init__
    self.ann_index = ann_index or load_approximate_nearest_neighbours_index(
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\integrated_evidence\indication_coding\indication-master\.venv\Lib\site-packages\scispacy\candidate_generation.py", line 133, in load_approximate_nearest_neighbours_index
    concept_alias_tfidfs = scipy.sparse.load_npz(
                           ^^^^^^^^^^^^^^^^^^^^^^
  File "H:\integrated_evidence\indication_coding\indication-master\.venv\Lib\site-packages\scipy\sparse\_matrix_io.py", line 134, in load_npz
    with np.load(file, **PICKLE_KWARGS) as loaded:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\integrated_evidence\indication_coding\indication-master\.venv\Lib\site-packages\numpy\lib\npyio.py", line 444, in load
    ret = NpzFile(fid, own_fid=own_fid, allow_pickle=allow_pickle,
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "H:\integrated_evidence\indication_coding\indication-master\.venv\Lib\site-packages\numpy\lib\npyio.py", line 190, in __init__
    _zip = zipfile_factory(fid)
           ^^^^^^^^^^^^^^^^^^^^
  File "H:\integrated_evidence\indication_coding\indication-master\.venv\Lib\site-packages\numpy\lib\npyio.py", line 103, in zipfile_factory
    return zipfile.ZipFile(file, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "D:\apps\python\Lib\zipfile.py", line 1301, in __init__
    self._RealGetContents()
  File "D:\apps\python\Lib\zipfile.py", line 1368, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file

I am using scispacy version 0.5.5 and en_core_sci_md version 0.5.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant