Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFoundError when running build_powerplants #367

Closed
2 tasks done
ekatef opened this issue Jun 3, 2022 · 12 comments · Fixed by #423
Closed
2 tasks done

FileNotFoundError when running build_powerplants #367

ekatef opened this issue Jun 3, 2022 · 12 comments · Fixed by #423
Labels
bug Something isn't working

Comments

@ekatef
Copy link
Member

ekatef commented Jun 3, 2022

Checklist

  • I am using the current main branch or the latest release. Please indicate.
  • I am running on an up-to-date pypsa-africa environment. Update via conda env update -f envs/environment.yaml.

Describe the Bug

An attempt to run build_powerplants on a new machine has resulted in a FileNotFoundError with an error messages referencing a "linkfile.txt" in a temporary folder. It looked very much similar to the problem reported in the Workflow-check of the PyPSA-Earth-Sec model (@energyLS, many thanks for reporting it!) and was fixed using @Hazem-IEG suggestion by installing java:

conda install -c bioconda java-jdk

or

conda install openjdk

As a nice side effect, installing java could have resolved also some troubles with glpk which I have initially experienced in a fresh workspace. (But not sure if this is a reproducible)

Error Message

The whole error message looked like:

INFO:snakemake.logging:
[Tue May 31 22:48:06 2022]
INFO:snakemake.logging:[Tue May 31 22:48:06 2022]
rule build_powerplants:
    input: networks/base.nc, configs/powerplantmatching_config.yaml, data/custom_powerplants.csv, data/clean/africa_all_generators.csv
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log
    jobid: 0
    reason: Missing output files: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    resources: tmpdir=/var/folders/qn/vpndfm21795ckkq89np1ckp40000gn/T, mem=500
INFO:snakemake.logging:rule build_powerplants:
    input: networks/base.nc, configs/powerplantmatching_config.yaml, data/custom_powerplants.csv, data/clean/africa_all_generators.csv
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log
    jobid: 0
    reason: Missing output files: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    resources: tmpdir=/var/folders/qn/vpndfm21795ckkq89np1ckp40000gn/T, mem=500

INFO:snakemake.logging:
INFO:pypsa.io:Imported network base.nc has buses, lines, links
INFO:powerplantmatching.collection:Create combined dataset for GEO, GPD
INFO:powerplantmatching.cleaning:Aggregating blocks in data source 'GEO'.
Traceback (most recent call last):
  File "~/Documents/_github_/pypsa-africa/.snakemake/scripts/tmpikj0kau5.build_powerplants.py", line 260, in <module>
    pm.powerplants(from_url=False, update=True, config=config)
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/collection.py", line 223, in matched_data
    matched = collect(matching_sources, config=config, **collection_kwargs)
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/collection.py", line 96, in collect
    dfs = parmap(df_by_name, datasets)
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/utils.py", line 378, in parmap
    return list(map(f, arg_list))
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/collection.py", line 74, in df_by_name
    return aggregate_units(df, dataset_name=name, config=config)
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/cleaning.py", line 447, in aggregate_units
    duplicates = pd.concat([duke(df.query("Country == @c")) for c in countries])
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/cleaning.py", line 447, in <listcomp>
    duplicates = pd.concat([duke(df.query("Country == @c")) for c in countries])
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/powerplantmatching/duke.py", line 152, in duke
    return pd.read_csv(
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 680, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 575, in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 933, in __init__
    self._engine = self._make_engine(f, self.engine)
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1217, in _make_engine
    self.handles = get_handle(  # type: ignore[call-overload]
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/site-packages/pandas/io/common.py", line 789, in get_handle
    handle = open(
FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/qn/vpndfm21795ckkq89np1ckp40000gn/T/tmpi9oa1sgi/linkfile.txt'
[Tue May 31 22:48:13 2022]
INFO:snakemake.logging:[Tue May 31 22:48:13 2022]
Error in rule build_powerplants:
    jobid: 0
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log (check log file(s) for error message)

ERROR:snakemake.logging:Error in rule build_powerplants:
    jobid: 0
    output: resources/powerplants.csv, resources/powerplants_osm2pm.csv
    log: logs/build_powerplants.log (check log file(s) for error message)

RuleException:
CalledProcessError in line 333 of ~/Documents/_github_/pypsa-africa/Snakefile:
Command 'set -euo pipefail;  ~/opt/miniconda3/envs/pypsa-africa/bin/python3.10 ~/Documents/_github_/pypsa-africa/.snakemake/scripts/tmpikj0kau5.build_powerplants.py' returned non-zero exit status 1.
  File "~/Documents/_github_/pypsa-africa/Snakefile", line 333, in __rule_build_powerplants
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/concurrent/futures/thread.py", line 58, in run
ERROR:snakemake.logging:RuleException:
CalledProcessError in line 333 of ~/Documents/_github_/pypsa-africa/Snakefile:
Command 'set -euo pipefail;  ~/opt/miniconda3/envs/pypsa-africa/bin/python3.10 ~/Documents/_github_/pypsa-africa/.snakemake/scripts/tmpikj0kau5.build_powerplants.py' returned non-zero exit status 1.
  File "~/Documents/_github_/pypsa-africa/Snakefile", line 333, in __rule_build_powerplants
  File "~/opt/miniconda3/envs/pypsa-africa/lib/python3.10/concurrent/futures/thread.py", line 58, in run
Removing output files of failed job build_powerplants since they might be corrupted:
resources/powerplants_osm2pm.csv
WARNING:snakemake.logging:Removing output files of failed job build_powerplants since they might be corrupted:
resources/powerplants_osm2pm.csv
Shutting down, this might take some time.
@ekatef ekatef added the bug Something isn't working label Jun 3, 2022
@davide-f
Copy link
Member

davide-f commented Jun 3, 2022

Awesome @ekatef !
I think though this issue is related to powerplantmatching, I think you could create a PR there to highlight the issue.
I can implement that easily on our fork, in my github.
If you wish, you could open a pr there as well :)

In both cases, the solution implies to change the environment files to successfully install also this dependency.
The environment file would need to specify the new channel (bioconda) and the new package to install (java-jdk), though in both ppl the environment file should be very self-explanatory regarding the above

@davide-f
Copy link
Member

davide-f commented Jun 3, 2022

An alternative solution is shown also in PyPSA/powerplantmatching#61 (comment)
Maybe using conda-forge channel may be better (main source for the other packages)

@ekatef
Copy link
Member Author

ekatef commented Jun 4, 2022

@davide-f, thank you for guiding me through the context and adding the dependency to your powerplantmatching fork. I see that this issue has some pre-history :)

I can try to create a PR to the powerplantmatching repo. But it would be great to understand before if openjdk is the best option as the Java Developers Kit seems to be available in quite many variations. Absolutely agree regarding availability of openjdk via conda-forge being a serious advantage but probably there are there some additionally important details?

Besides, it seems that installation of JDK on Windows may require some additional work with environment variables as @oayana has reported. It would be great to know whether that is fixed automatically when installing via conda.

I can try to investigate both questions and if there are no obvious pitfalls, propose openjdk as a solution into powerplantmatching/main.

@davide-f
Copy link
Member

davide-f commented Jun 4, 2022

Awesome! Feel free to investigate that.
Personally, I feel like the hvdc has priority over this issue.
On our fork we implemented the openjdk and we can see of future issues may arise.
I think we could go for either the two (maybe openjdk?), and if other issues are signaled, then we can investigate further them.
What do you think?

@ekatef
Copy link
Member Author

ekatef commented Jun 4, 2022

Absolutely agree :)

@pz-max
Copy link
Member

pz-max commented Jun 4, 2022

Hi guys, we install @davide-f fork of powerplantmatching with pip. I just added to the setup.py -> install-jdk.
This should now make sure that Java will be installed with our pip installation.

PR: https://github.com/davide-f/powerplantmatching/pull/3

@davide-f
Copy link
Member

davide-f commented Jun 4, 2022

As we solved the issue on our workflow, we can close this issue.
The problem has also been notified to powerplantmatching, only the PR is missing there, but this issue can now be closed

@ekatef
Copy link
Member Author

ekatef commented Jul 28, 2022

It seems that install-jdk magic doesn't work for me for some reasons. Each time I remove and re-install the environment the error above appears, despite install-jdk is among the installed packages.

@pz-max, could you please have a look?

@davide-f
Copy link
Member

Thanks for advicing @ekatef !
Just to make sure: if you remove install-jdk and reinstall it, does it work?

If not, we could use java-jdk that seems to be working.

@ekatef
Copy link
Member Author

ekatef commented Jul 29, 2022

@davide-f, thank you for looking into this!

When trying to remove install-jdk I obtain an error

pypsa-africa install-jdk 
Collecting package metadata (repodata.json): done
Solving environment: failed

PackagesNotFoundError: The following packages are missing from the target environment:
  - install-jdk

It seems that something gets silently broken during installation...

The issue may be resolved with conda install -c bioconda java-jdk or more elegantly (as you suggested earlier) with conda install openjdk.

@pz-max
Copy link
Member

pz-max commented Jul 30, 2022

@ekatef how did you solve this?
Just faced the same error.
Think we should fix it

@ekatef
Copy link
Member Author

ekatef commented Jul 31, 2022

@pz-max Had just installed openjdk as a quick fix and it worked. I would have to report the issue earlier but wasn't sure if it's reproducible and not too system-dependent...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants