Skip to content

Commit

Permalink
Start converting 'scrape' to 'process' (#1)
Browse files Browse the repository at this point in the history
* Start converting 'scrape' to 'process'

* Finish converting 'scrape' to 'process'

---------

Co-authored-by: Brian Mirletz <[email protected]>
  • Loading branch information
mikebannis and brtietz authored Aug 17, 2023
1 parent a994510 commit b40db96
Show file tree
Hide file tree
Showing 8 changed files with 158 additions and 52 deletions.
104 changes: 104 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Custom list
.DS_Store

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# SageMath parsed files
*.sage.py

# dotenv
.env

# virtualenv
.venv
venv/
ENV/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
Python files and Jupyter notebooks for processing the Annual Technology Baseline (ATB) electricity data and determining LCOE and other metrics. All documentation and data for the ATB is available at the [ATB website](https://atb.nrel.gov).

## Installation and Requirements
The pipeline requires [Python](https://www.python.org) 3.10 or newer. Dependancies can be installed using `pip`:
The pipeline requires [Python](https://www.python.org) 3.10 or newer. Dependancies can be installed using `pip`:

```
$ pip install -r requirements.txt
Expand All @@ -30,27 +30,27 @@ is the path and filename to the ATB electricity data workbook `xlsx` file.
Process all techs and export to a flat file named `flat_file.csv`:

```
$ python -m lcoe_calculator.full_scrape --save-flat flat_file.csv {PATH-TO-DATA-WORKBOOK}
$ python -m lcoe_calculator.process_all --save-flat flat_file.csv {PATH-TO-DATA-WORKBOOK}
```

Process only land-based wind and export pivoted data and meta data:

```
$ python -m lcoe_calculator.full_scrape --tech LandBasedWindProc \
$ python -m lcoe_calculator.process_all --tech LandBasedWindProc \
--save-pivoted pivoted_file.csv --save-meta meta_file.csv {PATH-TO-DATA-WORKBOOK}
```

Process only pumped storage hydropower and copy data to the clipboard so it may be pasted into a spreadsheet:

```
$ python -m lcoe_calculator.full_scrape --tech PumpedStorageHydroProc \
$ python -m lcoe_calculator.process_all --tech PumpedStorageHydroProc \
--clipboard {PATH-TO-DATA-WORKBOOK}
```

Help for the scraper and the names of available technologies can be viewed by running:
Help for the processor and the names of available technologies can be viewed by running:

```
$ python -m lcoe_calculator.full_scrape --help
$ python -m lcoe_calculator.process_all --help
```

## Debt Fraction Calculator
Expand Down
3 changes: 1 addition & 2 deletions debt_fraction_calculator/debt_fraction_calc.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,15 +5,14 @@
# (see https://github.com/NREL/ATB-calc).
#
"""
Workflow to calculate debt fractions based on scraped data
Workflow to calculate debt fractions based on ATB data
Developed against PySAM 4.0.0
"""
from typing import TypedDict, List, Dict, Type
import pandas as pd
import click


import PySAM.Levpartflip as levpartflip

from lcoe_calculator.extractor import Extractor
Expand Down
25 changes: 14 additions & 11 deletions example_notebooks/Full work flow.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
"from datetime import datetime as dt\n",
"\n",
"sys.path.insert(0, os.path.dirname(os.getcwd()))\n",
"from lcoe_calculator.full_scrape import FullScrape\n",
"from lcoe_calculator.process_all import ProcessAll\n",
"from lcoe_calculator.tech_processors import (ALL_TECHS,\n",
" OffShoreWindProc, LandBasedWindProc, DistributedWindProc,\n",
" UtilityPvProc, CommPvProc, ResPvProc, UtilityPvPlusBatteryProc,\n",
Expand Down Expand Up @@ -79,8 +79,8 @@
"# Or process a single technology\n",
"techs = LandBasedWindProc\n",
"\n",
"# Initiate the scraper with the workbook location and desired technologies\n",
"scraper = FullScrape(atb_electricity_workbook, techs)"
"# Initiate the processor with the workbook location and desired technologies\n",
"processor = ProcessAll(atb_electricity_workbook, techs)"
]
},
{
Expand All @@ -90,7 +90,10 @@
"metadata": {},
"source": [
"## Run the pipeline\n",
"Now that the scraper knows where the data workbook is and which technologies were interested in we can kick it off. Depending on the number of requested technologies, this can take a couple minutes. Note that calculated LCOE and CAPEX is automatically compared to the values in the workbook. Not all technologies have LCOE and CAPEX."
"Now that the processor knows where the data workbook is and which technologies we are interested in, we\n",
"can kick it off. Depending on the number of requested technologies, this can take a couple minutes.\n",
"Note that calculated LCOE and CAPEX is automatically compared to the values in the workbook. Not all\n",
"technologies have LCOE and CAPEX."
]
},
{
Expand All @@ -103,7 +106,7 @@
"outputs": [],
"source": [
"start = dt.now()\n",
"scraper.scrape()\n",
"processor.process()\n",
"print('Processing completed in ', dt.now() - start)"
]
},
Expand All @@ -124,16 +127,16 @@
"outputs": [],
"source": [
"# Save data to as a CSV\n",
"scraper.to_csv('atb_data.csv')\n",
"processor.to_csv('atb_data.csv')\n",
"\n",
"# Save flattened data to as a CSV\n",
"scraper.flat_to_csv('atb_data_flat.csv')\n",
"processor.flat_to_csv('atb_data_flat.csv')\n",
"\n",
"# Save meta data to as a CSV\n",
"scraper.meta_data_to_csv('atb_meta_data.csv')\n",
"processor.meta_data_to_csv('atb_meta_data.csv')\n",
"\n",
"# Copy data to the clipboard so it can be pasted in a spreadsheet \n",
"scraper.data.to_clipboard()"
"processor.data.to_clipboard()"
]
},
{
Expand All @@ -152,7 +155,7 @@
"metadata": {},
"outputs": [],
"source": [
"data = scraper.data\n",
"data = processor.data\n",
"\n",
"# Show available parameters\n",
"print('Available parameters')\n",
Expand Down Expand Up @@ -184,7 +187,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.10.12"
},
"vscode": {
"interpreter": {
Expand Down
1 change: 0 additions & 1 deletion example_notebooks/Process ATB electricity technology.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,6 @@
"from datetime import datetime as dt\n",
"\n",
"sys.path.insert(0, os.path.dirname(os.getcwd()))\n",
"from lcoe_calculator.full_scrape import FullScrape\n",
"\n",
"# Electricity technology processors\n",
"from lcoe_calculator.tech_processors import (\n",
Expand Down
11 changes: 6 additions & 5 deletions lcoe_calculator/README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,15 @@
# ATB Calculator
This code scrapes the Excel ATB data workbook, calculates LCOE and CAPEX
for all technologies as needed, and exports data in flat or flat + pivoted formats.
This code extracts data from the ATB Excel workbook, then calculates LCOE and CAPEX for all
technologies as needed, and exports data in flat or flat and pivoted formats.

**Note:** You will likely have to give Python access to interact with Excel. A window will automatically ask for permission the first time this script is run.
**Note:** You will likely have to give Python access to interact with Excel. A window will
automatically ask for permission the first time this script is run.

## Files
Files are listed in roughly descending order of importance and approachability.

- `full_scrape.py` - Class that performs full scrape with built in command line interface. See the README in the root of this repo for CLI examples.
- `tech_processors.py` - Classes to scrape and process individual technologies. Any new ATB technologies should be added to this file.
- `process_all.py` - Class that performs processing for all ATB technologies with a built-in command line interface. See the README in the root of this repo for CLI examples.
- `tech_processors.py` - Classes to process individual technologies. Any new ATB technologies should be added to this file.
- `base_processor.py` - Base processor class that is subclassed to process individual technologies.
- `config.py` - Constant definitions including the base year and scenario names
- `extractor.py` - Code to pull values from the workbook
Expand Down
Loading

0 comments on commit b40db96

Please sign in to comment.