Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Min max scaling for observation space #508

Open
wants to merge 11 commits into
base: main
Choose a base branch
from
Open

Min max scaling for observation space #508

wants to merge 11 commits into from

Conversation

kim-mskw
Copy link
Contributor

@kim-mskw kim-mskw commented Dec 3, 2024

Pull Request

Description

To Include more robust observation space scaling, a min-max scaling is proposed instead of the formerly introduced max scaling.

Changes Proposed

  • min/max scaling in learning strategies and advanced learning strategies
  • not in tutorials, as we want to stick to an easy scaling introduction and not over-engineer it there
  • changes of actual scaling values are minor

Testing

With example 02a tiny

Checklist

Please check all applicable items:

  • Code changes are sufficiently documented (docstrings, inline comments, doc folder updates)
  • New unit tests added for new features or bug fixes
  • Existing tests pass with the changes
  • Reinforcement learning examples are operational (for DRL-related changes)
  • Code tested with both local and Docker databases
  • Code follows project style guidelines and best practices
  • Changes are backwards compatible, or deprecation notices added
  • New dependencies added to pyproject.toml
  • A note for the release notes doc/release_notes.rst of the upcoming release is included
  • Consent to release this PR's code under the GNU Affero General Public License v3.0

@kim-mskw
Copy link
Contributor Author

kim-mskw commented Dec 3, 2024

@AndreasEppler I made a couple of quick fixes and pushed them. The scaling of the action space (from the hyperparameters) and the observation space were somewhat mixed up. Could you take care of the remaining points? Specifically, testing with an example that uses both normal and advanced orders.

scaled_res_load_forecast = min_max_scale(
unit.forecaster[f"residual_load_{market_id}"].loc[start:],
lower_scaling_factor_res_load,
upper_scaling_factor_res_load,
)
scaled_res_load_forecast = np.concatenate(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no scaling in this part?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confusingly marked these lines, but I meant the ones below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, this is a mistake, this should be fixed

Copy link

codecov bot commented Dec 3, 2024

Codecov Report

Attention: Patch coverage is 93.10345% with 4 lines in your changes missing coverage. Please review.

Project coverage is 76.66%. Comparing base (7457a21) to head (e63cbbe).

Files with missing lines Patch % Lines
assume/strategies/learning_advanced_orders.py 90.47% 2 Missing ⚠️
assume/strategies/learning_strategies.py 94.28% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #508      +/-   ##
==========================================
+ Coverage   76.55%   76.66%   +0.11%     
==========================================
  Files          51       51              
  Lines        6871     6896      +25     
==========================================
+ Hits         5260     5287      +27     
+ Misses       1611     1609       -2     
Flag Coverage Δ
pytest 76.66% <93.10%> (+0.11%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@nick-harder nick-harder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion and a nice catch with the mistake not scaling some values. I have left some comments to improve the performance and how things are handled

# stays here as it is unit specific, and different forecasts might apply for different units
# different handling would require an extra unit loop at learning role intiliazation and unit specific max/min values
# further forecasts might change during the simulation if advanced forecasting is used
self.max_market_price = max(unit.forecaster[f"price_{market_id}"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be done in init so it is done only once and not during every call of the function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree for the reasons stated in the lengthly comment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kim-mskw what lengthy comment? Don't see it

# price forecast
scaling_factor_price = self.max_bid_price
upper_scaling_factor_price = self.max_market_price
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why assign new variable here if we can use self.values directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did the same beforehand to make this crucial part of the learning easily changeable for new users; compare the RL tutorial. We could change it if you think it would be easier to understand.

@@ -320,18 +342,30 @@ def create_observation(
current_costs = unit.calculate_marginal_cost(start, current_volume)

# scale unit outputs
scaled_max_power = current_volume / scaling_factor_total_capacity
scaled_marginal_cost = current_costs / scaling_factor_marginal_cost
scaled_max_power = min_max_scale(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, should be calculated only once in init

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree for the reasons stated in the lengthly comment

@@ -36,6 +36,8 @@ Upcoming Release
- **Outputs Role Performance Optimization:** Output role handles dict data directly and only converts to DataFrame on Database write.
- **Overall Performance Optimization:** The overall performance of the framework has been improved by a factor of 5x to 12x
depending on the size of the simulation (number of units, markets, and time steps).
- **Learning Opservation Space Scaling:** Instead of the formerly used max sclaing of the observation space, we added a min-max scaling to the observation space.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Learning Opservation Space Scaling:** Instead of the formerly used max sclaing of the observation space, we added a min-max scaling to the observation space.
- **Learning Observation Space Scaling:** Instead of the formerly used max scaling of the observation space, we added a min-max scaling to the observation space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the new precommit would fix something like this?

scaled_res_load_forecast = min_max_scale(
unit.forecaster[f"residual_load_{market_id}"].loc[start:],
lower_scaling_factor_res_load,
upper_scaling_factor_res_load,
)
scaled_res_load_forecast = np.concatenate(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, this is a mistake, this should be fixed

# stays here as it is unit specific, and different forecasts might apply for different units
# different handling would require an extra unit loop at learning role intiliazation and unit specific max/min values
# further forecasts might change during the simulation if advanced forecasting is used
self.max_market_price = max(unit.forecaster[f"price_{market_id}"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should also here compute these things in the init directly to save time. Also no need to assign values from self like upper_scaling_factor_res_load = self.max_residual since we can use directly the self. values

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree for the reasons stated in the lengthly comment

@@ -10,6 +10,7 @@
from assume.common.base import SupportsMinMax
from assume.common.market_objects import MarketConfig, Orderbook, Product
from assume.common.utils import get_products_index
from assume.reinforcement_learning.learning_utils import min_max_scale
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this as really a learning util, but just a simple util

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants