You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
The project aims to develop a reinforcement learning (RL) agent to optimize waste collection in a simulated environment, minimizing overflow events and improving efficiency.
Environment and State Representation:
The state is represented by four features: Waste Level: Current waste level (0 to 1) Time of Day: A random value representing the time (0 to 24 hours) Weather Condition: A random value (0 to 1) indicating the weather Distance to Collection Point: A random value (0 to 10) representing the distance to the waste collection point.
Action Space:
The agent can choose between two actions: Wait (0): Do not collect waste. Collect Waste (1): Proceed with waste collection.
Reward Structure:
The reward system is designed to encourage efficient waste collection: +10 for timely collection when the waste level exceeds the threshold. -5 for premature collection when the waste level is below the threshold. -1 for each time step to penalize waiting.
Training Process:
The agent is trained over 100 episodes, where each episode simulates a series of time steps (up to 20) where the agent makes decisions based on the current state. The agent learns from experience using a replay memory and updates its policy through Q-learning.
Evaluation Metrics:
Performance is evaluated using: Average Reward per Episode: Measures the effectiveness of the agent's actions. Epsilon Decay: Tracks the exploration rate, indicating how the agent balances exploration vs. exploitation. Overflow Events: Counts occurrences when the waste level exceeds the maximum capacity as per previous updation.
Visualization:
The results are visualized using Matplotlib to plot: Average rewards per episode, showing the agent's learning progression and rewards gained on successfull execution and implementation of a specified condition Epsilon decay over episodes, illustrating the shift from exploration to exploitation. Overflow events per episode, highlighting improvements in waste management techniques
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Approach to be followed (optional)
A clear and concise description of the approach to be followed.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered:
Thanks for creating the issue in ML-Nexus!🎉
Before you start working on your PR, please make sure to:
⭐ Star the repository if you haven't already.
Pull the latest changes to avoid any merge conflicts.
Attach before & after screenshots in your PR for clarity.
Include the issue number in your PR description for better tracking.
Don't forget to follow @UppuluriKalyani – Project Admin – for more updates!
Tag @Neilblaze,@SaiNivedh26 for assigning the issue to you.
Happy open-source contributing!☺️
Is your feature request related to a problem? Please describe.
Describe the solution you'd like
The project aims to develop a reinforcement learning (RL) agent to optimize waste collection in a simulated environment, minimizing overflow events and improving efficiency.
Environment and State Representation:
The state is represented by four features: Waste Level: Current waste level (0 to 1) Time of Day: A random value representing the time (0 to 24 hours) Weather Condition: A random value (0 to 1) indicating the weather Distance to Collection Point: A random value (0 to 10) representing the distance to the waste collection point.
Action Space:
The agent can choose between two actions: Wait (0): Do not collect waste. Collect Waste (1): Proceed with waste collection.
Reward Structure:
The reward system is designed to encourage efficient waste collection: +10 for timely collection when the waste level exceeds the threshold. -5 for premature collection when the waste level is below the threshold. -1 for each time step to penalize waiting.
Training Process:
The agent is trained over 100 episodes, where each episode simulates a series of time steps (up to 20) where the agent makes decisions based on the current state. The agent learns from experience using a replay memory and updates its policy through Q-learning.
Evaluation Metrics:
Performance is evaluated using: Average Reward per Episode: Measures the effectiveness of the agent's actions. Epsilon Decay: Tracks the exploration rate, indicating how the agent balances exploration vs. exploitation. Overflow Events: Counts occurrences when the waste level exceeds the maximum capacity as per previous updation.
Visualization:
The results are visualized using Matplotlib to plot: Average rewards per episode, showing the agent's learning progression and rewards gained on successfull execution and implementation of a specified condition Epsilon decay over episodes, illustrating the shift from exploration to exploitation. Overflow events per episode, highlighting improvements in waste management techniques
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Approach to be followed (optional)
A clear and concise description of the approach to be followed.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: