-
Notifications
You must be signed in to change notification settings - Fork 929
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Native support for multiple agent types #1894
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1894 +/- ##
==========================================
+ Coverage 77.35% 77.53% +0.17%
==========================================
Files 15 15
Lines 1007 1015 +8
Branches 220 221 +1
==========================================
+ Hits 779 787 +8
Misses 197 197
Partials 31 31 ☔ View full report in Codecov by Sentry. |
074024d
to
1531ea7
Compare
mesa/model.py
Outdated
@@ -41,6 +41,29 @@ def __init__(self, *args: Any, **kwargs: Any) -> None: | |||
self.running = True | |||
self.schedule = None | |||
self.current_id = 0 | |||
self.agents_by_type = {} | |||
|
|||
def create_agent(self, agent_type, *args, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rht Could we type hint that agent_type
should be an instance of Agent? How would that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I am not mistaken it should be the same as in the agent class where you pass in the model model: Model
so it would be agent:Agent
you will need to import the Agent module though.
I'd say a less drastic change would be to modify the schedulers. Currently, there needs to be a permutation of [random activation, staged activation, simultaneous activation, discrete event, ...] x [1 agent, multiple agent]. This needs to be reduced to 1D of [random activation, staged activation, simultaneous activation, discrete event, ...] that is multiple type by default. There are 2 problems:
|
This is not the case given that you said the scheduler will access |
Actually, it could still be modular even in the model class. There could be an argument Note this all can be implemented without breaking compatibility by modifying the schedulers to by multiple type by default, so can happen without having to wait until Mesa 3.0. The remaining issue of whether to put the |
Another problem is that this is limited to grouping based on agent classes. Sometimes you want to "groupby" agents with various other metrics (inside/outside a bar in the El Farol model, the condition in Epstein civil violence model, or solvent/insolvent bank). This is the same issue I raised in the multi-agent data collection thread #348 (comment). See also https://github.com/tpike3/multilevel_mesa. |
Thanks for the extensive replies!
Sorry, I don’t understand the problem fully. Could you maybe give an example?
Is there a fundamental reason we couldn’t keep everything in the model and let both the scheduler and the grid access if from there.
For me, model feels like the right place. It’s the big overarching construct, from which other modules can access.
I think you’re spot on here: this is another problem. I’m suggesting native support for different kind (classes) of agents throughout the model, like NetLogo has breeds. You’re suggesting a universal way to filter agents on current state values. State values are parametric differences, while different classes are structural. So I think that are two different problems. |
I already replied that it is not a problem, because you could pass a custom agent container constructor to the class.
The grid/space is its own agent container that stores spatial relations between the agents. Needed for getting the neighbors.
If I were to summarize. #348 (comment) does |
I suppose I need to describe some concrete examples. I replaced # staged activation by type and condition
# Both wolf and sheep
model.agents.do_each("move")
model.agents.get_by(class=Sheep).do_each("eat_or_die")
model.agents.get_by(class=Wolf).do_each("hunt_or_die")
# Both wolf and sheep
model.agents.get_by(condition=lambda a: a.check_is_ready()).do_each("reproduce")
# RiskAverseTrader and AdaptiveTrader
# This is a string because a group could be a class that both classes inherit from, or for some conditional.
# The group argument is for when it is final, i.e. unchanged after the model initialization. For dynamic grouping, use the condition argument.
model.agents.get_by(group="trader").do_each("trade")
|
This is a great example and I really like the cleanness. It reads well from left to right, like proper Python. It's also nice to see the step not being one solid thing, but being modular and flexible. The Staged and ByType schedulers are also not that necessary anymore, and could maybe be replaced. I have another idea, I will try to work it out tomorrow. But I already like this discussion and talking about the different options! |
I was using system 1 (in this context) to write that. But deliberating on system 2 is better, and need some time. |
I started typing out this long detailed reply, only come to the conclusion that I basically ended up with the same solution you did.
I have still a few questions however:
I would really like to see an implementation for your approach, since I think it's the right direction! We might still encounter some minor things but generally I really like it. Okay, is my idea. I don't necessarily know if it's better, but maybe we can reach best of both.
It's based these premises:
Then, like I said before, I see this as two different problems:
(note that in Python, a property and a state are both modelled as an agent attribute. How ever, the convention is to call things that don't change during the model run properties (they are static) and things that do change states (they are dynamic). Now, the solution for problem 1) I have offered in this PR. I think that still is the best way to do it, just keep a dictionary of types. Is conceptually easy, straightforward to use and aligns well with PEP 20 in my opinion.
(I know I'm cheating since I'm Dutch) Then, the solution for problem 2. I would propose to have an Edit: Started working on it and we basically have the same solution. |
@EwoutH So this would be my take based on this conversation Agents are python objects that are stored in a dictionary in
*This leads to the question of whether the agents by type should be more basic (e.g. part of the |
Thanks for getting back!
Wait, so you first get all the agent Originally, I imagined a rich_traders = self.select_agents(class=Trader, attributes=["wealth">=5])
total_wealth_rich_traders = sum(a.wealth for a in rich_traders) However, I liked rth's proposal just as much, especially the
So in this proposal and in my opinion, definitely yes. And even more general than only in time.py, it should be in the model by default so you can always trust in any Mesa Module, Class and method that the type is available and you can iterate over it.
This PR proposes to change that aspect (see the code diff and commit/PR messages). The agent types should be accessible from both the model ( I'm now really curious for a full (draft) implementation of rht's proposal. @rht are you willing to make one? |
Apologies for the delay, This is a pretty big architectural change, but merits discussion. And it makes me realize our current architectural drawing is out of date (below). My thought, would be to discuss the architecture as I think it will reduce the cognitive load on the discussion. The idea would be I update the architecture design and then we can discuss around that. Let me know what you think. |
@tpike3 looking forward to your new conceptualisation! AgentPy also uses a construct like this in almost all their example models: class VirusModel(ap.Model):
def setup(self):
# Create agents
self.agents = ap.AgentList(self, self.p.population, Person) class WealthModel(ap.Model):
def setup(self):
# Create agents
self.agents = ap.AgentList(self, self.p.agents, WealthAgent) So the concept of having one single variable that contains all the agents - in our case stored by type - in not new in Python ABM modelling. |
Todo: Edit: Both are not relevant anymore, the new implementation already takes care of both. |
@quaquel thanks for you feedback. I looked at it, and have the following thoughts:
Thanks again for you suggestions, I'm learning a lot diving deeper in those Python constructs and thinking critically about how they would apply in our situation. So if you have the time, keep it coming! @rht, @Corvince and @wang-boy, I'm also really curious on your takes on this! |
Just some thoughts without diving too deeply into MESA itself but coming at it as a generic devs/abm modeler with ample Python experience who also happens to teach ABM with MESA to MSc. students (and has build a pPython connector to NetLogo without ever having built a single NetLogo model). I agree that having an agent_by_type overview at the model level is a good idea. Similarly, keeping track of agents within the model should not be a concern of the user. This means using weakrefs everywhere (so in all schedulers and spaces) except in one place. You need to use hard refs in one place because, otherwise, agent instances are garbage collected once all references have been removed. What I don't like is burdening the user with additional api calls when creating and killing agent instances (i.e., When creating a new Agent instance, this should handle the creation of the identifier (do you even really need this....?) as well as adding the instance to As an aside, I think looking at NetLogo for how you want to handle anything is strange. NetLogo is based on a now 50-year-old language. Why confine yourself to choices made 50 years ago if you can access a much more modern language with very sophisticated OO features? |
Right, I forgot that Agents have access to their model instance from very start, since they’re initiated with it. I’m my head is was a problem, because I thought some things needed to happen from the model scope (so in a model method like Thanks, I’m going to try to move the create and kill logic into the agent. Edit: Not having to move people to start using |
24bca6a
to
c29ab08
Compare
I reimplemented this PR fully in c29ab08 based on the discussions and updated the PR description. In short:
With these changes, there is now no backwards incompatible behavior and users don't need to modify existing models. |
Tracks agents in the model with a defaultdict. This PR adds a new `agents` dictionary to the Mesa `Model` class, enabling native support for handling multiple agent types within models. This way all modules can know which agents and agents type are in the model at any given time, by calling `model.agents`. NetLogo has had agent types, called [`breeds`](https://ccl.northwestern.edu/netlogo/docs/dict/breed.html), built-in from the start. It works perfectly in all NetLogo components, because it's a first class citizen and all components need to be designed to consider different breeds. In Mesa, agent types are an afterthought at best. Almost nothing is currently designed with multiple agent types in mind. That has caused several issues and limitations over the years, including: - projectmesa#348 - projectmesa#1142 - projectmesa#1162 Especially in scheduling, space and datacollection, lack of a native, consistent construct for agent types severely limits the possibilities. With the discussion about patches and "empty" this discussion done again. You might want empty to refer to all agents or only a subset of types or single type. That's currently cumbersome to implement. Basically, by always having dictionary available of which agents of which types are in the model, you can always trust on a consistent construct to iterate over agents and agent types. - The `Model` class now uses a `defaultdict` to store agents, ensuring a set is automatically created for each new agent type. - The `Agent` class has been updated to leverage this feature, simplifying the registration process when an agent is created. - The `remove` method in the `Agent` class now uses `discard`, providing a safer way to remove agents from the model.
In the tests the MockModel currently isn't properly initialized, because super().__init__() isn't called, and thus the Model init isn't ran. This commit fixes that.
f36f20d
to
a74b120
Compare
Final review checklist
@tpike3 and @jackiekazil please do final review. Let's get this merged! |
I am sorry for putting on brakes on this, but I think one detail needs to be changed. This stores agents in sets. While this appears to be the right datatype it has one big disadvantage: In Python the order of items in a set is non-deterministic. So in contrast to dictionaries they don't preserve insertion order. Since mesa is used for research we shouldn't introduce non-deterministic behavior to allow completely reproducible results. Since dictionaries work very similar, you can probably (ab)use dictionary keys instead of using a set. Otherwise this really looks very clean and since it has no breaking changes very mergeable. |
lists keeps insertion order.
Fair point. Changed from a defaultdict of sets to a defaultdict of lists (see a8bb15f). |
You are taking a massive performance hit by going to lists when you want to remove agents. I also fail to understand why, for research purposes, the order needs to be deterministic. The primary purpose of this is to keep track of agents per type. If you want deterministic behavior, you can always do a sort on ID and return a list. This would solve the performance loss while also ensuring deterministic behavior. Also, there are more sophisticated data structures that could be created that have the retrieval speed for removal of a set while providing deterministic behavior (e.g., ordered sets or some more involved linked list kind of thing such as a dict of dicts: [type]:{[id]:instance}). So the question becomes: what is done more frequently: removing agents from the dict or retrieving the collection of agents for a specific agent type? |
I did a benchmark to see the impact. Since most models never remove agents, it tool the Wolf-Sheep model from mesa-examples and ran 5 times with the default settings, 125 times. Here are the results: The t-test is insignificant. from scipy import stats
stats.ttest_ind(list_results, set_results)
TtestResult(statistic=-1.2073182676004544, pvalue=0.22845980311728672, df=248.0) mesa_with_wolfsheep_benchmark.zip Given that with a model that frequently deletes agents, the performance impact is unnoticeable, I would say lets go with the lists. If we find some weird edge-case later we can always patch it. Perfect is the enemy of done. |
@quaquel did some benchmarks (thanks!): self.agents = defaultdict(list)
self.agents[type(agent)].append(agent) Performance for adding and removing 1000 agents: 2.06 ms ± 3.11 µs self.agents = defaultdict(set)
self.agents[type(agent)].add(agent) Performance for adding and removing 1000 agents: 439 µs ± 1.3 µs self.agents = defaultdict(dict)
self.agents[type(agent)][agent] = None Performance for adding and removing 1000 agents: 460 µs ± 2.11 µs All options are imperfect:
What we need in an OrderedSet: https://discuss.python.org/t/add-orderedset-to-stdlib/12730/85 Unfortunately, that doesn't exist in the standard library (yet). There are some packages, the most popular being ordered-set: https://pypi.org/project/ordered-set/ from ordered_set import OrderedSet
self.agents: defaultdict[type, OrderedSet] = defaultdict(OrderedSet) But we have to add that to the requirements. Personally, I'm happy with:
The dict with all values None I find simply too ugly from and UX point of view. Teaching Mesa is complicated enough as it's already. You wouldn't have to specify |
Benchmarked ordered-set, it's extremely slow (22.9 ms ± 1.09 ms, written in pure Python). There's another package called Ordered-set-37 and it's a lot faster, almost as fast as the other set/dict options (615 µs ± 2.34 µs). So, let's vote:
|
Thanks for the performance benchmarks! I think for large lists, list should even perform much worse. I voted for dict. It's a bit ugly, but it's also what happens for Ordered Set 37 and I think explicitly handling it in mesa instead of a hidden in a dependency is more explicit |
This is really great @EwoutH! Substantive discussion and forward progress. I am very much leaning to option 1 but instead of the
This will get rid of the |
I agree and tried to think of a middle ground, let me know what you think |
Ah I didn't catch the last part of your comment. Dict should only be used internally and we should return a list via a getter. Turning the keys into a list should be negligible in terms of performance |
Thanks for both your responses. My biggest concern is that with the dict you can't simply do model.agents[SomeType] anlymore. Now you have to add You can indeed solve that by hiding everything behind utility function, and while I'm a big fan of utility functions, in this case I think it just shouldn't be needed. It doesn't follow the spirit of PEP 20.
Having agents be a dict is just so much more explainable than having it be a nested dict, even if the values are just |
I registered my vote, but this is a great discussion and I am excited to see how it ends up. My two cents for what is worth are very much in line with @Corvince
However, I do appreciate @EwoutH concerns having my own students struggle with dictionaries. Regardless, great job @EwoutH however this ends up it is going to be an awesome move forward. |
Okay, it seems the majority is for a I propose to handle all agent selection utility in the proposed |
This commit introduces dics nested into a single defaultdict for agent storage, using agent instances as inner keys and None as inner values. The decision is based on: 1. Determinism: A dictionary ensures deterministic behavior for reproducible research outcomes. 2. Performance: Benchmarks showed that dictionaries offer a good balance between performance and functionality, especially in models with frequent agent updates. 3. Usability and Explicitness: While the use of None values is unconventional, this approach is practical and avoids the complexity of nested structures or external dependencies. It aligns with clear and explicit coding practices, making the framework more accessible and maintainable. The choice of a defaultdict with None values addresses the need for deterministic behavior, performance efficiency, and clarity in implementation. Additional utility functions like `select_agents()` will be added to enhance usability. Discussion for historical reference: projectmesa#1894 (comment)
Implemented in 2154cb1. Let's get this merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM--- Thanks @EwoutH
And I always appreciate you improving the docstrings as you go!
Thanks everyone for all the feedback and critical thinking. We've made a great step here! Next up: Selecting subsets of agents: #1905. |
Yes, this should absolutely be the right syntax and I think we should make this work. I think we can implement our own OrderedSet (light) class by simply creating a class that inherits from dict and setting getitem. And we would use this to set the agents internally as See also the second part of this answer I think this should work. But for now lets celebrate the merge! 🚀 |
This commit introduces a new feature to the Mesa Agent-Based Modeling framework: a select_agents method. This enhancement is inspired by NetLogo's n-of functions and a smart idea from @rht as discussed in projectmesa#1894. It aims to provide a flexible and efficient way to select and filter agents in a model, based on a wide range of criteria. #### Features The `select_agents` method offers several key functionalities: 1. **Selective Quantity (`n`):** Specify the number of agents to select. If `n` is omitted, all matching agents are chosen. 2. **Criteria-Based Sorting (`sort` and `direction`):** Sort agents based on one or more attributes. The sorting order can be set individually for each attribute. 3. **Functional Filtering (`filter`):** Use a lambda function or a callable to apply complex filter conditions. 4. **Type-Based Filtering (`agent_type`):** Filter agents by their class type, allowing for selection among specific subclasses. 5. **Flexible Size Handling (`up_to`):** When `True`, the method returns up to `n` agents, which is useful when the available agent count is less than `n`.
This commit introduces a new feature to the Mesa Agent-Based Modeling framework: a select_agents method. This enhancement is inspired by NetLogo's n-of functions and a smart idea from @rht as discussed in projectmesa#1894. It aims to provide a flexible and efficient way to select and filter agents in a model, based on a wide range of criteria. #### Features The `select_agents` method offers several key functionalities: 1. **Selective Quantity (`n`):** Specify the number of agents to select. If `n` is omitted, all matching agents are chosen. 2. **Criteria-Based Sorting (`sort` and `direction`):** Sort agents based on one or more attributes. The sorting order can be set individually for each attribute. 3. **Functional Filtering (`filter`):** Use a lambda function or a callable to apply complex filter conditions. 4. **Type-Based Filtering (`agent_type`):** Filter agents by their class type, allowing for selection among specific subclasses. 5. **Flexible Size Handling (`up_to`):** When `True`, the method returns up to `n` agents, which is useful when the available agent count is less than `n`.
On 2023-12-17 this PR was fully updated based on all feedback.
This PR adds a new
agents
dictionary to the MesaModel
class, enabling native support for handling multiple agent types within models. This way all modules can know which agents and agents type are in the model at any given time, by callingmodel.agents
.Motivation
NetLogo has had agent types, called
breeds
, built-in from the start. It works perfectly in all NetLogo components, because it's a first class citizen and all components need to be designed to consider different breeds.In Mesa, agent types are an afterthought at best. Almost nothing is currently designed with multiple agent types in mind. That has caused several issues and limitations over the years, including:
Especially in scheduling, space and datacollection, lack of a native, consistent construct for agent types severely limits the possibilities. With the discussion about patches and "empty" this discussion done again. You might want empty to refer to all agents or only a subset of types or single type. That's currently cumbersome to implement.
Basically, by always having dictionary available of which agents of which types are in the model, you can always trust on a consistent construct to iterate over agents and agent types.
Conceptualisation
@tpike3 has made a really nice conceptual overview how Mesa now looks with this construct:
Implementation Details
Model
class now uses adefaultdict(list)
to manage agents. This eliminates the need for explicit checks when adding new agent types.Agent
class to automatically register itself with the model upon creation, leveraging thedefaultdict
functionality.remove
method in theAgent
class, ensuring safe removal of agents.Advantages
In the agent initialization, the agent gets added to the Model's
agents
dictionary. Thereby, you can always, from everywhere:model.agent_types
property)model.agents_by_type[AgentType]
This will make proper multiple agent support in all Mesa components (the scheduler, grid, datacollector and visualisation) easier, because there is now a consistent model variable they can access.
Backwards compatibility
This PR itself breaks nothing, it just adds a dictionary to the model.
If you have a single agent, looping over the agents will just mean that only one agent type is iterated over (since there is always at least one agent type). So this work well with both model with a single and multiple agent types.
Future implications
While this PR itself won't break current models, if we modify the scheduler, datacollector or grids to use this construct, it might start breaking models. By using weak references (
weakref
), we might be able to limit backwards breaking changes.Review
Since this is a very important change to the core of Mesa, I recommend really critically reviewing this PR, both conceptually (what are the implications for users) and implementation wise (is this the most efficient, scalable and flexible solution).
Any feedback is appreciated!