-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[stdlib] Add Dict._resize_down
and _under_load_factor()
#3133
base: nightly
Are you sure you want to change the base?
Conversation
stdlib/src/collections/dict.mojo
Outdated
@@ -790,7 +790,16 @@ struct Dict[K: KeyElement, V: CollectionElement]( | |||
var entry_value = entry[].unsafe_take() | |||
entry[] = None | |||
self.size -= 1 | |||
return entry_value.value^ | |||
var tmp = entry_value.value^ #TODO: not to merge with current PR | |||
#It is necessary for def test_pop_default() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this comment should be part of pull request, not code itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just return entry_value.value^
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Thanks @bethebunny , @gryznar,
It is related to this issue:
The test test_pop_default()
would fail,
I think it has to do with values that needs to be fully initialized to be ready for the __del__
,
but we move the value out anyway so maybe it has to do with something else ?
stdlib/src/collections/dict.mojo
Outdated
@@ -790,7 +790,16 @@ struct Dict[K: KeyElement, V: CollectionElement]( | |||
var entry_value = entry[].unsafe_take() | |||
entry[] = None | |||
self.size -= 1 | |||
return entry_value.value^ | |||
var tmp = entry_value.value^ #TODO: not to merge with current PR | |||
#It is necessary for def test_pop_default() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just return entry_value.value^
directly?
614feb7
to
ade41f9
Compare
✅ Changes for the review by @bethebunny:
✅ Changes for the review by @gryznar:
☑️ Still need to add theses back into the new logic:
⏱️ Benchmarks:It is slower on pop but we're good on insertion
|
Hi, yes It would be very interesting to be able to use std Dict in a memory constrained environment like a microcontroller. Maybe something like a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for leaving this one hanging, @rd4com. The PR LGTM, do you mind rebasing so we can land this?
I'd hold off on the DictConfig
idea or separate dict type for now.
Signed-off-by: rd4com <[email protected]>
Signed-off-by: rd4com <[email protected]>
ade41f9
to
b410ae7
Compare
Hello, all good 👍 , Added more test and one with the test_utils for small benchmark (1.7x):from time import now
from random import *
from collections import Dict
alias iteration_size = 1024
def main():
var result: Int=0
var start = now()
var stop = now()
small = Dict[Int,Int]()
start = now()
for x in range(100):
for i in range(iteration_size):
small[i]=i
for i in range(iteration_size):
result += small[i]
stop = now()
_ = small
print(stop-start, result) 🔥 6.5x speedupfrom time import now
from random import *
from sys.param_env import is_defined
from collections import Dict
def main():
var dict_size = 1<<12
var result = 0
var start = now()
var stop = now()
var x = Dict[String, String]()
for i in range(dict_size):
var val = str(i)
x[val] = val
start = now()
for i in x.keys():
result += len(x[i[]])
stop = now()
print(result, stop-start) 2x for
|
Hello,
this is a PR to solve a problem mentioned by @bethebunny
(in #3128)
Good little feature to have,
it probably should not be a default and be parametrized,
because of the cost in performance of halving memory.
💡 for the Dictionary that not down-resize
Did an implementation of it, maybe you'll like it !
It really grows and ungrows dynamically,
at each
pop
, ifself.size < 1/3 of self._reserved()
,the
Dict
halves it's memory with a smaller list capacity.(so the RAM usage is dynamic and change on the size)
When pop is used on all elements of a dict of 256 elements,
the transition goes from 512, 256, 128, 64, 32, 16, 8, 4
That way, it doubles on 2/3 and halves on 1/3
The ratio is quite simple and it seem to do the job:
(same as for resizing up (below 1/3 of
self._reserved()
))0.33 * 256 == 84.48 <- we are below (we can resize down)
0.66 * 128 == 84.48 <- we are below (no need to resize up again)
0.66 * 256 == 168.96 <- we are above (we can resize up)
0.33 * 512 == 168.96 < we are above (no need to resize down again)
Example of a
Dict
cycling trough grow and ungrow:The result is correct (
65280
):Maybe that kind feature need to be parametrized, so users can choose?
(resizing down reduce the performance, so may not be a good default)
DictConfig
struct, as defaultDict
parameter is a possibilityDict
type is also a possibility (BaloonDictionary
like) ?I suggest the second one would be better, but struggle for naming here.
If reviewers likes it, maybe we can bring it to life with more reviews ?
(There are more methods where the logic need to be done, and tests)
Note that the hard work have been done by people working on
Dict
,this pr just add a halving part, which use the same ratio as the doubling.