Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Add Dict._resize_down and _under_load_factor() #3133

Open
wants to merge 2 commits into
base: nightly
Choose a base branch
from

Conversation

rd4com
Copy link
Contributor

@rd4com rd4com commented Jun 28, 2024

Hello,

this is a PR to solve a problem mentioned by @bethebunny
(in #3128)

Good little feature to have,

it probably should not be a default and be parametrized,

because of the cost in performance of halving memory.

💡 for the Dictionary that not down-resize

Did an implementation of it, maybe you'll like it !

It really grows and ungrows dynamically,

at each pop, if self.size < 1/3 of self._reserved(),

the Dict halves it's memory with a smaller list capacity.

(so the RAM usage is dynamic and change on the size)

When pop is used on all elements of a dict of 256 elements,

the transition goes from 512, 256, 128, 64, 32, 16, 8, 4

That way, it doubles on 2/3 and halves on 1/3

 

The ratio is quite simple and it seem to do the job:
(same as for resizing up (below 1/3 of self._reserved()))

0.33 * 256 == 84.48 <- we are below (we can resize down)
0.66 * 128 == 84.48 <- we are below (no need to resize up again)

0.66 * 256 == 168.96 <- we are above (we can resize up)
0.33 * 512 == 168.96 < we are above (no need to resize down again)

Example of a Dict cycling trough grow and ungrow:

from time import now
def main():
    var result: Int=0
    var dict_size = 256
    var start = now()
    var stop = now()
    
    result = 0

    small2 = Dict[Int,Int]()
    start = now()
    for y in range(2):
        for i in range(dict_size):
            # grow the dictionary up to its full size
            print("before",small2._reserved(), len(small2))
            small2[i]=i
            print("after",small2._reserved(), len(small2))

        for i in range(dict_size):
            # ungrow the dictionary
            print("before",small2._reserved(), len(small2))
            result += small2.pop(i)
            print("after",small2._reserved(), len(small2))
    stop = now()
    print(stop-start, result)

The result is correct (65280):

x = 0
for y in range(2):
    for i in range(256):
        x+=i
print(x) #65280

 

Maybe that kind feature need to be parametrized, so users can choose?
(resizing down reduce the performance, so may not be a good default)

  • DictConfig struct, as default Dict parameter is a possibility
    • we might end up in a parametrization challenge later
  • another Dict type is also a possibility (BaloonDictionary like) ?

I suggest the second one would be better, but struggle for naming here.

 

If reviewers likes it, maybe we can bring it to life with more reviews ?

(There are more methods where the logic need to be done, and tests)

 

Note that the hard work have been done by people working on Dict,

this pr just add a halving part, which use the same ratio as the doubling.

@rd4com rd4com requested a review from a team as a code owner June 28, 2024 10:23
@@ -790,7 +790,16 @@ struct Dict[K: KeyElement, V: CollectionElement](
var entry_value = entry[].unsafe_take()
entry[] = None
self.size -= 1
return entry_value.value^
var tmp = entry_value.value^ #TODO: not to merge with current PR
#It is necessary for def test_pop_default()
Copy link
Contributor

@gryznar gryznar Jun 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this comment should be part of pull request, not code itself?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just return entry_value.value^ directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks @bethebunny , @gryznar,

It is related to this issue:

The test test_pop_default() would fail,

I think it has to do with values that needs to be fully initialized to be ready for the __del__ ,

but we move the value out anyway so maybe it has to do with something else ?

@@ -790,7 +790,16 @@ struct Dict[K: KeyElement, V: CollectionElement](
var entry_value = entry[].unsafe_take()
entry[] = None
self.size -= 1
return entry_value.value^
var tmp = entry_value.value^ #TODO: not to merge with current PR
#It is necessary for def test_pop_default()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just return entry_value.value^ directly?

stdlib/src/collections/dict.mojo Outdated Show resolved Hide resolved
@rd4com rd4com force-pushed the dict_feature_resize_down branch 2 times, most recently from 614feb7 to ade41f9 Compare July 1, 2024 06:26
@rd4com
Copy link
Contributor Author

rd4com commented Jul 1, 2024

✅ Changes for the review by @bethebunny:

  • Unify _resize_down, _resize_up and _compact in _maybe_resize

✅ Changes for the review by @gryznar:

#TODO: not to merge with current PR
#It is necessary for def test_pop_default()
# See PR #2796 (unmerged, but ready, by helelex), 
# [stdlib] mark dict entry as destroyed in Dict.pop()
#
# _resize_down __del__ current self._entries for smaller new one
# is it why the test fail ?

☑️ Still need to add theses back into the new logic:

  • debug_assert(right < self._reserved(), "Invalid dict state")
  • debug_assert(entry[].__bool__(), "Logic error")

⏱️ Benchmarks:

It is slower on pop but we're good on insertion

insert

A: 8380416 755676 PR
B: 8380416 913786

from time import now
def main():
    var dict_size = 1<<10
    var x = Dict[Int,Int]()
    var result = 0
    var start = now()
    var stop = now()
    start = now()
    for n in range(16):
        x = Dict[Int,Int]()
        for i in range(dict_size):
            x[i] = i
            result += x[i]
    stop = now()
    
    print(result, stop-start)

pop

A: 523776 79933 PR
B: 523776 28685

A: 549755289600 251504430 PR
B: 549755289600 84026254

time import now
def main():
    var dict_size = 1<<10
    var result = 0

    var start = now()
    var stop = now()
    
    var x = Dict[Int,Int]()
    for i in range(dict_size):
        x[i] = i
        
    start = now()
    for i in range(dict_size):
        result += x.pop(i)
    stop = now() 
    
    print(result, stop-start)

@martinvuyk
Copy link
Contributor

Maybe that kind feature need to be parametrized, so users can choose?

Hi, yes It would be very interesting to be able to use std Dict in a memory constrained environment like a microcontroller. Maybe something like a memory_constrained: Bool = False parameter. But it can also be that we end up developing a specialized version for such environments as an extension to the stdlib 🤷‍♂️ (see proposal #3232)

Copy link
Collaborator

@JoeLoser JoeLoser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for leaving this one hanging, @rd4com. The PR LGTM, do you mind rebasing so we can land this?

I'd hold off on the DictConfig idea or separate dict type for now.

@JoeLoser JoeLoser added the waiting for response Needs action/response from contributor before a PR can proceed label Nov 22, 2024
@rd4com rd4com force-pushed the dict_feature_resize_down branch from ade41f9 to b410ae7 Compare November 22, 2024 19:53
@rd4com
Copy link
Contributor Author

rd4com commented Nov 22, 2024

Hello, all good 👍 ,
we also have an extra speedup by revisiting this! 🥳

Added more test and one with the test_utils for __del__ counts,
also fixed a small thing in _find_slot by using a Pointer (it was doing __copyinit__ of dict_entry),
it results in a 2.27 speedup depending on the type for all methods (insert, pop, __getitem__, ..)
so it should now be even faster than before lol
(there was no Pointer/Reference when Dict got created so that's not a bug but integrating the feature)
(shout out to clattner who put us on the path to also focus on integration in last community meeting 👍 )

small benchmark (1.7x):

from time import now
from random import *
from collections import Dict

alias iteration_size = 1024
def main():
    var result: Int=0
    var start = now()
    var stop = now()

    
    small = Dict[Int,Int]()
    start = now()
    for x in range(100):
        for i in range(iteration_size):
            small[i]=i
        for i in range(iteration_size): 
            result += small[i]
    stop = now()
	_ = small
    print(stop-start, result)

🔥 6.5x speedup

from time import now
from random import *
from sys.param_env import is_defined
from collections import Dict

def main():
    var dict_size = 1<<12
    var result = 0

    var start = now()
    var stop = now()
    
    var x = Dict[String, String]()
     
    for i in range(dict_size):
        var val = str(i)
        x[val] = val

    start = now()
    for i in x.keys():
        result += len(x[i[]])
    stop = now() 
    
    print(result, stop-start)

2x for **kwargs

from time import now

fn sum_args(**kwargs: Int) raises ->Int:
    var res = 0
    for k in kwargs:
        res+=(kwargs[k[]])
    return res

def main():
    var iterations = 1<<10
    var result = 0

    var start = now()
    var stop = now()
    
        
    start = now()
    for i in range(iterations):
        result += sum_args(a=i, b=i+1, c=i+2)
    stop = now() 
    
    print(result, stop-start)

It is difficult to measure but Pointer should be faster anyway,
please do some benchmarks if needed 👍

There is also another idea to speedup **kwargs by using Dict[UInt, V],
and just rebind[UInt](int("arg1".unsafe_ptr())) !
(but not sure if it is good to have a pointer as K or not)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting for response Needs action/response from contributor before a PR can proceed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants