Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename memory method #1124

Closed
vigna opened this issue Nov 2, 2023 · 6 comments
Closed

Rename memory method #1124

vigna opened this issue Nov 2, 2023 · 6 comments

Comments

@vigna
Copy link
Contributor

vigna commented Nov 2, 2023

I suggest that the Process::memory method is renamed resident_memory, or anything that makes it clear that is not the method returning all the memory used by the process.

It is just an ergonomics problem—having found memory with autocompletion I thought that would returned all the memory, virtual or not, swapped or not, owned by the process. The documentation doesn't explain what is returned, either. Finally, after reading the blog entry, I understood that memory returns just the RSS. So, essentially memory returns a subset of virtual_memory.

Alternatively, one can just explain in the documentation that memory returns only the resident set, but I think people will still get confused.

My use case is that was doing a half-terabyte allocation under Linux and I couldn't see it in memory.

I can open a PR with a resident_memory method that deprecates the previous memory method if you're interested.

@GuillaumeGomez
Copy link
Owner

I think the best solution here would to actually improve the documentation. Meaning that first we should tell that memory returns the resident memory and then explain what the resident memory is and also recommend to take a look at the virtual_method method where we would explain what the virtual memory is.

I don't think renaming memory to resident_memory is a good idea simply people most people using this crate don't know what it is, making it harder for them to use the crate.

@vigna
Copy link
Contributor Author

vigna commented Nov 2, 2023

Well, if they don't know what RSS is presently they're calling a method without any idea of what it returns. More precisely, they make up their minds using the method name (like: this is the memory allocated by the process), and get something completely different.

I mean, I know what a RSS is, but I would have never thought that virtual_memory is more general than memory. If you add adjectives in an API, you're being more specific. Here you add adjectives and you're less specific.

@GuillaumeGomez
Copy link
Owner

I think it's pretty rare for people to be interested into virtual memory. The memory naming is correct in this regard because it's the memory effectively used by your process.

@vigna
Copy link
Contributor Author

vigna commented Nov 2, 2023

I disagree.

If you perform a large allocation (my case: 500GB) it does not show in the RSS until you actually write to those pages. This can be extremely misleading because using memory you will think that the process is "effectively using" little memory. But there are different definition of "using", and "pages that have been physically mapped into memory" is not really what the average user has in mind.

If I do a calloc for 500GB I'm definitely using that memory. But it won't show up in Linux with memory.

I realize this is a mess—for example, on MacOS the virtual memory is ridiculously large (1/2TB) even for the smallest process, and in fact the RSS info is more relatable to the memory "effectively used". But this is another proof that "effectively used" is very hard to define cross-OS, so a technically more precise name would be preferrable.

But it's your crate (BTW: thanks, incredibly useful crate!), so you call the shots.

@GuillaumeGomez
Copy link
Owner

If I do a calloc for 500GB I'm definitely using that memory. But it won't show up in Linux with memory.

I didn't know that. I thought the memory was counted as RSS as soon as you wrote into the pages.

I realize this is a mess—for example, on MacOS the virtual memory is ridiculously large (1/2TB) even for the smallest process, and in fact the RSS info is more relatable to the memory "effectively used". But this is another proof that "effectively used" is very hard to define cross-OS, so a technically more precise name would be preferrable.

That's why I suggested to update documentation rather than updating the method name. If the number seems off, people will likely come back to the documentation trying to see if there is more information. I prefer to make it easy to find something, even if not exactly accurate. Documentation is there to provide more context (which it currently doesn't, which is very bad and I'm quite happy you opened an issue about this!).

So although your arguments make perfect sense, I hope my position and logic make sense to you as to why I'd rather keep the method name as is.

But it's your crate (BTW: thanks, incredibly useful crate!), so you call the shots.

Glad it's helpful to you! In any case, a PR to improve documentation would be very welcome!

@GuillaumeGomez
Copy link
Owner

Since #1125 got merged, I think we can close this issue. Thanks a lot for the doc improvement! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants