Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage of Bash isn't comparable to rest #80

Open
hansbogert opened this issue Apr 12, 2016 · 1 comment
Open

Memory usage of Bash isn't comparable to rest #80

hansbogert opened this issue Apr 12, 2016 · 1 comment

Comments

@hansbogert
Copy link
Contributor

I think the memory usage as shown in the overview page is not comparable to other languages.

The bash script fork to different processes many times by piping outputs to other binaries. I think the memory usage of those separate processes is not accounted for.

@svanoort
Copy link
Contributor

Perhaps, but the larger problem there is that the sort command is actually implemented using an external sort (partitioning the data and sorting in temp files) rather than a fully in-memory sort as with all other implementations. This enables it to run on very large data sets with minimal memory use (just an index), but makes it extremely slow in comparison and highly dependent on I/O performance.

All other operations in that pipeline can operate on a line at a time with minimal memory use.

I've got a PR in WIP state that partially addresses this by using the -S 40% option to specify a percentage of physical memory to use in buffering data for sorting - #75

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants