Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSVIngesterSocial.java reads entire sample file into memory #87

Open
zekizeki opened this issue Sep 21, 2018 · 2 comments
Open

CSVIngesterSocial.java reads entire sample file into memory #87

zekizeki opened this issue Sep 21, 2018 · 2 comments

Comments

@zekizeki
Copy link
Contributor

In
https://github.com/ONSdigital/rm-sample-service/blob/master/src/main/java/uk/gov/ons/ctp/response/sample/ingest/CsvIngesterSocial.java

A SampleUnit object is held in memory for each row in the CSV read. Larger sample files will blow the heap limits set and cause a crash.

@83naff
Copy link
Contributor

83naff commented Sep 24, 2018

From what I have read in the code, for each line in the file it is getting the attributes then returning them within each iteration of the for loop. Although the local variables in parseLine are destroyed, we are copying them, which would increase memory. What we can do at least is to reduce the copious amount of copying incurred.

@mfcrocker
Copy link
Contributor

Hi Rob!

Though this is definitely something that will happen over a certain size, we've tested loading the biggest sample we've got (111k for ASHE) as part of the performance team's work and it loaded fine.

This is something that will want a refactor one day, but as it stands there's no business need to go >110k that I'm aware of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants