Replies: 1 comment 4 replies
-
Hi, thanks for the message - it's great to see people using the library and contributing back :). I can totally understand how large SharedStrings files could cause an issue here. Do we know if it's the XML parsing that causes huge memory consumption, or if is a result of storing the sharedStrings in memory? If it's the former, could we stream the shared strings file when parsing (as we do with the main XML content) - I'm not sure I see the need for the interface changing here. If it's the latter, I'm a bit confused how your suggestion would resolve the issue? |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Thanks for nice library. Currently I'm using it to parse large spreadsheets (500k rows and more) that contains data for mailing. Currently shared strings are fully loaded to memory during
XlsxFile
initialization (io.ReadAll
+xml.Unmarshal
+getPopulatedValues
). When nearly all strings in spreadsheet are unique this technique leads to huge memory consumption.My idea is:
NewReader([]byte, ...Option)
By default parser can use simple implementation based on
[]string
ormap[int]string
If there is no concerns about ideas above I'll try to make pull request.
Beta Was this translation helpful? Give feedback.
All reactions