-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DISCUSS] Enable compression by default #601
Conversation
1f01e85
to
5de0c98
Compare
It means that encryption will be enabled for all events, including small ones, which can make it less efficient. Not sure how this helps #600 |
Compression is only triggered if the payload is bigger than 1500 bytes. Should this number be higher? Configurable? 1500 bytes is a number picked after best practices iirc. no, this is not related to #600, but for the vast majority of log collection situations, compression is usually enabled. |
@harshit-splunk PTAL. I don't know why it was disabled for Splunk Platform endpoint |
f5fd5cb
to
cc1118e
Compare
bff7ffa
to
3e7e8a6
Compare
Asking @omrozowicz-splunk and colleagues for review. |
I consulted it with the team and Harshit, and we think it might slightly increase the CPU usage but many users already use it anyway, so we can make it a default. So as long as we document clearly in the release that the behavior has changed, it's fine 👍 |
LGTM! |
OK thanks I will rebase and merge asap. |
I did some test runs under high load and saw that enabling compression underperforms in terms of throughput and resource utilization keeping other default configurations as is. The first run on the chart - compression enabled, the second - disabled. The batch size is about 2k log records. Something is up with that. I don't think we should merge this until we proof the performance improvement |
OK, we can prove this out a bit more. We need to have this option if customers send logs larger than the max content length allowed. With other changes on recombination, we should be ok in general. |
I send 2k batches in my tests. It must be much bigger max content length, which is 1.5kb. EPS decrease is about 30% as you can see on the chart. and memory allocation is a bit higher |
In this situation, I'm referring to specifically of events where one log message is longer than the max content length. If you send a batch of 2k messages of say 500 bytes each, we know to batch them into a POST HTTP request together to be under the limit of 2MiB, the default max content length. If you have a log message of 3, 4 MiB, then it is over the max content length and we need to compress it. |
Maybe we could ditch this approach and decide to compress an individual event if it's over the limit. I always assumed compressing was the right move to send more data to Splunk. We can talk about it sometime, I appreciate you going through testing the actual performance of this mechanism. |
Not sure I understand why do we take into consideration individual events to make a decision about compression if we are sending events in batches |
I revisited the splunkhec exporter's code for some references and observed that we are writing one marshaled event at a time. It will create and write a gzip compression header for each event if compression is enabled. In general, it will lead to increases in CPU cycles and a decrease in the compression ratio. I tried it in a simple script to compare compressing the whole content of the file at once vs line by line. In my machine, the output is:
This makes it obvious that compressing one event at a time is not optimal for CPU usage and compression ratio. In my opinion, we should revisit the |
Compression logic has been revised quite a bit. I will rebase this PR. We need another round of performance checks before we merge it. |
3e7e8a6
to
bd7f8a3
Compare
fa68f6e
to
a3a681b
Compare
a3a681b
to
ce4b0fb
Compare
Spreadsheet with tests for a new version: |
See #600 for context.
By default, we don't enable gzip compression when sending HEC data. This change would make compression enabled by default.