Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCUSS] Enable compression by default #601

Closed
wants to merge 1 commit into from

Conversation

atoulme
Copy link
Contributor

@atoulme atoulme commented Dec 9, 2022

See #600 for context.

By default, we don't enable gzip compression when sending HEC data. This change would make compression enabled by default.

@atoulme atoulme requested review from a team as code owners December 9, 2022 23:31
@atoulme atoulme force-pushed the enable_compression_by_default branch from 1f01e85 to 5de0c98 Compare December 9, 2022 23:35
@dmitryax
Copy link
Contributor

It means that encryption will be enabled for all events, including small ones, which can make it less efficient. Not sure how this helps #600

@atoulme
Copy link
Contributor Author

atoulme commented Dec 13, 2022

Compression is only triggered if the payload is bigger than 1500 bytes. Should this number be higher? Configurable? 1500 bytes is a number picked after best practices iirc.

no, this is not related to #600, but for the vast majority of log collection situations, compression is usually enabled.

@dmitryax
Copy link
Contributor

@harshit-splunk PTAL. I don't know why it was disabled for Splunk Platform endpoint

@atoulme atoulme force-pushed the enable_compression_by_default branch 3 times, most recently from f5fd5cb to cc1118e Compare January 6, 2023 19:39
@atoulme atoulme force-pushed the enable_compression_by_default branch 2 times, most recently from bff7ffa to 3e7e8a6 Compare April 14, 2023 01:14
@atoulme
Copy link
Contributor Author

atoulme commented Apr 14, 2023

Asking @omrozowicz-splunk and colleagues for review.

@omrozowicz-splunk
Copy link
Contributor

Asking @omrozowicz-splunk and colleagues for review.

I consulted it with the team and Harshit, and we think it might slightly increase the CPU usage but many users already use it anyway, so we can make it a default. So as long as we document clearly in the release that the behavior has changed, it's fine 👍

@VihasMakwana
Copy link
Contributor

LGTM!

@atoulme
Copy link
Contributor Author

atoulme commented Apr 19, 2023

OK thanks I will rebase and merge asap.

@dmitryax
Copy link
Contributor

dmitryax commented May 9, 2023

I did some test runs under high load and saw that enabling compression underperforms in terms of throughput and resource utilization keeping other default configurations as is.

Screenshot 2023-05-09 at 2 04 27 PM

The first run on the chart - compression enabled, the second - disabled. The batch size is about 2k log records.

Something is up with that. I don't think we should merge this until we proof the performance improvement

@atoulme
Copy link
Contributor Author

atoulme commented May 9, 2023

OK, we can prove this out a bit more. We need to have this option if customers send logs larger than the max content length allowed. With other changes on recombination, we should be ok in general.

@dmitryax
Copy link
Contributor

dmitryax commented May 9, 2023

We need to have this option if customers send logs larger than the max content length allowed.

I send 2k batches in my tests. It must be much bigger max content length, which is 1.5kb. EPS decrease is about 30% as you can see on the chart. and memory allocation is a bit higher

@atoulme
Copy link
Contributor Author

atoulme commented May 9, 2023

In this situation, I'm referring to specifically of events where one log message is longer than the max content length. If you send a batch of 2k messages of say 500 bytes each, we know to batch them into a POST HTTP request together to be under the limit of 2MiB, the default max content length.

If you have a log message of 3, 4 MiB, then it is over the max content length and we need to compress it.

@atoulme
Copy link
Contributor Author

atoulme commented May 9, 2023

Maybe we could ditch this approach and decide to compress an individual event if it's over the limit. I always assumed compressing was the right move to send more data to Splunk. We can talk about it sometime, I appreciate you going through testing the actual performance of this mechanism.

@dmitryax
Copy link
Contributor

dmitryax commented May 9, 2023

Not sure I understand why do we take into consideration individual events to make a decision about compression if we are sending events in batches

@hvaghani221
Copy link
Contributor

I revisited the splunkhec exporter's code for some references and observed that we are writing one marshaled event at a time. It will create and write a gzip compression header for each event if compression is enabled. In general, it will lead to increases in CPU cycles and a decrease in the compression ratio.

I tried it in a simple script to compare compressing the whole content of the file at once vs line by line. In my machine, the output is:

StreamAll time:  92.896702ms compression ratio: 3.4027578104256406
StreamLine time:  528.035309ms compression ratio: 1.7993467237980227

This makes it obvious that compressing one event at a time is not optimal for CPU usage and compression ratio.

In my opinion, we should revisit the cancellableGzipWriter and delay compressing as much as possible by buffering multiple events before compressing them.

@atoulme
Copy link
Contributor Author

atoulme commented Aug 9, 2023

Compression logic has been revised quite a bit. I will rebase this PR. We need another round of performance checks before we merge it.

@atoulme atoulme force-pushed the enable_compression_by_default branch from 3e7e8a6 to bd7f8a3 Compare August 9, 2023 23:50
@atoulme atoulme force-pushed the enable_compression_by_default branch 4 times, most recently from fa68f6e to a3a681b Compare October 18, 2023 00:24
@atoulme atoulme force-pushed the enable_compression_by_default branch from a3a681b to ce4b0fb Compare January 18, 2024 21:03
@sbylica-splunk
Copy link
Collaborator

@atoulme atoulme closed this Sep 26, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Sep 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants