Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory consumption for data slicing into objects #2719

Merged
merged 2 commits into from
Jan 30, 2024

Conversation

cthulhu-rider
Copy link
Contributor

@cthulhu-rider cthulhu-rider force-pushed the experiment-slicer-optimizaitons branch from 074a921 to fae34e1 Compare January 15, 2024 15:01
Copy link

codecov bot commented Jan 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (8027ca3) 28.87% compared to head (153d56a) 28.87%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #2719   +/-   ##
=======================================
  Coverage   28.87%   28.87%           
=======================================
  Files         415      415           
  Lines       32450    32450           
=======================================
  Hits         9370     9370           
  Misses      22237    22237           
  Partials      843      843           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cthulhu-rider cthulhu-rider marked this pull request as ready for review January 15, 2024 15:43
@carpawell
Copy link
Member

Conflicts and I do not expect it to be merged before nspcc-dev/neofs-sdk-go#539 (this PR is not a draft).

@cthulhu-rider cthulhu-rider marked this pull request as draft January 23, 2024 16:58
@cthulhu-rider cthulhu-rider force-pushed the experiment-slicer-optimizaitons branch from fae34e1 to 2e40379 Compare January 23, 2024 17:04
@cthulhu-rider
Copy link
Contributor Author

after SDK PR update testcases still pass

NeoFS API recommends to specify known size of the streamed object so the
server could be ready to handle it. NeoFS CLI creates objects with
payload from file that have fixed size, so nothing prevents the size
from being given.

Signed-off-by: Leonard Lyubich <[email protected]>
Previously, storage nodes slicing user data stream into NeoFS objects
did not process payload size when it was specified in the request. Since
any NeoFS cluster has a system parameter that limits the maximum size
of a physically stored object, the node allocated a buffer equal to its
value.For "large" objects this was quite acceptable and optimal. However,
the smaller the amount of data that was sliced, the more redundant
memory was allocated. According to the protocol, the size of the stored
data may be unknown, so we cannot always rely on knowing the size of
the streaming data. However, in the case when this parameter is
specified in the request, nothing prevents the node from using its
value to allocate a small enough buffer. In particular, this will
significantly reduce memory consumption during mass streaming of “small”
data.

The updated version of the NeoFS SDK library supports processing a
previously known amount of data in-box. All the application needs is to
set `SetPayloadSize` slicing option.

It's hard to imagine simpler approach. But there were some nuances.
According to NeoFS API protocol, a zero header field is considered an
explicit setting of the object without a payload, while an unspecified
size is passed as all bits set in the field. However, NeoFS nodes for a
long time regarded any value, including zero, as a dynamic size. Since
zero also means a “forgotten” field, we cannot suddenly begin to
perceive it as an object without a payload. Therefore, for update
security, only non-zero values other than the maximum possible are
processed as the size set by the user.

In total, the applied optimization will not cover all cases, but for
some of them it significantly increases performance and efficiency.

Refs #2686.

Signed-off-by: Leonard Lyubich <[email protected]>
@cthulhu-rider cthulhu-rider force-pushed the experiment-slicer-optimizaitons branch from 71b5df0 to 153d56a Compare January 30, 2024 07:48
@cthulhu-rider cthulhu-rider marked this pull request as ready for review January 30, 2024 07:49
@cthulhu-rider cthulhu-rider changed the title Optimize fixed-length data slicing Reduce memory consumption for data slicing into objects Jan 30, 2024
@roman-khimov roman-khimov merged commit c27830f into master Jan 30, 2024
10 of 11 checks passed
@roman-khimov roman-khimov deleted the experiment-slicer-optimizaitons branch January 30, 2024 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants