-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZSTD_compressSequencesAndLiterals #4217
base: dev
Are you sure you want to change the base?
Conversation
ebc6485
to
5895903
Compare
Updated implementation, with slightly different prototype (does no longer request |
877796c
to
6b28993
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed everything except the actual implementation in zstd_compress.c
and zstd_compress_internal.h
.
lib/zstd.h
Outdated
* - Does not write the content size in frame header | ||
* - If any block is incompressible, will fail and return an error | ||
* - @litSize must be == sum of all @.litLength fields in @inSeqs. Any discrepancy will generate an error. | ||
* - the buffer @literals must be larger than @litSize by at least 8 bytes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we take another parameter litCapacity
here? This seems like an easy constraint to break, and it would be good to force users to pass in a value here and check it directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parameter litCapacity
added
@@ -232,7 +232,7 @@ static size_t roundTripTest(void* result, size_t resultCapacity, | |||
const void* src, size_t srcSize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should test ZSTD_compressSequencesAndLiterals()
here as well. We could just test both ZSTD_compressSequences()
and ZSTD_compressSequencesAndLiterals()
lib/compress/zstd_compress.c
Outdated
compressedSeqsSize = ZSTD_entropyCompressSeqStore(&cctx->seqStore, | ||
&cctx->blockState.prevCBlock->entropy, &cctx->blockState.nextCBlock->entropy, | ||
&cctx->appliedParams, | ||
compressedSeqsSize = ZSTD_entropyCompressSeqStore_wExtLitBuffer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is the same as ZSTD_entropyCompressSeqStore()
, can we call that instead? Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ZSTD_entropyCompressSeqStore()
is invoking ZSTD_entropyCompressSeqStore_wExtLitBuffer()
,
so they both converge to the same implementation.
The difference is that ZSTD_entropyCompressSeqStore()
forces the literals
buffer to be the one reserved into cctx
, while _wExtLitBuffer()
let the caller select where the buffer of literals
is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cyan4973 I understand that, but in this case the literals are in the seqStore, right?
There are so many parameters that it is hard to read, but it looks like this could just invoke ZSTD_entropyCompressSeqStore()
instead, because the literals are cctx->seqStore.litStart, (size_t)(cctx->seqStore.lit - cctx->seqStore.litStart)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, you meant this specific instance.
And yes, you are right, it could be replaced by ZSTD_entropyCompressSeqStore()
.
This modification was made early during development, when I was trying to have a common pipeline for the 2 variants. This is a left over from this period. Since I gave up this approach, we now have 2 separate pipelines, so it's possible to restore the call to ZSTD_entropyCompressSeqStore()
, especially if it helps readability.
lib/compress/zstd_compress.c
Outdated
/* | ||
* Note: Sequence validation functionality has been disabled (removed). | ||
* This is helpful to find back simplicity, leading to performance. | ||
* It may be re-inserted later. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we error if sequence validation was requested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, at this point, it's just ignored.
It could be a good idea to add such check and error out clearly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check added
lib/compress/zstd_compress.c
Outdated
RETURN_ERROR_IF(dstCapacity<4, dstSize_tooSmall, "No room for empty frame block header"); | ||
MEM_writeLE32(op, cBlockHeader24); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have MEM_writeLE24
right? Can we use it here to support dstCapacity == 3
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
RETURN_ERROR_IF(dstCapacity < ZSTD_blockHeaderSize, dstSize_tooSmall, "not enough dstCapacity to write a new compressed block"); | ||
|
||
compressedSeqsSize = ZSTD_entropyCompressSeqStore_internal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to call ZSTD_entropyCompressSeqStore_internal()
or can we call ZSTD_entropyCompressSeqStore_wExtLitBuffer()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The initial code was invoking ZSTD_entropyCompressSeqStore_wExtLitBuffer()
.
But because ZSTD_entropyCompressSeqStore_wExtLitBuffer()
is essentially ZSTD_entropyCompressSeqStore()
, just with a selectable litBuffer
,
it adds a few constraints, which favor the creation of uncompressed blocks.
In the normal case, it's fine (an uncompressed block is preferable to a compressed one with only 1 byte of saving),
but in the case of ZSTD_compressSequencesAndLiterals()
, it is catastrophic, because this variant is unable to generate an uncompressed block (since it doesn't see the original data). This would result in an error.
So we prefer skipping these additional checks, by invoking the next stage directly, ZSTD_entropyCompressSeqStore_internal()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good once we add the fuzzer
lib/compress/zstd_compress.c
Outdated
compressedSeqsSize = ZSTD_entropyCompressSeqStore(&cctx->seqStore, | ||
&cctx->blockState.prevCBlock->entropy, &cctx->blockState.nextCBlock->entropy, | ||
&cctx->appliedParams, | ||
compressedSeqsSize = ZSTD_entropyCompressSeqStore_wExtLitBuffer( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cyan4973 I understand that, but in this case the literals are in the seqStore, right?
There are so many parameters that it is hard to read, but it looks like this could just invoke ZSTD_entropyCompressSeqStore()
instead, because the literals are cctx->seqStore.litStart, (size_t)(cctx->seqStore.lit - cctx->seqStore.litStart)
.
makes it possible to register a sequence without copying its literals.
they should not be in common/zstd_internal.h, since these definitions are not shared beyond lib/compress/.
SeqDef is a type name, so it should start with a Capital letter. It's an internal symbol, no impact on public API.
same idea, SeqStore_t is a type name, it should start with a Capital letter.
…ned literals buffer
can receive externally defined buffer of literals
since it's a type name. Note: in contrast with previous names, this one is on the Public API side. So there is a #define, so that existing programs using ZSTD_sequenceFormat_e still work.
no need to publish them outside of this unit.
does not need to track and update internal `litPtr`. note: does not measurably impact performance.
…s is enabled note: very minor saving, no performance impact
…ng code results in +4% compression speed, thanks to removal of branches in the hot loop.
and updated its documentation. Note: older name ZSTD_c_searchForExternalRepcodes remains supported via #define
check that ZSTD_compressAndLiterals() also controls that the `srcSize` field is exact.
this makes it possible to adjust windowSize to its tightest.
checks that srcSize is present in the frame header and bounds the window size.
ZSTD_compressSequencesAndLiterals() cannot produce an uncompressed block
since ZSTD_compressSequencesAndLiterals() doesn't support it.
to ZSTD_compressSequencesAndLiterals() to enforce the litCapacity >= litSize+8 condition.
so that an empty frame needs only 3 bytes of dstCapacity.
in the ZSTD_compressSequences() pipeline
9bcee5f
to
47cbfc8
Compare
This PR introduces a variant of
ZSTD_compressSequences()
,named
ZSTD_compressSequencesAndLiterals()
,which accepts a buffer of literals instead of the source buffer.
This is meant to match specific limitations present in some transcoder operations,
in which it's easy to present Sequences and Literals coming from an external generator,
but it would be more complex or costly to restore the corresponding source.
As an example, one such use case happens within the Linux kernel,
where the source comes as scattered pages which do not form a continuous buffer.
This variant is a bit faster on its own than regular
ZSTD_compressSequences()
, by almost ~+20%,as measured below using
fullbench
on ai7-9700k
:compressSequences
compressSequencesAndLiterals
The difference is pretty consistent, driven primarily by a reduction in overhead during the sequence transcoding operation.
In addition, the goal is to cumulate the speed benefits of this variant with potential other benefits on the caller side, by no longer requiring presentation of the original data.
That being said,
ZSTD_compressSequencesAndLiterals()
also features some limitations:For this reason, it's best to keep this method for compression of "small" data (typically <= 128 KB), where data is typically stored uncompressed when it doesn't compress well enough. This setup corresponds to typical kernel use cases like
zram
,zswap
orBtrFS
.