ZSTD_compressSequencesAndLiterals #4217

Cyan4973 · 2024-12-17T05:01:11Z

This PR introduces a variant of ZSTD_compressSequences(),
named ZSTD_compressSequencesAndLiterals(),
which accepts a buffer of literals instead of the source buffer.

This is meant to match specific limitations present in some transcoder operations,
in which it's easy to present Sequences and Literals coming from an external generator,
but it would be more complex or costly to restore the corresponding source.
As an example, one such use case happens within the Linux kernel,
where the source comes as scattered pages which do not form a continuous buffer.

This variant is a bit faster on its own than regular ZSTD_compressSequences(), by almost ~+20%,
as measured below using fullbench on a i7-9700k:

source	parser	`compressSequences`	`compressSequencesAndLiterals`	difference
calgary.tar	level 19	805 MB/s	960 MB/s	+19%
enwik6	level 4	761 MBs	917 MB/s	+20 %

The difference is pretty consistent, driven primarily by a reduction in overhead during the sequence transcoding operation.

In addition, the goal is to cumulate the speed benefits of this variant with potential other benefits on the caller side, by no longer requiring presentation of the original data.

That being said, ZSTD_compressSequencesAndLiterals() also features some limitations:

Only supports Explicit Block Delimiter mode. While this could be extended later on, it's unclear if it would be useful.
Does not support Frame Checksum, since it doesn't see the original data. Requesting this checksum will result in an error.
Does not write the content size within the frame header
Cannot generate uncompressed blocks, since it doesn't have the original data. While this could be partially mitigated in some favorable cases, at the cost of complexity, there are many more complex scenarios for which it cannot be completely eliminated. Thus, if the compression process tries to generate an uncompressed block, it results in a compression error.

For this reason, it's best to keep this method for compression of "small" data (typically <= 128 KB), where data is typically stored uncompressed when it doesn't compress well enough. This setup corresponds to typical kernel use cases like zram, zswap or BtrFS.

Cyan4973 · 2024-12-18T05:49:22Z

Updated implementation, with slightly different prototype (does no longer request srcSize) and improved compression speed.

terrelln

Reviewed everything except the actual implementation in zstd_compress.c and zstd_compress_internal.h.

terrelln · 2024-12-18T20:39:44Z

lib/zstd.h

+ * - Does not write the content size in frame header
+ * - If any block is incompressible, will fail and return an error
+ * - @litSize must be == sum of all @.litLength fields in @inSeqs. Any discrepancy will generate an error.
+ * - the buffer @literals must be larger than @litSize by at least 8 bytes.


Can we take another parameter litCapacity here? This seems like an easy constraint to break, and it would be good to force users to pass in a value here and check it directly.

parameter litCapacity added

terrelln · 2024-12-18T20:43:03Z

tests/fuzz/sequence_compression_api.c

@@ -232,7 +232,7 @@ static size_t roundTripTest(void* result, size_t resultCapacity,
                            const void* src, size_t srcSize,


We should test ZSTD_compressSequencesAndLiterals() here as well. We could just test both ZSTD_compressSequences() and ZSTD_compressSequencesAndLiterals()

terrelln · 2024-12-18T21:04:51Z

lib/compress/zstd_compress.c

-        compressedSeqsSize = ZSTD_entropyCompressSeqStore(&cctx->seqStore,
-                                &cctx->blockState.prevCBlock->entropy, &cctx->blockState.nextCBlock->entropy,
-                                &cctx->appliedParams,
+        compressedSeqsSize = ZSTD_entropyCompressSeqStore_wExtLitBuffer(


It looks like this is the same as ZSTD_entropyCompressSeqStore(), can we call that instead? Or am I missing something?

ZSTD_entropyCompressSeqStore() is invoking ZSTD_entropyCompressSeqStore_wExtLitBuffer(),
so they both converge to the same implementation.

The difference is that ZSTD_entropyCompressSeqStore() forces the literals buffer to be the one reserved into cctx, while _wExtLitBuffer() let the caller select where the buffer of literals is.

@Cyan4973 I understand that, but in this case the literals are in the seqStore, right?

There are so many parameters that it is hard to read, but it looks like this could just invoke ZSTD_entropyCompressSeqStore() instead, because the literals are cctx->seqStore.litStart, (size_t)(cctx->seqStore.lit - cctx->seqStore.litStart).

Ah, you meant this specific instance.
And yes, you are right, it could be replaced by ZSTD_entropyCompressSeqStore().

This modification was made early during development, when I was trying to have a common pipeline for the 2 variants. This is a left over from this period. Since I gave up this approach, we now have 2 separate pipelines, so it's possible to restore the call to ZSTD_entropyCompressSeqStore(), especially if it helps readability.

terrelln · 2024-12-18T21:06:05Z

lib/compress/zstd_compress.c

+/*
+ * Note: Sequence validation functionality has been disabled (removed).
+ * This is helpful to find back simplicity, leading to performance.
+ * It may be re-inserted later.
+ */


Do we error if sequence validation was requested?

nope, at this point, it's just ignored.
It could be a good idea to add such check and error out clearly.

check added

terrelln · 2024-12-18T21:11:58Z

lib/compress/zstd_compress.c

+        RETURN_ERROR_IF(dstCapacity<4, dstSize_tooSmall, "No room for empty frame block header");
+        MEM_writeLE32(op, cBlockHeader24);


We have MEM_writeLE24 right? Can we use it here to support dstCapacity == 3?

terrelln · 2024-12-18T21:13:29Z

lib/compress/zstd_compress.c

+
+        RETURN_ERROR_IF(dstCapacity < ZSTD_blockHeaderSize, dstSize_tooSmall, "not enough dstCapacity to write a new compressed block");
+
+        compressedSeqsSize = ZSTD_entropyCompressSeqStore_internal(


Do we need to call ZSTD_entropyCompressSeqStore_internal() or can we call ZSTD_entropyCompressSeqStore_wExtLitBuffer()?

The initial code was invoking ZSTD_entropyCompressSeqStore_wExtLitBuffer().
But because ZSTD_entropyCompressSeqStore_wExtLitBuffer() is essentially ZSTD_entropyCompressSeqStore(), just with a selectable litBuffer,
it adds a few constraints, which favor the creation of uncompressed blocks.
In the normal case, it's fine (an uncompressed block is preferable to a compressed one with only 1 byte of saving),
but in the case of ZSTD_compressSequencesAndLiterals(), it is catastrophic, because this variant is unable to generate an uncompressed block (since it doesn't see the original data). This would result in an error.
So we prefer skipping these additional checks, by invoking the next stage directly, ZSTD_entropyCompressSeqStore_internal().

terrelln

Looks good once we add the fuzzer

terrelln · 2024-12-20T15:28:56Z

lib/compress/zstd_compress.c

-        compressedSeqsSize = ZSTD_entropyCompressSeqStore(&cctx->seqStore,
-                                &cctx->blockState.prevCBlock->entropy, &cctx->blockState.nextCBlock->entropy,
-                                &cctx->appliedParams,
+        compressedSeqsSize = ZSTD_entropyCompressSeqStore_wExtLitBuffer(


@Cyan4973 I understand that, but in this case the literals are in the seqStore, right?

There are so many parameters that it is hard to read, but it looks like this could just invoke ZSTD_entropyCompressSeqStore() instead, because the literals are cctx->seqStore.litStart, (size_t)(cctx->seqStore.lit - cctx->seqStore.litStart).

makes it possible to register a sequence without copying its literals.

they should not be in common/zstd_internal.h, since these definitions are not shared beyond lib/compress/.

SeqDef is a type name, so it should start with a Capital letter. It's an internal symbol, no impact on public API.

same idea, SeqStore_t is a type name, it should start with a Capital letter.

…ned literals buffer

can receive externally defined buffer of literals

since it's a type name. Note: in contrast with previous names, this one is on the Public API side. So there is a #define, so that existing programs using ZSTD_sequenceFormat_e still work.

no need to publish them outside of this unit.

does not need to track and update internal `litPtr`. note: does not measurably impact performance.

…s is enabled note: very minor saving, no performance impact

…ng code results in +4% compression speed, thanks to removal of branches in the hot loop.

and updated its documentation. Note: older name ZSTD_c_searchForExternalRepcodes remains supported via #define

check that ZSTD_compressAndLiterals() also controls that the `srcSize` field is exact.

this makes it possible to adjust windowSize to its tightest.

checks that srcSize is present in the frame header and bounds the window size.

ZSTD_compressSequencesAndLiterals() cannot produce an uncompressed block

since ZSTD_compressSequencesAndLiterals() doesn't support it.

to ZSTD_compressSequencesAndLiterals() to enforce the litCapacity >= litSize+8 condition.

so that an empty frame needs only 3 bytes of dstCapacity.

in the ZSTD_compressSequences() pipeline

Cyan4973 self-assigned this Dec 17, 2024

facebook-github-bot added the CLA Signed label Dec 17, 2024

Cyan4973 force-pushed the ZSTD_compressSequencesAndLiterals branch from ebc6485 to 5895903 Compare December 17, 2024 06:00

Cyan4973 force-pushed the ZSTD_compressSequencesAndLiterals branch from 877796c to 6b28993 Compare December 18, 2024 06:12

terrelln reviewed Dec 18, 2024

View reviewed changes

terrelln reviewed Dec 20, 2024

View reviewed changes

Cyan4973 added 22 commits December 20, 2024 10:36

publish new symbol ZSTD_compressSequencesAndLiterals()

125f052

created ZSTD_storeSeqOnly()

a00f45a

makes it possible to register a sequence without copying its literals.

move Sequences definition to zstd_compress_internal.h

b4a40a8

they should not be in common/zstd_internal.h, since these definitions are not shared beyond lib/compress/.

codemod: seqDef -> SeqDef

9671813

SeqDef is a type name, so it should start with a Capital letter. It's an internal symbol, no impact on public API.

codemod: seqStore_t -> SeqStore_t

a224572

same idea, SeqStore_t is a type name, it should start with a Capital letter.

codemod: ZSTD_sequenceLength -> ZSTD_SequenceLength

8d4506b

codemod: symbolEncodingType_e -> SymbolEncodingType_e

477a010

codemod: ZSTD_defaultPolicy_e -> ZSTD_DefaultPolicy_e

0442e43

ZSTD_entropyCompressSeqStore_internal() can accept an externally defi…

e9f8a11

…ned literals buffer

created ZSTD_entropyCompressSeqStore_wExtLitBuffer()

0165eeb

can receive externally defined buffer of literals

codemod: ZSTD_sequenceFormat_e -> ZSTD_SequenceFormat_e

c97522f

since it's a type name. Note: in contrast with previous names, this one is on the Public API side. So there is a #define, so that existing programs using ZSTD_sequenceFormat_e still work.

codemod: ZSTD_sequenceCopier -> ZSTD_SequenceCopier_f

894ea31

minor: simplify ZSTD_selectSequenceCopier

1ac79ba

scope: ZSTD_copySequencesToSeqStore*() are private to ZSTD_compress.c

76dd3a9

no need to publish them outside of this unit.

fix proper type for .forceNonContiguous

03d95f9

enable proper type

5359d16

codemod: ZSTD_sequencePosition -> ZSTD_SequencePosition

30671d7

codemod: ZSTD_buildSeqStore_e -> ZSTD_BuildSeqStore_e

fa46894

codemod: ZSTD_matchState_t -> ZSTD_MatchState_t

5df80ac

codemod: repcodes_t -> Repcodes_t

41c667c

codemod: rawSeqStore_t -> RawSeqStore_t

25bef24

codemod: ZSTD_blockCompressor -> ZSTD_BlockCompressor_f

08edecb

Cyan4973 and others added 29 commits December 20, 2024 10:36

fullbench: new scenario: compressSequencesAndLiterals()

f281497

minor conversion warning fix

0a5c080

minor optimization for ZSTD_compressSequencesAndLiterals()

1c8f5b0

does not need to track and update internal `litPtr`. note: does not measurably impact performance.

minor doc update

f176514

minor optimization: only track seqPos->posInSrc when validateSequence…

a288751

…s is enabled note: very minor saving, no performance impact

optimization: instantiate specialized version without Sequence checki…

1f6d681

…ng code results in +4% compression speed, thanks to removal of branches in the hot loop.

updated documentation on validateSequence

d2d0fda

minor: cleaner function parameter repcodeResolution

ca8bd83

change advanced parameter name: ZSTD_c_repcodeResolution

5164d44

and updated its documentation. Note: older name ZSTD_c_searchForExternalRepcodes remains supported via #define

ZSTD_compressSequencesAndLiterals() now supports multi-blocks frames.

31b5ef2

added tests

f0d0d95

check that ZSTD_compressAndLiterals() also controls that the `srcSize` field is exact.

update Visual Studio solutions

6f8c104

removed fullbench-dll project from visual solutions

47edd0a

fixed incorrect assert

f617e86

attempt to silence Visual Studio warning about fopen()

61ac831

change name to ZSTD_convertSequences*()

d48e330

added benchmark for ZSTD_convertBlockSequences_wBlockDelim()

95ad9e4

improved speed of the Sequences converter

12c47d3

fixed minor conversion warning

b7b4e86

fixed minor error in one benchmark scenario

ad023b3

ZSTD_compressSequencesAndLiterals requires srcSize as parameter

0a54f6f

this makes it possible to adjust windowSize to its tightest.

added a test for ZSTD_compressSequencesAndLiterals

a80f55f

checks that srcSize is present in the frame header and bounds the window size.

add dedicated error code for special case

b339eff

ZSTD_compressSequencesAndLiterals() cannot produce an uncompressed block

ensure that srcSize is controlled

ab0f179

fixed minor error in preparation of one fullbench scenario

52a9bc6

add a check, to return an error if Sequence validation is enabled

76445bb

since ZSTD_compressSequencesAndLiterals() doesn't support it.

added parameter litCapacity

b7a9e69

to ZSTD_compressSequencesAndLiterals() to enforce the litCapacity >= litSize+8 condition.

minor: use MEM_writeLE24()

522adc3

so that an empty frame needs only 3 bytes of dstCapacity.

restore invocation of ZSTD_entropyCompressSeqStore()

47cbfc8

in the ZSTD_compressSequences() pipeline

Cyan4973 force-pushed the ZSTD_compressSequencesAndLiterals branch from 9bcee5f to 47cbfc8 Compare December 20, 2024 18:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZSTD_compressSequencesAndLiterals #4217

ZSTD_compressSequencesAndLiterals #4217

Cyan4973 commented Dec 17, 2024 •

edited

Loading

Cyan4973 commented Dec 18, 2024

terrelln left a comment

terrelln Dec 18, 2024

Cyan4973 Dec 19, 2024

terrelln Dec 18, 2024

terrelln Dec 18, 2024

Cyan4973 Dec 19, 2024

terrelln Dec 20, 2024

Cyan4973 Dec 20, 2024 •

edited

Loading

terrelln Dec 18, 2024

Cyan4973 Dec 19, 2024

Cyan4973 Dec 20, 2024

terrelln Dec 18, 2024

Cyan4973 Dec 20, 2024

terrelln Dec 18, 2024

Cyan4973 Dec 20, 2024

terrelln left a comment

terrelln Dec 20, 2024

		@@ -232,7 +232,7 @@ static size_t roundTripTest(void* result, size_t resultCapacity,
		const void* src, size_t srcSize,

		RETURN_ERROR_IF(dstCapacity<4, dstSize_tooSmall, "No room for empty frame block header");
		MEM_writeLE32(op, cBlockHeader24);


		RETURN_ERROR_IF(dstCapacity < ZSTD_blockHeaderSize, dstSize_tooSmall, "not enough dstCapacity to write a new compressed block");

		compressedSeqsSize = ZSTD_entropyCompressSeqStore_internal(

ZSTD_compressSequencesAndLiterals #4217

Are you sure you want to change the base?

ZSTD_compressSequencesAndLiterals #4217

Conversation

Cyan4973 commented Dec 17, 2024 • edited Loading

Cyan4973 commented Dec 18, 2024

terrelln left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Cyan4973 Dec 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

terrelln left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Cyan4973 commented Dec 17, 2024 •

edited

Loading

Cyan4973 Dec 20, 2024 •

edited

Loading