Skip to content

Commit

Permalink
Merge pull request #4174 from facebook/bench_loadOnce
Browse files Browse the repository at this point in the history
Modify benchmark to load sources only once
  • Loading branch information
Cyan4973 authored Oct 23, 2024
2 parents b880f20 + f34bc9c commit dfaf5fa
Show file tree
Hide file tree
Showing 6 changed files with 225 additions and 350 deletions.
190 changes: 116 additions & 74 deletions programs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,89 +129,131 @@ CLI includes in-memory compression benchmark module for zstd.
The benchmark is conducted using given filenames. The files are read into memory and joined together.
It makes benchmark more precise as it eliminates I/O overhead.
Multiple filenames can be supplied, as multiple parameters, with wildcards,
or names of directories can be used as parameters with `-r` option.
or directory names can be used with `-r` option.
If no file is provided, the benchmark will use a procedurally generated "lorem ipsum" content.

The benchmark measures ratio, compressed size, compression and decompression speed.
One can select compression levels starting from `-b` and ending with `-e`.
The `-i` parameter selects minimal time used for each of tested levels.

The benchmark can also be used to test specific parameters,
such as number of threads (`-T#`), or advanced parameters (`--zstd=#`), or dictionary compression (`-D DICTIONARY`),
and many others available on command for regular compression and decompression.


### Usage of Command Line Interface
The full list of options can be obtained with `-h` or `-H` parameter:
```
Usage :
zstd [args] [FILE(s)] [-o file]
FILE : a filename
with no FILE, or when FILE is - , read standard input
Arguments :
-# : # compression level (1-19, default: 3)
-d : decompression
-D DICT: use DICT as Dictionary for compression or decompression
-o file: result stored into `file` (only 1 output file)
-f : overwrite output without prompting, also (de)compress links
--rm : remove source file(s) after successful de/compression
-k : preserve source file(s) (default)
-h/-H : display help/long help and exit
Advanced arguments :
-V : display Version number and exit
-c : write to standard output (even if it is the console)
-v : verbose mode; specify multiple times to increase verbosity
-q : suppress warnings; specify twice to suppress errors too
--no-progress : do not display the progress counter
-r : operate recursively on directories
--filelist FILE : read list of files to operate upon from FILE
--output-dir-flat DIR : processed files are stored into DIR
--output-dir-mirror DIR : processed files are stored into DIR respecting original directory structure
--[no-]asyncio : use asynchronous IO (default: enabled)
--[no-]check : during compression, add XXH64 integrity checksum to frame (default: enabled). If specified with -d, decompressor will ignore/validate checksums in compressed frame (default: validate).
-- : All arguments after "--" are treated as files
Advanced compression arguments :
--ultra : enable levels beyond 19, up to 22 (requires more memory)
--long[=#]: enable long distance matching with given window log (default: 27)
--fast[=#]: switch to very fast compression levels (default: 1)
--adapt : dynamically adapt compression level to I/O conditions
--patch-from=FILE : specify the file to be used as a reference point for zstd's diff engine
-T# : spawns # compression threads (default: 1, 0==# cores)
-B# : select size of each job (default: 0==automatic)
--single-thread : use a single thread for both I/O and compression (result slightly different than -T1)
--rsyncable : compress using a rsync-friendly method (-B sets block size)
--exclude-compressed: only compress files that are not already compressed
--stream-size=# : specify size of streaming input from `stdin`
--size-hint=# optimize compression parameters for streaming input of approximately this size
--target-compressed-block-size=# : generate compressed block of approximately targeted size
--no-dictID : don't write dictID into header (dictionary compression only)
--[no-]compress-literals : force (un)compressed literals
--format=zstd : compress files to the .zst format (default)
--format=gzip : compress files to the .gz format
--format=xz : compress files to the .xz format
--format=lzma : compress files to the .lzma format
--format=lz4 : compress files to the .lz4 format
Advanced decompression arguments :
-l : print information about zstd compressed files
--test : test compressed file integrity
-M# : Set a memory usage limit for decompression
--[no-]sparse : sparse mode (default: disabled)
Dictionary builder :
--train ## : create a dictionary from a training set of files
--train-cover[=k=#,d=#,steps=#,split=#,shrink[=#]] : use the cover algorithm with optional args
--train-fastcover[=k=#,d=#,f=#,steps=#,split=#,accel=#,shrink[=#]] : use the fast cover algorithm with optional args
--train-legacy[=s=#] : use the legacy algorithm with selectivity (default: 9)
-o DICT : DICT is dictionary name (default: dictionary)
--maxdict=# : limit dictionary to specified size (default: 112640)
--dictID=# : force dictionary ID to specified value (default: random)
Benchmark arguments :
-b# : benchmark file(s), using # compression level (default: 3)
-e# : test all compression levels successively from -b# to -e# (default: 1)
-i# : minimum evaluation time in seconds (default: 3s)
-B# : cut file into independent chunks of size # (default: no chunking)
-S : output one benchmark result per input file (default: consolidated result)
--priority=rt : set process priority to real-time
*** Zstandard CLI (64-bit) v1.5.6, by Yann Collet ***
Compress or decompress the INPUT file(s); reads from STDIN if INPUT is `-` or not provided.
Usage: zstd [OPTIONS...] [INPUT... | -] [-o OUTPUT]
Options:
-o OUTPUT Write output to a single file, OUTPUT.
-k, --keep Preserve INPUT file(s). [Default]
--rm Remove INPUT file(s) after successful (de)compression.
-# Desired compression level, where `#` is a number between 1 and 19;
lower numbers provide faster compression, higher numbers yield
better compression ratios. [Default: 3]
-d, --decompress Perform decompression.
-D DICT Use DICT as the dictionary for compression or decompression.
-f, --force Disable input and output checks. Allows overwriting existing files,
receiving input from the console, printing output to STDOUT, and
operating on links, block devices, etc. Unrecognized formats will be
passed-through through as-is.
-h Display short usage and exit.
-H, --help Display full help and exit.
-V, --version Display the program version and exit.
Advanced options:
-c, --stdout Write to STDOUT (even if it is a console) and keep the INPUT file(s).
-v, --verbose Enable verbose output; pass multiple times to increase verbosity.
-q, --quiet Suppress warnings; pass twice to suppress errors.
--trace LOG Log tracing information to LOG.
--[no-]progress Forcibly show/hide the progress counter. NOTE: Any (de)compressed
output to terminal will mix with progress counter text.
-r Operate recursively on directories.
--filelist LIST Read a list of files to operate on from LIST.
--output-dir-flat DIR Store processed files in DIR.
--output-dir-mirror DIR Store processed files in DIR, respecting original directory structure.
--[no-]asyncio Use asynchronous IO. [Default: Enabled]
--[no-]check Add XXH64 integrity checksums during compression. [Default: Add, Validate]
If `-d` is present, ignore/validate checksums during decompression.
-- Treat remaining arguments after `--` as files.
Advanced compression options:
--ultra Enable levels beyond 19, up to 22; requires more memory.
--fast[=#] Use to very fast compression levels. [Default: 1]
--adapt Dynamically adapt compression level to I/O conditions.
--long[=#] Enable long distance matching with window log #. [Default: 27]
--patch-from=REF Use REF as the reference point for Zstandard's diff engine.
-T# Spawn # compression threads. [Default: 1; pass 0 for core count.]
--single-thread Share a single thread for I/O and compression (slightly different than `-T1`).
--auto-threads={physical|logical}
Use physical/logical cores when using `-T0`. [Default: Physical]
-B# Set job size to #. [Default: 0 (automatic)]
--rsyncable Compress using a rsync-friendly method (`-B` sets block size).
--exclude-compressed Only compress files that are not already compressed.
--stream-size=# Specify size of streaming input from STDIN.
--size-hint=# Optimize compression parameters for streaming input of approximately size #.
--target-compressed-block-size=#
Generate compressed blocks of approximately # size.
--no-dictID Don't write `dictID` into the header (dictionary compression only).
--[no-]compress-literals Force (un)compressed literals.
--[no-]row-match-finder Explicitly enable/disable the fast, row-based matchfinder for
the 'greedy', 'lazy', and 'lazy2' strategies.
--format=zstd Compress files to the `.zst` format. [Default]
--[no-]mmap-dict Memory-map dictionary file rather than mallocing and loading all at once
--format=gzip Compress files to the `.gz` format.
--format=xz Compress files to the `.xz` format.
--format=lzma Compress files to the `.lzma` format.
--format=lz4 Compress files to the `.lz4` format.
Advanced decompression options:
-l Print information about Zstandard-compressed files.
--test Test compressed file integrity.
-M# Set the memory usage limit to # megabytes.
--[no-]sparse Enable sparse mode. [Default: Enabled for files, disabled for STDOUT.]
--[no-]pass-through Pass through uncompressed files as-is. [Default: Disabled]
Dictionary builder:
--train Create a dictionary from a training set of files.
--train-cover[=k=#,d=#,steps=#,split=#,shrink[=#]]
Use the cover algorithm (with optional arguments).
--train-fastcover[=k=#,d=#,f=#,steps=#,split=#,accel=#,shrink[=#]]
Use the fast cover algorithm (with optional arguments).
--train-legacy[=s=#] Use the legacy algorithm with selectivity #. [Default: 9]
-o NAME Use NAME as dictionary name. [Default: dictionary]
--maxdict=# Limit dictionary to specified size #. [Default: 112640]
--dictID=# Force dictionary ID to #. [Default: Random]
Benchmark options:
-b# Perform benchmarking with compression level #. [Default: 3]
-e# Test all compression levels up to #; starting level is `-b#`. [Default: 1]
-i# Set the minimum evaluation to time # seconds. [Default: 3]
-B# Cut file into independent chunks of size #. [Default: No chunking]
-S Output one benchmark result per input file. [Default: Consolidated result]
-D dictionary Benchmark using dictionary
--priority=rt Set process priority to real-time.
```

### Passing parameters through Environment Variables
Expand Down
Loading

0 comments on commit dfaf5fa

Please sign in to comment.