Skip to content

Commit

Permalink
Merge pull request #808 from facebook/dev
Browse files Browse the repository at this point in the history
Zstandard v1.3.1
  • Loading branch information
Cyan4973 authored Aug 20, 2017
2 parents b72808a + 4912fc2 commit aecf3b4
Show file tree
Hide file tree
Showing 134 changed files with 7,318 additions and 2,229 deletions.
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,4 +39,4 @@ outlined on that page and do not file a public issue.

## License
By contributing to Zstandard, you agree that your contributions will be licensed
under the [LICENSE](LICENSE) file in the root directory of this source tree.
under both the [LICENSE](LICENSE) file and the [COPYING](COPYING) file in the root directory of this source tree.
339 changes: 339 additions & 0 deletions COPYING

Large diffs are not rendered by default.

16 changes: 9 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -74,12 +74,9 @@ zstdmt:
zlibwrapper:
$(MAKE) -C $(ZWRAPDIR) test

.PHONY: shortest
shortest:
$(MAKE) -C $(TESTDIR) $@

.PHONY: test
test:
.PHONY: test shortest
test shortest:
$(MAKE) -C $(PRGDIR) allVariants
$(MAKE) -C $(TESTDIR) $@

.PHONY: examples
Expand Down Expand Up @@ -146,6 +143,11 @@ gcc6build: clean
gcc-6 -v
CC=gcc-6 $(MAKE) all MOREFLAGS="-Werror"

.PHONY: gcc7build
gcc7build: clean
gcc-7 -v
CC=gcc-7 $(MAKE) all MOREFLAGS="-Werror"

.PHONY: clangbuild
clangbuild: clean
clang -v
Expand Down Expand Up @@ -180,7 +182,7 @@ ppc64fuzz: clean
CC=powerpc-linux-gnu-gcc QEMU_SYS=qemu-ppc64-static MOREFLAGS="-m64 -static" FUZZER_FLAGS=--no-big-tests $(MAKE) -C $(TESTDIR) fuzztest

gpptest: clean
CC=g++ $(MAKE) -C $(PRGDIR) all CFLAGS="-O3 -Wall -Wextra -Wundef -Wshadow -Wcast-align -Werror"
CC=$(CXX) $(MAKE) -C $(PRGDIR) all CFLAGS="-O3 -Wall -Wextra -Wundef -Wshadow -Wcast-align -Werror"

gcc5test: clean
gcc-5 -v
Expand Down
15 changes: 15 additions & 0 deletions NEWS
Original file line number Diff line number Diff line change
@@ -1,3 +1,18 @@
v1.3.1
New license : BSD + GPLv2
perf: substantially decreased memory usage in Multi-threading mode, thanks to reports by Tino Reichardt (@mcmilk)
perf: Multi-threading supports up to 256 threads. Cap at 256 when more are requested (#760)
cli : improved and fixed --list command, by @ib (#772)
cli : command -vV to list supported formats, by @ib (#771)
build : fixed binary variants, reported by @svenha (#788)
build : fix Visual compilation for non x86/x64 targets, reported by Greg Slazinski (@GregSlazinski) (#718)
API exp : breaking change : ZSTD_getframeHeader() provides more information
API exp : breaking change : pinned down values of error codes
doc : fixed huffman example, by Ulrich Kunitz (@ulikunitz)
new : contrib/adaptive-compression, I/O driven compression strength, by Paul Cruz (@paulcruz74)
new : contrib/long_distance_matching, statistics by Stella Lau (@stellamplau)
updated : contrib/linux-kernel, by Nick Terrell (@terrelln)

v1.3.0
cli : new : `--list` command, by Paul Cruz
cli : changed : xz/lzma support enabled by default
Expand Down
33 changes: 0 additions & 33 deletions PATENTS

This file was deleted.

4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,12 +134,12 @@ Going into `build` directory, you will find additional possibilities :

### Status

Zstandard is currently deployed within Facebook. It is used daily to compress and decompress very large amounts of data in multiple formats and use cases.
Zstandard is currently deployed within Facebook. It is used continuously to compress large amounts of data in multiple formats and use cases.
Zstandard is considered safe for production environments.

### License

Zstandard is [BSD-licensed](LICENSE). We also provide an [additional patent grant](PATENTS).
Zstandard is dual-licensed under [BSD](LICENSE) and [GPLv2](COPYING).

### Contributing

Expand Down
3 changes: 3 additions & 0 deletions build/cmake/programs/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ ENDIF (MSVC)

ADD_EXECUTABLE(zstd ${PROGRAMS_DIR}/zstdcli.c ${PROGRAMS_DIR}/fileio.c ${PROGRAMS_DIR}/bench.c ${PROGRAMS_DIR}/datagen.c ${PROGRAMS_DIR}/dibio.c ${PlatformDependResources})
TARGET_LINK_LIBRARIES(zstd libzstd_static)
IF (CMAKE_SYSTEM_NAME MATCHES "(Solaris|SunOS)")
TARGET_LINK_LIBRARIES(zstd rt)
ENDIF (CMAKE_SYSTEM_NAME MATCHES "(Solaris|SunOS)")
INSTALL(TARGETS zstd RUNTIME DESTINATION "bin")

IF (UNIX)
Expand Down
6 changes: 3 additions & 3 deletions circle.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ dependencies:
- sudo dpkg --add-architecture i386
- sudo add-apt-repository -y ppa:ubuntu-toolchain-r/test; sudo apt-get -y -qq update
- sudo apt-get -y install gcc-powerpc-linux-gnu gcc-arm-linux-gnueabi libc6-dev-armel-cross gcc-aarch64-linux-gnu libc6-dev-arm64-cross
- sudo apt-get -y install libstdc++-6-dev clang gcc g++ gcc-5 gcc-6 zlib1g-dev liblzma-dev
- sudo apt-get -y install libstdc++-7-dev clang gcc g++ gcc-5 gcc-6 gcc-7 zlib1g-dev liblzma-dev
- sudo apt-get -y install linux-libc-dev:i386 libc6-dev-i386

test:
Expand Down Expand Up @@ -45,7 +45,7 @@ test:
parallel: true
- ? |
if [[ "$CIRCLE_NODE_INDEX" == "0" ]] ; then make ppc64build && make clean; fi &&
if [[ "$CIRCLE_NODE_TOTAL" < "2" ]] || [[ "$CIRCLE_NODE_INDEX" == "1" ]]; then true && make clean; fi #could add another test here
if [[ "$CIRCLE_NODE_TOTAL" < "2" ]] || [[ "$CIRCLE_NODE_INDEX" == "1" ]]; then make gcc7build && make clean; fi #could add another test here
:
parallel: true
- ? |
Expand All @@ -64,7 +64,7 @@ test:
#- gcc -v; make -C tests test32 MOREFLAGS="-I/usr/include/x86_64-linux-gnu" && make clean
#- make uasan && make clean
#- make asan32 && make clean
#- make -C tests test32 CC=clang MOREFLAGS="-g -fsanitize=address -I/usr/include/x86_64-linux-gnu"
#- make -C tests test32 CC=clang MOREFLAGS="-g -fsanitize=address -I/usr/include/x86_64-linux-gnu"
# Valgrind tests
#- CFLAGS="-O1 -g" make -C zlibWrapper valgrindTest && make clean
#- make -C tests valgrindTest && make clean
Expand Down
76 changes: 76 additions & 0 deletions contrib/adaptive-compression/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@

ZSTDDIR = ../../lib
PRGDIR = ../../programs
ZSTDCOMMON_FILES := $(ZSTDDIR)/common/*.c
ZSTDCOMP_FILES := $(ZSTDDIR)/compress/*.c
ZSTDDECOMP_FILES := $(ZSTDDIR)/decompress/*.c
ZSTD_FILES := $(ZSTDDECOMP_FILES) $(ZSTDCOMMON_FILES) $(ZSTDCOMP_FILES)

MULTITHREAD_LDFLAGS = -pthread
DEBUGFLAGS= -g -DZSTD_DEBUG=1
CPPFLAGS += -I$(ZSTDDIR) -I$(ZSTDDIR)/common -I$(ZSTDDIR)/compress \
-I$(ZSTDDIR)/dictBuilder -I$(ZSTDDIR)/deprecated -I$(PRGDIR)
CFLAGS ?= -O3
CFLAGS += -Wall -Wextra -Wcast-qual -Wcast-align -Wshadow \
-Wstrict-aliasing=1 -Wswitch-enum -Wdeclaration-after-statement \
-Wstrict-prototypes -Wundef -Wformat-security \
-Wvla -Wformat=2 -Winit-self -Wfloat-equal -Wwrite-strings \
-Wredundant-decls
CFLAGS += $(DEBUGFLAGS)
CFLAGS += $(MOREFLAGS)
FLAGS = $(CPPFLAGS) $(CFLAGS) $(LDFLAGS) $(MULTITHREAD_LDFLAGS)

all: adapt datagen

adapt: $(ZSTD_FILES) adapt.c
$(CC) $(FLAGS) $^ -o $@

adapt-debug: $(ZSTD_FILES) adapt.c
$(CC) $(FLAGS) -DDEBUG_MODE=2 $^ -o adapt

datagen : $(PRGDIR)/datagen.c datagencli.c
$(CC) $(FLAGS) $^ -o $@

test-adapt-correctness: datagen adapt
@./test-correctness.sh
@echo "test correctness complete"

test-adapt-performance: datagen adapt
@./test-performance.sh
@echo "test performance complete"

clean:
@$(RM) -f adapt datagen
@$(RM) -rf *.dSYM
@$(RM) -f tmp*
@$(RM) -f tests/*.zst
@$(RM) -f tests/tmp*
@echo "finished cleaning"

#-----------------------------------------------------------------------------
# make install is validated only for Linux, OSX, BSD, Hurd and Solaris targets
#-----------------------------------------------------------------------------
ifneq (,$(filter $(shell uname),Linux Darwin GNU/kFreeBSD GNU OpenBSD FreeBSD NetBSD DragonFly SunOS))

ifneq (,$(filter $(shell uname),SunOS))
INSTALL ?= ginstall
else
INSTALL ?= install
endif

PREFIX ?= /usr/local
DESTDIR ?=
BINDIR ?= $(PREFIX)/bin

INSTALL_PROGRAM ?= $(INSTALL) -m 755

install: adapt
@echo Installing binaries
@$(INSTALL) -d -m 755 $(DESTDIR)$(BINDIR)/
@$(INSTALL_PROGRAM) adapt $(DESTDIR)$(BINDIR)/zstd-adaptive
@echo zstd-adaptive installation completed

uninstall:
@$(RM) $(DESTDIR)$(BINDIR)/zstd-adaptive
@echo zstd-adaptive programs successfully uninstalled
endif
91 changes: 91 additions & 0 deletions contrib/adaptive-compression/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
### Summary

`adapt` is a new compression tool targeted at optimizing performance across network connections and pipelines. The tool is aimed at sensing network speeds and adapting compression level based on network or pipe speeds.
In situations where the compression level does not appropriately match the network/pipe speed, compression may be bottlenecking the entire pipeline or the files may not be compressed as much as they potentially could be, therefore losing efficiency. It also becomes quite impractical to manually measure and set an optimalcompression level (which could potentially change over time).

### Using `adapt`

In order to build and use the tool, you can simply run `make adapt` in the `adaptive-compression` directory under `contrib`. This will generate an executable available for use. Another possible method of installation is running `make install`, which will create and install the binary as the command `zstd-adaptive`.

Similar to many other compression utilities, `zstd-adaptive` can be invoked by using the following format:

`zstd-adaptive [options] [file(s)]`

Supported options for the above format are described below.

`zstd-adaptive` also supports reading from `stdin` and writing to `stdout`, which is potentially more useful. By default, if no files are given, `zstd-adaptive` reads from and writes to standard I/O. Therefore, you can simply insert it within a pipeline like so:

`cat FILE | zstd-adaptive | ssh "cat - > tmp.zst"`

If a file is provided, it is also possible to force writing to stdout using the `-c` flag like so:

`zstd-adaptive -c FILE | ssh "cat - > tmp.zst"`

Several options described below can be used to control the behavior of `zstd-adaptive`. More specifically, using the `-l#` and `-u#` flags will will set upper and lower bounds so that the compression level will always be within that range. The `-i#` flag can also be used to change the initial compression level. If an initial compression level is not provided, the initial compression level will be chosen such that it is within the appropriate range (it becomes equal to the lower bound).

### Options
`-oFILE` : write output to `FILE`

`-i#` : provide initial compression level (must within the appropriate bounds)

`-h` : display help/information

`-f` : force the compression level to stay constant

`-c` : force write to `stdout`

`-p` : hide progress bar

`-q` : quiet mode -- do not show progress bar or other information

`-l#` : set a lower bound on the compression level (default is 1)

`-u#` : set an upper bound on the compression level (default is 22)
### Benchmarking / Test results
#### Artificial Tests
These artificial tests were run by using the `pv` command line utility in order to limit pipe speeds (25 MB/s read and 5 MB/s write limits were chosen to mimic severe throughput constraints). A 40 GB backup file was sent through a pipeline, compressed, and written out to a file. Compression time, size, and ratio were computed. Data for `zstd -15` was excluded from these tests because the test runs quite long.

<table>
<tr><th> 25 MB/s read limit </th></tr>
<tr><td>

| Compressor Name | Ratio | Compressed Size | Compression Time |
|:----------------|------:|----------------:|-----------------:|
| zstd -3 | 2.108 | 20.718 GB | 29m 48.530s |
| zstd-adaptive | 2.230 | 19.581 GB | 29m 48.798s |

</td><tr>
</table>

<table>
<tr><th> 5 MB/s write limit </th></tr>
<tr><td>

| Compressor Name | Ratio | Compressed Size | Compression Time |
|:----------------|------:|----------------:|-----------------:|
| zstd -3 | 2.108 | 20.718 GB | 1h 10m 43.076s |
| zstd-adaptive | 2.249 | 19.412 GB | 1h 06m 15.577s |

</td></tr>
</table>

The commands used for this test generally followed the form:

`cat FILE | pv -L 25m -q | COMPRESSION | pv -q > tmp.zst # impose 25 MB/s read limit`

`cat FILE | pv -q | COMPRESSION | pv -L 5m -q > tmp.zst # impose 5 MB/s write limit`

#### SSH Tests

The following tests were performed by piping a relatively large backup file (approximately 80 GB) through compression and over SSH to be stored on a server. The test data includes statistics for time and compressed size on `zstd` at several compression levels, as well as `zstd-adaptive`. The data highlights the potential advantages that `zstd-adaptive` has over using a low static compression level and the negative imapcts that using an excessively high static compression level can have on
pipe throughput.

| Compressor Name | Ratio | Compressed Size | Compression Time |
|:----------------|------:|----------------:|-----------------:|
| zstd -3 | 2.212 | 32.426 GB | 1h 17m 59.756s |
| zstd -15 | 2.374 | 30.213 GB | 2h 56m 59.441s |
| zstd-adaptive | 2.315 | 30.993 GB | 1h 18m 52.860s |

The commands used for this test generally followed the form:

`cat FILE | COMPRESSION | ssh dev "cat - > tmp.zst"`
Loading

0 comments on commit aecf3b4

Please sign in to comment.