Releases · Xilinx/inference-server · GitHub

07 Sep 16:50

v0.4.0 Latest

Latest

What's Changed

Refactor globals by @varunsh-xilinx in #125
Add MLPerf app by @varunsh-xilinx in #129
Use benchmark library for C++ benchmarks by @varunsh-xilinx in #147
Use a global memory pool by @varunsh-xilinx in #149
Resolve requests before the batcher by @varunsh-xilinx in #164
Support using custom classes for memory in the pool by @varunsh-xilinx in #166
Enable parallel tests in CI by @varunsh-xilinx in #168
Use VAI 3.0 XRT and XRM by @varunsh-xilinx in #169
Update CMake build by @varunsh-xilinx in #170
Add C++ worker by @varunsh-xilinx in #172
Add QuickStart for ZenDNN CPU Inference by @rradjabi in #174
Add model chaining by @varunsh-xilinx in #176
Support model chaining in modelLoad by @varunsh-xilinx in #178
Update downloading test models by @varunsh-xilinx in #180
Update documentation by @varunsh-xilinx in #182
Pin python packages by @varunsh-xilinx in #183
Update benchmarks by @varunsh-xilinx in #184
Close dynamically opened libraries by @varunsh-xilinx in #186
Replace Jaeger exporter with OTLP by @varunsh-xilinx in #187
Use vcpkg for dependencies by @varunsh-xilinx in #188
Add FP16 test by @varunsh-xilinx in #189
Improve KServe compatibility by @varunsh-xilinx in #190
Fix wheel generation by @varunsh-xilinx in #191
Allow passing git credentials to Docker build by @varunsh-xilinx in #194
Fix loading models at startup by @varunsh-xilinx in #195
Use MLCommons app for benchmarks by @varunsh-xilinx in #197
Add performance graphs by @varunsh-xilinx in #198
Update documentation by @varunsh-xilinx in #200
Bug fix in migraphx filename mangling by @bpickrel in #202
Add FP16 test in Python by @varunsh-xilinx in #203
Fix FP16 binding by @varunsh-xilinx in #208
Fix quickstart script for 0.3.0 by @varunsh-xilinx in #210
Include tensor name in model metadata for xmodel worker by @ZchiPitt in #207
Add resize to Python preprocess binding by @varunsh-xilinx in #213
Add a test for the VCK5000 by @varunsh-xilinx in #214
UIF 1.2 release by @varunsh-xilinx in #215

New Contributors

@rradjabi made their first contribution in #174
@ZchiPitt made their first contribution in #207

Full Changelog
Full Diff: v0.3.0...v0.4.0

Contributors

ZchiPitt, rradjabi, and 2 other contributors

Assets 2

01 Feb 15:34

v0.3.0

Added

Allow building Debian package (#59)
Add modelInferAsync to the API (@2f4a6c2)
Add inferAsyncOrdered as a client operator for making inferences in parallel (#66)
Support building Python wheels with cibuildwheel (#71)
Support XModels with multiple output tensors (#74)
Add FP16 support (#76)
Add more documentation (#85, #90)
Add Python bindings for gRPC and Native clients (#88)
Add tests with KServe (#90)
Add batch size flag to examples (#94)
Add Kubernetes test for KServe (#95)
Use exhale to generate Python API documentation (#95)
OpenAPI spec for REST protocol (#100)
Use a timer for simpler time measurement (#104)
Allow building containers with custom backend versions (#107)

Changed

Refactor pre- and post-processing functions in C++ (42cf748)
Templatize Dockerfile for different base images (#71)
Use multiple HTTP clients internally for parallel HTTP requests (#66)
Update test asset downloading (#81)
Reimplement and align examples across platforms (#85)
Reorganize Python library (#88)
Rename 'proteus' to 'amdinfer' (#91)
Use Ubuntu 20.04 by default for Docker (#97)
Bump up to ROCm 5.4.1 (#99)
Some function names changed for style (#102)
Bump up to ZenDNN 4.0 (#113)

Deprecated

ALL_CAPS style enums for the DataType (#102)

Removed

Mappings between XIR data types <-> inference server data types from public API (#102)
Web GUI (#110)

Fixed

Use input tensors in requests correctly (#61)
Fix bug with multiple input tensors (#74)
Align gRPC responses using non-gRPC-native data types with other input protocols (#81)
Fix the Manager’s destructor (#88)
Fix using --no-user-config with proteus run (#89)
Handle assigning user permissions if the host UID is same as UID in container (#101)
Fix test discovery if some test assets are missing (#105)
Fix gRPC queue shutdown race condition (#111)

Full Changelog: v0.2.0...v0.3.0

Assets 2

05 Aug 15:54

v0.2.0

Added

HTTP/REST C++ client (@cbf33b8)
gRPC API based on KServe v2 API (@37a6aad and others)
TensorFlow/Pytorch + ZenDNN backend (#17 and #21)
‘ServerMetadata’ endpoint to the API (@7747911)
‘modelList’ endpoint to the API (@7477b7d)
Parse JSON data as string in HTTP body (@694800e)
Directory monitoring for model loading (@6459797)
‘ModelMetadata’ endpoint to the API (@22b9d1a)
MIGraphX backend (#34)
Pre-commit for style verification(@048bdd7)

Changed

Use Pybind11 to create Python API (#20)
Two logs are created now: server and client
Logging macro is now PROTEUS_LOG_*
Loading workers is now case-insensitive (@14ed4ef and @90a51ae)
Build AKS from source (@e04890f)
Use consistent custom exceptions (#30)
Update Docker build commands to opt-in to all backends (#43)
Renamed 'modelLoad' to 'workerLoad' and changed the behavior for 'modelLoad' (#27)

Fixed

Get the right request size in the batcher when enqueuing with the C++ API (@d1ad81d)
Construct responses correctly in the XModel worker if there are multiple input buffers (@d1ad81d)
Populate the right number of offsets in the hard batcher (@6666142)
Calculate offset values correctly during batching (@8c7534b)
Get correct library dependencies for production container (@14ed4ef)
Correctly throw an exception if a worker gets an error during initialization (#29)
Detect errors in HTTP client during loading (@99ffc33)
Construct batches with the right sizes (#57)

New Contributors

@amuralee-amd made their first contribution in #17
@dependabot made their first contribution in #24
@bpickrel made their first contribution in #34

Full Changelog: v0.1.0...v0.2.0

Contributors

dependabot, bpickrel, and amuralee-amd

Assets 2

09 Feb 01:30

v0.1.0

Initial open-source release

Assets 2