Releases: Xilinx/inference-server
Releases · Xilinx/inference-server
v0.4.0
What's Changed
- Refactor globals by @varunsh-xilinx in #125
- Add MLPerf app by @varunsh-xilinx in #129
- Use benchmark library for C++ benchmarks by @varunsh-xilinx in #147
- Use a global memory pool by @varunsh-xilinx in #149
- Resolve requests before the batcher by @varunsh-xilinx in #164
- Support using custom classes for memory in the pool by @varunsh-xilinx in #166
- Enable parallel tests in CI by @varunsh-xilinx in #168
- Use VAI 3.0 XRT and XRM by @varunsh-xilinx in #169
- Update CMake build by @varunsh-xilinx in #170
- Add C++ worker by @varunsh-xilinx in #172
- Add QuickStart for ZenDNN CPU Inference by @rradjabi in #174
- Add model chaining by @varunsh-xilinx in #176
- Support model chaining in modelLoad by @varunsh-xilinx in #178
- Update downloading test models by @varunsh-xilinx in #180
- Update documentation by @varunsh-xilinx in #182
- Pin python packages by @varunsh-xilinx in #183
- Update benchmarks by @varunsh-xilinx in #184
- Close dynamically opened libraries by @varunsh-xilinx in #186
- Replace Jaeger exporter with OTLP by @varunsh-xilinx in #187
- Use vcpkg for dependencies by @varunsh-xilinx in #188
- Add FP16 test by @varunsh-xilinx in #189
- Improve KServe compatibility by @varunsh-xilinx in #190
- Fix wheel generation by @varunsh-xilinx in #191
- Allow passing git credentials to Docker build by @varunsh-xilinx in #194
- Fix loading models at startup by @varunsh-xilinx in #195
- Use MLCommons app for benchmarks by @varunsh-xilinx in #197
- Add performance graphs by @varunsh-xilinx in #198
- Update documentation by @varunsh-xilinx in #200
- Bug fix in migraphx filename mangling by @bpickrel in #202
- Add FP16 test in Python by @varunsh-xilinx in #203
- Fix FP16 binding by @varunsh-xilinx in #208
- Fix quickstart script for 0.3.0 by @varunsh-xilinx in #210
- Include tensor name in model metadata for xmodel worker by @ZchiPitt in #207
- Add resize to Python preprocess binding by @varunsh-xilinx in #213
- Add a test for the VCK5000 by @varunsh-xilinx in #214
- UIF 1.2 release by @varunsh-xilinx in #215
New Contributors
Full Changelog
Full Diff: v0.3.0...v0.4.0
v0.3.0
Added
- Allow building Debian package (#59)
- Add modelInferAsync to the API (@2f4a6c2)
- Add inferAsyncOrdered as a client operator for making inferences in parallel (#66)
- Support building Python wheels with cibuildwheel (#71)
- Support XModels with multiple output tensors (#74)
- Add FP16 support (#76)
- Add more documentation (#85, #90)
- Add Python bindings for gRPC and Native clients (#88)
- Add tests with KServe (#90)
- Add batch size flag to examples (#94)
- Add Kubernetes test for KServe (#95)
- Use exhale to generate Python API documentation (#95)
- OpenAPI spec for REST protocol (#100)
- Use a timer for simpler time measurement (#104)
- Allow building containers with custom backend versions (#107)
Changed
- Refactor pre- and post-processing functions in C++ (42cf748)
- Templatize Dockerfile for different base images (#71)
- Use multiple HTTP clients internally for parallel HTTP requests (#66)
- Update test asset downloading (#81)
- Reimplement and align examples across platforms (#85)
- Reorganize Python library (#88)
- Rename 'proteus' to 'amdinfer' (#91)
- Use Ubuntu 20.04 by default for Docker (#97)
- Bump up to ROCm 5.4.1 (#99)
- Some function names changed for style (#102)
- Bump up to ZenDNN 4.0 (#113)
Deprecated
- ALL_CAPS style enums for the DataType (#102)
Removed
- Mappings between XIR data types <-> inference server data types from public API (#102)
- Web GUI (#110)
Fixed
- Use input tensors in requests correctly (#61)
- Fix bug with multiple input tensors (#74)
- Align gRPC responses using non-gRPC-native data types with other input protocols (#81)
- Fix the Manager’s destructor (#88)
- Fix using --no-user-config with proteus run (#89)
- Handle assigning user permissions if the host UID is same as UID in container (#101)
- Fix test discovery if some test assets are missing (#105)
- Fix gRPC queue shutdown race condition (#111)
Full Changelog: v0.2.0...v0.3.0
v0.2.0
Added
- HTTP/REST C++ client (@cbf33b8)
- gRPC API based on KServe v2 API (@37a6aad and others)
- TensorFlow/Pytorch + ZenDNN backend (#17 and #21)
- ‘ServerMetadata’ endpoint to the API (@7747911)
- ‘modelList’ endpoint to the API (@7477b7d)
- Parse JSON data as string in HTTP body (@694800e)
- Directory monitoring for model loading (@6459797)
- ‘ModelMetadata’ endpoint to the API (@22b9d1a)
- MIGraphX backend (#34)
- Pre-commit for style verification(@048bdd7)
Changed
- Use Pybind11 to create Python API (#20)
- Two logs are created now: server and client
- Logging macro is now PROTEUS_LOG_*
- Loading workers is now case-insensitive (@14ed4ef and @90a51ae)
- Build AKS from source (@e04890f)
- Use consistent custom exceptions (#30)
- Update Docker build commands to opt-in to all backends (#43)
- Renamed 'modelLoad' to 'workerLoad' and changed the behavior for 'modelLoad' (#27)
Fixed
- Get the right request size in the batcher when enqueuing with the C++ API (@d1ad81d)
- Construct responses correctly in the XModel worker if there are multiple input buffers (@d1ad81d)
- Populate the right number of offsets in the hard batcher (@6666142)
- Calculate offset values correctly during batching (@8c7534b)
- Get correct library dependencies for production container (@14ed4ef)
- Correctly throw an exception if a worker gets an error during initialization (#29)
- Detect errors in HTTP client during loading (@99ffc33)
- Construct batches with the right sizes (#57)
New Contributors
- @amuralee-amd made their first contribution in #17
- @dependabot made their first contribution in #24
- @bpickrel made their first contribution in #34
Full Changelog: v0.1.0...v0.2.0
v0.1.0
Initial open-source release