Skip to content

Releases: Xilinx/inference-server

v0.4.0

07 Sep 16:50
9fde9bb
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog
Full Diff: v0.3.0...v0.4.0

v0.3.0

01 Feb 15:34
8b0d632
Compare
Choose a tag to compare

Added

  • Allow building Debian package (#59)
  • Add modelInferAsync to the API (@2f4a6c2)
  • Add inferAsyncOrdered as a client operator for making inferences in parallel (#66)
  • Support building Python wheels with cibuildwheel (#71)
  • Support XModels with multiple output tensors (#74)
  • Add FP16 support (#76)
  • Add more documentation (#85, #90)
  • Add Python bindings for gRPC and Native clients (#88)
  • Add tests with KServe (#90)
  • Add batch size flag to examples (#94)
  • Add Kubernetes test for KServe (#95)
  • Use exhale to generate Python API documentation (#95)
  • OpenAPI spec for REST protocol (#100)
  • Use a timer for simpler time measurement (#104)
  • Allow building containers with custom backend versions (#107)

Changed

  • Refactor pre- and post-processing functions in C++ (42cf748)
  • Templatize Dockerfile for different base images (#71)
  • Use multiple HTTP clients internally for parallel HTTP requests (#66)
  • Update test asset downloading (#81)
  • Reimplement and align examples across platforms (#85)
  • Reorganize Python library (#88)
  • Rename 'proteus' to 'amdinfer' (#91)
  • Use Ubuntu 20.04 by default for Docker (#97)
  • Bump up to ROCm 5.4.1 (#99)
  • Some function names changed for style (#102)
  • Bump up to ZenDNN 4.0 (#113)

Deprecated

  • ALL_CAPS style enums for the DataType (#102)

Removed

  • Mappings between XIR data types <-> inference server data types from public API (#102)
  • Web GUI (#110)

Fixed

  • Use input tensors in requests correctly (#61)
  • Fix bug with multiple input tensors (#74)
  • Align gRPC responses using non-gRPC-native data types with other input protocols (#81)
  • Fix the Manager’s destructor (#88)
  • Fix using --no-user-config with proteus run (#89)
  • Handle assigning user permissions if the host UID is same as UID in container (#101)
  • Fix test discovery if some test assets are missing (#105)
  • Fix gRPC queue shutdown race condition (#111)

Full Changelog: v0.2.0...v0.3.0

v0.2.0

05 Aug 15:54
4090747
Compare
Choose a tag to compare

Added

  • HTTP/REST C++ client (@cbf33b8)
  • gRPC API based on KServe v2 API (@37a6aad and others)
  • TensorFlow/Pytorch + ZenDNN backend (#17 and #21)
  • ‘ServerMetadata’ endpoint to the API (@7747911)
  • ‘modelList’ endpoint to the API (@7477b7d)
  • Parse JSON data as string in HTTP body (@694800e)
  • Directory monitoring for model loading (@6459797)
  • ‘ModelMetadata’ endpoint to the API (@22b9d1a)
  • MIGraphX backend (#34)
  • Pre-commit for style verification(@048bdd7)

Changed

  • Use Pybind11 to create Python API (#20)
  • Two logs are created now: server and client
  • Logging macro is now PROTEUS_LOG_*
  • Loading workers is now case-insensitive (@14ed4ef and @90a51ae)
  • Build AKS from source (@e04890f)
  • Use consistent custom exceptions (#30)
  • Update Docker build commands to opt-in to all backends (#43)
  • Renamed 'modelLoad' to 'workerLoad' and changed the behavior for 'modelLoad' (#27)

Fixed

  • Get the right request size in the batcher when enqueuing with the C++ API (@d1ad81d)
  • Construct responses correctly in the XModel worker if there are multiple input buffers (@d1ad81d)
  • Populate the right number of offsets in the hard batcher (@6666142)
  • Calculate offset values correctly during batching (@8c7534b)
  • Get correct library dependencies for production container (@14ed4ef)
  • Correctly throw an exception if a worker gets an error during initialization (#29)
  • Detect errors in HTTP client during loading (@99ffc33)
  • Construct batches with the right sizes (#57)

New Contributors

Full Changelog: v0.1.0...v0.2.0

v0.1.0

09 Feb 01:30
65088cc
Compare
Choose a tag to compare

Initial open-source release