Skip to content

Releases: triton-inference-server/pytriton

PyTriton 0.2.4

11 Aug 16:17
Compare
Choose a tag to compare
  • new: Introduced strict flag in Triton.bind which enables data types and shapes validation of inference callable outputs
    against model config
  • new: AsyncioModelClient which works in FastAPI and other async frameworks
  • fix: FuturesModelClient do not raise gevent.exceptions.InvalidThreadUseError
  • fix: Do not throw TimeoutError if could not connect to server during model verification

PyTriton 0.2.3

21 Jul 12:45
Compare
Choose a tag to compare
  • Improved verification of Proxy Backend environment when running under same Python interpreter
  • Fixed pytriton.version to represent currently installed version

PyTriton 0.2.2

20 Jul 11:19
Compare
Choose a tag to compare
  • Added inference_timeout_s parameters to client classes
  • Renamed PyTritonClientUrlParseError to PyTritonClientInvalidUrlError
  • ModelClient and FuturesModelClient methods raise PyTritonClientClosedError when used after client is closed
  • Pinned tritonclient dependency due to issues with tritonclient >= 2.34 on systems with glibc version lower than 2.34
  • Added warning after Triton Server setup and teardown while using too verbose logging level as it may cause a significant performance drop in model inference

PyTriton 0.2.1

29 Jun 07:30
Compare
Choose a tag to compare
  • Fixed handling TritonConfig.cache_directory option - the directory was always overwritten with the default value.
  • Fixed tritonclient dependency - PyTriton need tritonclient supporting http headers and parameters
  • Improved shared memory usage to match 64MB limit (default value for Docker, Kubernetes) reducing the initial size for PyTriton Proxy Backend.

PyTriton 0.2.0

31 May 05:08
Compare
Choose a tag to compare
  • Added support for using custom HTTP/gRPC request headers and parameters.
    See docs/custom_params.md for further information

    This change breaks backward compatibility of the inference function signature.
    The undecorated inference function now accepts a list of Request instances instead
    of a list of dictionaries. The Request class contains data for inputs and parameters
    for combined parameters and headers. For details see docs/infrence_callable.md.

  • Added FuturesModelClient which enables sending inference requests in a parallel manner.

  • Added displaying documentation link after models are loaded.

  • Version of Triton Inference Server embedded in wheel: 2.33.0

PyTriton 0.1.5

12 May 14:54
Compare
Choose a tag to compare
  • Improved pytriton.decorators.group_by_values function
    • Modified the function to avoid calling the inference callable on each individual sample when grouping by string/bytes input
    • Added pad_fn argument for easy padding and combining of the inference results
  • Fixed Triton binaries search
  • Improved Workspace management (remove workspace on shutdown)
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions.
      Refer to its support matrix
      for a detailed summary.

PyTriton v0.1.4

16 Mar 16:28
Compare
Choose a tag to compare

0.1.4 (2023-03-16)

  • Add validation of the model name passed to Triton bind method.
  • Add monkey patching of InferenceServerClient.__del__ method to prevent unhandled exceptions.
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions.
      Refer to its support matrix
      for a detailed summary.

PyTriton v0.1.3

21 Feb 14:56
Compare
Choose a tag to compare

0.1.3 (2023-02-20)

  • Fixed getting model config in fill_optionals decorator.
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions.
      Refer to its support matrix
      for a detailed summary.

PyTriton v0.1.2

14 Feb 16:01
Compare
Choose a tag to compare
  • Fixed wheel build to support installations on operating systems with glibc version 2.31 or higher.
  • Updated the documentation on custom builds of the package.
  • Change: TritonContext instance is shared across bound models and contains model_configs dictionary.
  • Fixed support of binding multiple models that uses methods of the same class.
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions.
      Refer to its support matrix
      for a detailed summary.

PyTriton v0.1.1

01 Feb 09:50
Compare
Choose a tag to compare
  • Change: The @first_value decorator has been updated with new features:
    • Renamed from @first_values to @first_value
    • Added a strict flag to toggle the checking of equality of values on a single selected input of the request. Default is True
    • Added a squeeze_single_values flag to toggle the squeezing of single value ND arrays to scalars. Default is True
  • Fix: @fill_optionals now supports non-batching models
  • Fix: @first_value fixed to work with optional inputs
  • Fix: @group_by_values fixed to work with string inputs
  • Fix: @group_by_values fixed to work per sample-wise
  • Version of external components used during testing:
    • Triton Inference Server: 2.29.0
    • Other component versions depend on the used framework and Triton Inference Server containers versions.
      Refer to its support matrix
      for a detailed summary.