Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

triton-inference-server / pytriton Public

Notifications You must be signed in to change notification settings
Fork 52
Star 751

Code
Issues 9
Pull requests
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Releases: triton-inference-server/pytriton

Releases · triton-inference-server/pytriton

PyTriton 0.5.3

08 Mar 18:31

Compare

Choose a tag to compare

Loading

PyTriton 0.5.3

New: Relaxed wheel dependencies to avoid forced downgrading of protobuf and other packages in the NVIDIA 24.02 docker containers for PyTorch and other frameworks.
Version of Triton Inference Server embedded in wheel: 2.43.0

Assets 4

Loading

Nishikoh reacted with hooray emoji

All reactions

🎉 1 reaction

1 person reacted

PyTriton 0.5.2

29 Feb 10:32

Compare

Choose a tag to compare

Loading

PyTriton 0.5.2

Add: Add TritonLifecyclePolicy parameter to Triton class to control the lifecycle of the Triton Inference Server
(Triton Inference Server can be started at the beginning of the context - default behavior, or at the call of run or serve method),
second flag in this parameter indicates if model configs should be created in local filesystem or passed to Triton Inference Server and managed by it.
Fix: ModelManager does not raise tritonclient.grpc.InferenceServerException for stop method when HTTP endpoint is disabled in Triton configuration.
Fix: Methods can be used as the inference callable.
Version of Triton Inference Server embedded in wheel: 2.42.0

Assets 4

Loading

All reactions

PyTriton 0.5.1

09 Feb 07:38

Compare

Choose a tag to compare

Loading

PyTriton 0.5.1

Fix: ModelClient does not raise gevent.exceptions.InvalidThreadUseError when destroyed in a different thread.
Version of Triton Inference Server embedded in wheel: 2.42.0

Assets 4

Loading

All reactions

PyTriton 0.5.0

10 Jan 09:35

pziecina-nv

Compare

Choose a tag to compare

Loading

PyTriton 0.5.0

New: Decoupled models support
New: AsyncioDecoupledModelClient, which works in async frameworks and decoupled Triton models like some Large Language Models.
Fix: Fixed a bug that prevented getting the log level when HTTP endpoint was disabled. Thanks @catwell

Version of Triton Inference Server embedded in wheel: 2.41.0

Contributors

catwell

Assets 4

Loading

dogky123 and catwell reacted with heart emoji

All reactions

❤️ 2 reactions

2 people reacted

PyTriton 0.4.2

06 Dec 19:04

Compare

Choose a tag to compare

Loading

PyTriton 0.4.2

New: You can create a client from an existing client instance or model configuration to avoid loading model configuration from the server.
New: Introduced warning system using the warnings module.
Fix: Experimental client for decoupled models prevents sending another request, when responses from previous request are not consumed, blocks close until stream is stopped.
Fix: Leak of ModelClient during Triton creation
Fix: Fixed non-declared project dependencies (removed from use in code or added to package dependencies)
Fix: Remote model is being unloaded from Triton when RemoteTriton is closed.

Version of Triton Inference Server embedded in wheel: 2.39.0

Assets 4

Loading

dogky123 reacted with thumbs up emoji

All reactions

👍 1 reaction

1 person reacted

PyTriton 0.4.1

13 Nov 20:41

Compare

Choose a tag to compare

Loading

PyTriton 0.4.1

New: Place where workspaces with temporary Triton model repositories and communication file sockets can be configured by $PYTRITON_HOME environment variable
Fix: Recover handling KeyboardInterrupt in triton.serve()
Fix: Remove limit for handling bytes dtype tensors
Build scripts update
- Added support for arm64 platform builds

Version of Triton Inference Server embedded in wheel: 2.39.0

Assets 4

Loading

imneonizer, lfxx, thomasdhc, and dogky123 reacted with rocket emoji

All reactions

🚀 4 reactions

4 people reacted

PyTriton 0.4.0

24 Oct 19:40

nv-blazejkubiak

Compare

Choose a tag to compare

Loading

PyTriton 0.4.0

New: Remote Mode - PyTriton can be used to connect to a remote Triton Inference Server
- Introduced RemoteTriton class which can be used to connect to a remote Triton Inference Server running on the same machine, by passing triton url.
- Changed Triton lifecycle - now the Triton Inference Server is started while entering the context. This allows to load models dynamically to the running server while calling the bind method. It is still allowed to create Triton instance without entering the context and bind models before starting the server (in this case the models are lazy loaded when calling run or serve method like it worked before).
- In RemoteTriton class, calling enter or connect method connects to triton server, so we can safely load models while binding inference functions (if RemoteTriton is used without context manager, models are lazy loaded when calling connect or serve method).
Change: "batch" decorator raises a ValueError if any of the outputs have a different batch size than expected.
Fix: gevent resources leak in FuturesModelClient
Version of Triton Inference Server embedded in wheel: 2.36.0

Assets 3

Loading

dogky123, thomasdhc, and edwardnguyen1705 reacted with thumbs up emoji

Nishikoh and dogky123 reacted with hooray emoji

dogky123 reacted with heart emoji

Nishikoh, dogky123, and thomasdhc reacted with rocket emoji

All reactions

👍 3 reactions
🎉 2 reactions
❤️ 1 reaction
🚀 3 reactions

4 people reacted

PyTriton 0.3.1

27 Sep 12:24

pziecina-nv

Compare

Choose a tag to compare

Loading

PyTriton 0.3.1

Fix: Addressed potential instability in shared memory management.
Change: KeyboardInterrupt is now handled in triton.serve(). PyTriton hosting scripts return an exit code of 0 instead of 130 when they receive a SIGINT signal.

Version of Triton Inference Server embedded in wheel: 2.36.0

Assets 3

Loading

datlt4 and saishreddyk reacted with heart emoji

All reactions

❤️ 2 reactions

2 people reacted

PyTriton 0.3.0

05 Sep 12:41

jkosek

Compare

Choose a tag to compare

Loading

PyTriton 0.3.0

new: Support for multiple Python versions starting from 3.8+
new: Added support for decoupled models enabling to support results streaming from models (alpha state)
change: Upgraded Triton Inference Server binaries to version 2.36.0. Note that this Triton Inference Server requires glibc 2.35+ or a more recent version.

Version of Triton Inference Server embedded in wheel: 2.36.0

Assets 3

Loading

windyuan, nnshah1, wang-yiwei, and KernelA reacted with thumbs up emoji

All reactions

👍 4 reactions

4 people reacted

PyTriton 0.2.5

24 Aug 16:22

jkosek

Compare

Choose a tag to compare

Loading

PyTriton 0.2.5

new: Allow to execute multiple PyTriton instances in the same process and/or host
fix: Invalid flags for Proxy Backend configuration passed to Triton

Version of Triton Inference Server embedded in wheel: 2.33.0

Assets 3

Loading

Nishikoh reacted with hooray emoji

Nishikoh reacted with heart emoji

Nishikoh and saishreddyk reacted with rocket emoji

All reactions

🎉 1 reaction
❤️ 1 reaction
🚀 2 reactions

2 people reacted

Previous 1 2 3 Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.