forked from SeldonIO/seldon-core
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
New documentation site for Seldon Core v2 (SeldonIO#5760)
New format compatible with GItBook * moved docs out of the source directory and removed spnix-related files * APIs section completed * changing the configuration section in the getting started guide * getting started sectionc completed * rearranged models directory and enhanced different docs * added most images in the dos to the images directory * moved outliers and drift docs to its own file in the root directory * deleted servers directory and moved servers.md to the root directory with enhancements * deleted pipelines dir and moved pipelines.md to the root directory * deleted inference dir and moved inference.md to the root directory * deleted explainers dir and moved explainers.md to the root directory * deleted performance-tests dir and moved .md to the root directory * deleted experiments dir and moved .md to the root directory * updated about section to match gitbook's expected format * updated FAQs section to match gitbook's expected format * updated pandas query section with choice1.yaml * mostly moved and renamed files and directories * updated SUMMARY.md for GitBook * adding additional images * restructured development dir * restructured and reformatted examples dir to match GitBook's md flavor * added gitbook format to metrics dir * restructured k8s directory to match GitBooks expected md flavor * reformatted cli dir * typos and links fixed * typos and links fixed * tentative structured added to the root of the docs * fixed names in kubernetes section * GITBOOK-1: changed hard-coded reference to scheduler.proto * added reference to chainer.proto instead of hard-coded version * removed hard-coded references and added GitHub Gist pointing to v2 branch * fixed format and broken links feat(docs): adding a mention of per component labels and annotations to the docs (SeldonIO#5931) feat(docs): add documentation for HPA-based autoscaling (SeldonIO#5935) This describes a solution for scaling both Models and Servers based on HPA for the case of single-model serving. In the example described in the docs, the scaling is done based on Model RPS metrics fetched from Prometheus.
- Loading branch information
Showing
384 changed files
with
50,426 additions
and
4,597 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,63 @@ | ||
# About | ||
|
||
Seldon V2 APIs provide a state of the art solution for machine learning inference which | ||
can be run locally on a laptop as well as on Kubernetes for production. | ||
|
||
{% embed url="https://www.youtube.com/watch?v=ar5lSG_idh4" %} | ||
|
||
## Features | ||
|
||
* A single platform for inference of wide range of standard and custom artifacts. | ||
* Deploy locally in Docker during development and testing of models. | ||
* Deploy at scale on Kubernetes for production. | ||
* Deploy single models to multi-step pipelines. | ||
* Save infrastructure costs by deploying multiple models transparently in inference servers. | ||
* Overcommit on resources to deploy more models than available memory. | ||
* Dynamically extended models with pipelines with a data-centric perspective backed by Kafka. | ||
* Explain individual models and pipelines with state of the art explanation techniques. | ||
* Deploy drift and outlier detectors alongside models. | ||
* Kubernetes Service mesh agnostic - use the service mesh of your choice. | ||
|
||
|
||
## Core features and comparison to Seldon Core V1 APIs | ||
|
||
Our V2 APIs separate out core tasks into separate resources allowing users to get started fast | ||
with deploying a Model and the progressing to more complex Pipelines, Explanations and Experiments. | ||
|
||
![intro](images/intro.png) | ||
|
||
## Multi-model serving | ||
|
||
Seldon transparently will provision your model onto the correct inference server. | ||
|
||
![mms1](images/multimodel1.png) | ||
|
||
By packing multiple models onto a smaller set of servers users can save infrastructure costs and | ||
efficiently utilize their models. | ||
|
||
![mms2](images/mms.png) | ||
|
||
By allowing over-commit users can provision model models that available memory resources by | ||
allowing Seldon to transparently unload models that are not in use. | ||
|
||
![mms3](images/overcommit.png) | ||
|
||
## Inference Servers | ||
|
||
Seldon V2 supports any V2 protocol inference server. At present we include Seldon's MLServer and NVIDIA's Triton inference server automatically on install. These servers cover a wide range of artifacts including custom python models. | ||
|
||
![servers](images/servers.png) | ||
|
||
## Service Mesh Agnostic | ||
|
||
Seldon core v2 can be integrated with any Kubernetes service mesh. There are current examples with istio, Ambassador and Traefic. | ||
|
||
![mesh](images/mesh.png) | ||
|
||
## Publication | ||
|
||
These features are influenced by our position paper on the next generation of ML model serving frameworks: | ||
|
||
*Title*: [Desiderata for next generation of ML model serving](http://arxiv.org/abs/2210.14665) | ||
|
||
*Workshop*: Challenges in deploying and monitoring ML systems workshop - NeurIPS 2022 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# Table of contents | ||
|
||
* [Home](README.md) | ||
* [Getting Started](getting-started/README.md) | ||
* [Docker Installation](getting-started/docker-installation.md) | ||
* [Kubernetes Installation](getting-started/kubernetes-installation/README.md) | ||
* [Ansible](getting-started/kubernetes-installation/ansible.md) | ||
* [Helm](getting-started/kubernetes-installation/helm.md) | ||
* [Security](getting-started/kubernetes-installation/security/README.md) | ||
* [AWS MSK mTLS](getting-started/kubernetes-installation/security/aws-msk-mtls.md) | ||
* [AWS MSK SASL](getting-started/kubernetes-installation/security/aws-msk-sasl.md) | ||
* [Azure Event Hub SASL Example](getting-started/kubernetes-installation/security/azure-event-hub-sasl.md) | ||
* [Confluent Cloud Oauth 2.0 Example](getting-started/kubernetes-installation/security/confluent-oauth.md) | ||
* [Confluent Cloud SASL Example](getting-started/kubernetes-installation/security/confluent-sasl.md) | ||
* [Strimzi mTLS Example](getting-started/kubernetes-installation/security/strimzi-mtls.md) | ||
* [Strimzi SASL Example](getting-started/kubernetes-installation/security/strimzi-sasl.md) | ||
* [Reference](getting-started/kubernetes-installation/security/reference.md) | ||
* [Configuration](getting-started/configuration.md) | ||
* [Seldon CLI](getting-started/cli.md) | ||
* [APIs](apis/README.md) | ||
* [Internal](apis/internal/README.md) | ||
* [Chainer](apis/internal/chainer.md) | ||
* [Agent](apis/internal/agent.md) | ||
* [Inference](apis/inference/README.md) | ||
* [Open Inference Protocol](apis/inference/v2.md) | ||
* [Scheduler](apis/scheduler.md) | ||
* [Architecture](architecture/README.md) | ||
* [DataFlow](architecture/dataflow.md) | ||
* [Examples](examples/README.md) | ||
* [Local examples](examples/local-examples.md) | ||
* [Kubernetes examples](examples/k8s-examples.md) | ||
* [Huggingface models](examples/huggingface.md) | ||
* [Model zoo](examples/model-zoo.md) | ||
* [Artifact versions](examples/multi-version.md) | ||
* [Pipeline examples](examples/pipeline-examples.md) | ||
* [Pipeline to pipeline examples](examples/pipeline-to-pipeline.md) | ||
* [Explainer examples](examples/explainer-examples.md) | ||
* [Custom Servers](examples/custom-servers.md) | ||
* [Local experiments](examples/local-experiments.md) | ||
* [Experiment version examples](examples/experiment-versions.md) | ||
* [Inference examples](examples/inference.md) | ||
* [Tritonclient examples](examples/tritonclient-examples.md) | ||
* [Batch Inference examples (kubernetes)](examples/batch-examples-k8s.md) | ||
* [Batch Inference examples (local)](examples/batch-examples-local.md) | ||
* [Checking Pipeline readiness](examples/pipeline-ready-and-metadata.md) | ||
* [Multi-Namespace Kubernetes](examples/k8s-clusterwide.md) | ||
* [Huggingface speech to sentiment with explanations pipeline](examples/speech-to-sentiment.md) | ||
* [Production image classifier with drift and outlier monitoring](examples/cifar10.md) | ||
* [Production income classifier with drift, outlier and explanations](examples/income.md) | ||
* [Conditional pipeline with pandas query model](examples/pandasquery.md) | ||
* [Kubernetes Server with PVC](examples/k8s-pvc.md) | ||
* [Local Overcommit](examples/k8s-pvc.md) | ||
* [Kubernetes](kubernetes/README.md) | ||
* [Scaling](kubernetes/scaling.md) | ||
* [Autoscaling](kubernetes/autoscaling.md) | ||
* [HPA Autoscaling in single-model serving](kubernetes/hpa-rps-autoscaling.md) | ||
* [Tracing](kubernetes/tracing.md) | ||
* [Storage Secrets](kubernetes/storage-secrets.md) | ||
* [Kafka](kubernetes/kafka.md) | ||
* [Metrics](kubernetes/metrics.md) | ||
* [Resources](kubernetes/resources/README.md) | ||
* [Model](kubernetes/resources/model.md) | ||
* [Experiment](kubernetes/resources/experiment.md) | ||
* [Pipeline](kubernetes/resources/pipeline.md) | ||
* [Server](kubernetes/resources/server.md) | ||
* [Server Config](kubernetes/resources/serverconfig.md) | ||
* [Server Runtime](kubernetes/resources/seldonruntime.md) | ||
* [Seldon Config](kubernetes/resources/seldonconfig.md) | ||
* [Service Meshes](kubernetes/service-meshes/README.md) | ||
* [Ambassador](kubernetes/service-meshes/ambassador.md) | ||
* [Istio](kubernetes/service-meshes/istio.md) | ||
* [Traefik](kubernetes/service-meshes/traefik.md) | ||
* [Resource allocation](resource-allocation/README.md) | ||
* [Example: Serving models on dedicated GPU nodes](resource-allocation/example-serving-models-on-dedicated-gpu-nodes.md) | ||
* [Models](models/README.md) | ||
* [Multi-Model Serving](models/mms.md) | ||
* [Inference Artifacts](models/inference-artifacts.md) | ||
* [rClone](models/rclone.md) | ||
* [Parameterized Models](models/parameterized-models/README.md) | ||
* [Pandas Query](models/parameterized-models/pandasquery.md) | ||
* [Metrics](metrics/README.md) | ||
* [Usage](metrics/usage.md) | ||
* [Operational](metrics/operational.md) | ||
* [Local Metrics](metrics/local-metrics-test.md) | ||
* [Development](development/README.md) | ||
* [License](development/licenses.md) | ||
* [Release](development/release.md) | ||
* [CLI](cli/README.md) | ||
* [Seldon](cli/seldon.md) | ||
* [Config](cli/seldon\_config.md) | ||
* [Config Activate](cli/seldon\_config\_activate.md) | ||
* [Config Deactivate](cli/seldon\_config\_deactivate.md) | ||
* [Config Add](cli/seldon\_config\_add.md) | ||
* [Config List](cli/seldon\_config\_list.md) | ||
* [Config Remove](cli/seldon\_config\_remove.md) | ||
* [Experiment](cli/seldon\_experiment.md) | ||
* [Experiment Start](cli/seldon\_experiment\_start.md) | ||
* [Experiment Status](cli/seldon\_experiment\_status.md) | ||
* [Experiment List](cli/seldon\_experiment\_list.md) | ||
* [Experiment Stop](cli/seldon\_experiment\_stop.md) | ||
* [Model](cli/seldon\_model.md) | ||
* [Model Status](cli/seldon\_model\_status.md) | ||
* [Model Load](cli/seldon\_model\_load.md) | ||
* [Model List](cli/seldon\_model\_list.md) | ||
* [Model Infer](cli/seldon\_model\_infer.md) | ||
* [Model Metadata](cli/seldon\_model\_metadata.md) | ||
* [Model Unload](cli/seldon\_model\_unload.md) | ||
* [Pipeline](cli/seldon\_pipeline.md) | ||
* [Pipeline Load](cli/seldon\_pipeline\_load.md) | ||
* [Pipeline Status](cli/seldon\_pipeline\_status.md) | ||
* [Pipeline List](cli/seldon\_pipeline\_list.md) | ||
* [Pipeline Inspect](cli/seldon\_pipeline\_inspect.md) | ||
* [Pipeline Infer](cli/seldon\_pipeline\_infer.md) | ||
* [Pipeline Unload](cli/seldon\_pipeline\_unload.md) | ||
* [Server](cli/seldon\_server.md) | ||
* [Server List](cli/seldon\_server\_list.md) | ||
* [Server Status](cli/seldon\_server\_status.md) | ||
* [Pipelines](pipelines.md) | ||
* [Experiments](experiments.md) | ||
* [Servers](servers.md) | ||
* [Inference](inference.md) | ||
* [Outlier Detection](outlier.md) | ||
* [Drift Detection](drift.md) | ||
* [Explainers](explainers.md) | ||
* [Performance Tests](performance-tests.md) | ||
* [Upgrading](upgrading.md) | ||
* [FAQ](faqs.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# APIs | ||
|
||
Seldon provides APIs for management and inference. | ||
|
||
* [API for inference](./inference/README.md) | ||
* [Scheduler API for management](./scheduler/README.md) (Advanced) | ||
* [Internal APIs](./internal/README.md) (Reference) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.