Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] feat(serving): light weighted traffic control for inference #184

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ccchenjiahuan
Copy link
Contributor

@ccchenjiahuan ccchenjiahuan commented Sep 26, 2021

Ⅰ. Describe what this PR does

Functions currently implemented:

  • istio-less ingress gateway
  • kubedl serving reconcile logic modification

I have done some tests on the ingress gateway part, but haven't got a chance to test the changes with kubedl, I think we can review the gateway part first.

Functions currently known but not yet implemented:

  • The ingress gateway watches all ingresses currently, it needs to be configured to watch specific inference cr related ingresses
  • Create a specific service account for the ingress gateway pod in the serving reconciliation

II. Does this pull request fix one issue?

resolves #160

@ccchenjiahuan ccchenjiahuan force-pushed the feat/traffic-control branch 2 times, most recently from 1f6a65f to 99e0999 Compare September 26, 2021 16:12
@codecov-commenter
Copy link

codecov-commenter commented Sep 26, 2021

Codecov Report

Merging #184 (1bdb54e) into master (f408cc8) will increase coverage by 0.88%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #184      +/-   ##
==========================================
+ Coverage   21.74%   22.62%   +0.88%     
==========================================
  Files          73       75       +2     
  Lines        4374     4557     +183     
==========================================
+ Hits          951     1031      +80     
- Misses       3292     3392     +100     
- Partials      131      134       +3     
Flag Coverage Δ
unittests 22.62% <0.00%> (+0.88%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
controllers/serving/inference_controller.go 0.00% <0.00%> (ø)
controllers/serving/utils.go 0.00% <0.00%> (ø)
controllers/tensorflow/tfjob_controller.go 44.77% <0.00%> (-2.24%) ⬇️
pkg/job_controller/job.go 13.61% <0.00%> (ø)
controllers/mars/marsjob_controller.go 0.00% <0.00%> (ø)
controllers/xgboost/xgboostjob_controller.go 0.00% <0.00%> (ø)
pkg/gang_schedule/batch_scheduler/scheduler.go 68.88% <0.00%> (ø)
pkg/gang_schedule/volcano_scheduler/scheduler.go 68.88% <0.00%> (ø)
pkg/gang_schedule/coscheduler/scheduler.go 64.70% <0.00%> (+61.58%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f408cc8...1bdb54e. Read the comment docs.

Signed-off-by: ccchenjiahuan <[email protected]>

feat(serving): add kubedl serving modifications

Signed-off-by: ccchenjiahuan <[email protected]>
@ccchenjiahuan ccchenjiahuan changed the title feat(serving): light weighted traffic control for inference [WIP] feat(serving): light weighted traffic control for inference Sep 26, 2021
@ccchenjiahuan
Copy link
Contributor Author

/cc @SimonCqk @jian-he

// 3) If inference serves multiple model version simultaneously and canary policy has been set,
//// 4) If inference serves multiple model version simultaneously and canary policy has been set,
//// serving traffic will be distributed with different ratio and routes to backend service.
//if len(inference.Spec.Predictors) > 1 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove if it is not needed anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Namespace: inf.Namespace,
},
Spec: networkingv1beta1.IngressSpec{
Rules: []networkingv1beta1.IngressRule{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the entry host name of this ingress?

Copy link
Contributor Author

@ccchenjiahuan ccchenjiahuan Oct 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may not need host name? The default is *, we distinguish the traffic by path

return nil
}

func (tc TrafficControl) ServeHTTP(w http.ResponseWriter, r *http.Request, next caddyhttp.Handler) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is receiver(tc) of ServeHTTP method intend to be a non-pointer receiver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, l wrote this plugin according to https://caddyserver.com/docs/extending-caddy

@@ -0,0 +1,135 @@
package istio_less_ingress_controller
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good at first glance, and it will be nicer if you add more comments :)

}

// parseCaddyfile unmarshals tokens from h into a new Middleware.
func parseCaddyfile(h httpcaddyfile.Helper) (caddyhttp.MiddlewareHandler, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems actually a no-op?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nothing is done right now, but keep it so that there are some parameters could be added later.

type ingressCache struct {
mutex sync.Mutex

hostToIngressEntry map[string][]ingressEntry
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hostToIngressEntry caches all potential hosts in-cluster and maps to a list of imgress entries (each for a predcitor), so the istio-less ingress controller seems to be a cluster scope deployment? however, each Inference service creates it, it is expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I listed this point in the todo list above, it needs to be configured to watch specific ingresses related to each predcitor

@@ -335,6 +430,72 @@ func (ir *InferenceReconciler) syncServiceForInference(inf *servingv1alpha1.Infe
return nil
}

// syncIstioLessIngressGateway sync the istio-less ingress gateway for inference service.
func (ir *InferenceReconciler) syncIstioLessIngressGateway(inf *servingv1alpha1.Inference) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to renam to caddyIngressGateway

return err
}
svcExists := true
if err != nil && errors.IsNotFound(err) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two if conditions can be combined into a single if errr != nil { xxx } block

return ir.client.Create(context.Background(), &gateway)
}

if !reflect.DeepEqual(gateway.Spec, gatewayInCluster.Spec) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not reset the spec every time, in case we need to manually change something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t quite understand your meaning here, the trigger condition here is manually changing the spec

igExists := true
igName := genPredictorName(inf, predictor)
err := ir.client.Get(context.Background(), types.NamespacedName{Namespace: inf.Namespace, Name: igName}, &igInCluster)
if err != nil && !errors.IsNotFound(err) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly, these two if conditions can be combined into a single if err != nil {xxx}. block

@@ -22,6 +22,12 @@ import (
"github.com/alibaba/kubedl/apis/serving/v1alpha1"
)

const (
CANARY_WEIGHT = "kubedl.kubernetes.io/canary-weight"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name it kubedl.io to be consistent

// |---request---> VirtualService
// |--- 90% ---> Deploy-Of-Model-A.1
// |--- 10% ---> Deploy-Of-Model-B.1
func (ir *InferenceReconciler) syncPredictorTrafficDistribution(inf *servingv1alpha1.Inference, index int, predictor *servingv1alpha1.PredictorSpec) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we make it pluggable such that both istio and caddy based traffic control can be configured. @SimonCqk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think that would be better, it may require a global configuration

package istio_less_ingress_controller

const (
CANARY_WEIGHT = "kubedl.kubernetes.io/canary-weight"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this const is duplicated in utils.go. this file can be removed

k8s.io/apimachinery v0.20.7
k8s.io/client-go v0.20.7
k8s.io/klog v1.0.0
k8s.io/utils v0.0.0-20210820185131-d34e5cb4466e // indirect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't have another go.mod file in a sub package ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think it better to be placed under a separate repo. but anyway if it is placed under the kubedl, l think it's ok because this service is more like a plugin, it should not pollute the main go.mod


go 1.15

require (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the traffic_control package located under pkg/, instead of the root? @SimonCqk

@@ -110,7 +113,13 @@ func (ir *InferenceReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error)
return ctrl.Result{}, err
}

// 2) Sync each predictor to deploy containers mounted with specific model.
// 2) Sync istio-less ingress gateway
if err = ir.syncIstioLessIngressGateway(&inference); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't call "istio-less" , call it caddyIngress to be explicit, and also the docker image name

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[summer of code] light-weighted traffic control for inference
4 participants