Skip to content

[Serve] multiple submodels serve #429

[Serve] multiple submodels serve

[Serve] multiple submodels serve #429

Triggered via pull request January 23, 2025 16:48
Status Failure
Total duration 9m 21s
Artifacts

all-tests.yml

on: pull_request
Matrix: functional-tests-serve
megatron-report-clean  /  clean-report
12s
megatron-report-clean / clean-report
flagscale-report-clean  /  clean-report
flagscale-report-clean / clean-report
Matrix: megatron-unit-tests
Matrix: flagscale-unit-tests
Waiting for pending jobs
Matrix: functional-tests-train
Waiting for pending jobs
Matrix: functional-tests-hetero
Waiting for pending jobs
flagscale-coverage-test  /  test-coverage
flagscale-coverage-test / test-coverage
megatron-coverage-test  /  test-coverage
megatron-coverage-test / test-coverage
all-tests
0s
all-tests
Fit to window
Zoom out
Zoom in

Annotations

15 errors
serve-build_dag / functional-test
Process completed with exit code 1.
megatron-dist_checkpointing / unit-test
Process completed with exit code 1.
megatron-root / unit-test
FailFast: cancelling since parallel instance has failed
megatron-distributed / unit-test
FailFast: cancelling since parallel instance has failed
megatron-export / unit-test
FailFast: cancelling since parallel instance has failed
megatron-fusions / unit-test
FailFast: cancelling since parallel instance has failed
megatron-inference / unit-test
FailFast: cancelling since parallel instance has failed
megatron-pipeline_parallel / unit-test
FailFast: cancelling since parallel instance has failed
megatron-ssm / unit-test
FailFast: cancelling since parallel instance has failed
megatron-tensor_parallel / unit-test
FailFast: cancelling since parallel instance has failed
megatron-models / unit-test
FailFast: cancelling since parallel instance has failed
megatron-transformer / unit-test
FailFast: cancelling since parallel instance has failed
megatron-transformer/moe / unit-test
FailFast: cancelling since parallel instance has failed
megatron-data / unit-test
FailFast: cancelling since parallel instance has failed
megatron-data / unit-test
The operation was canceled.