Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trial Component: Failed to retrieve model package details. #170

Open
mouhannadali opened this issue Nov 2, 2022 · 4 comments
Open

Trial Component: Failed to retrieve model package details. #170

mouhannadali opened this issue Nov 2, 2022 · 4 comments

Comments

@mouhannadali
Copy link

Describe the bug
After a model is trained and registered, I navigate to the model registry and select the model group name -> model version -> settings. At "Trial Component" row is shows "Failed to retrieve model package details"
This issue is appearing just for the approved model versions
To Reproduce
Steps to reproduce the behavior:
trained and registered a model then approve the model

Expected behavior
A link to the corresponding "Trail component" should be shown

Screenshots
If applicable, add screenshots to help explain your problem.
image

Environment:
Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans):
Framework Version:
Python Version:
CPU or GPU:
Python SDK Version:
Are you using a custom image:

Additional context
Add any other context about the problem here.

@helinmik
Copy link

helinmik commented Nov 3, 2022

I have the same issue when registering the model using Sagemaker Pipeline

@mouhannadali
Copy link
Author

any update here?

@tmbluth
Copy link

tmbluth commented Feb 22, 2023

Environment: SageMaker Studio
Framework: SageMaker LinearLearner
Framework Version: latest as of 2/22/2023
Python Version: 3.7.10
CPU or GPU: CPU
Python SDK Version: 2.131.0
Are you using a custom image: No

I'm seeing the same thing. To create the model I'm using some generic code

estimator = Estimator(
        image_uri=image_uri,
        role=role,
        output_path=output_path,
        sagemaker_session=sagemaker_session,
        instance_type=instance_type,
        instance_count=instance_count,
        enable_sagemaker_metrics=True,
        volume_kms_key=use_case_kms_key,
        output_kms_key=use_case_kms_key,
        subnets=subnets,
        security_group_ids=security_group,
        enable_network_isolation=enable_network_isolation,
        encrypt_inter_container_traffic=encrypt_inter_container_traffic,
        tags=tags
)

estimator.set_hyperparameters(
    epochs=epochs,
    l1=l1,
    learning_rate=learning_rate,
    predictor_type=predictor_type
)

Experiment.load(experiment_name=experiment_name)
linear_trial = Trial.create(
    trial_name=trial_name,
    experiment_name=experiment_name,
    sagemaker_boto_client=sm_client,
    tags=tags
)
  
estimator.fit(
    inputs={
        'train': train_input, 
        'validation':validation_input,
        'test':test_input
    },
    job_name = base_job_name+'-'+mlops_id,
    experiment_config={
        'TrialName': linear_trial.trial_name,
        'TrialComponentDisplayName': 'training',
    },
    wait=True,
    logs=False,
)

Then the path to the model is saved:
model_uri = f'{output_path}/{estimator.latest_training_job.job_name}/output/model.tar.gz'

Then to register this model I run this code:

response = sm_client.create_model_package(
    ModelPackageGroupName=model_package_group_name,
    ModelPackageDescription='Model registration testing',
    ModelApprovalStatus='PendingManualApproval',
    InferenceSpecification={
        'Containers': [
            {
                'Image': image_uri,
                'ModelDataUrl': model_uri,
                'NearestModelName': model_name
            },
        ],
        'SupportedTransformInstanceTypes': [inference_instance_type],
        'SupportedContentTypes': ['text/csv'],
        'SupportedResponseMIMETypes': ['text/csv']
    },
    CustomerMetadataProperties={
        'train': training_path, 
        'validation':validation_path,
        'test':testing_path,
        'experiment_name':experiment_name
    },
)

The model successfully uploads to the registry, increments its version, but I get the same error as others.

Is this a bug or are we misusing the Model Registry?

@tmbluth
Copy link

tmbluth commented Feb 24, 2023

After playing around quite a bit I found that registering the model through boto3 like I did in my comment above did not automatically link the TrialComponent, but when using the SageMaker SDK way of registering a model I was able to see the TrialComponent link

model_package = linear_learner.register(
    model_package_group_name=model_package_group_name,
    model_name='linear-learner',
    image_uri=image_uri,
    transform_instances=[instance_type],
    content_types=['text/csv'],
    response_types=['text/csv'],
    approval_status='PendingManualApproval', 
    customer_metadata_properties={
        'train': training_path, 
        'validation':validation_path,
        'test':testing_path,
        'experiment_name':experiment_name
    }
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants