Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glue-client: get_job_run response missing the Arguments key #3506

Closed
andhrelja opened this issue Nov 24, 2022 · 14 comments
Closed

glue-client: get_job_run response missing the Arguments key #3506

andhrelja opened this issue Nov 24, 2022 · 14 comments
Assignees
Labels
glue service-api This issue is caused by the service API, not the SDK implementation.

Comments

@andhrelja
Copy link

andhrelja commented Nov 24, 2022

Describe the bug

AWS Glue job type: Python shell
AWS Glue version: 1.0
Python version: 3.6
botocore==1.26.16
boto3== 1.29.16

I am using the get_job_run function to retrieve a Glue Job's Run ID.

I use the Arguments dict from the function's response to uniquely identify a Glue Job run.
The Arguments dict is not available in get_job_run and get_job_runs response objects.

Expected Behavior

client = boto3.client('glue')
response = client.get_job_run(JobName=job_name, RunId=job_run_id)
response_dict = response['JobRun']
type(response_dict.get('Arguments'))
>>> <class 'dict'>

Current Behavior

client = boto3.client('glue')
response = client.get_job_run(JobName=job_name, RunId=job_run_id)
response_dict = response['JobRun']
type(response_dict.get('Arguments'))
>>> <class 'NoneType'>

Reproduction Steps

# AWS Glue (v1.0) Python Shell job
import sys
from awsglue.utils import getResolvedOptions

args = getResolvedOptions(sys.argv, ['JOB_NAME'])
job_name = args['JOB_NAME']

client = boto3.client('glue')

def get_job_run_id(job_name):
    response = client.get_job_runs(JobName=job_name)
    return response['JobRuns'][0]['Id']

job_run_id = get_job_run_id(job_name)

response = client.get_job_run(JobName=job_name, RunId=job_run_id)
assert response['JobRun'].get('Arguments') is not None, 'Arguments key is not available in get_job_run response'

Possible Solution

No response

Additional Information/Context

Tried to downgrade multiple boto3 versions (through 1.22.6), but the behavior was the same

SDK version used

1.26.16

Environment details (OS name and version, etc.)

Python shell; Glue 1.0

@andhrelja andhrelja added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Nov 24, 2022
@andhrelja
Copy link
Author

andhrelja commented Nov 28, 2022

Arguments key is always missing for manual Glue job runs with default Job arguments set.
Resolution: The Arguments dict is available in get_job_run and get_job_runs response objects if the job was triggered by another glue_client.start_job_run() call.

@simonB2020
Copy link

@andhrelja
But the point remains, the key is missing !
The answer you've provided only resolves the issue for a specific circumstance - the problem remains if you are not using the Glue Client to start a job.

Please can you re-open this case ?

@andhrelja andhrelja reopened this Nov 19, 2023
@andhrelja
Copy link
Author

Reopened this issue as my proposed resolution is not applicable for general public as noted by @simonB2020

@tim-finnigan tim-finnigan self-assigned this Nov 20, 2023
@tim-finnigan
Copy link
Contributor

Hello and thanks for reaching out. The get_job_run Boto3 command corresponds to the GetJobRun Glue API. Therefore, any issues with the API would need to be escalated to the Glue team. (We recommend reaching out through AWS Support for issues involving service APIs if you have a support plan, but we can also reach internally on your behalf.)

I think we need a little more information in order to better understand the issue here. It looks like there are many nuances involved with how Arguments could be returned in the response:

Arguments (dict) –

The job arguments associated with this run. For this job run, they replace the default arguments set in the job definition itself.

You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes.

Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets from a Glue Connection, Secrets Manager or other secret management mechanism if you intend to keep them within the Job.

For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide.

For information about the arguments you can provide to this field when configuring Spark jobs, see the Special Parameters Used by Glue topic in the developer guide.

For information about the arguments you can provide to this field when configuring Ray jobs, see Using job parameters in Ray jobs in the developer guide.

Could you provide more details regarding the circumstances under which this issue occurs? For example, a minimal reproducible code snippet, and the Boto3 version installed. If you can share your debug logs (with any sensitive info redacted) by adding boto3.set_stream_logger('') to your script, that may also help provide more insight into the issue.

@tim-finnigan tim-finnigan added response-requested Waiting on additional information or feedback. service-api This issue is caused by the service API, not the SDK implementation. glue and removed bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Nov 20, 2023
@andhrelja
Copy link
Author

Hey @tim-finnigan

Not sure if you saw the issue description above, it shows the installed boto3 version and the reproduction code snippet.

Circumstances:

  • a Glue job is created using the provided code snippet
  • the created Glue job is run via console manually
  • the created Glue job fails because of an assertion error

I hope the code snipped explains why an assertion error happens. I would expect the "Arguments" response (dict) contains the arguments that were used to run the created Glue job, but that is not the case.

@simonB2020 please expand if needed

@simonB2020
Copy link

I think the reference to Boto3 is a red herring here.

The first thing the occurring in the example script, is to use Boto3 to first get the runid from the job name.
From my experience, this is not a reliable method of getting the right ID.
e.g.
Imagine if you have multiple instances of the same job executing at the same time - how is the API to know which JobId is the one you want from just passing the job name ?
Hence I feel that a call to Boto3 is not helpful in the first place.

Could this not be provided by default as per Spark jobs ?
args = getResolvedOptions(sys.argv)
job_run_id = args['JOB_RUN_ID']

This would ensure both issues :
(1) JobId is available when executing manually
(2) Correct ID is guaranteed when multiple instances executing in parallel.

Thanks

@bikboktech
Copy link

bikboktech commented Nov 22, 2023

@simonB2020 the API does not know which JobId is the one I want. In my scenario, I collect all available job runs and then I would use the Arguments response dictionary (combined with [latest] execution date) to retrieve the JobId I need. I agree this is not a reliable method of retrieving a job's ID.

@andhrelja But the point remains, the key is missing ! The answer you've provided only resolves the issue for a specific circumstance - the problem remains if you are not using the Glue Client to start a job.

Please can you re-open this case ?

Based on this reply, I assumed you have a similar use-case and could elaborate on it. Your latest reply suggests a feature that should really be a separate issue.

@simonB2020
Copy link

"feature that should really be a separate issue"
This is where i get confused on how we classify things.
You are able to get RunID in a Spark Job but not Python Job - so is that a "feature" (to add it) or a "bug" (because it is missing)?
If you fixed the above issue, then the Boto3 issue becomes obsolete - so "separate, but related & dependent" issue.

Not sure my use case is much different to the original.
We have script in our jobs that send actions/events to other services.
Currently we have to match on (JobName & start/end dates) .. and when you have jobs running in parallel it is impossible to identify the exact RunID of the action/event.
Hence cannot create an audit log to map cause/effect.

@andhrelja
Copy link
Author

andhrelja commented Nov 22, 2023

Well, my Resolution proposal is not a fix per-se, but it clarified the missing Arguments dict. Sounds like our use case is similar, but I still feel like adding the JOB_RUN_ID variable to the Glue (Python Shell) Job context should be a feature request. @tim-finnigan what are your thoughts on making the JOB_RUN_ID available in the Glue (Python Shell) Job context?

@github-actions github-actions bot removed the response-requested Waiting on additional information or feedback. label Nov 23, 2023
@tim-finnigan
Copy link
Contributor

Hi @andhrelja - get_job_run maps to the AWS Glue GetJobRun API, and already accepts RunId as a parameter. Are you asking for an update to the Glue API parameters/functionality? We recommend reaching out through AWS Support for service API feature requests, otherwise we can forward them to the Glue team.

I think I'm missing some context here, I'm not entirely clear on your use case. The AWS Glue User Guide has a section on working with Spark jobs that we can refer to.

@tim-finnigan tim-finnigan added the response-requested Waiting on additional information or feedback. label Nov 29, 2023
Copy link

github-actions bot commented Dec 5, 2023

Greetings! It looks like this issue hasn’t been active in longer than five days. We encourage you to check if this is still an issue in the latest release. In the absence of more information, we will be closing this issue soon. If you find that this is still a problem, please feel free to provide a comment or upvote with a reaction on the initial post to prevent automatic closure. If the issue is already closed, please feel free to open a new one.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Dec 5, 2023
@simonB2020
Copy link

@tim-finnigan
The use case is :
When running a PYTHON glue job, the run_id should be available as a default job arg.

(we cannot use Boto3 API as this will not give the correct run_id in the scenario of concurrent or rapidly sequential executions)

@github-actions github-actions bot removed closing-soon This issue will automatically close in 4 days unless further comments are made. response-requested Waiting on additional information or feedback. labels Dec 6, 2023
@tim-finnigan
Copy link
Contributor

Thanks for following up and your patience here. As I mentioned, the Boto3 Python SDK uses the Glue service APIs, so any requests directly involving those APIs would need to get escalated to the Glue team. You can reach out through AWS Support for issues like this, or if you don't have a support plan, please create an issue in our cross-SDK respository and someone can reach out internally to the Glue team on your behalf regarding this issue.

Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
glue service-api This issue is caused by the service API, not the SDK implementation.
Projects
None yet
Development

No branches or pull requests

4 participants