-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Infinite loop while paginating paws.management::cloudwatchlogs()$get_log_events() #721
Comments
By the way, is there a completely straightforward and reliable way to get the CloudWatch log group given a job ID, job name, or even job ARN? I can't find it in the output of describe_jobs() or describe_job_definitions(), so for the sake of wlandau/crew.aws.batch#2, I am having users create their own job definitions and supply an explicit log group name with default "/aws/batch/job". Not ideal if someone's sys admins use a custom logging scheme which users don't know about, so I'd be interested in a better approach if there is one. |
Sorry about that @wlandau I will try and get this in for paws.common 0.6.5. Off the top of my head I am not sure but I will have a little look into it :) |
Hmmm it looks like From digging a little more, there seems to be some issues raised on botocore and aws-sdk around this: |
Not sure what to do on this one as I would like to keep align with the other aws-sdks however it doesn't feel difficult to implement. From my understanding it should be checking the |
Here is a solution that aws js v3 uses. I believe we can implement something similar. |
@wlandau I believe I have a solution to this based off the aws sdk* js v3: remotes::install_github("dyfanjones/paws/paws.common", ref = "paginator_empty_result")
library(paws)
client = cloudwatchlogs(config(credentials(profile = "paws")))
pages <- paws.common::paginate(
client$get_log_events(
logGroupName = "/aws/sagemaker/NotebookInstances",
logStreamName = "paws-demo/jupyter.log",
startFromHead = TRUE
),
StopOnSameToken = TRUE
)
length(pages)
#> [1] 2 Created on 2023-12-19 with reprex v2.0.2 Please try it out :) |
For completeness, here is the typescript smithy reference: https://github.com/smithy-lang/smithy-typescript/blob/main/packages/core/src/pagination/createPaginator.ts#L20-L58 export function createPaginator<
PaginationConfigType extends PaginationConfiguration,
InputType extends object,
OutputType extends object
>(
ClientCtor: any,
CommandCtor: any,
inputTokenName: string,
outputTokenName: string,
pageSizeTokenName?: string
): (config: PaginationConfigType, input: InputType, ...additionalArguments: any[]) => Paginator<OutputType> {
return async function* paginateOperation(
config: PaginationConfigType,
input: InputType,
...additionalArguments: any[]
): Paginator<OutputType> {
let token: any = config.startingToken || undefined;
let hasNext = true;
let page: OutputType;
while (hasNext) {
(input as any)[inputTokenName] = token;
if (pageSizeTokenName) {
(input as any)[pageSizeTokenName] = (input as any)[pageSizeTokenName] ?? config.pageSize;
}
if (config.client instanceof ClientCtor) {
page = await makePagedClientRequest(CommandCtor, config.client, input, ...additionalArguments);
} else {
throw new Error(`Invalid client, expected instance of ${ClientCtor.name}`);
}
yield page;
const prevToken = token;
token = (page as any)[outputTokenName];
hasNext = !!(token && (!config.stopOnSameToken || token !== prevToken));
}
// @ts-ignore
return undefined;
};
} |
Thanks for the patch, @DyfanJones! Unfortunately, I won’t be able to test until January when I get back from vacation. If #722 works on your end, please feel free to merge etc. |
No worries, I won't be pushing this to the cran before the new year. I am just going to go over the logic before finally merging :) |
Hi @DyfanJones, I hope you had a nice holiday. I tested your work on this issue, both using #721 (comment) and https://github.com/wlandau/crew.aws.batch/blob/9714d49d32d94b1e044cd06f2fde0956f6ea556e/tests/monitor/test-jobs.R#L88-L130, and By the way, I noticed the default argument of |
That is great news 👍 The implementation mimics how the JavaScript AWS SDK V3 paginates with the default set as false or missing For knowing what service to set
It might be worth having this documentation in paws as well. Sadly this isn't a solid answer. If anyone knows a better answer other than documentation I am happy add it in 😄 For completeness ====Mini example for testing out aws sdk js v3 paginator logic. let tokens = ["a", "b", "c", "c", "d"];
export interface PaginationConfiguration {
stopOnSameToken?: boolean;
};
export function createPaginator(config: PaginationConfiguration): void {
let hasNext: boolean = true;
let token: string = undefined;
let i: number = 0;
while (hasNext) {
const prevToken = token;
token = tokens[i];
hasNext = !!(token && (!config.stopOnSameToken || token !== prevToken));
console.log(hasNext)
i++;
};
};
const paginatorConfig = { stopOnSameToken: true };
// const paginatorConfig = { };
// const paginatorConfig = { stopOnSameToken: false };
createPaginator(paginatorConfig);
// paginatorConfig.stopOnSameToken = true
// true
// true
// true
// false
// paginatorConfig.stopOnSameToken = false or
// paginatorConfig = {}
// true
// true
// true
// true
// true
// false |
Note: paws.common 0.7.0 has been released to the cran |
I am working on wlandau/crew.aws.batch#2, and things are mostly coming along well. However, I notice that when I try to use pagination to get the full log of an AWS Batch job, the call to
get_log_events()
hangs. Turning on verbose logging withoptions(paws.log_level = 3)
, it appears that the function call churns in an infinite loop even when the log is very small. I am usingpaws.common
0.6.4 andpaws.management
0.4.0 (both current CRAN). Here is my reprex:On my end, the R console output looks like this (although I censored the sensitive information). It never stops printing messages like these.
Session info:
The text was updated successfully, but these errors were encountered: