Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logs_reader: use get_log_events() #6

Merged
merged 1 commit into from
Sep 1, 2021
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 13 additions & 6 deletions utils/logs_reader.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,20 +80,27 @@ def retrieve_log_stream_names(log_group_name, prefixes=None):
def read_log_messages(log_group_name, log_stream_name):
""" Retrieves all events from the specified log group/stream, formats the timestamp
as an ISO-8601 string, and sorts them by timestamp.

Note: filter_log_events() takes an excessive amount of time if there are a large
number of streams, even though we're only selecting from one, so instead we use
get_log_events(). However, Boto doesn't provide a paginator for it, so we have to
handle the pagination ourselves. Fun!
"""
events = []
paginator = client.get_paginator('filter_log_events')
for page in paginator.paginate(logGroupName=log_group_name, logStreamNames=[log_stream_name]):
request = {'logGroupName': log_group_name, 'logStreamName': log_stream_name}
while True:
page = client.get_log_events(**request)
if page['nextForwardToken'] == request.get('nextToken'):
break;
request['nextToken'] = page['nextForwardToken']
for event in page['events']:
ts = event['timestamp']
event['originalTimestamp'] = ts
event['timestamp'] = datetime.fromtimestamp(ts / 1000.0, timezone.utc).isoformat()
event['timestamp'] = datetime.fromtimestamp(ts / 1000, timezone.utc).isoformat()
events.append(event)
# this step may be unnecessary, but I don't see any ordering guarantees
events.sort(key=lambda x: x['timestamp'])
events.sort(key=lambda x: x['timestamp']) # API doesn't guarantee order
return events


if __name__ == "__main__":
if len(sys.argv) < 2:
print(__doc__)
Expand Down