Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover from potential panic when doing map to JSON serialization #161

Merged
merged 13 commits into from
Oct 16, 2023

Conversation

zbud-msft
Copy link
Contributor

@zbud-msft zbud-msft commented Oct 11, 2023

Why I did it

ADO: 25341563

It is possible that in some edge cases, json.Marshal is unable to serialize map to JSON and panics. I am adding some additional logging at a higher log level and the ability for the function to recover from the panic with a deferred recover function.

How I did it

Add deferred recover function when JSON serialization is done and drop the query when unable to provide a JSON to gnmi TypedValue. Add additional logging to give more context of state of map as well as data retrieved from Redis.

How to verify it

UT

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

jv, err := emitJSON(&msi)
if err != nil {
log.V(2).Infof("emitJSON err %s for %v", err, msi)
return nil, fmt.Errorf("emitJSON err %s for %v", err, msi)
}
if jv == nil { // json is nil because of potential panic happen
return nil, fmt.Errorf("emitJSON failed due to panic")
Copy link
Collaborator

@qiluo-msft qiluo-msft Oct 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

panic

panic may not be the only reason, could the error message be more broader or general like "emitJSON failed due to panic, err=..." #Closed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is that on 823, both the json value and the err value would be nil, since if emitJSON has panicked and we have recovered, the values of jv and err would be the zero values which is nil. I can instead say "emitJSON was unable to grab json value of map due to potential panic". In other cases where emitJSON would have failed to get json value we would see that err != nil.

@zbud-msft
Copy link
Contributor Author

Tried it on lab device, we can see recover from panic:

Oct 16 21:40:41.260697 str-msn2700-20 INFO telemetry#supervisord: telemetry I1016 21:40:41.257009      19 db_client.go:660] Recovered from panic: json Marshal reflect Value IsNil simulation panic
Oct 16 21:40:41.281701 str-msn2700-20 INFO telemetry#supervisord: telemetry I1016 21:40:41.257064      19 db_client.go:661] Current state of map to be serialized is: map[etp10:0:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp10:1:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp10:2:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp10:3:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp10:4:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp10:5:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp10:6:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11a:0:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11a:1:map[SAI_QUEUE_STAT_BYTES:8550 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:95 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11a:2:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11a:3:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11a:4:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11a:5:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11a:6:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11b:0:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11b:1:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11b:2:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11b:3:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11b:4:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11b:5:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp11b:6:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12a:0:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12a:1:map[SAI_QUEUE_STAT_BYTES:8550 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:95 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12a:2:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12a:3:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12a:4:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12a:5:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12a:6:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12b:0:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12b:1:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12b:2:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12b:3:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12b:4:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12b:5:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp12b:6:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp13a:0:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp13a:1:map[SAI_QUEUE_STAT_BYTES:8550 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:95 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp13a:2:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp13a:3:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp13a:4:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp13a:5:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp13a:6:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI
Oct 16 21:40:41.282420 str-msn2700-20 INFO telemetry#supervisord: telemetry ] etp7:0:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp7:1:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp7:2:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp7:3:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp7:4:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp7:5:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp7:6:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp8:0:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp8:1:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp8:2:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp8:3:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp8:4:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp8:5:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp8:6:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp9:0:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp9:1:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp9:2:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp9:3:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp9:4:map[PFC_WD_ACTION:drop PFC_WD_DETECTION_TIME:400000 PFC_WD_DETECTION_TIME_LEFT:400000 PFC_WD_QUEUE_STATS_DEADLOCK_DETECTED:0 PFC_WD_QUEUE_STATS_DEADLOCK_RESTORED:0 PFC_WD_RESTORATION_TIME:400000 PFC_WD_STATUS:operational SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_CURR_OCCUPANCY_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_PACKETS_last:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp9:5:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0] etp9:6:map[SAI_QUEUE_STAT_BYTES:0 SAI_QUEUE_STAT_DROPPED_PACKETS:0 SAI_QUEUE_STAT_PACKETS:0 SAI_QUEUE_STAT_SHARED_WATERMARK_BYTES:0]]
Oct 16 21:40:41.282756 str-msn2700-20 INFO telemetry#supervisord: telemetry I1016 21:40:41.282371      19 db_client.go:272] Unable to create gnmi TypedValue due to err: emit JSON failed to due panic

@qiluo-msft qiluo-msft merged commit 07e0b36 into sonic-net:master Oct 16, 2023
4 checks passed
zbud-msft added a commit to zbud-msft/sonic-gnmi that referenced this pull request Oct 18, 2023
…nic-net#161)

It is possible that in some edge cases, json.Marshal is unable to serialize map to JSON and panics. I am adding some additional logging at a higher log level and the ability for the function to recover from the panic with a deferred recover function.

Add deferred recover function when JSON serialization is done and drop the query when unable to provide a JSON to gnmi TypedValue. Add additional logging to give more context of state of map as well as data retrieved from Redis.

UT
zbud-msft added a commit to zbud-msft/sonic-gnmi that referenced this pull request Oct 18, 2023
…nic-net#161)

It is possible that in some edge cases, json.Marshal is unable to serialize map to JSON and panics. I am adding some additional logging at a higher log level and the ability for the function to recover from the panic with a deferred recover function.

Add deferred recover function when JSON serialization is done and drop the query when unable to provide a JSON to gnmi TypedValue. Add additional logging to give more context of state of map as well as data retrieved from Redis.

UT
yxieca added a commit that referenced this pull request Oct 25, 2023
Recover from potential panic when doing map to JSON serialization (#161)
zbud-msft added a commit to zbud-msft/sonic-gnmi that referenced this pull request Oct 26, 2023
…nic-net#161)

It is possible that in some edge cases, json.Marshal is unable to serialize map to JSON and panics. I am adding some additional logging at a higher log level and the ability for the function to recover from the panic with a deferred recover function.

Add deferred recover function when JSON serialization is done and drop the query when unable to provide a JSON to gnmi TypedValue. Add additional logging to give more context of state of map as well as data retrieved from Redis.

UT
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants