Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregate L3 errors with L2 discards for interfaces with RIF #325

Merged
merged 1 commit into from
Aug 12, 2024

Conversation

stepanblyschak
Copy link
Contributor

@stepanblyschak stepanblyschak commented Aug 6, 2024

Fixes sonic-net/sonic-buildimage#19715

- What I did

Today we sum L2 and L3 errors on RX/TX in ifInErrors/ifOutErrors counters that are part of RFC1213 MIB.
It appears that the monitoring system is using ifInErrors as a way to detect CRC errors on physical interfaces.

The RFC vaguely defines ifInErrors as:

 ifInErrors OBJECT-TYPE
              SYNTAX  Counter
              ACCESS  read-only
              STATUS  mandatory
              DESCRIPTION
                      "The number of inbound packets that contained
                      errors preventing them from being deliverable to a
                      higher-layer protocol."

instead, aggregate errors on L3 as part of ifInDiscard and ifOutDiscard.

- How I did it

Removed aggregation logic and updated UT.

- How to verify it

Manual snmpwalk on interfaces - trigger L3 error - e.g. packet with dst map != router mac, check counter by snmpwalk.

Generate RIF RX errors:

admin@sonic:~$ show interface counters rif
    IFACE    RX_OK    RX_BPS    RX_PPS    RX_ERR    TX_OK    TX_BPS    TX_PPS    TX_ERR
---------  -------  --------  --------  --------  -------  --------  --------  --------
Ethernet0        0  0.00 B/s    0.00/s       100        0  0.00 B/s    0.00/s         0
  Vlan100        1  0.00 B/s    0.00/s         0        0  0.00 B/s    0.00/s         0

Before this change (14 - error, 13 - discard):

admin@sonic:~$ docker exec -it snmp snmpwalk -v2c -c public localhost 1.3.6.1.2.1.2.2.1.14.1
iso.3.6.1.2.1.2.2.1.14.1 = Counter32: 100
admin@sonic:~$ docker exec -it snmp snmpwalk -v2c -c public localhost 1.3.6.1.2.1.2.2.1.13.1
iso.3.6.1.2.1.2.2.1.13.1 = Counter32: 0

After this change:

admin@sonic:~$ docker exec -it snmp snmpwalk -v2c -c public localhost 1.3.6.1.2.1.2.2.1.14.1
iso.3.6.1.2.1.2.2.1.14.1 = Counter32: 0
admin@sonic:~$ docker exec -it snmp snmpwalk -v2c -c public localhost 1.3.6.1.2.1.2.2.1.13.1
iso.3.6.1.2.1.2.2.1.13.1 = Counter32: 100

- Description for the changelog

@stepanblyschak
Copy link
Contributor Author

@bingwang-ms @SuvarnaMeenakshi Please review

@SuvarnaMeenakshi
Copy link
Contributor

@stepanblyschak thank you for fixing this.
Per discussion, can we have RIF errors included as a part of ifInDiscards/ifOutDiscards instead completely removing it from being aggregated. Currently it is aggregated as a part of ifInError/ifOutErrors
RIF counters are not supported on all hwsku. On the hwskus that do not support them, a packet with a RIF error is dropped and counted as part of the discard counter.

@stepanblyschak
Copy link
Contributor Author

@SuvarnaMeenakshi ifInDiscard is port buffer drops. Do we know that no monitor app is treating ifInDiscard as port buffer drops? As we are talking about breaking change for established release, a simple revert is less destructive than remap.

@stepanblyschak stepanblyschak changed the title Don't aggregate L2 errors with L3 errors for physical ports with RIF Aggregate L3 errors with L2 discards for interfaces with RIF Aug 7, 2024
@stepanblyschak
Copy link
Contributor Author

@SuvarnaMeenakshi Discussed offline, your comment is addressed, please review

Copy link
Contributor

@SuvarnaMeenakshi SuvarnaMeenakshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@bingwang-ms bingwang-ms merged commit deb7b7c into sonic-net:master Aug 12, 2024
5 checks passed
mssonicbld pushed a commit to mssonicbld/sonic-snmpagent that referenced this pull request Aug 12, 2024
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202405: #326

@bingwang-ms
Copy link
Contributor

@yxieca Can you please approve the cherry-pick to 202311 branch? Thanks

@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202311: #327

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ethernet RIF counters being reported as ifInErrors in SNMP
5 participants