Skip to content

Commit

Permalink
Treat OF groups and meters the same when initializing flows (#5613)
Browse files Browse the repository at this point in the history
* Treat OF groups and meters the same when initializing flows

Both meter and group entries need to be deleted before initializing /
replaying flows, to avoid possible conflicts with existing objects.

Until now, meter entries were deleted by Antrea in the OFClient's
Initialize method, while group entries were deleted by the ofnet library
for every switch connection / re-connection.

We now treat both meter and group entries the same:
* ofnet is no longer in charge of deleting group entries
* all meter and group entries are deleted by Antrea before replaying
  flows (in the OFClient's initialize method)

We also remove a stale comment in the Agent initialization code.

Note that this deletion operation is mostly needed in the following
scenario: the antrea-agent container restarts while the antrea-ovs
container keeps running. In that case, all OF entries are replayed, but
previous entries (group & meter) can still be present because OVS
daemons were not restarted. The code takes a more general approach of
deleting entries every time flows are replayed, including in the
following situations:
* antrea-ovs container / OVS daemons are restarted: in this case group
  and meter tables are empty, so the deletion is a no-op.
* spurious re-connection without any process dying: not sure this can
  happen in practice.

* Replace max_len=128 with max_len=65535 in expected OF flows in tests

This is because of a change in the ofnet library.

65535 means that there is no buffering and that the full packet is sent
to the controller. This is actually the only value supported by OVS, and
the ofnet library now defaults to it (since v0.10).

Signed-off-by: Antonin Bas <[email protected]>
  • Loading branch information
antoninbas authored Nov 4, 2023
1 parent d921767 commit ec2f6f8
Show file tree
Hide file tree
Showing 13 changed files with 91 additions and 69 deletions.
4 changes: 2 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ module antrea.io/antrea
go 1.21

require (
antrea.io/libOpenflow v0.12.1
antrea.io/ofnet v0.9.0
antrea.io/libOpenflow v0.13.0
antrea.io/ofnet v0.12.0
github.com/ClickHouse/clickhouse-go/v2 v2.6.1
github.com/DATA-DOG/go-sqlmock v1.5.0
github.com/Mellanox/sriovnet v1.1.0
Expand Down
8 changes: 4 additions & 4 deletions go.sum
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
antrea.io/libOpenflow v0.12.1 h1:5l+/SHEfJuuDbjkqF+YuZmk3oGjsQ1QS21K50EfqqOA=
antrea.io/libOpenflow v0.12.1/go.mod h1:WNcqu1927fBdc4G9wocmi58+bfxaLmZOTaTGROEcM8I=
antrea.io/ofnet v0.9.0 h1:Fcpi32GXcp4XCV/KQYDUtQ8b3HNqYxOCjTlcL0GIJ44=
antrea.io/ofnet v0.9.0/go.mod h1:0HhW28bemQec5xzwvdJIboqEs+884fU2cWDahQskVjs=
antrea.io/libOpenflow v0.13.0 h1:CelrRAMUk2wZyLI7KrYy2+cbFsJrWVIQp3xlP/pnNu4=
antrea.io/libOpenflow v0.13.0/go.mod h1:U8YR0ithWrjwLdUUhyeCsYnlKg/mEFjG5CbPNt1V+j0=
antrea.io/ofnet v0.12.0 h1:pgygAsEZJUPK/kGmeuIesDh5hoGLpYeavSLdG/Dp8Ao=
antrea.io/ofnet v0.12.0/go.mod h1:MB3qaInEimj+T8qtpBcIQK+EqeNiY1S/WbUdGk+TzNg=
bazil.org/fuse v0.0.0-20160811212531-371fbbdaa898/go.mod h1:Xbm+BRKSBEpa4q4hTSxohYNQpsxXPbPry4JJWOB3LB8=
cloud.google.com/go v0.26.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw=
cloud.google.com/go v0.34.0/go.mod h1:aQUYkXzVsufM+DwF1aE+0xfcU+56JwCaLick0ClmMTw=
Expand Down
6 changes: 0 additions & 6 deletions pkg/agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -548,12 +548,6 @@ func persistRoundNum(num uint64, bridgeClient ovsconfig.OVSBridgeClient, interva
// agent restarts (with the agent crashing before step 4 can be completed). With the sequence
// described above, We guarantee that at most two rounds of flows exist in the switch at any given
// time.
// Note that at the moment we assume that all OpenFlow groups are deleted every time there is an
// Antrea Agent restart. This allows us to add the necessary groups without having to worry about
// the operation failing because a (stale) group with the same ID already exists in OVS. This
// assumption is currently guaranteed by the ofnet implementation:
// https://github.com/wenyingd/ofnet/blob/14a78b27ef8762e45a0cfc858c4d07a4572a99d5/ofctrl/fgraphSwitch.go#L57-L62
// All previous groups have been deleted by the time the call to i.ofClient.Initialize returns.
func (i *Initializer) initOpenFlowPipeline() error {
roundInfo := getRoundInfo(i.ovsBridgeClient)

Expand Down
41 changes: 24 additions & 17 deletions pkg/agent/openflow/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -801,28 +801,46 @@ func (c *client) GetServiceFlowKeys(svcIP net.IP, svcPort uint16, protocol bindi
}

func (c *client) initialize() error {
// After a connection or re-connection, delete all existing group and meter entries, to
// avoid "already exist" errors. This will typically happen if the antrea-agent container is
// restarted (but not the antrea-ovs one). We do this in initialize(), and not directly in
// Initialize(), to ensure that the deletion happen on every re-connection (when
// ReplayFlows() is called), even though we typically only see reconnections when the OVS
// daemons are restarted. When ovs-vswitchd restarts, group and meter entries are empty by
// default and these calls are not required.
// This is specific to groups and meters. Flows are replayed with a different cookie number
// and conflicts are not possible.
if err := c.bridge.DeleteGroupAll(); err != nil {
return fmt.Errorf("error when deleting all group entries: %w", err)
}
if c.ovsMetersAreSupported {
if err := c.bridge.DeleteMeterAll(); err != nil {
return fmt.Errorf("error when deleting all meter entries: %w", err)
}
}

if err := c.ofEntryOperations.AddAll(c.defaultFlows()); err != nil {
return fmt.Errorf("failed to install default flows: %v", err)
return fmt.Errorf("failed to install default flows: %w", err)
}

if c.ovsMetersAreSupported {
if err := c.genOFMeter(PacketInMeterIDNP, ofctrl.MeterBurst|ofctrl.MeterPktps, uint32(c.packetInRate), uint32(2*c.packetInRate)).Add(); err != nil {
return fmt.Errorf("failed to install OpenFlow meter entry (meterID:%d, rate:%d) for NetworkPolicy packet-in rate limiting: %v", PacketInMeterIDNP, c.packetInRate, err)
return fmt.Errorf("failed to install OpenFlow meter entry (meterID:%d, rate:%d) for NetworkPolicy packet-in rate limiting: %w", PacketInMeterIDNP, c.packetInRate, err)
}
if err := c.genOFMeter(PacketInMeterIDTF, ofctrl.MeterBurst|ofctrl.MeterPktps, uint32(c.packetInRate), uint32(2*c.packetInRate)).Add(); err != nil {
return fmt.Errorf("failed to install OpenFlow meter entry (meterID:%d, rate:%d) for TraceFlow packet-in rate limiting: %v", PacketInMeterIDTF, c.packetInRate, err)
return fmt.Errorf("failed to install OpenFlow meter entry (meterID:%d, rate:%d) for TraceFlow packet-in rate limiting: %w", PacketInMeterIDTF, c.packetInRate, err)
}
if err := c.genOFMeter(PacketInMeterIDDNS, ofctrl.MeterBurst|ofctrl.MeterPktps, uint32(c.packetInRate), uint32(2*c.packetInRate)).Add(); err != nil {
return fmt.Errorf("failed to install OpenFlow meter entry (meterID:%d, rate:%d) for DNS interception packet-in rate limiting: %v", PacketInMeterIDDNS, c.packetInRate, err)
return fmt.Errorf("failed to install OpenFlow meter entry (meterID:%d, rate:%d) for DNS interception packet-in rate limiting: %w", PacketInMeterIDDNS, c.packetInRate, err)
}
}

for _, activeFeature := range c.activatedFeatures {
if err := c.ofEntryOperations.AddOFEntries(activeFeature.initGroups()); err != nil {
return fmt.Errorf("failed to install feature %v initial groups: %v", activeFeature.getFeatureName(), err)
return fmt.Errorf("failed to install feature %s initial groups: %w", activeFeature.getFeatureName(), err)
}
if err := c.ofEntryOperations.AddAll(activeFeature.initFlows()); err != nil {
return fmt.Errorf("failed to install feature %v initial flows: %v", activeFeature.getFeatureName(), err)
return fmt.Errorf("failed to install feature %s initial flows: %w", activeFeature.getFeatureName(), err)
}
}

Expand Down Expand Up @@ -870,17 +888,6 @@ func (c *client) Initialize(roundInfo types.RoundInfo,
return nil, fmt.Errorf("error when deleting exiting flows for current round number: %v", err)
}

// In the normal case, there should be no existing meter entries. This is needed in case the
// antrea-agent container is restarted (but not the antrea-ovs one), which will add meter
// entries during initialization, but the meter entries added during the previous
// initialization still exist. Trying to add an existing meter entry will cause an
// OFPMMFC_METER_EXISTS error.
if c.ovsMetersAreSupported {
if err := c.bridge.DeleteMeterAll(); err != nil {
return nil, fmt.Errorf("error when deleting all meter entries: %v", err)
}
}

return connCh, c.initialize()
}

Expand Down
3 changes: 3 additions & 0 deletions pkg/agent/openflow/client_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -2781,6 +2781,9 @@ func Test_client_ReplayFlows(t *testing.T) {

expectedFlows = append(expectedFlows, replayedFlows...)

bridge.EXPECT().DeleteGroupAll().Return(nil).Times(1)
bridge.EXPECT().DeleteMeterAll().Return(nil).Times(1)

actualGroups := make([]string, 0)
m.EXPECT().AddOFEntries(gomock.Any()).Do(func(ofEntries []binding.OFEntry) {
for _, entry := range ofEntries {
Expand Down
6 changes: 3 additions & 3 deletions pkg/agent/openflow/multicast_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ func multicastInitFlows(isEncap bool) []string {
return []string{
"cookie=0x1050000000000, table=MulticastEgressRule, priority=64990,igmp,reg0=0x3/0xf actions=goto_table:MulticastRouting",
"cookie=0x1050000000000, table=MulticastEgressPodMetric, priority=210,igmp actions=goto_table:MulticastRouting",
"cookie=0x1050000000000, table=MulticastRouting, priority=210,igmp,reg0=0x3/0xf actions=controller(id=32776,reason=no_match,userdata=03,max_len=128)",
"cookie=0x1050000000000, table=MulticastRouting, priority=210,igmp,reg0=0x1/0xf actions=controller(id=32776,reason=no_match,userdata=03,max_len=128)",
"cookie=0x1050000000000, table=MulticastRouting, priority=210,igmp,reg0=0x3/0xf actions=controller(id=32776,reason=no_match,userdata=03,max_len=65535)",
"cookie=0x1050000000000, table=MulticastRouting, priority=210,igmp,reg0=0x1/0xf actions=controller(id=32776,reason=no_match,userdata=03,max_len=65535)",
"cookie=0x1050000000000, table=MulticastRouting, priority=190,ip actions=set_field:0x200000/0x600000->reg0,set_field:0x2->reg1,goto_table:MulticastOutput",
"cookie=0x1050000000000, table=MulticastIngressPodMetric, priority=210,igmp actions=goto_table:MulticastOutput",
"cookie=0x1050000000000, table=MulticastOutput, priority=210,reg0=0x200001/0x60000f,reg1=0x2 actions=drop",
Expand All @@ -38,7 +38,7 @@ func multicastInitFlows(isEncap bool) []string {
}
return []string{
"cookie=0x1050000000000, table=MulticastIngressPodMetric, priority=210,igmp actions=goto_table:MulticastOutput",
"cookie=0x1050000000000, table=MulticastRouting, priority=210,igmp,reg0=0x3/0xf actions=controller(id=32776,reason=no_match,userdata=03,max_len=128)",
"cookie=0x1050000000000, table=MulticastRouting, priority=210,igmp,reg0=0x3/0xf actions=controller(id=32776,reason=no_match,userdata=03,max_len=65535)",
"cookie=0x1050000000000, table=MulticastRouting, priority=190,ip actions=set_field:0x200000/0x600000->reg0,set_field:0x2->reg1,goto_table:MulticastOutput",
"cookie=0x1050000000000, table=MulticastEgressPodMetric, priority=210,igmp actions=goto_table:MulticastRouting",
"cookie=0x1050000000000, table=MulticastEgressRule, priority=64990,igmp,reg0=0x3/0xf actions=goto_table:MulticastRouting",
Expand Down
Loading

0 comments on commit ec2f6f8

Please sign in to comment.