Skip to content

Commit

Permalink
[Test] Make TestBatchCreatePods more robust (#5577)
Browse files Browse the repository at this point in the history
This e2e validates that when creating a batch of Pods, no file
descriptor is leaked by the antrea-agent process. In order to do that,
it compares the sets of FDs before creating the Pods and after creating
the Pods. However, in some rare cases, we observe that the test fails
because of some temporary FDs that have been allocated, e.g., because of
incoming CNI requests. We make the test more robust by asserting that
the new set of FDs (after batch creation) eventually becomes a subset of
the old set of FDs (before batch creation). The polling logic performs
the check every 100ms and only needs a small timeout (2s), as we never
expect the check to take more than a couple of iterations. In most
cases, it will succeed on the first iteration.

Fixes #5567

Signed-off-by: Antonin Bas <[email protected]>
  • Loading branch information
antoninbas authored Oct 16, 2023
1 parent bb59347 commit 682f7de
Showing 1 changed file with 20 additions and 6 deletions.
26 changes: 20 additions & 6 deletions test/e2e/batch_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,12 @@ import (
"fmt"
"strings"
"testing"
"time"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/apimachinery/pkg/util/wait"
)

// TestBatchCreatePods verifies there is no FD leak after batched Pod creation.
Expand All @@ -39,24 +43,34 @@ func TestBatchCreatePods(t *testing.T) {
podName, err := data.getAntreaPodOnNode(node1)
assert.NoError(t, err)

getFDs := func() string {
getFDs := func() sets.Set[string] {
// In case that antrea-agent is not running as Pid 1 in future.
cmds := []string{"pgrep", "-o", "antrea-agent"}
pid, _, err := data.RunCommandFromPod(antreaNamespace, podName, "antrea-agent", cmds)
assert.NoError(t, err)
require.NoError(t, err)

// Ignore the difference of modification time by specifying "--time-style +".
cmds = []string{"ls", "-l", "--time-style", "+", fmt.Sprintf("/proc/%s/fd/", strings.TrimSpace(pid))}
stdout, _, err := data.RunCommandFromPod(antreaNamespace, podName, "antrea-agent", cmds)
assert.NoError(t, err)
return stdout
require.NoError(t, err)

fds := strings.Split(stdout, "\n")
return sets.New(fds...)
}

oldFDs := getFDs()

_, _, cleanupFn := createTestBusyboxPods(t, data, batchNum, data.testNamespace, node1)
defer cleanupFn()

newFDs := getFDs()
assert.Equal(t, oldFDs, newFDs, "FDs were changed after batched Pod creation")
// It is possible for new FDs to be allocated temporarily by the process, for different
// reasons (health probes, CNI invocations, ...). In that case, the new set of FDs can
// contain additional entries compared to the old set of FDs. However, eventually, getFDs()
// should return a subset of oldFDs.
// Most of the time, wait.PollImmediate will return immediately, after the first call to the
// condition function.
assert.NoError(t, wait.PollImmediate(100*time.Millisecond, 2*time.Second, func() (bool, error) {
newFDs := getFDs()
return oldFDs.IsSuperset(newFDs), nil
}), "Batched Pod creation allocated new FDs")
}

0 comments on commit 682f7de

Please sign in to comment.