Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test lakefsfs with mock server #6646

Merged
merged 9 commits into from
Oct 3, 2023

Conversation

arielshaqed
Copy link
Contributor

Translate almost all tests in LakeFSFileSystemTest to use a Java MockClient. The result is called LakeFSFileSystemServerTest.

The existing Mockito-based code is horribly brittle; see the underlying issue #6554 for why this manages to block our move to a forwards-compatible Java SDK. So this is blocking #6424 .

Fixes #6554 for the most part. Some tests are still missing, but I do not want to block the Java client review on fixing them. I will open a new issue to uncomment these tests, and copy over any tests from LakeFSFileSystemTest that are still missing.

@arielshaqed arielshaqed added area/testing Improvements or additions to tests area/system-tests Improvements or additions to the system-tests exclude-changelog PR description should not be included in next release changelog area/client/hadoopfs labels Sep 24, 2023
LakeFSFileSystemServerTest will replace LakeFSFileSystemTest once all tests
are translated.  It fakes actual lakeFS responses using a MockServer.  That
is a more complete test: test lakeFSFS versus the on-the-wire interface not
versus the generated SDK.  In particular the generated SDK will change soon
(#6424) to have a completely different API.  And Mockito cannot easily mock
the new API.

Fixes #6554.
lakeFSFS relies on 404s ("not found") for things such as directory marker
creation.  Without this, its behaviour in various states cannot be
simulated.  Also always returning a 404 makes it nearly impossible to
distinguish that no mock request matched.
@arielshaqed arielshaqed force-pushed the chore/test-lakefsfs-with-mock-server branch 3 times, most recently from c932b61 to 83d2190 Compare September 27, 2023 15:47
Some testRename* tests still commented-out, we will do them separately so as
not to delay the new Java SDK.
@arielshaqed arielshaqed force-pushed the chore/test-lakefsfs-with-mock-server branch 2 times, most recently from 1f34231 to 6ed5dc7 Compare September 27, 2023 15:53
Copy link
Contributor

@johnnyaug johnnyaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing major.

@@ -41,6 +41,11 @@ public LakeFSClient(String scheme, Configuration conf) throws IOException {
basicAuth.setUsername(accessKey);
basicAuth.setPassword(secretKey);

String sessionId = FSConfiguration.get(conf, scheme, Constants.SESSION_ID);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended by MockServer docs. You put a cookie that defines a session ID header on all requests. This lets you debug MockServer outputs.

Why? MockServer is a separate process, it logs a lot but doesn't even know which test is running. The session-ID cookie make it possible for a dev to correlate its output with the test run. I've found it very useful.

@@ -925,7 +925,7 @@ public void testRename_existingFileToExistingDirName() throws ApiException, IOEx
ObjectLocation fileObjLoc = fs.pathToObjectLocation(fileInDstDir);
Path dst = new Path("lakefs://repo/main/existing-dir2");
ObjectLocation dstObjLoc = fs.pathToObjectLocation(dst);
mockExistingDirPath(dstObjLoc, ImmutableList.of(fileObjLoc));
// mockExistingDirPath(dstObjLoc, ImmutableList.of(fileObjLoc));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently it was never needed(!), the code discovers that the file exists even before listing it 🤷🏽 .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's delete it then!

}

// Expect this statObject to be not found
protected void expectStatObjectNotFound(String repo, String ref, String path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really an expectation: it's setting the mock to return NotFound. Same for the next ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate the word "mock", but changing words back.

Comment on lines 1098 to 1099
String contents = "The quick brown fox jumps over the lazy dog.";
byte[] contentsBytes = contents.getBytes();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes. It's not! It wasn't before, but it still isn't used.

// Need a directory marker at the source because it's now empty!
expectUploadObject("repo", "main", "existing-dir1/");

Assert.assertTrue(fs.rename(src, dst));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many tests here (and their original counterparts) do not test much. Can we test the physical file at this point, similar to what we do on testCreate, for example?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are issues with this test, agreed. As they do not block 1.0, I do not want to perform them here.

Note however that using a mockserver tightens up these tests! The "smart nulls" behaviour that we used on Mockito means that many calls will fail silently and differently to how they fail on a real HTTP server.

Initial refactoring to allow LakeFSFileSystemPresignedModeTest and
LakeFSFileSystemSimpleModeTest to be clearly written.  This refactoring also
removes MinIO from _most_ tests in LakeFSFileSystemServerTest, which allows
these tests to be less brittle while running faster.

**MUST** bump Maven Surefire Plugin (the JUnit test runner) version to 3.x
from 2.x, to get rid of "No tests found matching Method..." failure messages
that appear for no apparent reason.
These are over MinIO, ensuring that presigned works at least there.  (MinIO
is similar enough to S3 on this, and indeed _probably_ any HTTP server
supporting GET and PUT would be enough to check presigned works...).
These now happen as parameters in LakeFSFileSystemServerS3Test.
Copy link
Contributor Author

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored S3 tests in f8372ad, then used that in e0b075d to test presigned and simple modes (they work!).

Other review comments are on lines that I've already clobbered, so responding on them there; sorry!

@@ -41,6 +41,11 @@ public LakeFSClient(String scheme, Configuration conf) throws IOException {
basicAuth.setUsername(accessKey);
basicAuth.setPassword(secretKey);

String sessionId = FSConfiguration.get(conf, scheme, Constants.SESSION_ID);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended by MockServer docs. You put a cookie that defines a session ID header on all requests. This lets you debug MockServer outputs.

Why? MockServer is a separate process, it logs a lot but doesn't even know which test is running. The session-ID cookie make it possible for a dev to correlate its output with the test run. I've found it very useful.

Flavour of the day :-)
@arielshaqed arielshaqed requested a review from johnnyaug October 2, 2023 12:15
@johnnyaug
Copy link
Contributor

Tests that are no longer there:

testRename_existingDirToExistingNonEmptyDirName
testRename_fallbackStageAPI
testRename_nonExistingSrcFile
testRename_srcAndDstOnDifferentBranch
testRename_srcEqualsDst

Are we cool with that?

Copy link
Contributor

@johnnyaug johnnyaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See some minor comments

}

@Parameter(1)
public String unusedAddressCreatorType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JUnit 😠 😡

Parametrized tests are horrible in JUnit. If I didn't explicitly name things, the tests would have named "[0]" and "[1]", and good luck figuring out whether a failure was in "simple" or in "presigned". So we give a name here, and this is the only way to do it. Note that the string defining the name is literally "name={1}", and this happens before compilation (Java attributes FTW).

So a parametrized test can only receive a static name, so it has to be a parameter. But... if I don't capture every element of the parameter in a variable then JUnit will complain and refuse to run the test.

Simple!


// Return a location under namespace for this getPhysicalAddress call.
//
// TODO(ariels): abstract, overload separately for direct and pre-signed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO is already done?

import java.util.Arrays;
import java.util.concurrent.TimeUnit;

@RunWith(Parameterized.class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JUnit FTW. Enjoy it while it lasts, you're gonna hate the reply to the next comment!

Copy link
Contributor Author

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think no major major changes in this, and you approved, so I will pull once tests pass. This does of course mean that you get to ask for more changes if I did something horribly wrong 🤷🏽 .

import java.util.Arrays;
import java.util.concurrent.TimeUnit;

@RunWith(Parameterized.class)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JUnit FTW. Enjoy it while it lasts, you're gonna hate the reply to the next comment!

}

@Parameter(1)
public String unusedAddressCreatorType;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JUnit 😠 😡

Parametrized tests are horrible in JUnit. If I didn't explicitly name things, the tests would have named "[0]" and "[1]", and good luck figuring out whether a failure was in "simple" or in "presigned". So we give a name here, and this is the only way to do it. Note that the string defining the name is literally "name={1}", and this happens before compilation (Java attributes FTW).

So a parametrized test can only receive a static name, so it has to be a parameter. But... if I don't capture every element of the parameter in a variable then JUnit will complain and refuse to run the test.

Simple!

@arielshaqed
Copy link
Contributor Author

Tests that are no longer there:

testRename_existingDirToExistingNonEmptyDirName
testRename_fallbackStageAPI
testRename_nonExistingSrcFile
testRename_srcAndDstOnDifferentBranch
testRename_srcEqualsDst

Are we cool with that?

Added them all back. Except for testRename_fallbackStageAPI, which IIUC @johnnyaug is getting rid of, so I left him a TODO.

@arielshaqed arielshaqed enabled auto-merge (squash) October 3, 2023 13:01
@arielshaqed
Copy link
Contributor Author

THANKS!

Sorry for the mega-review

If it's any consolation, #6656 is even worse!

Pulling once tests pass.

@arielshaqed arielshaqed merged commit 6f69b3e into master Oct 3, 2023
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/client/hadoopfs area/system-tests Improvements or additions to the system-tests area/testing Improvements or additions to tests exclude-changelog PR description should not be included in next release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Refactor LakeFSFileSystemTest to use a mock server rather than mockito
2 participants