Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: protect against phishing domain redirects in main/sub frames for http(s) requests #25153

Merged
merged 23 commits into from
Jul 9, 2024

Conversation

davidmurdoch
Copy link
Contributor

@davidmurdoch davidmurdoch commented Jun 7, 2024

Open in GitHub Codespaces

Manual Testing:

Easy: Verify non-redirect domains still work

  1. Install the extension (you should test mv2 and mv3)
  2. go through onboarding if necessary
  3. go to a phishing website, like claimblast.lol (these websites seem to go down all the time, so you might have to grab one from https://github.com/MetaMask/eth-phishing-detect/commits/main/ to test with)
  4. make sure it redirects to the phishing website

Tricky: Verify domains that redirect are now blocked:

  1. modify background.js by replacing !phishingTestResponse?.result with hostname !== "shorturl.at" (since we test blocking my test URL that is on a domain that isn't supposed to be blocked)
  2. run yarn start (test both mv2 and mv3)
  3. install the extension
  4. go through onboarding if necessary
  5. navigate to https://shorturl.at/6F2Zt
  6. you should see our phishing page. if you see evil.com it didn't work :-(

Some notes:

  • If MetaMask has not fully initialized yet (like on a browser restart) redirect blocking won't work.
  • The new phishing detection code kicks in before our current content_script-based blocking, so it works for redirect cases as well as normal phishing blocking.

Copy link
Contributor

github-actions bot commented Jun 7, 2024

CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes.

@davidmurdoch davidmurdoch changed the title Burned tab id poc2 fix: burned tab id poc2 Jun 10, 2024
@davidmurdoch davidmurdoch force-pushed the burned-tab-id-poc2 branch 3 times, most recently from cc24b4b to 1635f68 Compare June 11, 2024 21:33
danjm and others added 8 commits June 12, 2024 15:35
This proof of concept has been updated to check origins against the
phishing list upon each message, rather than upon each header received.
This requires no additional permissions and no additional extension
API usage, and works on both MV2 and MV3.
This method is more reliable because it works immediately upon
pageload, and doesn't depend on us receiving any messages from the
page. It still works correctly across Chrome (MV2 and MV3) and Firefox,
and requires no additional permissions or manifest changes.
Copy link

codecov bot commented Jun 13, 2024

Codecov Report

Attention: Patch coverage is 18.51852% with 44 lines in your changes missing coverage. Please review.

Project coverage is 69.73%. Comparing base (5ee57a6) to head (fea21e5).
Report is 10 commits behind head on develop.

Files Patch % Lines
app/scripts/background.js 0.00% 44 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop   #25153      +/-   ##
===========================================
- Coverage    69.77%   69.73%   -0.04%     
===========================================
  Files         1376     1377       +1     
  Lines        48403    48477      +74     
  Branches     13348    13366      +18     
===========================================
+ Hits         33773    33804      +31     
- Misses       14630    14673      +43     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@metamaskbot
Copy link
Collaborator

Builds ready [6bd2514]
Page Load Metrics (48 ± 5 ms)
PlatformPageMetricMin (ms)Max (ms)Average (ms)StandardDeviation (ms)MarginOfError (ms)
ChromeHomefirstPaint6712181147
domContentLoaded9201132
load408948115
domInteractive9201132

@metamaskbot
Copy link
Collaborator

Builds ready [feaf0bc]
Page Load Metrics (53 ± 10 ms)
PlatformPageMetricMin (ms)Max (ms)Average (ms)StandardDeviation (ms)MarginOfError (ms)
ChromeHomefirstPaint7116388199
domContentLoaded9201221
load42137532010
domInteractive9201221

@davidmurdoch davidmurdoch changed the title fix: burned tab id poc2 fix: protect against phishing domain redirects in main/sub frames for http(s) requests Jun 13, 2024
Comment on lines +4787 to +4789
if (sender.url) {
if (this.onboardingController.store.getState().completedOnboarding) {
if (this.preferencesController.store.getState().usePhishDetect) {
Copy link
Contributor Author

@davidmurdoch davidmurdoch Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic changed at one point, was refactored, and finally restored to the original logic, but I kept some of the refactoring.

All the refactoring does is minimize the amount of "work" done in these pre-condition checks.

I don't mind restoring it to the original, as the refactoring's performance improvements here are certainly insignificant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this refactor looks fine

Comment on lines 205 to 226
if (details.tabId === browser.tabs.TAB_ID_NONE) {
return {};
}

const onboardState = theController.onboardingController.store.getState();
if (!onboardState.completedOnboarding) {
return {};
}

const prefState = theController.preferencesController.store.getState();
if (!prefState.usePhishDetect) {
return {};
}

// ignore requests that come from our phishing warning page
if (
details.initiator &&
// compare normalized URLs
new URL(details.initiator).toString() === phishingPageHref
) {
return {};
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section just short-circuits if the preconditions aren't met. Most of the conditions are the same as the checks in setupUntrustedCommunication, except for the addition of details.initiator.

app/scripts/background.js Show resolved Hide resolved
Comment on lines 256 to 271
// blocking is better than tab redirection, as blocking will prevent
// the browser from loading the page at all
if (blocking) {
if (details.type === 'sub_frame') {
// redirect the entire tab to the
// phishing warning page instead.
redirectTab(details.tabId, redirectHref);
// don't let the sub_frame load at all
return { cancel: true };
}
// redirect the whole tab, even if it's a sub_frame request
return { redirectUrl: redirectHref };
}
// redirect the whole tab, even if it's a sub_frame request
redirectTab(details.tabId, redirectHref);
return {};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If main frame:

  • mv2: immediately redirects to our Phishing Warning Page. no requests are made to the phishing server
  • mv3: a redirect to our phishing warning page is initiated and completed asynchronously. Any redirects from the phishing page may be followed, but ultimately end up being overridden by the redirect to our Phishing Warning Page.

If sub frame (iframe):

  • mv2: cancels the iframe's navigation. No requests are made to the phishing server. A redirect to our Phishing Warning Page is made asynchronously in the main frame (the tab).
  • mv3: a redirect to our phishing warning page is initiated in the main frame (the tab) and completed asynchronously. Any redirects from the phishing page may still be followed, but ultimately end up being overridden by the redirect to our Phishing Warning Page.

Comment on lines +382 to +391
<meta http-equiv="Refresh" content="0;url="${destination}"/>
<title>Phishing test</title>

<script>
// this script should not run.
// it is meant to test for regressions in our redirect
// protection due to changes in either MetaMask or browsers.
document.location.href = "${destination}";
alert("trying to prevent phishing protection");
while(true){}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random ideas to try to work around our phishing protections. They don't affect our protections and are actually not necessary for our testing. I figured I could keep them here for regression testing. Thoughts?

app/manifest/v2/_base.json Show resolved Hide resolved
Comment on lines -194 to -196
const normalizedUrl = phishingWarningPageUrl.endsWith('/')
? phishingWarningPageUrl
: `${phishingWarningPageUrl}/`;
Copy link
Contributor Author

@davidmurdoch davidmurdoch Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was error prone if the phishing page ends up ever being something like https://domain.com/page-with-trailing-slash.html. The logic is replaced with phishingWarningPageUrlObject.toString(); below, which adds a trailing slash to the end of the domain if it is missing and the URL has nothing after it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

// required to ensure MetaMask is fully started before running tests
// if we had a way of detecting when the offscreen/background were ready
// we could remove this
await unlockWallet(driver);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sad day.

I couldn't easily set up tests to wait until the background/service_worker was fully ready, so I had to fall back to the unlockWallet strategy.

Note: the wallet does NOT have to be unlocked for the protections to kick in.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the comment, now we know!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've figured out a way in a separate PR meant to fix a flaky test.

@HowardBraham figured out a way to set up a socket connection from the extension to the test runner in #25362, so we might be able to use that (which I think will be even better than my other solution)!

@@ -297,4 +330,151 @@ describe('Phishing Detection', function () {
},
);
});

describe('Phishing redirect protections', function () {
Copy link
Contributor Author

@davidmurdoch davidmurdoch Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not structured like our other tests simply for "best practice" reasons: tests set up should be done in before/beforeAll, and teardown should be done in after/afterAll.

By making use of before/beforeAll this test sets up the browser fixtures only once for all of the tests. It still uses our withFixtures function, so it gets a bit awkward because this helper function requires an async callback and it returns a Promise (for different and valid reasons, but you already know that).

Anyway, I don't mind redoing it if you hate it this way. I don't even particularly love the code-gymnastics I had to do to make withFixtures work in this test flow. It does make it faster though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this lays the groundwork for others to be capable of writing tests more easily with less repetition

if (
details.initiator &&
// compare normalized URLs
new URL(details.initiator).toString() === phishingPageHref
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chrome's docs note:

The origin where the request was initiated. This does not change through redirects. If this is an opaque origin, the string 'null' will be used.

I believe the cases where we may run into this are things like sandbox'ed iFrame. Perhaps this could be a good case to test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh interesting. I'll test it.

IIRC, this check was necessary because the "Continue to site" link on our phishing warning page begins the redirect before our content_script can tell the phishingController that the domain is safe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this and while the sandbox property on the iframe did cause everything to stop working, it wasn't for the reason you'd expect.

So sandbox actually prevents our Continue to site links (and other links that want to open in _blank like "Open this warning in a new tab") from working at all.

This is an existing issue that I think is out of scope for this PR. I think the solution is for the iframe to send a message to background.js asking it to open the new tab for us (since the sandbox attribute without an allow-popups prevents it).

I encourage reviewers to attempt their own manual testing with <iframe sandbox>.

app/scripts/background.js Outdated Show resolved Hide resolved
// required to ensure MetaMask is fully started before running tests
// if we had a way of detecting when the offscreen/background were ready
// we could remove this
await unlockWallet(driver);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the comment, now we know!

app/scripts/background.js Show resolved Hide resolved
@@ -297,4 +330,151 @@ describe('Phishing Detection', function () {
},
);
});

describe('Phishing redirect protections', function () {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this lays the groundwork for others to be capable of writing tests more easily with less repetition

@davidmurdoch
Copy link
Contributor Author

QA findings (in progress)

So far looks good. I've encountered one minor change which might not be expected: whenever I click to the continue to the site link from the MM Phishing page, I see how I remain in the MM page. Only after I click again for a second time, or when I refresh, I am then directed to the phishing page. I've seen that this behaviour doesn't seem to happen in production, so I was wondering if might be due to some of the changes? 🤔

double-click-to-go-to-page.mp4

@seaona I've committed a fix and some proof of testing:

New.Tab.-.Brave.2024-06-27.17-40-04.mp4

@metamaskbot
Copy link
Collaborator

Builds ready [7f94f20]
Page Load Metrics (77 ± 23 ms)
PlatformPageMetricMin (ms)Max (ms)Average (ms)StandardDeviation (ms)MarginOfError (ms)
ChromeHomefirstPaint773671106029
domContentLoaded107627167
load44277774823
domInteractive107627167
Bundle size diffs [🚨 Warning! Bundle size has increased!]
  • background: 1.19 KiB (0.04%)
  • ui: 0 Bytes (0.00%)
  • common: 0 Bytes (0.00%)

}

theController.metaMetricsController.trackEvent({
// should we differentiate between background redirection and content script redirection?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I don't think it matters? I can't think of a reason to care about where we catch it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wasn't sure. Maybe someone would want to know how often we're redirecting from the content script vs this earlier detection?

app/scripts/background.js Outdated Show resolved Hide resolved
await promise;
}
});
const deferredTestSuite = createDeferredPromise();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, wow that is an interesting solution to that problem.

It does make sense to enable using a single withFixtures call for multiple tests. Have you tried moving the tests inside the withFixtures callback instead? That might be a simpler way to do this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but then we'd be back to doing things The Wrong Way™️.

Test set up should be inside before or beforeAll and tear down in after or afterAll. Doing it the Right Way®️ will give us access to more granular test metrics and proper setup/tear down will prevent cascading test failures due to issue with open handles (when the test run kills a the test process for whatever reason, it always runs the after hooks).

Perhaps withFixtures should be refactored to be used in it's current form, as well as in a form that makes the patterns in this PR less chaotic; maybe a startFixtures function that returns something like Promise<{ driver, <... etc...>, end: () => Promise<void> }> where end could be called by the after/afterAll. Thoughts (and if yes, should it be done in this PR, before this PR, or as a follow up PR)?

Copy link
Member

@Gudahtt Gudahtt Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, our general preference in other test suites has been to use "setup functions" over hooks for setup steps, so that they can be configured on a per-test basis while still reusing the setup code. For example, that's what withFixtures does. Another example of this is the withController setup function we have in various controller unit test suites. Our unit test contributor docs suggest this pattern as well (see here, in the "Read more" section)

This would still include "proper setup/tear down" because we can tear down in a finally block in the setup function.

Placing multiple tests within a withFixtures callback does break test isolation, but so does what you're doing here. And that seems like a reasonable tradeoff to make in an e2e test context.

That is to say, maybe putting multiple tests in a withFixtures callback wouldn't be "The Wrong Way™️"?

I don't know what "granular test metrics" means in this context though, maybe there is some benefit to using hooks that I am missing there.

Copy link
Contributor Author

@davidmurdoch davidmurdoch Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can tear down in a finally block in the setup function.

There are cases where a test can be forcibly killed by the test runner (mocha, jest) and finally blocks are not run, but after will be.

Not doing test cleanup is not desirable (I know you know, but others might read this too, so I'm saying it anyway), as it can leave files behind or port-bound processes running, like ganache or other servers -- and the next test that attempts to spin up these servers will also fail. This makes understanding debug logs a bit more cumbersome.

I've thought of two cases I've seen happen in the extension code base.

When a test times out a finally won't run:

// name this file test.js then run `npx -y mocha test.js`
describe("test", () => {
  before(()=> {
    console.log("before");
  });

  after(()=> {
    console.log("after");
  });

  function runTest() {
    return new Promise(resolve => {
      setTimeout(resolve, 10000);
    })
  }

  it("works", async () => {
    try {
      await runTest()
    } finally {
      console.log("finally"); // this never runs
    }
  })
})

Another case where finally is not sufficient is when an unhandled exception/rejection bubbles up to the top process. Example:

// name this file test.js then run `npx -y mocha test.js`
describe("test", () => {
  before(()=> {
    console.log("before");
  });

  after(()=> {
    console.log("after");
  });

  function runTest() {

    setTimeout(() => {
      // pretend this is thrown by some listener we've got installed somewhere
      throw new Error("error");
    }, 100);


    return new Promise(resolve => {
      setTimeout(resolve, 1000);
    })
  }

  it("works", async () => {
    try {
      await runTest()
    } finally {
      console.log("finally"); // this will not be logged
    }
  })
})

Regarding granular timings: we can measure how long the actual tests take, vs the test setup and teardown. We can then more easily optimize for them. Not like it's useful right now, as we've got way bigger fish to fry, but if we wanted to start implementing upper bounds for test duration these timings could prove useful.

Copy link
Member

@Gudahtt Gudahtt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I have nearly finished review but I need to finish reading the tests, I didn't get through all of that. I will return to this next week.

@davidmurdoch davidmurdoch requested a review from Gudahtt July 2, 2024 19:46
@metamaskbot
Copy link
Collaborator

Builds ready [4c65041]
Page Load Metrics (81 ± 9 ms)
PlatformPageMetricMin (ms)Max (ms)Average (ms)StandardDeviation (ms)MarginOfError (ms)
ChromeHomefirstPaint86131108168
domContentLoaded116733157
load5411181199
domInteractive116733157
Bundle size diffs [🚨 Warning! Bundle size has increased!]
  • background: 1.19 KiB (0.04%)
  • ui: 0 Bytes (0.00%)
  • common: 0 Bytes (0.00%)

"https://chainid.network/chains.json",
"https://lattice.gridplus.io/*",
"http://*/*",
"https://*/*",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the goal of this PR be achieved without adding this allow-all to permissions in manifest?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, we already practically have these allow-all permissions via the content_scripts matches. However, content_script only works after redirects have settled, so they aren't useful.

Not sure it makes it any better, but the MV3 manifest does already include the same * permissions:

  "host_permissions": [
    "http://localhost:8545/",
    "file://*/*",
    "http://*/*",
    "https://*/*"
  ],

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the existing contentscript permissions, these additions don't really grant us much in terms of added capabilities. The only differences I am aware of are that it lets us use the webRequest API, and it disables CORS checks.

I would like to avoid adding further permissions, but, in this case it does seem like we have a good reason.

@davidmurdoch davidmurdoch requested a review from legobeat July 3, 2024 15:32
Copy link
Member

@Gudahtt Gudahtt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, great work!

Lots of room to improve the readability of the test suite, but we can do that later

@davidmurdoch
Copy link
Contributor Author

LGTM, great work!

Lots of room to improve the readability of the test suite, but we can do that later

I've made a new issue: #25667 and assigned myself to it. I'll be OOO for a while, but I'll try to get it cleaned up when I return!

@NicholasEllul NicholasEllul requested a review from a team July 4, 2024 14:01
Copy link
Contributor

@danjm danjm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

sonarqubecloud bot commented Jul 9, 2024

@metamaskbot
Copy link
Collaborator

Builds ready [fea21e5]
Page Load Metrics (463 ± 366 ms)
PlatformPageMetricMin (ms)Max (ms)Average (ms)StandardDeviation (ms)MarginOfError (ms)
ChromeHomefirstPaint712991235426
domContentLoaded108730189
load422172463763366
domInteractive108730189
Bundle size diffs [🚨 Warning! Bundle size has increased!]
  • background: 1.19 KiB (0.04%)
  • ui: 0 Bytes (0.00%)
  • common: 0 Bytes (0.00%)

@Gudahtt Gudahtt merged commit 2ff6c0d into develop Jul 9, 2024
72 checks passed
@Gudahtt Gudahtt deleted the burned-tab-id-poc2 branch July 9, 2024 15:02
@github-actions github-actions bot locked and limited conversation to collaborators Jul 9, 2024
@metamaskbot metamaskbot added the release-12.2.0 Issue or pull request that will be included in release 12.2.0 label Jul 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
release-12.2.0 Issue or pull request that will be included in release 12.2.0 team-extension-platform team-security
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

7 participants