fix(rollback handler): convert to ts for safety #1044

kishore03109 · 2023-12-03T21:13:34Z

Now that there exists commits to 2 different branches, there needs to be a rollback handler to cater for the case of network failures that exist for a commit to 1 branch. else, we might have a scary situation whereby staging-lite updates, but staging doesnt, and this leads to a wrongly output production site. this effectively means that there should be any use of the writeroutehandler since any write can lead to deviation of staging and staging lite

Note that we dont need to do any thing with regards checking if a site is whitelisted for quickie since all sites have the staging lite branch infra already set up. the whitelisting only dictates whether or not the staging lite file gets updated or not.

Previously, we were not using the rollback properly, this pr converts the file to ts to allow for easier type checking.
there were also some weird errors that occured that could be avoided with the retry. however, staging lite needs to be pushed with a .push(gitOptions), which the retry mechanism does not have. as a bandage, retry twice with the options, check in with @harishv7 on why there exists a retry in GitFileSystemService in the first place.

Manual tests for rollback handler

[Testing for quickie whitelisted site that is GGs]

use a email login site, ensure that that it is whitelisted for build time reduction in growthbook
create a file, ensure that no error are obsevered
go into the create function of GitFileSystemService and short circuit the function by adding a early // return errAsync(new ConflictError("this is a test failure")) eg.

create(
    repoName: string,
    userId: string,
    content: string,
    directoryName: string,
    fileName: string,
    encoding: "utf-8" | "base64" = "utf-8",
    branchName: string
  ): ResultAsync<
    GitCommitResult,
    ConflictError | GitFileSystemError | NotFoundError
  > {
    // short circuit for testing
    return errAsync(new ConflictError("this is a test failure"))
... // rest of code
}

re-run the create operation, notice that all rollbacks occur for both staging + staging-lite.

[Testing for non-quickie whitelisted site that is GGs]

do the same as above but not quickie should not be whitelisted

[Testing for GH sites for rollback handler] (all gh sites are NOT whitelisted for quickie)

throw error in the create function for GithubService
notice that rollback works as intended -> it goes back to the last known state sucessfully.

[Testing normal crud operations]

Tests

In staging environment,
kishore-test is quickie + zhongjun-test-amplify is NON-quickie + kishore-test-dev-gh on github (non qucikie)

rename of folders with folder + pages inside
changes made to any page should be persisted

kishore03109 · 2023-12-03T21:13:47Z

Current dependencies on/for this PR:

develop
- PR hotfix(repair-form): set remote url correctly #1048
  - PR feat(quickie): only gitfile should have quickie #1042
    - PR chore(quickie): delete quickie for gh #1043
      - PR test(githubService): add tests #1045
        
        PR feat(gitFileSystem): safer api #1046
        
        PR test(gitCommitService): add test cases #1047
        
        PR fix(rollback handler): convert to ts for safety #1044 👈

This stack of pull requests is managed by Graphite.

src/middleware/routeHandler.ts

harishv7 · 2023-12-06T05:17:34Z

src/middleware/routeHandler.ts

+          )
+        )
+
+        await backOff(() =>


how many times does this retry?

max default 10

1st attempt
200ms
2nd
400ms
3rd
800ms
4th
1600ms
5th
3200ms
7th
6400ms

and so on right? is the above backoff sequence correct?

was just thinking in a worst case scenario the request will be open for quite long? will this timeout on client?

would need to dive into source code here to get the exact algo (https://github.com/coveooss/exponential-backoff), but based on their readme I would assume that would be the case

wait ah this backoff occurs when the call has an error bah, so the FE would have already gotten an error code that we try to fix via exponential backoff

hmm this is a handler middleware right? so it will completed through the execution of all handlers before request completes right?

follow up suggestion, option 1 + alarm if it fails for 5 times as failing 5 times signifies some deeper issue not resolvable with retry

sure option 1 implemented

hmm sure about alarm? worried about numberous false positives tho
What I can do is add logging for this, can review later if this occurs too freq with false positives. will add todo regardless in jira

@kishore03109 pls test once on staging/local

src/middleware/routeHandler.ts

src/routes/v2/authenticatedSites/contactUs.js

src/middleware/routeHandler.ts

kishore03109 · 2023-12-06T05:48:09Z

src/services/db/GitFileSystemService.ts

@@ -539,6 +539,29 @@ export default class GitFileSystemService {
        )
        .orElse(() =>
          // Retry push once
+          ResultAsync.fromPromise(


@harishv7 ps should have been clearer, specificly here, why do we have a retry here ah? + why is it on the second retry we pass in diferent args?

Oh this is just cos in case the first attempt fails, we have a backup to retry once more.

regarding the options, I think there may be a bug - we should be passing in the same options both for first & 2nd attempts

do you know why it (i think consistently) fails on file rename?

making all retries with same options. I am not too sure why we need retries anyways (kept hitting it during testing), so am keeping verbose code as is first for functionality first

do you know why it (i think consistently) fails on file rename?

hmm, consistent failure on 1st try is not an expected behaviour

seaerchin · 2023-12-06T08:41:30Z

src/middleware/routeHandler.ts

+) => {
+  const result = await gitFileSystemService.hasGitFileLock(repoName, true)
+  if (result.isErr()) {
+    next(result.error)


that bugfix :monkas:

this is copy-pasted code :sadge:

harishv7

@kishore03109 can add a follow up minor todo - add alarm if 5 retries all fail

seaerchin

focused mostly on ensuring no regression; can't be fully sure but kinda sure.

we should aim to fix low-hanging fruits here + checking the scope to see that the changes made on API are safe

seaerchin · 2023-12-06T08:44:55Z

src/middleware/routeHandler.ts

+  repoName: string,
+  next: (arg0: any) => void
+) => {
+  const result = await gitFileSystemService.hasGitFileLock(repoName, true)


i assume we add the true here because we wanna show the staging site right?

separately, i realised that earlier on in 1 of your PRs, we removed a default argument. the majority of callsites should then require the isStaging prop (probably set to true now); can i check if this has been done for all the call sites? (i think yes la but i lazy check)

also it's actually really difficult to tell when it's isStaging vs not, which opens us up to errors but that's out of scope.

//todo to self, check call sites for hasGitFileLock

seaerchin · 2023-12-06T08:46:08Z

src/middleware/routeHandler.ts

+    return false
+  }
+  return true


style - we can just return !!isGitLocked but noted that this was taken as-is from existing code.

seaerchin · 2023-12-06T08:47:59Z

src/middleware/routeHandler.ts

+}
+
+// Used when there are no write API calls to the repo on GitHub
+export const attachReadRouteHandlerWrapper = (routeHandler: any) => async (


we don't necessarily have to do it here (this just maintains parity) but express does export a RouteHandler type and retaining the implicit any is harmful to our codebase as it encourages (or at least does not prevent) ppl from then using any as-is or having it bleed through our code.

seaerchin · 2023-12-06T08:55:15Z

src/middleware/routeHandler.ts

+    false
+  )
+
+  const isGitAvailable = await handleGitFileLock(siteName, next)


this is taken as-is but there's a difference in behaviour between this method and the earlier one.

notably, earlier on, we only check isGitAvailable is IS_GGS_ENABLED - in here, we always check regardless of the flag.

within the handleGitFileLock method itself, i think it just returns false if result.isErr() so this should be safe but we might want to simplify this process if isGitAvailable is computed using IS_GGS_ENABLED and handleGitFileLock.

seaerchin · 2023-12-06T08:58:10Z

src/middleware/routeHandler.ts

+  }
+
+  let originalStagingCommitSha: any
+  let originalStagingLiteCommitSha: any


i think we should set the type here (and above) - this is quite low effort as the method used to determine the sha is typed.

seaerchin · 2023-12-06T09:25:20Z

src/middleware/routeHandler.ts

+        )
+        await backOff(
+          () =>
+            revertCommit(
+              originalStagingLiteCommitSha,
+              siteName,
+              accessToken,
+              STAGING_LITE_BRANCH
+            ),
+          backoffOptions
+        )
+      }
+    } catch (retryErr) {
+      await unlock(siteName)


tbvh this whole chunk is very jank and confusing but this PR doesn't focus on code clarity but migration from js -> ts so ok with it. we should, however, look into probably refactoring this into a sensible form.

seaerchin · 2023-12-06T09:25:56Z

src/routes/v2/authenticatedSites/settings.js

@@ -136,7 +135,7 @@ class SettingsRouter {
    router.post(
      "/repo-password",
      this.authorizationMiddleware.verifyIsEmailUser,
-      attachWriteRouteHandlerWrapper(this.updateRepoPassword)
+      attachReadRouteHandlerWrapper(this.updateRepoPassword)


shouldn't updating a repo's password be a write op?

seaerchin · 2023-12-06T09:26:52Z

src/services/db/GitFileSystemService.ts

@@ -542,10 +542,33 @@ export default class GitFileSystemService {
            isForce
              ? this.git
                  .cwd({ path: `${efsVolPath}/${repoName}`, root: false })
-                  .push(["--force"])
+                  .push([...gitOptions, "--force"])


why the change here?

pushes are supposed to have the required options

seaerchin · 2023-12-06T09:29:49Z

src/services/db/GitFileSystemService.ts

+            }
+          )
+        )
+        .orElse(() =>


in order for this to work, should we check that the previous failure was over the network (ie, github doesn't see the commit)? otherwise, we might have a failure from a non-github issue -> push -> dup commits or fail, isn't it?

after disc resolving this as an extreme edge case

seaerchin · 2023-12-06T09:31:02Z

src/utils/neverthrow.ts

+ * expect a .catch() method on the returned promise. This should not be used in most
+ * control flows as it removes the benefits that neverthrow provides.
+ */
+const convertNeverThrowToPromise = <T, E>(x: ResultAsync<T, E>): Promise<T> =>


does x._unsafeUnwrap work here?

just tried, the below snippet works!

const res = await x return res._unsafeUnwrap()

done 097f1f4

kishore03109 · 2023-12-06T10:51:33Z

Merge activity

Dec 6, 5:51 AM: @kishore03109 started a stack merge that includes this pull request via Graphite.
Dec 6, 6:01 AM: Graphite rebased this pull request as part of a merge.

This was referenced Dec 3, 2023

feat(quickie): only gitfile should have quickie #1042

Merged

chore(quickie): delete quickie for gh #1043

Merged

This was referenced Dec 3, 2023

test(githubService): add tests #1045

Merged

feat(gitFileSystem): safer api #1046

Merged

test(gitCommitService): add test cases #1047

Merged

kishore03109 requested a review from a team December 3, 2023 22:52

kishore03109 marked this pull request as ready for review December 3, 2023 22:53

kishore03109 removed the request for review from a team December 3, 2023 23:05

kishore03109 marked this pull request as draft December 3, 2023 23:05

kishore03109 changed the base branch from 12-04-chore_quickie_delete_quickie_for_gh to 12-04-test_gitCommitService_add_test_cases December 4, 2023 04:13

kishore03109 force-pushed the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch from 518dace to a257d11 Compare December 4, 2023 04:13

kishore03109 mentioned this pull request Dec 4, 2023

hotfix(repair-form): set remote url correctly #1048

Merged

kishore03109 force-pushed the 12-04-test_gitCommitService_add_test_cases branch from 3d72484 to 1bce332 Compare December 4, 2023 04:58

kishore03109 force-pushed the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch from a257d11 to 480621f Compare December 4, 2023 04:58

kishore03109 marked this pull request as ready for review December 4, 2023 17:29

kishore03109 requested a review from a team December 4, 2023 17:29

kishore03109 force-pushed the 12-04-test_gitCommitService_add_test_cases branch from 1bce332 to 6f5c0d1 Compare December 6, 2023 01:20

kishore03109 force-pushed the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch from b7cd529 to f601f8d Compare December 6, 2023 01:21

kishore03109 force-pushed the 12-04-test_gitCommitService_add_test_cases branch from 6f5c0d1 to 0b1d0b9 Compare December 6, 2023 02:07

kishore03109 force-pushed the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch from f601f8d to 5ea51b9 Compare December 6, 2023 02:07

kishore03109 force-pushed the 12-04-test_gitCommitService_add_test_cases branch from 0b1d0b9 to 79ec119 Compare December 6, 2023 02:08

kishore03109 force-pushed the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch from 5ea51b9 to 42c9594 Compare December 6, 2023 02:08

harishv7 reviewed Dec 6, 2023

View reviewed changes

kishore03109 commented Dec 6, 2023

View reviewed changes

kishore03109 requested a review from harishv7 December 6, 2023 05:48

kishore03109 force-pushed the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch from 42c9594 to d27ba8a Compare December 6, 2023 07:14

kishore03109 force-pushed the 12-04-test_gitCommitService_add_test_cases branch from 33bff68 to 46c7528 Compare December 6, 2023 07:36

kishore03109 force-pushed the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch from d27ba8a to 6ff9598 Compare December 6, 2023 07:37

kishore03109 force-pushed the 12-04-test_gitCommitService_add_test_cases branch from 46c7528 to 3e85ab2 Compare December 6, 2023 07:53

kishore03109 force-pushed the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch 2 times, most recently from a8abd09 to 3df3e80 Compare December 6, 2023 08:12

seaerchin reviewed Dec 6, 2023

View reviewed changes

harishv7 approved these changes Dec 6, 2023

View reviewed changes

seaerchin reviewed Dec 6, 2023

View reviewed changes

kishore03109 requested a review from seaerchin December 6, 2023 10:45

kishore03109 force-pushed the 12-04-test_gitCommitService_add_test_cases branch from d2ab13c to b995436 Compare December 6, 2023 10:58

kishore03109 changed the base branch from 12-04-test_gitCommitService_add_test_cases to develop December 6, 2023 11:00

kishore03109 added 9 commits December 6, 2023 11:01

fix(rollback handler): convert to ts for safety

88df7c4

Untitled commit

d71577a

Untitled commit

5d5d865

Untitled commit

600cb59

Untitled commit

51ffe5a

fix(retry): add max 5 retries)

e2946e1

feat(loggin): add logging for possibility of alarm

7b0e17a

feat(style): use resultAsync for clarity + refactor

1617668

fix(privatisation): should be write handler

b96e0bc

kishore03109 force-pushed the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch from 79cdf6a to b96e0bc Compare December 6, 2023 11:01

kishore03109 merged commit 00c532e into develop Dec 6, 2023
8 checks passed

mergify bot deleted the 12-04-fix_rollback_handler_convert_to_ts_for_safety branch December 6, 2023 11:03

This was referenced Dec 6, 2023

0.56.0 #1049

Merged

release(v0.56.0): merge to prod #1050

Merged

fix(rollback handler): convert to ts for safety #1044

fix(rollback handler): convert to ts for safety #1044

Conversation

kishore03109 commented Dec 3, 2023 • edited by seaerchin Loading

Manual tests for rollback handler

Tests

kishore03109 commented Dec 3, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kishore03109 Dec 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harishv7 left a comment

Choose a reason for hiding this comment

seaerchin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kishore03109 Dec 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kishore03109 commented Dec 6, 2023 • edited Loading

Merge activity

kishore03109 commented Dec 3, 2023 •

edited by seaerchin

Loading

kishore03109 commented Dec 3, 2023 •

edited

Loading

kishore03109 Dec 6, 2023 •

edited

Loading

kishore03109 Dec 6, 2023 •

edited

Loading

kishore03109 commented Dec 6, 2023 •

edited

Loading