Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mydumper integration #1962

Merged
merged 34 commits into from
Aug 14, 2024
Merged

Mydumper integration #1962

merged 34 commits into from
Aug 14, 2024

Conversation

abdullah-kasim
Copy link
Contributor

@abdullah-kasim abdullah-kasim commented Jul 26, 2024

Description

This PR adds support for mydumper-based SQL dumps for almost all commands that interact with SQL dumps.

Those commands are:

  • vip dev-env sync sql
  • vip dev-env import sql
  • vip search-replace

What's currently not supported is:

  • vip import sql - the post-processing involved is included in this PR, but this also requires a back-end change so that the endpoint it's communicating with can read a mydumper file. For now, it's best to use the datasync feature.

Pull request checklist

New release checklist

Steps to Test

For testing vip dev-env sync:

  1. Check out PR.
  2. Run npm run build:watch
  3. Ensure that you have a site ready that's part of the mydumper testing group. See p6jPRI-8tH shortlink on how to make a site be part of one.
  4. Create a new dev-env by running ./dist/bin/vip-dev-env-create.js @0000.production --slug=0000-production where 0000 is your testing site id. I also suggest that you add phpmyadmin for validation purposes
  5. Then, start the dev environment - ./dist/bin/vip-dev-env-start.js @0000.production --slug=0000-production
  6. Run ./dist/bin/vip-dev-env-sync-sql.js @0000.production --slug=0000-production and your local dev environment should now have its database ready - with all the search and replace working.

For testing vip dev-env import sql:

  1. Do steps 1 to 5 of testing vip dev-env sync.
  2. Run ./dist/bin/vip-dev-env-import-sql.js <path-to-mydumper-sql-file> --slug=0000-production
  3. You should see warnings about running search-replace, and the import will partially succeed with failures at creating the admin user.
  4. Re-run ./dist/bin/vip-dev-env-import-sql.js <path-to-mydumper-sql-file> --slug=0000-production --search-replace="from,to" as per its recommendation.
  5. Database should import properly this time and the dev-env at the slug 0000-production will work, just like a mysqldump-based file.

For testing vip search-replace:

  1. Do steps 1 to 5 of testing vip dev-env sync
  2. Run ./dist/bin/vip-search-replace.js <mydumper-file> --search-replace="some.from.domain,some.to.domain" --output="result.sql". The resulting file should have metadata headers i.e. -- some_db_wp_site.00000.sql -1 with a -1 at the back. It will also properly replace the values listed in search-replace.

Copy link
Contributor

github-actions bot commented Jul 26, 2024

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

PackageVersionScoreDetails

Scanned Manifest Files

src/lib/database.ts Outdated Show resolved Hide resolved
src/lib/database.ts Outdated Show resolved Hide resolved
src/lib/database.ts Outdated Show resolved Hide resolved
@abdullah-kasim abdullah-kasim marked this pull request as draft July 29, 2024 02:56
@luismulinari
Copy link
Contributor

This is awesome!

src/lib/database.ts Fixed Show fixed Hide fixed
src/lib/database.ts Fixed Show fixed Hide fixed
@abdullah-kasim abdullah-kasim changed the title WIP: Mydumper integration Mydumper integration Aug 2, 2024
@abdullah-kasim abdullah-kasim marked this pull request as ready for review August 2, 2024 07:22
@abdullah-kasim abdullah-kasim requested a review from a team August 2, 2024 07:23
Copy link
Contributor Author

@abdullah-kasim abdullah-kasim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes are admittedly the refactor.

The most important logic that has changed is the addition of the fixMyDumperTransform() to fix/workaround a mydumper dump's stream when it's passed over to the replace() stream, as well as the detection logic for checking if a file is a mysqldump dump or a mydumper dump.

const importArg = [ 'db', '--disable-auto-rehash' ].concat(
this.options.quiet ? '--silent' : []
);
const importArg = this.getImportArgs( dumpDetails );
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted to a function to satisfy eslint complexity limit - which I totally agree with.

this.options.quiet ? '--quiet' : []
);
await exec( lando, this.slug, cacheArg );
await flushCache( lando, this.slug, this.options.quiet );
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also more refactor by extraction here so that I could read this code more easily to understand what's going on.

*
* @return {Stream} A passthrough stream
*/
function getMockStream( eventArgs: { name: string; data?: string }[], timeout = 10 ) {
Copy link
Contributor Author

@abdullah-kasim abdullah-kasim Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to refactor the tests by quite a bit - it's mocking too much and adjusting the mock is starting to take too much time. Instead, I don't mock it but instead, provide a small fixture that is also speedy.

Comment on lines +93 to +97
public get sqlFile(): string {
return `${ this.tmpDir }/sql-export.sql`;
}

private get gzFile(): string {
public get gzFile(): string {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made public to make tests easier.

sourceDb: string;
}

export const getSqlDumpDetails = async ( filePath: string ): Promise< SqlDumpDetails > => {
Copy link
Contributor Author

@abdullah-kasim abdullah-kasim Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function figures out the type of sql dump that we're dealing with (either mydumper or mysqldump), and the source DB if it's a mydumper dump.

throw new Error( 'Not a supported compressed file' );
};

export const fixMyDumperTransform = () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This transform is a stream 'middleware' that modifies the metadata header in mydumper i.e. -- some_db.tbl 12345, and then replaces the 12345 with -1. This allows us to bypass the metadata bytes validation.

@@ -78,15 +88,14 @@ export class DevEnvImportSQLCommand {
const expectedDomain = `${ this.slug }.${ lando.config.domain }`;
await validateSQL( resolvedPath, {
isImport: false,
skipChecks: [],
skipChecks: isMyDumper ? [ 'dropTable', 'dropDB' ] : [],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dropTable and dropDB validations doesn't make sense in the mydumper context because isMyDumper uses --overwrite-table to perform any table/db dropping - hence we're ignoring it here.

@@ -23,7 +25,7 @@ import { getReadInterface } from '../lib/validations/line-by-line';
* @return Site home url. null if not found
*/
function findSiteHomeUrl( sql: string ): string | null {
const regex = "'(siteurl|home)',\\s?'(.*?)'";
const regex = `['"](siteurl|home)['"],\\s?['"](.*?)['"]`;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor regex change because mydumper uses double quotes while mysqldump uses single quote.

Using two regexes may be more accurate, but this change in the regex seems to be good enough from my testing.

src/lib/database.ts Outdated Show resolved Hide resolved
@abdullah-kasim abdullah-kasim changed the base branch from trunk to esp-72/readability-refactor August 2, 2024 09:03
@abdullah-kasim
Copy link
Contributor Author

abdullah-kasim commented Aug 2, 2024

For future ref, here's some minimal performance tests that I've run with importing a 3 GB dump:

# vip dev-env import - mydumper - 10 SRs with wp search-replace
real    20m55.033s
user    0m31.632s
sys     0m5.237s

# vip dev-env import mydumper - 0 SRs
real    3m55.207s
user    0m30.658s
sys     0m6.472s


# 17 minutes for 10 SRs
# 1.7 minutes per SR with wp search-replace


# vip dev-env import - mysqldump - 0 SRs:
real    10m15.278s
user    0m19.742s
sys     0m6.805s

# mysqldump - no streaming by copying file into docker container.
real    10m4.920s
user    0m11.425s
sys     0m2.730s

# vip search-replace - 10 SRs with go-search-replace.
real    0m32.049s
user    0m33.223s
sys     0m12.168s


# winner: go-search-replace with mydumper.

Base automatically changed from esp-72/readability-refactor to trunk August 2, 2024 09:18
Copy link
Member

@sanmai sanmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Followed the testing steps, no issues 👍

Copy link

@abdullah-kasim abdullah-kasim merged commit 893df70 into trunk Aug 14, 2024
16 checks passed
@abdullah-kasim abdullah-kasim deleted the esp-72/mydumper-support branch August 14, 2024 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants