Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Dolt table diffs #94

Open
addisonklinke opened this issue Apr 7, 2022 · 4 comments
Open

Support Dolt table diffs #94

addisonklinke opened this issue Apr 7, 2022 · 4 comments

Comments

@addisonklinke
Copy link

addisonklinke commented Apr 7, 2022

Dolt is a versionable MySQL database that can commit, branch, push, and pull just like a git repository. The diff output is similar enough to git's that PatchSet is able to parse it. However, the number of lines (or in Dolt's case the number of table rows) added/removed does not seem to get tracked correctly.

For a basic Dolt setup, see this a minimal example I made. Taking the diff string of the objects table from my example, I tried unsuccessfully to parse the additions/deletions with unidiff

from io import StringIO
from textwrap import dedent
from unidiff import PatchSet

dolt_diff = dedent("""
    diff --dolt a/objects b/objects
    --- a/objects @ 73hiqmiduef0sqtecba4fav7vuuvdk2l
    +++ b/objects @ 1hq161cev9kkt6eukvap0jmrfeedvt9j
    +-----+----+---------+------------------+
    |     | id | label   | bbox             |
    +-----+----+---------+------------------+
    |  <  | 1  | cat     | [1, 2, 3, 4]     |
    |  >  | 1  | cat     | [3, 4, 5, 6]     |
    |  <  | 2  | dog     | [10, 20, 30, 40] |
    |  >  | 2  | poodle  | [10, 20, 30, 40] |
    |  <  | 3  | dog     | [5, 6, 7, 8]     |
    |  >  | 3  | bulldog | [5, 6, 7, 8]     |
    +-----+----+---------+------------------+
""")

patch_set = PatchSet(StringIO(dolt_diff))
for t, table in enumerate(patch_set):
    table_name = table.path.split('@')[0].strip()
    print(f'Dolt table {t}={table_name}: {table.added} additions / {table.removed} deletions')

This outputs

Dolt table 0=objects: 0 additions / 0 deletions  

Whereas it should've been 3 additions / 3 deletions from the <> syntax in the first ASCII column of the diff. Is there a way to support Dolt's table diffs?

@addisonklinke
Copy link
Author

@matiasb Curious if you have any update on this?

@matiasb
Copy link
Owner

matiasb commented Aug 17, 2022

hi! I think this exceeds the original goal of the project, so I'm not really sure how it would work as part of unidiff, although I can see it would be useful to have something like that for your use case.
Having said that, it seems it shouldn't be complex to implement using the existing code as base, maybe we can get a branch started and see how that looks? Alternatively, it could be a fork and become something independent?

@addisonklinke
Copy link
Author

Is there a standard way you could see external project's writing a plugin to support their diff format? That seems like it could be a good solution. If so, I can take a look or raise the issue with the Dolt maintainers to get that written

@matiasb
Copy link
Owner

matiasb commented Aug 31, 2022

Right now there isn't an easy way (it wasn't previously considered either) to have a pluggable way to specify a custom diff format. That would require some work. I think the simpler path to get something working (given in this case it seems a specific scenario and scope) would be to fork and adapt the existing code, as a separate thing. I can try to help/answer questions as time permits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants