Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor auto-commit Base images #66

Closed
josecelano opened this issue Jan 4, 2022 · 0 comments · Fixed by #81
Closed

Refactor auto-commit Base images #66

josecelano opened this issue Jan 4, 2022 · 0 comments · Fixed by #81

Comments

@josecelano
Copy link
Member

This is a subtask of this issue-47

We have been discussing how to improve the auto-commit step. In this issue, we are going to define the new specification for this step.

The Gold Image Processing command input is the DVC diff between two git commits:

{
  "added":[
    {
      "path":"data/000001/32/000001-32.600.2.tif"
    }
  ],
  "deleted":[
    {
      "path":"data/000002/32/000002-32.600.2.tif"
    }
  ],
  "modified":[
    {
      "path":"data/000003/32/000003-32.600.2.tif",
      "hash": {
        "old": "2b5e0fdd787b569b4fcb4bafa543d270",
        "new": "f237c73d43f87f84081796e3c0e22e37"
      }
    }
  ],
  "renamed": [
    {
      "path": {
        "old": "data/00000444/32/00000444-32.600.2.tif",
        "new": "data/000004/32/000004-32.600.2.tif"
      }
    }
  ],
}  

More info about the DVC diff command: https://dvc.org/doc/command-reference/diff

The Gold Image Processing command processes the changes in the DVC files (the changes only related to Gold images).

Added case

Current behaviour

When a new Gold image is added the command output should be:

  • A newly signed commit for each new image should be created.
  • We are merging PR with GitHub interface and "merge" option. That means a new merge commit will be created.

Proposal 1: custom merge commit message

After creating all commits (not only for added images) the final merge commit message has to have a defined format, something like this:

Example of auto-generated merge commit message:

commit 7e9a23ef4593bca0bcadade7ecf18d678f8d6ae8 (HEAD -> main, origin/main)
Merge: 39035bf 5316dad
Author: cgbosse <[email protected]>
Date:   Tue Jan 4 10:19:43 2022 +0000

    Merge pull request #65 from Nautilus-Cyberneering/issue-20-conventional-commits
    
    Add link to conventional commits site

New pre-defined merge message:

commit 7e9a23ef4593bca0bcadade7ecf18d678f8d6ae8 (HEAD -> main, origin/main)
Merge: 39035bf 5316dad
Author: cgbosse <[email protected]>
Date:   Tue Jan 4 10:19:43 2022 +0000

    Merge pull request #65 from Nautilus-Cyberneering/issue-20-rename-gold-image
    
    Gold Image Processing for `dvc diff f846060 47e84cd`

    {
        "added": [],
        "deleted": [],
        "modified": [],
        "renamed": [
            {"path": "folder/renamed_file.txt"},
        ],
    }

We will add the dvc diff we have processed.

Pros:

  • We have a log of all previous commits

Cons:

  • We can not create the merge commit automatically with the GitHub interface. It forces us to use a merge script like this. And that can make things harder for contributors, especially for non-developers contributors.

Proposal 2: add the information to all auto-commits

Instead of adding that information to the merge commit, we can add it to each commit. At the end of the day, the commit message should explain why that commit was created.

Pros:

  • We can continue using the GitHub interface to merge PRs.
  • If the workflow fails we already have the information in the previous commits.

Proposal 3: use git notes for commit "metadata"

If we want to add extra information to git about why some commits were created we could use git notes.

After merging a branch we could add a note to the merge commit.

Cons:

  • It seems it's no longer supported by GitHub.

Other alternatives

  • Initial commit. Create a previous empty commit with the dvc diff info. git commit --allow-empty -m "gold-image-processing". The message could contain the dvc diff json.
  • Final commit. For some cases, it could make sense to add an extra commit at the end of a process in order to trigger other events. The commit could be like an event/notification to other parts of the system. For example, when we update dvc library (https://github.com/Nautilus-Cyberneering/chinese-ideographs-website/blob/main/.github/workflows/import-base-images.yml) in a parent repo we could trigger more than one secondary workflow, on secondary branches. After finishing the secondary tasks on the secondary branches, we could automatically merge those secondary branches into the parent branch if they have the special "final commit".
  • Both, initial and final commit.

Deleted case

TDB

Modified case

TDB

Renamed case

TDB

@da2ce7 @yeraydavidrodriguez could you add your thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant