Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: GithubRepo is modeled incorrectly #1083

Open
begedin opened this issue Oct 18, 2017 · 4 comments
Open

RFC: GithubRepo is modeled incorrectly #1083

begedin opened this issue Oct 18, 2017 · 4 comments

Comments

@begedin
Copy link
Contributor

begedin commented Oct 18, 2017

Problem

#1080 added a unique index to the github_repos schema.

#1081 also changes behavior when syncing github repos to match by github_id

The model for a GithubRepo, however, looks like this (important bits only):

schema "github_repos" do
  field :github_id, :integer

  belongs_to :github_app_installation, CodeCorps.GithubAppInstallation
end

The problem here is, in a real world, a github repository is not uniquely identified by an installation, but by it's github_id, as we are aiming to change.

In our specific case, a belongs_to :github_app_installation simply tells us which installation provided us with the information about the repository.

I guess our initial goal was to model it so each installation stores it's own copy, even though, on github, it's the same record. With the addition of other records, this no longer works optimally.

However, with the way it all works, since in our model, a repo can only belong to a single installation, we will encounter A LOT of difficulties as we start dealing with multiple installations.

Proposed general solution

  • remove the "belongs_to" relationship
  • add a "link" table and change it to a many-to-many relationship
    • not sure about naming GithubAppInstallationGithubRepo, GithubInstallationRepo? GithubAppInstallationRepo, InstallationGithubRepo? GithubAppInstallationRepo sounds best to me.

Specific changes needed

In CodeCorps.GitHub.Event.Installation.Repos

  • the :delete_repos, :sync_repos steps are replaced with
    • :ensure_repos - find or create al repositories
    • :disassociate_repos - delete GithubAppInstallationRepo records no longer in the installation
    • :associate_repos - create GithubAppInstallationRepo records in the installation
@begedin begedin added this to the 🚀 GitHub integration MVP milestone Oct 18, 2017
@begedin
Copy link
Contributor Author

begedin commented Oct 18, 2017

@joshsmith I put this into the integration MVP milestone, but technically, we could "launch" without resolving this, since we'll only be dealing with one github app installation for now.

However, our data is definitely modeled incorrectly here, so we should deal with it as soon as we can.

@begedin begedin changed the title Discussion: GithubRepo, as it is right now, should not be considered unique by github_id Discussion: GithubRepo is modeled incorrectly Oct 18, 2017
@joshsmith
Copy link
Contributor

We had a conversation about this offline where I clarified that a GithubRepo can only ever correspond to a single GithubAppInstallation at one time. A GitHub user can uninstall our application from their repository and then reinstall it, but this would be an entirely new GithubAppInstallation. It's possible that we could receive events for installation repositories before we receive events for the installation, so we have to handle this possibility by allowing a GithubRepo to change installations.

We can also mitigate the problem of not having access to the installation by fetching it from the API if it does not currently exist.

This issue likely needs rewritten to deal with the above, but for now, it's low priority.

@joshsmith
Copy link
Contributor

@begedin would you mind clarifying the relevance of this issue for me now? I'm not sure it's relevant any longer at all, but I could be wrong.

@begedin
Copy link
Contributor Author

begedin commented Nov 20, 2017

@joshsmith I went through the code, and near as I can tell, this is still slightly relevant, but probably as a different issue altogether.

A repo is tied to a single installation either way, with the only possible case happening where an installation is removed, followed by a new one being added to the same account and the same repo.

Our installation_repositories "added" event syncs repos by doing the following for each repo payload

  • finding by {installation_id, github_id}
  • creating if not found - this will cause an error if a different installation owns the repo, due to unique constraint
  • just updating the name field if found - weird that we only update that one field. Can you think of why we would do this?

installation_repositories "removed" simply deletes each GithubRepo payload

Basically, there is some strangeness in the process. I would say, to make this as stable as possible we should

In the case of added

  • find by :github_id only. That's the unique identifier
  • update all basic attributes if found
  • associate with installation both on find and create in case of added event - this is so when a repo is added to a different installation, the process doesn't fail and the installation simply takes over the repo

In the case of removed

  • delete any repos not associated to a project
  • flag any repos associated to a project, so we can warn the user the integration broke and they need to fix it
  • once they re-add the repo, we can recover the integration
  • that means the added event also needs to unset the flag

Another thing we should do is handle the installation "deleted" event but there are details we need to consider here.

A potential solution would be

  • delete the installation (we wont get it back)
  • delete any repos not associated to a project (we don't get any benefit in keeping them)
  • flag any repos associated to a project, so the user could be warned their integration is broken and they need to reconnect
  • once they install our app again, and add one of their broken repos to it, we can recover

I'm not sure we want to do this as part of this issue, or by creating a new one, but I think at least parts of this are worth doing.

@joshsmith joshsmith added the RFC label Jan 1, 2018
@joshsmith joshsmith changed the title Discussion: GithubRepo is modeled incorrectly RFC: GithubRepo is modeled incorrectly Jan 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants