Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow rsync server and path to be overridden by .gitconfig #40

Open
KeithHanlan opened this issue Mar 17, 2015 · 25 comments
Open

allow rsync server and path to be overridden by .gitconfig #40

KeithHanlan opened this issue Mar 17, 2015 · 25 comments

Comments

@KeithHanlan
Copy link

In my development environment, there is no single server which is accessible to all potential users. However, every user has access to a server which mounts the git-fat object directory. Therefore, I would like to be able to override the default remote configuration value in .gitfat with one read from .gitconfig. That way, I could use a server accessible to jenkins in the .gitfat file but allow users to provide their own values as necessary.

@abraithwaite
Copy link

Hmm, you've actually touched on something I've been wanting to do which is to have any variable in the .gitfat file be overridden by the .gitconfig file.

What you can do in the meantime is alias git-fat="git-fat -c /path/to/custom/config".

@KeithHanlan
Copy link
Author

Thank you for the quick response @abraithwaite.

I am looking into what would be required to add this functionality and have a few design questions to answer before I go down a route that you might find unpalatable.

Which configuration file takes precedence?

I intend to make use of gitconfig_get() which will look, in order, through:

  • system,
  • ${XDG_CONFIG_HOME:-${HOME}}/git/config,
  • ${HOME}/.gitconfig, and finally,
  • ${GIT_DIR:-git rev-parse --show-toplevel}/config

With git-fat, we have what is essentially another config-file but one which may be a tracked file or another specified with the -c option. Given that -c overrides .git_fat, where in the heirarchy should we place the results of git config --get?

The problem is that -c is currently thought of as an override but I want to use get config --get as an override as well. Semantically, we probably prefer to search the repo .git_fat file first, followed by git config, and finally the argument of -c if specified. Such an approach would make it possible to deprecate the .git_fat file altogether. In conjunction with hooks preventing the push of binary files, that could be a desirable approach.

Regardless of what is chosen, the change is easily confined to _parse_config()

What do we call the git-fat configuration settings in a git config file?

In .git_fat, the scope is self-evident. In a git-config file, we need to label git-fat settings. For example:

git config --global --add git-fat.remote.rsync "a:b"

results in:

[git-fat "remote"]
        rsync = a:b

Should we use simply dot everything so it looks like the above or should we instead use sub-sections as documented in git-config(1). What if we want different settings for different repos but in the global config file?

Which configuration values do we support in git-config files?

With .git-fat, we simply read the file and parse everything using ConfigParser. With get-config, we have to pick which settings we wish to query.

I'm not interested in making this overly complicated and for my purposes, I only really want to be able to override rsync.remote. But obviously, the incremental cost for a more general solution is low.

Comments and suggestions?

Thank you again,
Keith

@abraithwaite
Copy link

I intend to make use of gitconfig_get() which will look, in order, through..

Yep, it already does that if None is passed in. The main problem is that we're always passing in a config file, so None is never an argument to that function.

Such an approach would make it possible to deprecate the .git_fat file altogether

No, we can't have this. Having the file in the repository is what allows others to access the same remote. I don't want every user to have to configure it themselves after the fact.

What do we call the git-fat configuration settings in a git config file?

To start, I would use the same sections we have in the .gitfat: (rsync, http, copy). I'm not opposed to namespacing them if they're going to be merged with git's configs, but I'm not sure that should be included in this patch.

@KeithHanlan
Copy link
Author

Such an approach would make it possible to deprecate the .git_fat file altogether

No, we can't have this. Having the file in the repository is what allows others to access the same remote. I don't want every user to have to configure it themselves after the fact.

Everything that is in .gitfat could be also stored in .git/config could it not?

But I take your point, .git/config does have to be configured as well.

@avicent
Copy link

avicent commented Mar 25, 2015

@abraithwaite

After talking to Keith, who is a colleague of mine, and doing some playing around, I have come up with two different working proposals and we would like to know if you would advocate either of them in the event of a pull request:

Proposal 1

Introduce support for multiple .gitfat config files, so that those would be looked for (and read) in the following order:

  • Site config file pointed to by the environment variable GIT_FAT_DEFAULTS
  • User .gitfat file ~/.gitfat
  • Repo .gitfat file git rev-parse --show-toplevel /.git (i.e. today's default file)

Files are parsed with ConfigParser.SafeConfigparser (as today) in the order above, so that the content of the repo .gitfat takes precedence.

We make use of ConfigParsers %()s interpolation to construct remote or other settings. For instance, a repo .gitconfig could look like:

[rsync]
remote = %(synchost)s:%(syncpath)s/repo-specific-store

While the site or the user configuration could look like:

[rsync]
synchost=aaa.bbb.ccc
syncpath=/var/tmp/gitfat/store

The change in Proposal 1 is restricted to parsing a list of configs instead of a single one.

Proposal 2

Introduce a gitfat-specific interpolation syntax that resolves keywords through git config --get
For instance, a repo .gitconfig could look like:

[rsync]
remote = [myorg.synchost||aaa.bbb.ccc]:[myorg.syncpath||/var/tmp/gitfat/store]/repo-specific-store

Meaning that for each [<setting-1>||<setting-2>||...||<default>] group it will use git-fat --get <setting-1>, if it is found it would use that, otherwise it would try <setting-2> and so on and if no setting is found it would use the <default> value.

The change in Proposal 2 is restricted to a function that does the interpolation, which is applied on the options returned by _parse_config.

Any opinions?

@justinclift
Copy link

Proposal 1 sounds simpler to implement + grok, and gives both SysAdmin + user control over repos.

What are the benefits of Proposal 2 compared to it?

Btw, they don't really sound mutually exclusive do they? They could both work together...?

@abraithwaite
Copy link

@justinclift beat me to it! 😄

Pretty much what he said. The only thing I would really change is the files. I think that git fat should read git's config. Lot's of other git plugins do this already and I don't want to pollute people's home directory with even more configs.

The order I would parse them is such (highest priority to lowest):

$GIT_CONFIG
$(git rev-parse --git-dir)/config
~/.gitconfig
$(git rev-parse --show-toplevel)/.gitfat

I don't really see value in Proposal 2 at the current time. Instead of interpolation, It would make more sense to have named remotes.

@avicent
Copy link

avicent commented Mar 25, 2015

Thanks for your quick replies.

Regarding Proposal 1 and ~/.gitfat vs ~/.gitconfig:

The problem I saw with the git config files is that ConfigParser didn't like indentation in the different section items, so when I tried that approach (which was my initial thought too), I had to unindent my ~/.gitconfig. The machine I was running on had Python 2.7.5 installed.

Forcing users to modify their .gitfiles because of this wouldn't be elegant, so that's where the ~/.gitfat came into the picture. Do you have any suggestions here?

Regarding benefits of Proposal 2

The intended benefit here was related to the fact that ConfigParser (on my machine) couldn't parse my ~/.gitconfig file because of indentation, and also related to the fact that the %()s interpolation on Python2.7 doesn't support foreign-section interpolation, which could allow for more flexibility in the configuration space. The downside as you know is that it resorts to a made-up syntax that can be considered alienating.

... and yes, they could work together.

@abraithwaite
Copy link

Hmm, I couldn't believe this until I saw it. Switching to ConfigParser was my doing. JedBrown's git-fat uses git config --get which now makes sense.

Of course INI files are not a well defined standard so I'll have to think about this a bit. Might be best just to backport or monkey patch it.

@abraithwaite
Copy link

Just checked python3 and leading spaces are still not supported. What do you think, use git config --get or monkey patch the regular expression?

@justinclift
Copy link

Monkey patching is kind of badly regarded isn't it?

@abraithwaite
Copy link

Generally, yes. The reason why it's bad though (I believe) is because the patch is hard to track down in large python projects. Keeping git-fat to one file should mitigate most of the issues associated with monkey patching. In fact, back-porting fixes (as we already do) is a form of monkey patching itself.

The advantage to monkey patching is that we don't have to execute an expensive subprocess call. The disadvantage is that we don't parse the config the same way git does.

@justinclift
Copy link

k. This is only called once (or a few times) at the start of a git-fat run isn't it? It's not a time critical piece of code which needs optimisation along those lines?

To my thinking 😉, if it's not, then parsing the config the same way git does sounds of more benefit. If this code is time critical though, the other monkey patching approach might win?

That being said... I'm only an onlooker here. 😀

@abraithwaite
Copy link

k. This is only called once (or a few times) at the start of a git-fat run isn't it? It's not a time critical piece of code which needs optimisation along those lines?

Launching a subprocess is probably pretty minor compared to the number of times that the python interpreter has to start or the time it takes to copy files around. I haven't done major profiling, but I suspect that those two things are where we spend the most time. I wouldn't be opposed to either way it gets done though, which is why I presented the question. 😏

@avicent
Copy link

avicent commented Mar 26, 2015

Are you sure it doesn't work in Python 3? I tried out the Python module configparser which I believe is a backport of the one by the same name in Python3's standard library and it does work for me. It also introduces ExtendedInterpolation which allows to natively do cross-section interpolation.

The one thing I realized when testing on my local git-fat is that when you parse all these config files, you can no longer assume that the git-fat backend is the first section of the config object, so my local branch, which is now using configparser if it exists in the system, is now supporting:

[gitfat]
    backend=rsync

[gitfat "rsync"]
    remote = ${myorg:synchost}:${myorg:syncroot}/repo-specific-store

So my local branch has now support for Git config file syntax and multi-file setup while keeping backwards compatibility. The advantage of this over git config --get is that configparser supports interpolation. The disadvantage is that it introduces a dependency.

@abraithwaite
Copy link

Are you sure it doesn't work in Python 3? I tried out the Python module configparser which I believe is a backport of the one by the same name in Python3's standard library and it does work for me.

I didn't actually try it but the regular expression used in python3's config parser looks to be about the same. If we were to monkey patch it, we wouldn't want to patch the whole module of course, but just the area which allows whitespace before the variable name.

@abraithwaite
Copy link

you can no longer assume that the git-fat backend is the first section of the config object.

Yeah, this certainly isn't very robust. It was a quick solution which allowed for a default when multiple are specified.

@avicent
Copy link

avicent commented Mar 30, 2015

I have now a version that fully utilizes git config instead of ConfigParser. It supports namespaces, it does interpolation (restricted to the .gitfat file only) and would also read site-wide git-fat config settings from $GIT_FAT_CONFIG ($GIT_CONFIG turned out not to be a good idea because it messed things up with Git) and the regular Git config files (site, user, local). It is backwards compatible and the existing test-suite is passing.

On the performance, I haven't actually measured the difference, but this version is caching the config, so the amount of subprocess'ed calls to git config is pretty much the same as before (depending on $GIT_FAT_CONFIG) as it spares subsequent subprocess'ing on gitconfig_get() calls once the initialization is done.

If you have no objections I am happy to make a pull request with these changes.

@abraithwaite
Copy link

Please do! :shipit:

@avicent
Copy link

avicent commented Mar 31, 2015

Corporate bureaucracy to get approval for sharing the change is slowing the process down. I hope to get clearance in a couple of days.

@abraithwaite
Copy link

No worries, take your time. I don't think anyone is expecting this tomorrow. 😄

@justinclift
Copy link

@avicent If it looks like bureaucracy or legal approval will stop you contributing, you're welcome to ping me ([email protected]).

It may be possible to reassure any fears with regards to liability when contributing to Open Source, if that's the problem. 😄

@avicent
Copy link

avicent commented Apr 2, 2015

Thanks @justinclift . Fear is not the problem, but the process that I am asked to follow by my company (Ericsson) to be able to share work I did on company time. I was informed today that it will probably take a couple of weeks instead of days. 😞

@justinclift
Copy link

Heh Heh Heh

Good luck. 😄

@avicent
Copy link

avicent commented Jun 26, 2015

Hi, I finally got the corporate approval. It has taken quite some time. I will be submitting a pull request from the account 'ecsavin' which is connected to my corporate e-mail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants