Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Allow for mixed case keys when configuring from environment variables #94

Open
arcanerr opened this issue Dec 15, 2023 · 4 comments

Comments

@arcanerr
Copy link

arcanerr commented Dec 15, 2023

Current Situation

A configuration setup for myproject, that contains uppercase letters like

ABC: "an uppercase value"
abc: "a lowercase value"

currently cannot be configured via environment variables, as the keys encoded into the environment variable names are always converted to lowercase.

Requested Improvement

Allow to configure myproject for above configuration structure via

export myproject_ABC="another uppercase value"
export myproject_abc="another lowercase value"

Implementation

Assuming that the (undocumented) env_prefix parameter of the class Config is rarely used in client code I suggest, that if it is explicitly set to a value, that contains lowercase letters:

from donfig import Config

config = Config('myproject', env_prefix='myproject')

the conversion of found environment variables to lowercase is skipped:

...
if any(char.islower() for char in prefix):
    varname = name[prefix_len:].replace("__", ".")
else:
    varname = name[prefix_len:].lower().replace("__", ".")
...

Analysis

The mentioned approach is not 100% backwards compatible: If any client code uses the following:

from donfig import Config

config = Config('myproject', env_prefix='myProject')

to be able to use mixed case environment Variables

export myProject_ABC="a value"

to read in a configuration equivalent to

abc: "a value"

the change would break this.

Alternative

Add a boolean env_case_sensitive keyword parameter to the class donfig.Config and use it:

class Config:
    def __init__(
        self,
        name: str,
        defaults: list[Mapping[str, Any]] | None = None,
        paths: list[str] | None = None,
        env: Mapping[str, str] | None = None,
        env_var: str | None = None,
        root_env_var: str | None = None,
        env_prefix: str | None = None,
        deprecations: Mapping[str, str | None] | None = None,
        env_case_sensitive: bool = False,
    ):
@djhoese
Copy link
Member

djhoese commented Dec 15, 2023

Could you provide more detail on your use case? Wouldn't having parameters that are only different by capitalization be confusing to users?

In general I'd like this project to remain in sync with the dask project as far as compatibility. This might mean we have more functionality (ex. schemas, etc), but this request "feels" different. I'm not a huge fan of the implicit env_prefix lowercase/uppercase behavior nor of adding yet another keyword argument.

It is pretty standard for environment variables to be all capital letters. I'm not sure providing direct support to go against that is a good idea. I'm willing to be convinced otherwise, but right now I don't see it.

@arcanerr
Copy link
Author

arcanerr commented Dec 19, 2023

A simple use case is: I have a naming convention in an application, that (also) uses uppercase letters in configuration keys, like:

MSG:
...
MTG:
...

In such a case I cannot express via environment variables what I can express via YAML.

Yes, writing environment variables in uppercase is pretty standard, but it's also only a convention, there is no technical restriction at least on UNIXes that prevents a user from defining lowercase or mixed case environment variables. I would say, a library should not technically enforce a convention, if the eco-system it lives in does not: Neither YAML nor environment variables nor Python prevents you from using the case you want.

And regarding staying consistent with DASK: What about suggesting the same enhancement to DASK? Ultimately, in my opinion, it was already a bad idea in DASK to carry out an implicit case conversion and thereby prevent valid use cases.

At the end of the day, the proposed improvement simply allows more than DASK, it would be fully backwards compatible (except for the unlikely use case that someone would want to use environment variables of the form MyProject_U_KEY__M_keY="value" to access {u_key.m_key: "value"}. Even MYPROJECT_U_key__M_keY="value" and MYPROJECT_U_KEY__M_KEY="value" would (debatably) both still be interpreted the same way.

By the way, I would submit a pull request if there was a chance that it would be integrated.

I'd be happy with both of the suggested options to control the behaviour or even a different one that didn't come to my mind, as long as I am not forced to write my configuration keys in lowercase to be able to overwrite settings via environment variables (or to switch to a more flexible configuration library, that may not even exist yet...).

@djhoese
Copy link
Member

djhoese commented Dec 19, 2023

And regarding staying consistent with DASK: What about suggesting the same enhancement to DASK? Ultimately, in my opinion, it was already a bad idea in DASK to carry out an implicit case conversion and thereby prevent valid use cases.

Dask did not build this configuration system to work for everyone else's use cases. It grew out of necessity for their own workflow and wasn't intended for anyone else. I extracted it because I liked it and hoped that in the future dask would depend on this package instead of keeping their configuration logic bundled internally.

I would say, a library should not technically enforce a convention, if the eco-system it lives in does not: Neither YAML nor environment variables nor Python prevents you from using the case you want.

I have to disagree, but I'm not sure how strongly. Libraries should do what is expected. Following conventions and standards is expected. There is also a burden on maintenance of features, especially when those features are only used by a small subset of users. You are the first to request this feature for donfig and as far as I know no one has requested it from dask.

It isn't clear to me why the YAML and python config keys can't be lowercase. I understand that these are satellite name acronyms, but surely users and developers can time config.get("mtg").

I'm almost leaning towards breaking backwards compatibility and forcing environment variable case-sensitivity always. My reasoning being that if conventions are so important to me and others, then we should be following those conventions and not depending on "magic" .lowercase() in the config handling to workaround our typos. Having said that, this would break "helping" users with typos and could break in any system/environment where environment variables are (for some reason) lowercased. I'm not aware of any system that does that.

@mraspaud any opinions to help me make a decision on this? @arcanerr what do you think? Did I just do a complete reversal and confuse you? If you'd like to make a PR for my suggestion please go ahead.

@djhoese
Copy link
Member

djhoese commented Dec 19, 2023

Wait...no...what am I talking about. This would make all environment variables that are following conventions (all uppercase) to not match the expected lowercase YAML/Python names. Ugh.

Ok, @arcanerr what about a PR for the extra keyword argument solution? Let me at least see what it looks like if it isn't too much work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants