Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strip trailing white spaces before saving the config #161

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

s-weigand
Copy link

@s-weigand s-weigand commented Sep 11, 2020

First of all thanks for the nice tool, I have used it for 3 years and together with a CD system, releasing a new version of your package is a breeze 😄

Recently I made my first setup.cfg only package (bare minimum setup.py) and when I wanted to bump the version with (commit = True and tag = True settings as usual), my pre-commit hook removing trailing white spaces failed, rendering me unable to make a commit.

After some investigating I found that bump2version generates the following unwanted diff:

--- a/setup.cfg
+++ b/setup.cfg
...
-classifiers =
+classifiers = 
        Development Status :: 4 - Beta
....
-console_scripts =
+console_scripts = 
        my_programm = route.to:cli

Since many people have editor settings and/or other measures to remove trailing white spaces, and I don't know of any reason to add them, I decided to make this ninja PR, which makes bump2version remove them by default.

Since the line spitting is closely related to the used newline character, I integrated the test into test_retain_newline.

Confusing black magic

Even so splitting lines which have \r as newline character, with \n shouldn't work in theory it somehow does.
Also, I found that the passed value of config_newlines in the tests is often None, which is why tests failed when I tried to split on config_newlines or simply use .splitlines().
This is also the reason why I had to manually add back the \n, instead of .writelines taking care of that.

The only way for config_newlines to be None, without any other issues, would be if the file wasn't read before retrieving newlines from the file descriptor.
But it is read before:

with open(config_file, "rt", encoding="utf-8") as config_fp:
config_content = config_fp.read()
config_newlines = config_fp.newlines

And the function where it is read is always called:

config, config_file_exists, config_newlines, part_configs, files = _load_configuration(
config_file, explicit_config, defaults,
)

So I'm very confused and have the 2nd biggest problem in programming, 'It works and I don't know why!'.
Maybe you can enlighten me, about what kind of black magic is going on here 😅

Implemented the test for this into test_retain_newline since it can affect the newline character.
Added test case for mac line ending '\r'.
@s-weigand s-weigand requested a review from ekohl September 27, 2020 10:53
@florisla
Copy link
Collaborator

Does this mean that all existing users will see spaces disappear in their config file?

That would introduce a lot of unwanted diffs...

@s-weigand
Copy link
Author

@florisla Yes, this change would remove trailing spaces from existing configs and thus create additional diffs.
That said I think those additional diffs are a one time thing since I don't know of other tools that add trailing whitespaces, but many that remove them.
While the current behavior possibly creates unwanted diffs after each version bumping when the setup.cfg gets changed.
e.g.: The following scenario

  • bump2version is run and possibly adds trailing whitespaces
  • Next time the file gets edited by a person the editor setting remove trailing whitespaces the whitespaces are removed again, unrelated to the actual change
  • bump2version is run and the cycle begins again

So IMHO the current behavior creates possibly more unwanted diffs than the changed one.
Also, if the commit = True and tag = True settings are used the diff created by this change most likely won't even be looked at, while the diffs of the scenario described above would be part of the normal reviewing process.

Another noteworthy point in respect to formatting only diffs is that bump2version will create those anyway since the whole setup.cfg is rewritten and new_config.getvalue() doesn't know about the previous formatting.

Lastly, this would allow using bump2version in conjunction with setup-cfg-fmt as a pre-commit hook, which was my initial problem.

@florisla
Copy link
Collaborator

I agree that stripping the trailing whitespace is better.

The question is, will all users be happy (or don't care) if we change this.

Also, we have to decide if we still want to re-write the config file at all, but that's another issue (#185).

I have a theory on why text with return characters (\r) is also splittable on newlines (\n)... Because \r is seldom used on its own, it's always combined with a newline like \r\n. So it's either \n or \r\n. See the table at https://en.wikipedia.org/wiki/Newline .

Do note that the current newline detection has a bug... If a file contains both \n and \r\n line endings, then config_newlines is a tuple. We should probably raise an error when that's the case.

@bsolomon1124
Copy link

Reason I am a big fan of this change:

Projects that use pre-commit's "Trim Trailing Whitespace" hook will see the version-bump commits fail because setup.cfg otherwise contains whitespace prior to the commit (after its rewrite).

$ bump2version pre
Check python ast.........................................................Passed
Check builtin type constructor use.......................................Passed
Check docstring is first.................................................Passed
Check JSON...........................................(no files to check)Skipped
Check for merge conflicts................................................Passed
Check Toml...........................................(no files to check)Skipped
Check Yaml...............................................................Passed
Debug Statements (Python)................................................Passed
Tests should end in _test.py.........................(no files to check)Skipped
Fix requirements.txt.................................(no files to check)Skipped
Trim Trailing Whitespace.................................................Failed
- hook id: trailing-whitespace
- exit code: 1
- files were modified by this hook

Fixing setup.cfg

flake8...................................................................Passed
yamllint.................................................................Passed
Failed to run ['git', 'commit', '-F', '/var/folders/28/rx5hw8hd3hl48jtj0c1vz9wr0000gn/T/tmpo99tyb6k']: return code 1, output: b''
Traceback (most recent call last):
  File "/Users/<redacted>/venv/lib/python3.9/site-packages/bumpversion/vcs.py", line 31, in commit
    subprocess.check_output(
  File "/usr/local/Cellar/[email protected]/3.9.1_6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 420, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/local/Cellar/[email protected]/3.9.1_6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 524, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'commit', '-F', '/var/folders/28/rx5hw8hd3hl48jtj0c1vz9wr0000gn/T/tmpo99tyb6k']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/Users/<redacted>/venv/bin/bump2version", line 8, in <module>
    sys.exit(main())
  File "/Users/<redacted>/venv/lib/python3.9/site-packages/bumpversion/cli.py", line 135, in main
    context = _commit_to_vcs(files, context, config_file, config_file_exists, vcs,
  File "/Users/<redacted>/venv/lib/python3.9/site-packages/bumpversion/cli.py", line 706, in _commit_to_vcs
    vcs.commit(
  File "/Users/<redacted>/venv/lib/python3.9/site-packages/bumpversion/vcs.py", line 39, in commit
    raise exc
  File "/Users/<redacted>/venv/lib/python3.9/site-packages/bumpversion/vcs.py", line 31, in commit
    subprocess.check_output(
  File "/usr/local/Cellar/[email protected]/3.9.1_6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 420, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/usr/local/Cellar/[email protected]/3.9.1_6/Frameworks/Python.framework/Versions/3.9/lib/python3.9/subprocess.py", line 524, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['git', 'commit', '-F', '/var/folders/28/rx5hw8hd3hl48jtj0c1vz9wr0000gn/T/tmpo99tyb6k']' returned non-zero exit status 1.

@s-weigand
Copy link
Author

s-weigand commented Feb 5, 2021

I think I finally found the black magic that puzzled me back then 😄
It looks like python sanitizes line endings when reading a file in text mode.

In [1]: with open("dummy.txt", "wb") as dummy:
      :     dummy.write(b"foo\r\nbar")
      :

In [2]: with open("dummy.txt", "rb") as dummy:
      :     print(dummy.read())
      :
b'foo\r\nbar'

In [3]: with open("dummy.txt", "r") as dummy:
      :     print(dummy.read().encode())
      :
b'foo\nbar'

Same result on windows (gitbash, cmd) and Linux (WSL2)

@florisla
Copy link
Collaborator

florisla commented Feb 5, 2021

There is another complication: files which use multiple newline styles intermixed!

When calling open() in text mode on a file, you can reference newlines. Typically this is "\n" or "\r\n", but it could be a tuple with multiple styles.

bump2version already checks newlines and returns it from _load_configuration().

Edit:
Python's newline interpretation behavior is documented in open().

@s-weigand
Copy link
Author

There are some questions I have regarding raising an error on intermixed newline, first and foremost should it be done in this PR?

  • Should the error only be raised when writing back to the file?
  • What error would appropriate to raise? e.g. plain ValueError or rather a custom error MixedLineEndingError
  • Maybe a warning and normalizing with \n (if it is in the tuple) would be a better user experience since I can't think of a reason why someone would purposefully have mixed line endings.

@Pathfinder216
Copy link

Just want to hop in here and give my +1. I find the trailing whitespace to be annoying, and I would much rather have a one-time diff that removes trailing whitespace than continuous diffs that pop up every time I have to adjust the config file. I would love for this PR to get released!

Not that I'm qualified to answer your question, but I'll give my 2 cents while I'm here: I think fixing the line ending bug is outside the scope of this PR. You haven't touched any of the code that affects which newline(s) the file uses, and it's not really relevant to the conceptual change either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants