Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: dockerfile2llb: handle escaping of backslashes correctly #5269

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

profnandaa
Copy link
Collaborator

@profnandaa profnandaa commented Aug 22, 2024

shlex in was treating single backslash "\" in string as escape sequence, hence stripping them off. Modify processCmdEnv to double escape \\.

This was in turn affecting the vertex Name, and mostly pronounced on Windows where backslashes are prevalent in path names. For example C:\hello\world\path was being rendered as C:helloworldpath, hence confusing users.

Fixes #5250

Build progress before fix:

    #5 [2/4] RUN echo C:helloworldpath
    #5 1.359 C:\hello\world\path
    #5 DONE 1.7s

    #6 [3/4] RUN echo C:\hello\escaped\path
    #6 1.734 C:\\hello\\escaped\\path
    #6 DONE 2.1s

    #7 [4/4] RUN echo "C:\hello\quoted\path"
    #7 1.765 "C:\hello\quoted\path"
    #7 DONE 2.1s

Build progress after fix:

    #5 [2/4] RUN echo C:\hello\world\path
    #5 1.458 C:\hello\world\path
    #5 DONE 1.8s
    
    #6 [3/4] RUN echo C:\\hello\\escaped\\path
    #6 1.730 C:\\hello\\escaped\\path
    #6 DONE 2.1s
    
    #7 [4/4] RUN echo "C:\hello\quoted\path"
    #7 1.702 "C:\hello\quoted\path"
    #7 DONE 2.1s

You can also see that now paths in the step/vertex names now match those in the log lines right after the name.

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is correct. Don't we need something windows specific what would detect that if path is windows style then single backslashes should mean \\ instead of being escape characters?

@profnandaa
Copy link
Collaborator Author

profnandaa commented Aug 23, 2024

Not sure this is correct. Don't we need something windows specific what would detect that if path is windows style then single backslashes should mean \\ instead of being escape characters?

Initially I'd thought of just windows-only variant but there such cases like cross-platform builds that could often have the \\, as much as we do normalize them later on, etc. See comment here.

This only has the effect on the vertex.Name after being stored in the llb.customname key #549.

Also you can see that the log lines currently preserve the backslashes correctly, but the stage name:

    #5 [2/4] RUN echo C:helloworldpath
    #5 1.359 C:\hello\world\path
    ~~~~~~~~~~~^
    #5 DONE 1.7s

@tonistiigi
Copy link
Member

tonistiigi commented Aug 23, 2024

So what about this:

» cat Dockerfile
from alpine
env foo=aa\bb
run env && stop
 => ERROR [2/2] RUN env && stop                                                            0.2s
------
 > [2/2] RUN env && stop:
0.114 SHLVL=1
0.114 foo=aabb
0.114 HOME=/root

Is it incorrect or correct. Looks like ProcessWord should assume that the value is inside the string in the case of printing command name?

@profnandaa
Copy link
Collaborator Author

profnandaa commented Aug 23, 2024 via email

@profnandaa
Copy link
Collaborator Author

Let me revisit this right after my vctn, 9/3.

@profnandaa profnandaa marked this pull request as draft August 24, 2024 20:00
@profnandaa
Copy link
Collaborator Author

profnandaa commented Sep 2, 2024

@tonistiigi -- I think the log lines are writing out the bytes as received from the RUN call. Linux already strips off the \ itself:

So what about this:

» cat Dockerfile
from alpine
env foo=aa\bb
run env && stop
 => ERROR [2/2] RUN env && stop                                                            0.2s
------
 > [2/2] RUN env && stop:
0.114 SHLVL=1
0.114 foo=aabb
0.114 HOME=/root

Is it incorrect or correct. Looks like ProcessWord should assume that the value is inside the string in the case of printing command name?

Equivalent running in an alpine container shell:

/ # export foo=aa\bb
/ # env
HOSTNAME=3143e71fab36
SHLVL=1
HOME=/root
foo=aabb <--- \ stripped off.
TERM=xterm
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/

This is what is coming back the other way round from the lower layers. My fix was fixing only the stage name display, where backslashes are stripped off before they go down from the dockerfile to LLB...

@profnandaa profnandaa marked this pull request as ready for review September 2, 2024 15:39
@profnandaa
Copy link
Collaborator Author

PS:

The log lines (just below the stage name), are raw bytes:

fmt.Fprintf(p.w, "#%d %s", v.index, []byte(l))

Then here is where the stage name is printed, but coming from llb.customname:

fmt.Fprintf(p.w, "#%d %s\n", v.index, v.Name)

func (v *vertex) Name() string {
if name, ok := v.options.Description["llb.customname"]; ok {
return name
}
return v.name
}

`shlex` in was treating single backslash "\\" in string as
escape sequence, hence stripping them off. Modify `processCmdEnv`
to double escape `\\`.

This was in turn affecting the vertex `Name`, and mostly pronounced
on Windows where backslashes are prevalent in path names. For example
`C:\hello\world\path` was being rendered as `C:helloworldpath`, hence
confusing users.

Fixes moby#5250

Build progress before fix:
```
	moby#5 [2/4] RUN echo C:helloworldpath
	moby#5 1.359 C:\hello\world\path
	moby#5 DONE 1.7s

	moby#6 [3/4] RUN echo C:\hello\escaped\path
	moby#6 1.734 C:\\hello\\escaped\\path
	moby#6 DONE 2.1s

	moby#7 [4/4] RUN echo "C:\hello\quoted\path"
	moby#7 1.765 "C:\hello\quoted\path"
	moby#7 DONE 2.1s
```

Build progress after fix:
```
	moby#5 [2/4] RUN echo C:\hello\world\path
	moby#5 1.458 C:\hello\world\path
	moby#5 DONE 1.8s

	moby#6 [3/4] RUN echo C:\\hello\\escaped\\path
	moby#6 1.730 C:\\hello\\escaped\\path
	moby#6 DONE 2.1s

	moby#7 [4/4] RUN echo "C:\hello\quoted\path"
	moby#7 1.702 "C:\hello\quoted\path"
	moby#7 DONE 2.1s
```

You can also see that now paths in the step/vertex names now
match those in the log lines right after the name.

Signed-off-by: Anthony Nandaa <[email protected]>
@profnandaa
Copy link
Collaborator Author

rebased with the latest master.

@tonistiigi
Copy link
Member

tonistiigi commented Sep 11, 2024

Testing current (not this PR) in linux:

from alpine
run echo "C:\hello\escaped\path"
run echo C:\hello\escaped\path
#5 [2/3] RUN echo "C:\hello\escaped\path"
#5 0.167 C:\hello\escaped\path
#5 DONE 0.2s

#6 [3/3] RUN echo C:helloescapedpath
#6 0.153 C:helloescapedpath
#6 DONE 0.2s

and btw:

#escape=`
from alpine
run echo "C:\hello\escaped\path"
run echo C:\hello\escaped\path
#5 [2/3] RUN echo "C:\hello\escaped\path"
#5 0.143 C:\hello\escaped\path
#5 DONE 0.2s

#6 [3/3] RUN echo C:\hello\escaped\path
#6 0.146 C:helloescapedpath
#6 DONE 0.2s

Is the issue that Lex is using \ as escape token in windows when it is not actually the escape token?

@profnandaa
Copy link
Collaborator Author

Is the issue that Lex is using \ as escape token in windows when it is not actually the escape token?

That's true, and not only on Windows alone, but Linux too...
Can see that \ is only printed when you put them within quotes.

@profnandaa
Copy link
Collaborator Author

@tonistiigi -- PTAL again.

@tonistiigi
Copy link
Member

@profnandaa Has this PR changed since my last comment #5269 (comment) ?

@profnandaa
Copy link
Collaborator Author

@profnandaa Has this PR changed since my last comment #5269 (comment) ?

Relooking into this again. From your test cases, this is what had caught my eyes:

#6 [3/3] RUN echo C:\hello\escaped\path
#6 0.146 C:helloescapedpath <----- this!
#6 DONE 0.2s

Let me see the best way we can address this.

@profnandaa profnandaa marked this pull request as draft November 18, 2024 04:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

progress: if not escaped, backslashes \ not printed on the progress stream dockerfile steps
3 participants