Skip to content

Commit

Permalink
Don't use matcher from a .gitignore against includePatterns
Browse files Browse the repository at this point in the history
When attempting to determine whether a .gitignore should be respected,
we previously tried to see, whether any of the patterns in the
.gitignore match against the includePatterns.
As @fsoikin noticed [here](https://github.com/purescript/spago/pull/1296/files/b84981dd086219bc1157a440667aca92ea23efb4#r1815996597)
this does not really make any sense in case the includePatterns really
are actual patterns and not just paths.
Therefore, instead of matching the .gitignore patterns against the
includePatterns directly, we now only match them against the base of
each includePattern. The base of a pattern is always a path.
However, some includePatterns may not have a base, for example when
searching for `**/foo/bar`. What should one do here?
I think disrespecting .gitignores is not really feasible in such a case,
because how can we tell if something that the .gitignore excludes
could match the baseless pattern `**/foo/bar` without actually walking
the excluded directories?

So for example, if there was a file: `./abc/foo` and a .gitignore with
the pattern `abc` and we glob for `**/foo`, then we do not find
`./abc/foo`, because the .gitignore was respected.

I think, this is the behaviour we would want anyway, we only ignore a
.gitignore, when it is obvious that the glob target is necessarily
within an ignored directory. Like globbing for
`.spago/p/arrays-7.3.0/src/**/*.purs` when `.spago` is gitignored.

This was already the behaviour previously. In the above example, we
would have previously matched `abc` against `**/foo`, which would be
false and therefore no conflict would have been dectected, so the
.gitignore would also have been respected.

This PR only makes this behaviour explicit and removes the, incidentally
working, match of a pattern against another pattern.
The new test also passes without this change, but i think it's a nice
kind of documentation.

I wonder, whether the globbing code should even attempt to disrespect
.gitignore files, doesn't the calling code always know, whether it
should limit its search to the non-ignored files?
For example, when globbing for the source files of a package in .spago,
we know beforehand, that a nonGitignoringGlob (otherwise known as
`glob` :^D) would suffice.
  • Loading branch information
Blugatroff committed Oct 27, 2024
1 parent b13857d commit 6f847be
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 4 deletions.
18 changes: 14 additions & 4 deletions src/Spago/Glob.purs
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ import Control.Monad.Maybe.Trans (runMaybeT)
import Control.Monad.Trans.Class (lift)
import Data.Array as Array
import Data.Filterable (filter)
import Data.Foldable (any, traverse_)
import Data.Foldable (all, any, traverse_)
import Data.String as String
import Data.String as String.CodePoint
import Effect.Aff as Aff
Expand Down Expand Up @@ -101,6 +101,8 @@ fsWalk cwd ignorePatterns includePatterns = Aff.makeAff \cb -> do
canceled <- Ref.new false

let
allIncludePatternHaveBase = all (not <<< String.null) includePatternBases

-- Update the ignoreMatcherRef with the patterns from a .gitignore file
updateIgnoreMatcherWithGitignore :: Entry -> Effect Unit
updateIgnoreMatcherWithGitignore entry = do
Expand All @@ -119,9 +121,17 @@ fsWalk cwd ignorePatterns includePatterns = Aff.makeAff \cb -> do
-- ex. if `includePatterns` is [".spago/p/aff-1.0.0/**/*.purs"],
-- and `gitignored` is ["node_modules", ".spago"],
-- then add "node_modules" to `ignoreMatcher` but not ".spago"
wouldConflictWithSearch matcher = any matcher includePatterns

newMatchers = or $ filter (not <<< wouldConflictWithSearch) gitignored
wouldConflictWithSearch matcher = any matcher includePatternBases

newMatchers = or case allIncludePatternHaveBase of
true -> filter (not <<< wouldConflictWithSearch) gitignored
false -> do
-- Some of the include patterns don't have a base,
-- e.g. there is an include pattern like "*/foo/bar" or "**/.spago".
-- In this case, do not attempt to determine whether the gitignore
-- file would exclude some of the target paths. Instead always respect
-- the .gitignore.
gitignored

-- Another possible approach could be to keep a growing array of patterns and
-- regenerate the matcher on every gitignore. We have tried that (see #1234),
Expand Down
5 changes: 5 additions & 0 deletions test/Spago/Glob.purs
Original file line number Diff line number Diff line change
Expand Up @@ -103,3 +103,8 @@ spec = Spec.around globTmpDir do
FS.writeTextFile (Path.concat [ p, "fruits", "right", ".gitignore" ]) hugeGitignore
a <- Glob.gitignoringGlob p [ "fruits/**/apple" ]
Array.sort a `Assert.shouldEqual` [ "fruits/left/apple", "fruits/right/apple" ]

Spec.it "does respect .gitignore even though it might conflict with a search path without base" $ \p -> do
FS.writeTextFile (Path.concat [ p, ".gitignore" ]) "fruits"
a <- Glob.gitignoringGlob p [ "**/apple" ]
Array.sort a `Assert.shouldEqual` []

0 comments on commit 6f847be

Please sign in to comment.