Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path refactoring #2918

Merged
merged 7 commits into from
Dec 30, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions core/src/Streamly/FileSystem/Path.hs
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,98 @@
-- Maintainer : [email protected]
-- Portability : GHC
--
-- Well typed, flexible, extensible and efficient file systems paths,
-- preserving the OS and filesystem encoding.
--
-- /Flexible/: you can choose the level of type safety you want. 'Path' is the
-- basic path type which can represent a file, directory, absolute or relative
-- path with no restrictions. Depending on how much type safety you want, you
-- can choose appropriate type wrappers or a combination of those to wrap the
-- 'Path' type.
--
-- = Rooted Paths vs Path Segments
--
-- For the safety of the path append operation we make the distinction of
-- rooted paths vs path segments. A path that starts from some implicit or
-- explicit root in the file system is a rooted path, for example, @\/usr\/bin@
-- is a rooted path starting from an explicit file system root directory @/@.
-- Similarly, @.\/bin@ is a path with an implicit root, this path is hanging
-- from the current directory. A path that is not rooted is called a path
-- segment e.g. @local\/bin@ is a segment.
--
-- This distinction affords safety to the path append operation. We can always
-- append a segment to a rooted path or to another segment. However, it does
-- not make sense to append a rooted path to another rooted path. The default
-- append operation in the Path module checks for this and fails if the
-- operation is incorrect. However, the programmer can force it by using the
-- unsafe version of append operation. You can also drop the root explicitly
-- and use the safe append operation.
--
-- The "Streamly.FileSystem.Path.LocSeg" module provides explicit typing of
-- rooted paths vs path segments. Rooted paths are represented by the @Loc
-- Path@ type and path segments are represented by the @Seg Path@ type. If you
-- use the 'Path' type then append can fail if you try to append a rooted
-- location to another path, but if you use @Loc Path@ or @Seg Path@ types then
-- append can never fail at run time as the types would not allow it at compile
-- time.
--
-- = Absolute vs Relative Rooted Paths
--
-- Rooted paths can be absolute or relative. Absolute paths have an absolute
-- root e.g. @\/usr\/bin@. Relative paths have a dynamic or relative root e.g.
-- @.\/local\/bin@, or @.@, in these cases the root is current directory which
-- is not absolute but can change dynamically. Note that there is no type level
-- distinction for absolute and relative paths.
--
-- = File vs Directory Paths
--
-- Independently of the rooted or segment distinction you can also make the
-- distinction between files and directories using the
-- "Streamly.FileSystem.Path.FileDir" module. @File Path@ type represents a
-- file whereas @Dir Path@ represents a directory. It provides safety against
-- appending a path to a file. Append operation does not allow appending to
-- 'File' types.
--
-- By default a path with a trailing separator is implicitly considered a
-- directory path. However, the absence of a trailing separator does not convey
-- any information, it could either be a directory or a file. Thus the append
-- operation allows appending to even the paths that do not have a trailing
-- separator. However, when creating a typed path of 'File' type the conversion
-- fails unless we explicitly drop the trailing separator.
--
-- = Flexible Typing
--
-- You can use the 'Loc', 'Seg' or 'Dir', 'File' types independent of each
-- other by using only the required module. If you want both types of
-- distinctions then you can use them together as well using the
-- "Streamly.FileSystem.Path.Typed" module. For example, the @Loc (Dir Path)@
-- represents a rooted path which is a directory. You can only append to a path
-- that has 'Dir' in it and you can only append a 'Seg' type.
--
-- You can choose to use just the basic 'Path' type or any combination of safer
-- types. You can upgrade or downgrade the safety using the @adapt@ operation.
-- Whenever a less restrictive path type is converted to a more restrictive
-- path type, the conversion involves run-time checks and it may fail. However,
-- a more restrictive path type can be freely converted to a less restrictive
-- one.
--
-- = Extensibility
--
-- Extensible, you can define your own newtype wrappers similar to 'File' or
-- 'Dir' to provide custom restrictions if you want.
--
-- = Compatibility
--
-- Any path type can be converted to the 'FilePath' type using the 'toString'
-- operation. Operations to convert to and from 'OsPath' type at zero cost are
-- provided in the @streamly-filepath@ package. This is possible because the
-- types use the same underlying representation as the 'OsPath' type.
--
-- = String Creation Quasiquoter
--
-- You may find the 'str' quasiquoter from "Streamly.Unicode.String" to be
-- useful in creating paths.
--

module Streamly.FileSystem.Path
(
Expand Down
13 changes: 8 additions & 5 deletions core/src/Streamly/Internal/Data/Array.hs
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ module Streamly.Internal.Data.Array
-- , getSlice
, sliceIndexerFromLen
, slicerFromLen
, splitOn -- XXX slicesEndBy
, sliceEndBy_

-- * Streaming Operations
, streamTransform
Expand Down Expand Up @@ -101,6 +101,7 @@ module Streamly.Internal.Data.Array
, pinnedCompactLE
, compactOnByte
, compactOnByteSuffix
, splitOn
)
where

Expand Down Expand Up @@ -310,12 +311,14 @@ getSliceUnsafe index len (Array contents start e) =
-- matching the predicate is dropped.
--
-- /Pre-release/
{-# INLINE splitOn #-}
splitOn :: (Monad m, Unbox a) =>
{-# INLINE sliceEndBy_ #-}
sliceEndBy_, splitOn :: (Monad m, Unbox a) =>
(a -> Bool) -> Array a -> Stream m (Array a)
splitOn predicate arr =
sliceEndBy_ predicate arr =
fmap (\(i, len) -> getSliceUnsafe i len arr)
$ D.indexOnSuffix predicate (read arr)
$ D.indexEndBy_ predicate (read arr)

RENAME(splitOn,sliceEndBy_)

{-# INLINE sliceIndexerFromLen #-}
sliceIndexerFromLen :: forall m a. (Monad m, Unbox a)
Expand Down
15 changes: 10 additions & 5 deletions core/src/Streamly/Internal/Data/MutArray/Type.hs
Original file line number Diff line number Diff line change
Expand Up @@ -257,7 +257,7 @@ module Streamly.Internal.Data.MutArray.Type
-- | Split an array into slices.

-- , getSlicesFromLenN
, splitOn -- slicesEndBy
, sliceEndBy_
-- , slicesOf

-- *** Concat
Expand Down Expand Up @@ -359,6 +359,7 @@ module Streamly.Internal.Data.MutArray.Type
, pinnedFromList
, pinnedClone
, unsafePinnedCreateOf
, splitOn
)
where

Expand Down Expand Up @@ -2891,12 +2892,16 @@ spliceExp = spliceWith (\l1 l2 -> max (l1 * 2) (l1 + l2))
-- matching the predicate is dropped.
--
-- /Pre-release/
{-# INLINE splitOn #-}
splitOn :: (MonadIO m, Unbox a) =>
{-# INLINE sliceEndBy_ #-}
sliceEndBy_, splitOn :: (MonadIO m, Unbox a) =>
(a -> Bool) -> MutArray a -> Stream m (MutArray a)
splitOn predicate arr =
sliceEndBy_ predicate arr =
fmap (\(i, len) -> unsafeGetSlice i len arr)
$ D.indexOnSuffix predicate (read arr)
$ D.indexEndBy_ predicate (read arr)

RENAME(splitOn,sliceEndBy_)

-- XXX breakEndBy_?

-- | Drops the separator byte
{-# INLINE breakOn #-}
Expand Down
29 changes: 16 additions & 13 deletions core/src/Streamly/Internal/Data/Stream/Type.hs
Original file line number Diff line number Diff line change
Expand Up @@ -143,7 +143,7 @@ module Streamly.Internal.Data.Stream.Type
, foldIterateBfs

-- * Splitting
, indexOnSuffix
, indexEndBy_

-- * Multi-stream folds
-- | These should probably be expressed using zipping operations.
Expand All @@ -153,6 +153,7 @@ module Streamly.Internal.Data.Stream.Type
-- * Deprecated
, sliceOnSuffix
, unfoldMany
, indexOnSuffix
)
where

Expand Down Expand Up @@ -2114,25 +2115,27 @@ indexerBy (Fold step1 initial1 extract1 _final) n =

extract (Tuple' i s) = (i,) <$> extract1 s

-- XXX rename to indicesEndBy

-- | Like 'splitEndBy' but generates a stream of (index, len) tuples marking
-- | Like 'splitEndBy_' but generates a stream of (index, len) tuples marking
-- the places where the predicate matches in the stream.
--
-- >>> Stream.toList $ Stream.indexEndBy_ (== '/') $ Stream.fromList "/home/harendra"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/home/harendra
Now the world knows where your home directory is located.

-- [(0,0),(1,4),(6,8)]
--
-- /Pre-release/
{-# INLINE indexOnSuffix #-}
indexOnSuffix :: Monad m =>
{-# INLINE indexEndBy_ #-}
indexEndBy_, indexOnSuffix :: Monad m =>
(a -> Bool) -> Stream m a -> Stream m (Int, Int)
indexOnSuffix predicate =
-- Scan the stream with the given refold
indexEndBy_ predicate =
refoldIterateM
(indexerBy (FL.takeEndBy_ predicate FL.length) 1)
(return (-1, 0))

RENAME(indexOnSuffix,indexEndBy_)

-- Alternate implementation
{-# INLINE_NORMAL _indexOnSuffix #-}
_indexOnSuffix :: Monad m => (a -> Bool) -> Stream m a -> Stream m (Int, Int)
_indexOnSuffix p (Stream step1 state1) = Stream step (Just (state1, 0, 0))
{-# INLINE_NORMAL _indexEndBy_ #-}
_indexEndBy_ :: Monad m => (a -> Bool) -> Stream m a -> Stream m (Int, Int)
_indexEndBy_ p (Stream step1 state1) = Stream step (Just (state1, 0, 0))

where

Expand All @@ -2149,9 +2152,9 @@ _indexOnSuffix p (Stream step1 state1) = Stream step (Just (state1, 0, 0))
Stop -> if len == 0 then Stop else Yield (i, len) Nothing
step _ Nothing = return Stop

{-# DEPRECATED sliceOnSuffix "Please use indexOnSuffix instead." #-}
{-# DEPRECATED sliceOnSuffix "Please use indexEndBy_ instead." #-}
sliceOnSuffix :: Monad m => (a -> Bool) -> Stream m a -> Stream m (Int, Int)
sliceOnSuffix = indexOnSuffix
sliceOnSuffix = indexEndBy_

------------------------------------------------------------------------------
-- Stream with a cross product style monad instance
Expand Down
74 changes: 8 additions & 66 deletions core/src/Streamly/Internal/FileSystem/Path.hs
Original file line number Diff line number Diff line change
Expand Up @@ -5,78 +5,20 @@
-- Maintainer : [email protected]
-- Portability : GHC
--
-- = User Notes
--
-- Well typed, flexible, extensible and efficient file systems paths,
-- preserving the OS and filesystem encoding.
--
-- /Flexible/: you can choose the level of type safety you want. 'Path' is the
-- basic path type which can represent a file, directory, absolute or relative
-- path with no restrictions. Depending on how much type safety you want, you
-- can choose appropriate type wrappers or a combination of those to wrap the
-- 'Path' type.
--
-- The basic type-safety is provided by the
-- "Streamly.Internal.FileSystem.PosixPath.LocSeg" module. We make a distinction
-- between two types of paths viz. locations and segments. Locations are
-- represented by the @Loc Path@ type and path segments are represented by the
-- @Seg Path@ type. Locations are paths pointing to specific objects in the
-- file system absolute or relative e.g. @\/usr\/bin@, @.\/local\/bin@, or @.@.
-- Segments are a sequence of path components without any reference to a
-- location e.g. @usr\/bin@, @local\/bin@, or @../bin@ are segments. This
-- distinction is for safe append operation on paths, you can only append
-- segments to any path and not a location. If you use the 'Path' type then
-- append can fail if you try to append a location to a path, but if you use
-- @Loc Path@ or @Seg Path@ types then append can never fail.
--
-- Independently of the location or segment distinction you can also make the
-- distinction between files and directories using the
-- "Streamly.Internal.FileSystem.PosixPath.FileDir" module. @File Path@ type
-- represents a file whereas @Dir Path@ represents a directory. It provides
-- safety against appending a path to a file. Append operation allows appending
-- to only 'Dir' types.
--
-- You can use the 'Loc', 'Seg' or 'Dir', 'File' types independent of each
-- other by using only the required module. If you want both types of
-- distinctions then you can use them together as well using the
-- "Streamly.Internal.FileSystem.PosixPath.Typed" module. For example, the
-- @Loc (Dir Path)@ represents a location which is a directory. You can only
-- append to a path that has 'Dir' in it and you can only append a 'Seg' type.
--
-- You can choose to use just the basic 'Path' type or any combination of safer
-- types. You can upgrade or downgrade the safety using the @adapt@ operation.
-- Whenever a less restrictive path type is converted to a more restrictive
-- path type, the conversion involves run-time checks and it may fail. However,
-- a more restrictive path type can be freely converted to a less restrictive
-- one.
--
-- Extensible, you can define your own newtype wrappers similar to 'File' or
-- 'Dir' to provide custom restrictions if you want.
--
-- Any path type can be converted to the 'FilePath' type using the 'toString'
-- operation. Operations to convert to and from 'OsPath' type at zero cost are
-- provided in the @streamly-filepath@ package. The types use the same
-- underlying representation as the 'OsPath' type.
--
-- = Developer Notes:
-- == References
--
-- * https://en.wikipedia.org/wiki/Path_(computing)
-- * https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file
-- * https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/62e862f4-2a51-452e-8eeb-dc4ff5ee33cc
--
-- == Windows and Posix Paths
--
-- We should be able to manipulate windows paths on posix and posix paths on
-- windows as well. Therefore, we have WindowsPath and PosixPath types which
-- are supported on both platforms. However, the Path module aliases Path to
-- WindowsPath on Windows and PosixPath on Posix.
--
-- Conventions: A trailing separator on a path indicates that it is a
-- directory. However, the absence of a trailing separator does not convey any
-- information, it could either be a directory or a file.
--
-- You may also find the 'str' quasiquoter from "Streamly.Unicode.String" to be
-- useful in creating paths.
--
-- * https://en.wikipedia.org/wiki/Path_(computing)
-- * https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file
-- * https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/62e862f4-2a51-452e-8eeb-dc4ff5ee33cc
--
-- == File System Tree
-- == File System as Tree vs Graph
--
-- A file system is a tree when there are no hard links or symbolic links. But
-- in the presence of symlinks it could be a DAG or a graph, because directory
Expand Down
Loading
Loading