In this chapter, we are going to bring our treatment of the higher-kinded interfaces in the Prelude to an end. In order to do so, we will continue developing the CSV reader we started implementing in chapter Functor and Friends. I moved some of the data types and interfaces from that chapter to their own modules, so we can import them here without the need to start from scratch.
Note that unlike in our original CSV reader, we will use
Validated
instead of Either
for handling exceptions,
since this will allow us to accumulate all errors
when reading a CSV file.
module Tutorial.Traverse
import Data.HList
import Data.IORef
import Data.List1
import Data.String
import Data.Validated
import Data.Vect
import Text.CSV
%default total
We stopped developing our CSV reader with function
hdecode
, which allows us to read a single line
in a CSV file and decode it to a heterogeneous list.
As a reminder, here is how to use hdecode
at the REPL:
Tutorial.Traverse> hdecode [Bool,String,Bits8] 1 "f,foo,12"
Valid [False, "foo", 12]
The next step will be to parse a whole CSV table, represented as a list of strings, where each string corresponds to one of the table's rows. We will go about this stepwise as there are several aspects about doing this properly. What we are looking for - eventually - is a function of the following type (we are going to implement several versions of this function, hence the numbering):
hreadTable1 : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
In our first implementation, we are not going to care about line numbers:
hreadTable1 _ [] = pure []
hreadTable1 ts (s :: ss) = [| hdecode ts 0 s :: hreadTable1 ts ss |]
Note, how we can just use applicative syntax in the implementation
of hreadTable1
. To make this clearer, I used pure []
on the first
line instead of the more specific Valid []
. In fact, if we used
Either
or Maybe
instead of Validated
for error handling,
the implementation of hreadTable1
would look exactly the same.
The question is: Can we extract a pattern to abstract over
from this observation? What we do in hreadTable1
is running
an effectful computation of type String -> Validated CSVError (HList ts)
over a list of strings, so that the result is a list of HList ts
wrapped in a Validated CSVError
. The first step of abstraction
should be to use type parameters for the input and output:
Run a computation of type a -> Validated CSVError b
over a
list List a
:
traverseValidatedList : (a -> Validated CSVError b)
-> List a
-> Validated CSVError (List b)
traverseValidatedList _ [] = pure []
traverseValidatedList f (x :: xs) = [| f x :: traverseValidatedList f xs |]
hreadTable2 : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
hreadTable2 ts = traverseValidatedList (hdecode ts 0)
But our observation was, that the implementation of hreadTable1
would be exactly the same if we used Either CSVError
or Maybe
as our effect types instead of Validated CSVError
.
So, the next step should be to abstract over the effect type.
We note, that we used applicative syntax (idiom brackets and
pure
) in our implementation, so we will need to write
a function with an Applicative
constraint
on the effect type:
traverseList : Applicative f => (a -> f b) -> List a -> f (List b)
traverseList _ [] = pure []
traverseList f (x :: xs) = [| f x :: traverseList f xs |]
hreadTable3 : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
hreadTable3 ts = traverseList (hdecode ts 0)
Note, how the implementation of traverseList
is exactly the same
as the one of traverseValidatedList
, but the types are more general
and therefore, traverseList
is much more powerful.
Let's give this a go at the REPL:
Tutorial.Traverse> hreadTable3 [Bool,Bits8] ["f,12","t,0"]
Valid [[False, 12], [True, 0]]
Tutorial.Traverse> hreadTable3 [Bool,Bits8] ["f,12","t,1000"]
Invalid (FieldError 0 2 "1000")
Tutorial.Traverse> hreadTable3 [Bool,Bits8] ["1,12","t,1000"]
Invalid (Append (FieldError 0 1 "1") (FieldError 0 2 "1000"))
This works very well already, but note how our error messages do
not yet print the correct line numbers. That's not surprising,
as we are using a dummy constant in our call to hdecode
.
We will look at how we can come up with the line numbers on the
fly when we talk about stateful computations later in this chapter.
For now, we could just manually annotate the lines with their
numbers and pass a list of pairs to hreadTable
:
hreadTable4 : (0 ts : List Type)
-> CSVLine (HList ts)
=> List (Nat, String)
-> Validated CSVError (List $ HList ts)
hreadTable4 ts = traverseList (uncurry $ hdecode ts)
If this is the first time you came across function uncurry
,
make sure you have a look at its type and try to figure out why it is
used here. There are several utility functions like this
in the Prelude, such as curry
, uncurry
, flip
, or even
id
, all of which can be very useful when working with higher-order
functions.
While not perfect, this version at least allows us to verify at the REPL that the line numbers are passed to the error messages correctly:
Tutorial.Traverse> hreadTable4 [Bool,Bits8] [(1,"t,1000"),(2,"1,100")]
Invalid (Append (FieldError 1 2 "1000") (FieldError 2 1 "1"))
Now, here is an interesting observation: We can implement a function
like traverseList
for other container types as well. You might think that's
obvious, given that we can convert container types to lists via
function toList
from interface Foldable
. However, while going
via List
might be feasible in some occasions, it is undesirable in
general, as we loose typing information. For instance, here
is such a function for Vect
:
traverseVect' : Applicative f => (a -> f b) -> Vect n a -> f (List b)
traverseVect' fun = traverseList fun . toList
Note how we lost all information about the structure of the
original container type. What we are looking for is a function
like traverseVect'
, which keeps this type level information:
The result should be a vector of the same length as the input.
traverseVect : Applicative f => (a -> f b) -> Vect n a -> f (Vect n b)
traverseVect _ [] = pure []
traverseVect fun (x :: xs) = [| fun x :: traverseVect fun xs |]
That's much better! And as I wrote above, we can easily get the same
for other container types like List1
, SnocList
, Maybe
, and so on.
As usual, some derived functions will follow immediately from traverseXY
.
For instance:
sequenceList : Applicative f => List (f a) -> f (List a)
sequenceList = traverseList id
All of this calls for a new interface, which is called
Traversable
and is exported from the Prelude. Here is
its definition (with primes for disambiguation):
interface Functor t => Foldable t => Traversable' t where
traverse' : Applicative f => (a -> f b) -> t a -> f (t b)
Function traverse
is one of the most abstract and versatile
functions available from the Prelude. Just how powerful
it is will only become clear once you start using it
over and over again in your code. However, it will be the
goal of the remainder of this chapter to show you several
diverse and interesting use cases.
For now, we will quickly focus on the degree of abstraction.
Function traverse
is parameterized over no less than
four parameters: The container type t
(List
, Vect n
,
Maybe
, to just name a few), the effect type (Validated e
,
IO
, Maybe
, and so on), the input element type a
, and
the output element type b
. Considering that the libraries
bundled with the Idris project export more than 30 data types
with an implementation of Applicative
and more than ten
traversable container types, there are literally hundreds
of combinations for traversing a container with an effectful
computation. This number gets even larger once we realize
that traversable containers - like applicative functors -
are closed under composition (see the exercises and
the final section in this chapter).
There are two laws function traverse
must obey:
traverse (Id . f) = Id . map f
: Traversing over theIdentity
monad is just functormap
.traverse (MkComp . map f . g) = MkComp . map (traverse f) . traverse g
: Traversing with a composition of effects must be the same when being done in a single traversal (left hand side) or a sequence of two traversals (right hand side).
Since map id = id
(functor's identity law), we can derive
from the first law that traverse Id = Id
. This means, that
traverse
must not change the size or shape of the container
type, nor is it allowed to change the order of elements.
-
It is interesting that
Traversable
has aFunctor
constraint. Proof that everyTraversable
is automatically aFunctor
by implementingmap
in terms oftraverse
.Hint: Remember
Control.Monad.Identity
. -
Likewise, proof that every
Traversable
is aFoldable
by implementingfoldMap
in terms ofTraverse
.Hint: Remember
Control.Applicative.Const
. -
To gain some routine, implement
Traversable'
forList1
,Either e
, andMaybe
. -
Implement
Traversable
forList01 ne
:data List01 : (nonEmpty : Bool) -> Type -> Type where Nil : List01 False a (::) : a -> List01 False a -> List01 ne a
-
Implement
Traversable
for rose trees. Try to satisfy the totality checker without cheating.record Tree a where constructor Node value : a forest : List (Tree a)
-
Implement
Traversable
forCrud i
:data Crud : (i : Type) -> (a : Type) -> Type where Create : (value : a) -> Crud i a Update : (id : i) -> (value : a) -> Crud i a Read : (id : i) -> Crud i a Delete : (id : i) -> Crud i a
-
Implement
Traversable
forResponse e i
:data Response : (e, i, a : Type) -> Type where Created : (id : i) -> (value : a) -> Response e i a Updated : (id : i) -> (value : a) -> Response e i a Found : (values : List a) -> Response e i a Deleted : (id : i) -> Response e i a Error : (err : e) -> Response e i a
-
Like
Functor
,Applicative
andFoldable
,Traversable
is closed under composition. Proof this by implementingTraversable
forComp
andProduct
:record Comp (f,g : Type -> Type) (a : Type) where constructor MkComp unComp : f (g a) record Product (f,g : Type -> Type) (a : Type) where constructor MkProduct fst : f a snd : g a
Let's go back to our CSV reader. In order to get reasonable error messages, we'd like to tag each line with its index:
zipWithIndex : List a -> List (Nat, a)
It is, of course, very easy to come up with an ad hoc implementation for this:
zipWithIndex = go 1
where go : Nat -> List a -> List (Nat,a)
go _ [] = []
go n (x :: xs) = (n,x) :: go (S n) xs
While this is perfectly fine, we should still note that we might want to do the same thing with the elements of trees, vectors, non-empty lists and so on. And again, we are interested in whether there is some form of abstraction we can use to describe such computations.
Let us for a moment think about how we'd do such a thing
in an imperative language. There, we'd probably define
a local (mutable) variable to keep track of the current
index, which would then be increased while iterating over the list
in a for
- or while
-loop.
In Idris, there is no such thing as mutable state.
Or is there? Remember, how we used a mutable reference
to simulate a data base connection in an earlier
exercise. There, we actually used some truly mutable
state. However, since accessing or modifying a mutable
variable is not a referential transparent operation,
such actions have to be performed within IO
.
Other than that, nothing keeps us from using mutable
variables in our code. The necessary functionality is
available from module Data.IORef
from the base library.
As a quick exercise, try to implement a function, which -
given an IORef Nat
- pairs a value with the current
index and increases the index afterwards.
Here's how I would do this:
pairWithIndexIO : IORef Nat -> a -> IO (Nat,a)
pairWithIndexIO ref va = do
ix <- readIORef ref
writeIORef ref (S ix)
pure (ix,va)
Note, that every time we run pairWithIndexIO ref
, the
natural number stored in ref
is incremented by one.
Also, look at the type of pairWithIndexIO ref
: a -> IO (Nat,a)
.
We want to apply this effectful computation to each element
in a list, which should lead to a new list wrapped in IO
,
since all of this describes a single computation with side
effects. But this is exactly what function traverse
does: Our
input type is a
, our output type is (Nat,a)
, our
container type is List
, and the effect type is IO
!
zipListWithIndexIO : IORef Nat -> List a -> IO (List (Nat,a))
zipListWithIndexIO ref = traverse (pairWithIndexIO ref)
Now this is really powerful: We could apply the same function
to any traversable data structure. It therefore makes
absolutely no sense to specialize zipListWithIndexIO
to
lists only:
zipWithIndexIO : Traversable t => IORef Nat -> t a -> IO (t (Nat,a))
zipWithIndexIO ref = traverse (pairWithIndexIO ref)
To please our intellectual minds even more, here is the same function in point-free style:
zipWithIndexIO' : Traversable t => IORef Nat -> t a -> IO (t (Nat,a))
zipWithIndexIO' = traverse . pairWithIndexIO
All that's left to do now is to initialize a new mutable variable
before passing it to zipWithIndexIO
:
zipFromZeroIO : Traversable t => t a -> IO (t (Nat,a))
zipFromZeroIO ta = newIORef 0 >>= (`zipWithIndexIO` ta)
Quickly, let's give this a go at the REPL:
> :exec zipFromZeroIO {t = List} ["hello", "world"] >>= printLn
[(0, "hello"), (1, "world")]
> :exec zipFromZeroIO (Just 12) >>= printLn
Just (0, 12)
> :exec zipFromZeroIO {t = Vect 2} ["hello", "world"] >>= printLn
[(0, "hello"), (1, "world")]
Thus, we solved the problem of tagging each element with its index once and for all for all traversable container types.
Alas, while the solution presented above is elegant and
performs very well, it still carries its IO
stain, which
is fine if we are already in IO
land, but unacceptable
otherwise. We do not want to make our otherwise pure functions
much harder to test and reason about just for a simple
case of stateful element tagging.
Luckily, there is an alternative to using a mutable reference, which allows us to keep our computations pure and untainted. However, it is not easy to come upon this alternative on one's own, and it can be hard to figure out what's going on here, so I'll try to introduce this slowly. We first need to ask ourselves what the essence of a "stateful" but otherwise pure computation is. There are two essential ingredients:
- Access to the current state. In case of a pure function, this means that the function should take the current state as one of its arguments.
- Ability to communicate the updated state to later stateful computations. In case of a pure function this means, that the function will return a pair of values: The computation's result plus the updated state.
These two prerequisites lead to the following generic
type for a pure, stateful computation operating on state
type st
and producing values of type a
:
Stateful : (st : Type) -> (a : Type) -> Type
Stateful st a = st -> (st, a)
Our use case is pairing elements with indices, which can be implemented as a pure, stateful computation like so:
pairWithIndex' : a -> Stateful Nat (Nat,a)
pairWithIndex' v index = (S index, (index,v))
Note, how we at the same time increment the index, returning the incremented value as the new state, while pairing the first argument with the original index.
Now, here is an important thing to note: While Stateful
is
a useful type alias, Idris in general does not resolve
interface implementations for function types. If we want to
write a small library of utility functions around such a type,
it is therefore best to wrap it in a single-constructor data type and
use this as our building block for writing more complex
computations. We therefore introduce record State
as
a wrapper for pure, stateful computations:
record State st a where
constructor ST
runST : st -> (st,a)
We can now implement pairWithIndex
in terms of State
like so:
pairWithIndex : a -> State Nat (Nat,a)
pairWithIndex v = ST $ \index => (S index, (index, v))
In addition, we can define some more utility functions. Here's
one for getting the current state without modifying it
(this corresponds to readIORef
):
get : State st st
get = ST $ \s => (s,s)
Here are two others, for overwriting the current state. These
corresponds to writeIORef
and modifyIORef
:
put : st -> State st ()
put v = ST $ \_ => (v,())
modify : (st -> st) -> State st ()
modify f = ST $ \v => (f v,())
Finally, we can define three functions in addition to runST
for running stateful computations
runState : st -> State st a -> (st, a)
runState = flip runST
evalState : st -> State st a -> a
evalState s = snd . runState s
execState : st -> State st a -> st
execState s = fst . runState s
All of these are useful on their own, but the real power of
State s
comes from the observation that it is a monad.
Before you go on, please spend some time and try implementing
Functor
, Applicative
, and Monad
for State s
yourself.
Even if you don't succeed, you will have an easier time
understanding how the implementations below work.
Functor (State st) where
map f (ST run) = ST $ \s => let (s2,va) = run s in (s2, f va)
Applicative (State st) where
pure v = ST $ \s => (s,v)
ST fun <*> ST val = ST $ \s =>
let (s2, f) = fun s
(s3, va) = val s2
in (s3, f va)
Monad (State st) where
ST val >>= f = ST $ \s =>
let (s2, va) = val s
in runST (f va) s2
This may take some time to digest, so we come back to it in a
slightly advanced exercise. The most important thing to note is,
that we use every state value only ever once. We must make sure
that the updated state is passed to later computations, otherwise
the information about state updates is being lost. This can
best be seen in the implementation of Applicative
: The initial
state, s
, is used in the computation of the function value,
which will also return an updated state, s2
, which is then
used in the computation of the function argument. This will
again return an updated state, s3
, which is passed on to
later stateful computations together with the result of
applying f
to va
.
This sections consists of two extended exercise, the aim of which is to increase your understanding of the state monad. In the first exercise, we will look at random value generation, a classical application of stateful computations. In the second exercise, we will look at an indexed version of a state monad, which allows us to not only change the state's value but also its type during computations.
-
Below is the implementation of a simple pseudo-random number generator. We call this a pseudo-random number generator, because the numbers look pretty random but are generated predictably. If we initialize a series of such computations with a truly random seed, most users of our library will not be able to predict the outcome of our computations.
rnd : Bits64 -> Bits64 rnd seed = fromInteger $ (437799614237992725 * cast seed) `mod` 2305843009213693951
The idea here is that the next pseudo-random number gets calculated from the previous one. But once we think about how we can use these numbers as seeds for computing random values of other types, we realize that these are just stateful computations. We can therefore write down an alias for random value generators as stateful computations:
Gen : Type -> Type Gen = State Bits64
Before we begin, please note that
rnd
is not a very strong pseudo-random number generator. It will not generate values in the full 64bit range, nor is it safe to use in cryptographic applications. It is sufficient for our purposes in this chapter, however. Note also, that we could replacernd
with a stronger generator without any changes to the functions you will implement as part of this exercise.-
Implement
bits64
in terms ofrnd
. This should return the current state, updating it afterwards by invoking functionrnd
. Make sure the state is properly updated, otherwise this won't behave as expected.bits64 : Gen Bits64
This will be our only primitive generator, from which we will derived all the others. Therefore, before you continue, quickly test your implementation of
bits64
at the REPL:Solutions.Traverse> runState 100 bits64 (2274787257952781382, 100)
-
Implement
range64
for generating random values in the range[0,upper]
. Hint: Usebits64
andmod
in your implementation but make sure to deal with the fact thatmod x upper
produces values in the range[0,upper)
.range64 : (upper : Bits64) -> Gen Bits64
Likewise, implement
interval64
for generating values in the range[min a b, max a b]
:interval64 : (a,b : Bits64) -> Gen Bits64
Finally, implement
interval
for arbitrary integral types.interval : Num n => Cast n Bits64 => (a,b : n) -> Gen n
Note, that
interval
will not generate all possible values in the given interval but only such values with aBits64
representation in the the range[0,2305843009213693950]
. -
Implement a generator for random boolean values.
-
Implement a generator for
Fin n
. You'll have to think carefully about getting this one to typecheck and be accepted by the totality checker without cheating. Note: Have a look at functionData.Fin.natToFin
. -
Implement a generator for selecting a random element from a vector of values. Use the generator from exercise 4 in your implementation.
-
Implement
vect
andlist
. In case oflist
, the first argument should be used to randomly determine the length of the list.vect : {n : _} -> Gen a -> Gen (Vect n a) list : Gen Nat -> Gen a -> Gen (List a)
Use
vect
to implement utility functiontestGen
for testing your generators at the REPL:testGen : Bits64 -> Gen a -> Vect 10 a
-
Implement
choice
.choice : {n : _} -> Vect (S n) (Gen a) -> Gen a
-
Implement
either
.either : Gen a -> Gen b -> Gen (Either a b)
-
Implement a generator for printable ASCII characters. These are characters with ASCII codes in the interval
[32,126]
. Hint: Functionchr
from the Prelude will be useful here. -
Implement a generator for strings. Hint: Function
pack
from the Prelude might be useful for this.string : Gen Nat -> Gen Char -> Gen String
-
We shouldn't forget about our ability to encode interesting things in the types in Idris, so, for a challenge and without further ado, implement
hlist
(note the distinction betweenHListF
andHList
). If you are rather new to dependent types, this might take a moment to digest, so don't forget to use holes.data HListF : (f : Type -> Type) -> (ts : List Type) -> Type where Nil : HListF f [] (::) : (x : f t) -> (xs : HLift f ts) -> HListF f (t :: ts) hlist : HListF Gen ts -> Gen (HList ts)
-
Generalize
hlist
to work with any applicative functor, not justGen
.
If you arrived here, please realize how we can now generate pseudo-random values for most primitives, as well as regular sum- and product types. Here is an example REPL session:
> testGen 100 $ hlist [bool, printableAscii, interval 0 127] [[True, ';', 5], [True, '^', 39], [False, 'o', 106], [True, 'k', 127], [False, ' ', 11], [False, '~', 76], [True, 'M', 11], [False, 'P', 107], [True, '5', 67], [False, '8', 9]]
Final remarks: Pseudo-random value generators play an important role in property based testing libraries like QuickCheck or Hedgehog. The idea of property based testing is to test predefined properties of pure functions against a large number of randomly generated arguments, to get strong guarantees about these properties to hold for all possible arguments. One example would be a test for verifying that the result of reversing a list twice equals the original list. While it is possible to proof many of the simpler properties in Idris directly without the need for tests, this is no longer possible as soon as functions are involved, which don't reduce during unification such as foreign function calls or functions not publicly exported from other modules.
-
-
While
State s a
gives us a convenient way to talk about stateful computations, it only allows us to mutate the state's value but not its type. For instance, the following function cannot be encapsulated inState
because the type of the state changes:uncons : Vect (S n) a -> (Vect n a, a) uncons (x :: xs) = (xs, x)
Your task is to come up with a new state type allowing for such changes (sometimes referred to as an indexed state data type). The goal of this exercise is to also sharpen your skills in expressing things at the type level including derived function types and interfaces. Therefore, I will give only little guidance on how to go about this. If you get stuck, feel free to peek at the solutions but make sure to only look at the types at first.
-
Come up with a parameterized data type for encapsulating stateful computations where the input and output state type can differ. It must be possible to wrap
uncons
in a value of this type. -
Implement
Functor
for your indexed state type. -
It is not possible to implement
Applicative
for this indexed state type (but see also exercise 2.vii). Still, implement the necessary functions to use it with idom brackets. -
It is not possible to implement
Monad
for this indexed state type. Still, implement the necessary functions to use it in do blocks. -
Generalize the functions from exercises 3 and 4 with two new interfaces
IxApplicative
andIxMonad
and provide implementations of these for your indexed state data type. -
Implement functions
get
,put
,modify
,runState
,evalState
, andexecState
for the indexed state data type. Make sure to adjust the type parameters where necessary. -
Show that your indexed state type is strictly more powerful than
State
by implementingApplicative
andMonad
for it.Hint: Keep the input and output state identical. Note also, that you might need to implement
join
manually if Idris has trouble inferring the types correctly.
Indexed state types can be useful when we want to make sure that stateful computations are combined in the correct sequence, or that scarce resources get cleaned up properly. We might get back to such use cases in later examples.
-
After our excursion into the realms of stateful computations, we
will go back and combine mutable state with error accumulation
to tag and read CSV lines in a single traversal. We already
defined pairWithIndex
for tagging lines with their indices.
We also have uncurry $ hdecode ts
for decoding single tagged lines.
We can now combine the two effects in a single computation:
tagAndDecode : (0 ts : List Type)
-> CSVLine (HList ts)
=> String
-> State Nat (Validated CSVError (HList ts))
tagAndDecode ts s = uncurry (hdecode ts) <$> pairWithIndex s
Now, as we learned before, applicative functors are closed under
composition, and the result of tagAndDecode
is a nesting
of two applicatives: State Nat
and Validated CSVError
.
The Prelude exports a corresponding named interface implementation
(Prelude.Applicative.Compose
), which we can use for traversing
a list of strings with tagAndDecode
.
Remember, that we have to provide named implementations explicitly.
Since traverse
has the applicative functor as its
second constraint, we also need to provide the first
constraint (Traversable
) explicitly. But this
is going to be the unnamed default implementation! To get our hands on such
a value, we can use the %search
pragma:
readTable : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
readTable ts = evalState 1 . traverse @{%search} @{Compose} (tagAndDecode ts)
This tells Idris to use the default implementation for the
Traversable
constraint, and Prelude.Applicatie.Compose
for the
Applicative
constraint.
While this syntax is not very nice, it doesn't come up too often, and
if it does, we can improve things by providing custom functions
for better readability:
traverseComp : Traversable t
=> Applicative f
=> Applicative g
=> (a -> f (g b))
-> t a
-> f (g (t b))
traverseComp = traverse @{%search} @{Compose}
readTable' : (0 ts : List Type)
-> CSVLine (HList ts)
=> List String
-> Validated CSVError (List $ HList ts)
readTable' ts = evalState 1 . traverseComp (tagAndDecode ts)
Note, how this allows us to combine two computational effects (mutable state and error accumulation) in a single list traversal.
But I am not yet done demonstrating the power of composition. As you showed
in one of the exercises, Traversable
is also closed under composition,
so a nesting of traversables is again a traversable. Consider the following
use case: When reading a CSV file, we'd like to allow lines to be
annotated with additional information. Such annotations could be
mere comments but also some formatting instructions or other
custom data tags might be feasible.
Annotations are supposed to be separated from the rest of the
content by a single hash character (#
).
We want to keep track of these optional annotations
so we come up with a custom data type encapsulating
this distinction:
data Line : Type -> Type where
Annotated : String -> a -> Line a
Clean : a -> Line a
This is just another container type and we can
easily implement Traversable
for Line
(do this yourself as
a quick exercise):
Functor Line where
map f (Annotated s x) = Annotated s $ f x
map f (Clean x) = Clean $ f x
Foldable Line where
foldr f acc (Annotated _ x) = f x acc
foldr f acc (Clean x) = f x acc
Traversable Line where
traverse f (Annotated s x) = Annotated s <$> f x
traverse f (Clean x) = Clean <$> f x
Below is a function for parsing a line and putting it in its correct category. For simplicity, we just split the line on hashes: If the result consists of exactly two strings, we treat the second part as an annotation, otherwise we treat the whole line as untagged CSV content.
readLine : String -> Line String
readLine s = case split ('#' ==) s of
h ::: [t] => Annotated t h
_ => Clean s
We are now going to implement a function for reading whole CSV tables, keeping track of line annotations:
readCSV : (0 ts : List Type)
-> CSVLine (HList ts)
=> String
-> Validated CSVError (List $ Line $ HList ts)
readCSV ts = evalState 1
. traverse @{Compose} @{Compose} (tagAndDecode ts)
. map readLine
. lines
Let's digest this monstrosity. This is written in point-free
style, so we have to read it from end to beginning. First, we
split the whole string at line breaks, getting a list of strings
(function Data.String.lines
). Next, we analyze each line,
keeping track of optional annotations (map readLine
).
This gives us a value of type List (Line String)
. Since
this is a nesting of traversables, we invoke traverse
with a named instance from the Prelude: Prelude.Traversable.Compose
.
Idris can disambiguate this based on the types, so we can
drop the namespace prefix. But the effectful computation
we run over the list of lines results in a composition
of applicative functors, so we also need the named implementation
for compositions of applicatives in the second
constraint (again without need of an explicit
prefix, which would be Prelude.Applicative
here).
Finally, we evaluate the stateful computation with evalState 1
.
Honestly, I wrote all of this without verifying if it works, so let's give it a go at the REPL. I'll provide two example strings for this, a valid one without errors, and an invalid one. I use multiline string literals here, about which I'll talk in more detail in a later chapter. For the moment, note that these allow us to conveniently enter string literals with line breaks:
validInput : String
validInput = """
f,12,-13.01#this is a comment
t,100,0.0017
t,1,100.8#color: red
f,255,0.0
f,24,1.12e17
"""
invalidInput : String
invalidInput = """
o,12,-13.01#another comment
t,100,0.0017
t,1,abc
f,256,0.0
f,24,1.12e17
"""
And here's how it goes at the REPL:
Tutorial.Traverse> readCSV [Bool,Bits8,Double] validInput
Valid [Annotated "this is a comment" [False, 12, -13.01],
Clean [True, 100, 0.0017],
Annotated "color: red" [True, 1, 100.8],
Clean [False, 255, 0.0],
Clean [False, 24, 1.12e17]]
Tutorial.Traverse> readCSV [Bool,Bits8,Double] invalidInput
Invalid (Append (FieldError 1 1 "o")
(Append (FieldError 3 3 "abc") (FieldError 4 2 "256")))
It is pretty amazing how we wrote dozens of lines of code, always being guided by the type- and totality checkers, arriving eventually at a function for parsing properly typed CSV tables with automatic line numbering and error accumulation, all of which just worked on first try.
The Prelude provides three additional interfaces for
container types parameterized over two type parameters
such as Either
or Pair
: Bifunctor
, Bifoldable
,
and Bitraversable
. In the following exercises we get
some hands-one experience working with these. You are
supposed to look up what functions they provide
and how to implement and use them yourself.
-
Assume we'd like to not only interpret CSV content but also the optional comment tags in our CSV files. For this, we could use a data type such as
Tagged
:data Tagged : (tag, value : Type) -> Type where Tag : tag -> value -> Tagged tag value Pure : value -> Tagged tag value
Implement interfaces
Functor
,Foldable
, andTraversable
but alsoBifunctor
,Bifoldable
, andBitraversable
forTagged
. -
Show that the composition of a bifunctor with two functors such as
Either (List a) (Maybe b)
is again a bifunctor by defining a dedicated wrapper type for such compositions and writing a corresponding implementation ofBifunctor
. Likewise forBifoldable
/Foldable
andBitraversable
/Traversable
. -
Show that the composition of a functor with a bifunctor such as
List (Either a b)
is again a bifunctor by defining a dedicated wrapper type for such compositions and writing a corresponding implementation ofBifunctor
. Likewise forBifoldable
/Foldable
andBitraversable
/Traversable
. -
We are now going to adjust
readCSV
in such a way that it decodes comment tags and CSV content in a single traversal. We need a new error type to include invalid tags for this:data TagError : Type where CE : CSVError -> TagError InvalidTag : (line : Nat) -> (tag : String) -> TagError Append : TagError -> TagError -> TagError Semigroup TagError where (<+>) = Append
For testing, we also define a simple data type for color tags:
data Color = Red | Green | Blue
You should now implement the following functions, but please note that while
readColor
will need to access the current line number in case of an error, it must not increase it, as otherwise line numbers will be wrong in the invocation oftagAndDecodeTE
.readColor : String -> State Nat (Validated TagError Color) readTaggedLine : String -> Tagged String String tagAndDecodeTE : (0 ts : List Type) -> CSVLine (HList ts) => String -> State Nat (Validated TagError (HList ts))
Finally, implement
readTagged
by using the wrapper type from exercise 3 as well asreadColor
andtagAndDecodeTE
in a call tobitraverse
. The implementation will look very similar toreadCSV
but with some additional wrapping and unwrapping at the right places.readTagged : (0 ts : List Type) -> CSVLine (HList ts) => String -> Validated TagError (List $ Tagged Color $ HList ts)
Test your implementation with some example strings at the REPL.
You can find more examples for functor/bifunctor compositions in Haskell's bifunctors package.
Interface Traversable
and its main function traverse
are incredibly
powerful forms of abstraction - even more so, because both Applicative
and Traversable
are closed under composition. If you are interested
in additional use cases, the publication, which
introduced Traversable
to Haskell, is a highly recommended read:
The Essence of the Iterator Pattern
The base library provides an extended version of the
state monad in module Control.Monad.State
. We will look
at this in more detail when we talk about monad transformers.
Please note also, that IO
itself is implemented as a
simple state monad
over an abstract, primitive state type: %World
.
Here's a short summary of what we learned in this chapter:
- Function
traverse
is used to run effectful computations over container types without affecting their size or shape. - We can use
IORef
as mutable references in stateful computations running inIO
. - For referentially transparent computations with "mutable"
state, the
State
monad is extremely useful. - Applicative functors are closed under composition, so we can run several effectful computations in a single traversal.
- Traversables are also closed under composition, so we can
use
traverse
to operate on a nesting of containers.
For now, this concludes our introduction of the Prelude's
higher-kinded interfaces, which started with the introduction of
Functor
, Applicative
, and Monad
, before moving on to Foldable
,
and - last but definitely not least - Traversable
.
There's one still missing - Alternative
- but this will
have to wait a bit longer, because we need to first make
our brains smoke with some more type-level wizardry.