Skip to content

encodings guide

JohanWiltink edited this page Mar 4, 2024 · 6 revisions

Lambda Calculus Encodings Reference Guide

Church encoding

Church encoding encodes values by encoding flow control. Applied operations and values are automatically applied to nested values. Short-circuiting is enabled by laziness, but is limited.

For non-recursive datatypes, this coincides ( up to argument ordering ) with Scott encoding.

Church numerals

zero = \ f x . x
succ = \ n . \ f x . f (n f x)

The resulting value is not just a number, it is an iterator ( for loop ). Applying an operation and a value results in the operation being applied n times to the value.

Eg. n tail xs can be interpreted directly as drop n xs, without intermediately decreasing the number and checking if it is zero or not.

In practice, pred is an expensive operation, but add, mul and pow are inexpensive ( full evaluation is still O(n) ).

Church booleans

True  = \ true _false . true
False = \ _true false . false

Church booleans choose between two options. By convention, Church booleans name their True value first.

p x y can be interpreted directly as if p then x else y, where the whole expression has the type of either x or y ( x and y having different types makes for complicated control flow ).

Church pairs

Pair = \ x y fn . fn x y
fst  = \ xy . xy \ x _y . x
snd  = \ xy . xy \ _x y . y
uncurry = \ fn xy . xy fn

A curried function can be applied to a pair postfix directly, but rewriting the expression for prefix application makes composition simpler.

Church lists

nil = \ _cons nil . nil
cons = \ x xs . \ cons nil . cons x (xs cons nil)
null = \ xs . xs ( \ _head _tail . False ) True

Church lists are represented by their right fold.

In practice, tail is an expensive operation, but snoc, append and concat are inexpensive ( full evaluation is still O(n) ).

Scott encoding

Scott encoding, rather than encoding flow control, encodes values directly by encoding pattern matching. For every data constructor, there is a corresponding function argument of the same arity in the lambda calculus constructor function.

Scott encoding allows applying different operations and values at different levels of recursive nesting. Wrapper operators have to be written to pass operations down the recursion, and thus encoding flow control.

Scott numerals

zero = \ zero _succ . zero
succ = \ n . \ _zero succ . succ n

This encodes Number = Zero | Succ Number.

Scott booleans

False = \ false _true . false
True  = \ _false true . true

This coincides with Church encoding, being an enumerated, non-recursive type ( Boolean = False | True ), but ordering is according to False < True.

Scott pairs

Pair = \ x y fn . fn x y

This coincides with Church encoding, being a non-recursive type ( (,) a b = (,) a b, or (a,b) = (a,b) with syntactic sugar ).

Scott lists

nil = \ nil _cons . nil
cons = \ x xs . \ _nil cons . cons x xs
head = \ xs . xs () \ x _xs . x
tail = \ xs . xs () \ _x xs . xs

This encodes [] a = Nil | Cons a ([] a), or, with syntactic sugar, [a] = [] | a : [a]. Again, note how arguments are ordered according to [] < [_].

Destructuring is natural because operations do not flow down the recursion.

Scott orderings

LT = \ lt _eq _gt . lt
EQ = \ _lt eq _gt . eq
GT = \ _lt _eq gt . gt

This implements Ordering = LT | EQ | GT, and coincides with a Church encoding.

Scott option types

Nothing = \ nothing _just . nothing
Just = \ x . \ _nothing just . just x
# maybe :: z -> (a -> z) -> Maybe a -> z
maybe = \ nothing just maybeX . maybeX nothing just

This implements Maybe a = Nothing | Just a. Note how Nothing is a nullary function but Just is unary. Ordering of arguments is according to Nothing < Just _.

The standard library names this Option a = None | Some a ( and option ).

A Church encoding would coincide ( but probably reorder arguments to \ just nothing .. sigh ).

Scott sum types

Left  = \ x . \ left _right . left x
Right = \ x . \ _left right . right x
# either :: (a -> z) -> (b -> z) -> Either a b -> z
either = \ left right eitherX . eitherX left right

This implements Either a b = Left a | Right b. Note how both constructors are unary. A Church encoding would coincide.

Scott binary encoded numerals

The conceptual datatype for this is Number = NoMoreBits | UnsetBit SmallerNumber | SetBit SmallerNumber, with constructors ordered for easy comparisons.

The implementation uses Little-Endian ordering of an arbitrary number of bits, and maintains an invariant that the most significant bit, if present, is set.

An unset bit means an even number ( from that point to the right ); a set bit means an odd number; the portion to the right is the encoded number integer-divided by 2. The structure is equivalent to a bitwise trie ( prefix tree ).

Because encoding is Little-Endian, a shift right ( and shifting in a bit on the left ) is a multiplication of a number, adding either 0 or 1 onto the doubling.

The invariant can be intermediarily broken ( trailing zero bits, cf. leading zeroes in Big-Endian representations, do not affect the encoded value ), though implementations may have to have special handling in place for denormal numbers for certain operations. A normal zero is represented by the empty string, without a mathematically spurious zero bit for display-to-human purposes. Unless disabled, the compiler enforces the invariant for numbers that are exported to JavaScript.

zero = \ end _even _odd . end
shiftR0 = \ n . \ _end even _odd . even n
shiftR1 = \ n . \ _end _even odd . odd n

Implementing shift-left, which does not use recursion. Note that a shift left is a division of a number in Little-Endian.

# shiftL :: Number -> Number
shiftL = \ n . n zero I I
# alternatively
shiftL = \ n . n n I I

Flow control ( times ) is implemented similar to a right fold over a list ( foldr = \ fn z xs . xs z \ x xs . fn x (foldr fn z xs); times argument ordering follows Number constructor order ), with one initial value ( end ) but two functions acting on even and odd numbers, performing some doubling and some doubling-plus-one, respectively. even and odd get the accumulator as an argument but not the actual bit, which is implicit. All arguments are explicitly applied to times recursively.

# times :: Number -> z -> (z -> z) -> (z -> z) -> z
times = \ n . \ end even odd . n end ( \ z . even (times z end even odd) ) ( \ z . odd (times z end even odd) )

Implementing to-Int with times. Note how recursion is implicit.

# to-Int :: Number -> Int
to-Int = \ n . times n 0 ( \ n-div-2 . 2 * n-div-2 ) ( \ n-div-2 . 1 + 2 * n-div-2 )

Short-circuiting the recursion in times is possible, so a predicate is-even can be written using times

# is-even :: Number -> Boolean
is-even = \ n . times n True (K True) (K False)

but inlining times gives a simpler result.

# is-even :: Number -> Boolean
is-even = \ n . n True (K True) (K False)