Skip to content

Latest commit

 

History

History
582 lines (361 loc) · 16.4 KB

(un)folding-programs.org

File metadata and controls

582 lines (361 loc) · 16.4 KB

(Un)folding Programs

Boom Hierarchy

One framework to begin with which kinds of pulls in together a lot of interesting ideas here is the Boom hierarchy.

It is a structure between presence/absence of 4 properties giving us 2⁴ = 16 structures.

Write about how they can be classified as { monoids, semigroups, trees, non-empty trees }

TODO: Find out what the best structures to characterize the trees and their variants (idempotent, non-empty) are.

TODO: Insert a diagram on this here.

Also, show how some are forgetful and some are free.

Talk about the idea of adjunctions.

Probably not the right place to expand out on Galois theory, but a very broad strokes overview of Galois theory might help here. Just keep the salient bits for computation here. I can then point to resources which then elaborate this.

Abstraction Interpretation, Noson Yanofsky / Rohith Parikh work.

Discuss about the free vs. forgetful distinction in category theory.

Talk about why ++ is a free structure while Last or +, * are not.

Discuss about the practical implication of having a free structure in programming.

Discuss about the properties: associativity, commutativity, distributivity, idempotency, reflexivity, symmetry, transitivity

TODO: Describe how can logic be used to capture the various enumerative combinatorics of the structures?

Quantifying structures using logical operators and another approach using generating functions.

TODO: Tie in groupoidification, combinatory species, generating functionology, orthogonal polynomials, analytic functors, and 12 fold way once you investigate them deeper here. Maintain only the contextually relevance details when describing: we are using them to understand computation and how they are related to program derivation.

Lattices

These are nice structures. They can be thought of as giving a principled way to locate the evolution of an entity through multiple ways. And by the nature of its axiomatization, it is ensured that each stage in this evolution is tracked in a way that the commonalities between the branches of evolution converge to common points and then converge/diverge out from there. This is the process interpretation of the lattices. If we think about the structure interpretation, the best way to understand them is as closures and kernels of the problem domain.

TODO: I have to understand the idea of closures/kernels, implication/conjunction, and their counterparts in algebraic geometry and how it connects up with the composition of the lattice.

There are at least two dualities in play here

Intersection / Union duality

Chain / Antichain duality

TODO: Describe how they are related to partial orders.

How you can do closures and kernels on these structures to generate transitive reduction and transitive closure. What does this mean in terms of logic?

What role does the lattice play?

How did the idea of lattices emerge?

Dualgruppen idea

Link with how splitting fields and group extensions idea gave birth to Galois theory

TODO: Investigate why Peirce was after grounding some kind of duality using lattice like structures. TODO: Investigate how it connects with Schröder’s work TODO: Link with the history papers here TODO: Link with the work of Dedekind and how he leveraged the idea of ideals/filters and identified the concept of PID

Ideals/Filters

Link to articles like that of Sowa and ISKO

TODO: Think about how lattices are connected to quantifiers and quantifier elimination. TODO: Talk about how they are related to sheaves/fibres/sections/stack etc.

Talk about how adjunctions between lattices give you closure/kernel operators and this is mirrored by monads and comonads.

How lattices connect up with the idea of eigenvalue decomposition and extended Euclidean algorithm

Briefly mention about the connection between Fourier Transforms, Bezout theorem, Euclidean GCD, Singular Value Decomposition, Krohn-Rhodes theory, Eigenvalue Decomposition etc.

https://twitter.com/LdaFCosta/status/1454520031132459008 https://sridharramesh.github.io/HowSridharThinks/math/BezoutEtc.html

Formal Concept Analysis

With formal concept analysis, you get a way to partition out the problem space in such a way that the components are linearly independent. You attempt to coordinatize the space in a way that your axes, like eigenvectors spans the whole space so that when you decompose them into subcomponents, the intertwining is reduced and what you get out is orthogonal components which you can linearly recombine to recover the original space. They are in some sense the necessary and sufficient concepts to describe the space in a parsimonious manner. So an equivalence class with maximal information, i.e. the least amount of redundancy (entropy?) needed to describe the original space. It also prevents a skewed coordinate system which might cause overlap of two things on one idea or the converse of one thing as two ideas.

Mention about Duality of Structure and Process from Anna Sfard’s paper and how it connects up with the idea of right and left fold having time / space tradeoffs.

Eigenvalues describing the variance in a distribution

Covariance/correlation of distributions

Lattice homomorphisms

Sets ~ Distributive? Lattices Ordered Sets ~ Matrices Multisets ~ Modular? Lattices Lists ~ Hypermatrices

One of my realization lately has been that matrices model functions: https://twitter.com/prathyvsh/status/1459936336182452233 So I think matrices represent ordered sets in the BOOM hierarchy because you have a certain sense of direction with domain mapping to codomain. With lattices, what you are representing are the multisets/sets which can have different overlaps / elisions within them. I will gladly elaborate on this if you have any doubts on this. But I think I

Monotonicity/continuity

Program calculation

Discuss about the two schools: Eindhoven / Oxford

Talk about their difference

Talk about the general approach in a schematic manner once you investigate it.

TODO: Read that Backhouse paper and investigate how Eindhoven style relates to Oxford style

Stalactite vs. Stalagmite distinction in quotients vs. subgroups

Injectivity vs. Surjectivity

Two aspects of Homomorphism

How this is also reflected in the canonical decomposition of a function and in formal concept analysis style disjunction

Richard Bird Oxford, 16th and 17th December 1986

A calculus used for deriving efficient solutions to certain kinds of problems in computation. A set of laws, lemmas, and theorems in the same sense of integral calculus. It does not stand for a formal systems with axioms and inference rules. Semantic and foundational issues are not touched upon.

Theory of expression trees are waiting to be organized.

Program specification / transformation

General laws which enable programs to be calculated.

Richard Bird has been dissatisfied about the lack of penetrating results such as in integral calculus that scientists and engineers use daily.

Only two players: Richard Bird and Lambert Meertens

Dijkstra, Grieves, Backhouse program calculation with invariant assertions

Difference between Eindhoven school is that of style not of objective.

Eindhoven use imperative notation andd uses predicate calculus as the main tool and many of the results are presented using arrays.

Bird-Meertens is: Functional notation, specialized functional calculus and considers lists as a more basic data structure than arrays.

Summary:

First Lecture

Notation suggested by David Turner

Second Lecture

First example of a calculational proof and some further notation

Third Lecture

Problems about segments of lists

Fourth Lecture

Problems about partitions of lists

Fifth Lecture

More material and further examples

1 List Notation

Lists: Ordered and homogenous

[] [‘a’] [‘a’,’p’,’p’,’l’,’e’]

2 Convention

a, b, c … elements of lists x, y, z … lists xs, ys, zs … lists of lists

3 Length

#[a1,a2,…an] = n

4 Concatenation

++ :: [ α ] × [ α ] → [ α ] Associativity: (x ++ y) ++ z = x ++ ( y ++ z ) Identity: x ++ [] = [] ++ x = x

(#, ++ distribution)

5 Map

:: ( α → β ) × [ α ] → [ β ]

f * [ a₁, a₂, … , aₓ ] = [ fa₁ , fa₂, … faₓ]

f * ( x ++ y ) = ( f * x ) ++ ( f * y ) { *, ++ distributivity }

(f • g) * = (f * ) • ( g * ) { *, • dist }

( f * )⁻¹ = ( f⁻¹ * ) { *, ⁻¹ comm }

6 Notational interlude

Let ⊕ :: ( ⍺ ✕ β ) → 𝛾

(a ⊕ ) :: β → 𝛾 | ( a ⊕ ) b = a ⊕ b

( ⊕ b ) :: ⍺ → 𝛾 | ( ⊕ b ) a = a ⊕ b

( f * ) f-map function ( + 1 ) successor function ( ++ [a] ) append a function

Function application is left-associative and has highest precedence

f x y + 3 = ((f x) y) + 3

7 Filter

◁ :: ( α → Bool ) × [ ⍺ ] → [ ⍺ ] p ◁ x the sublists of elements of x satisfying p

even ◁ [1 .. 10] = [2, 4, 6, 8, 10]

p ◁ ( x ++ y) = (p ◁ x) ++ (p ◁ y) {◁, ++ dist }

(p ◁ ) • (p ◁) = (p ◁) { ◁ idem } (p ◁) • (q ◁) = (q ◁) • (p ◁) { ◁ comm } (For total functions)

(p ◁) • (f *) = (f *) • ((p • f) ◁) { ◁, • comm }

8 Reduce

Borrowed from APL

Operator which takes an operator / :: ( ⍺ × ⍺ → ⍺) × [ ⍺ ] → ⍺ ⊕ / [ a₁, a₂, … , aₓ ] = a₁ ⊕ a₂ ⊕ … ⊕ aₓ Only defined if ⊕ is associative

Laws

⊕ / [ a ] = a { / singletons } ⊕ / (x ++ y) = (⊕ / x) ⊕ (⊕ / y) { / dist }

If ⊕ has an identity element e, then ⊕ / [] = e otherwise, ⊕ / [] is not defined

⊕ / y = ⊕ / ( [] ++ y ) = (⊕ / [] ) ⊕ (⊕ / y) = e ⊕ (⊕/y)

9 Examples

sum = + / product = × /

n! = x / [1 .. n]

flatten = ++ /

flatten [[1, 2], [], [2, 3]] = [1, 2, 2, 3]

min = ↓ / max = ↑ /

head = << / last = >> /

all p = (˄ / ) • (p •) some p = (˅ / ) • (p •)

10 Promotion Lemmas

Generalize the distribution laws of map, filter, reduce

(f *) • (++ /) = (++ /) • ((f *) *) { * promotion }

fmap to a flatten list is the same as flatten map of f map map.

f*(++ / [ x₁, x₂, … , xₓ ] = f*(x₁ ++ x₂ ++ … ++ xₓ) = (f*x₁) ++ f(x₂) ++ … ++ f(xₓ) = ++ / [f*x₁, f(x₂), … , f(xₓ) ] = ++ / (f*)* [ x₁, x₂, … , xₓ ]

Rather than flattening Promote the map into each component list and then flatten the result

( p ◁ ) • (++ /) = (++ /) • ( p ◁ * ) { ◁ promotion }

( ⊕ ◁ ) • (++ /) = (⊕ /) • ( ⊕ / * ) { ⊕ promotion }

11 Homomorphisms

A function that preserves the properties of associativity and identity e.

h [] = e h (x ++ y) = h x ⊕ h y

Equivalently, if h • ( ++ / ) = (⊕ / ) • (h *)

12 Homomorphism Lemma

h is a homomorphism iff h = (⊕ / ) • (f *) for some ⊕ and f.

Proof

Suppose h = (⊕ / ) • (f *) Then h • (++ /) = (⊕ / ) • (f *) • ( ++ / ) { hypothesis } = (⊕ / ) • (++ /) • ( (f *) * ) { *-promotion } = (⊕ / ) • (+(⊕ / ) *) • ( (f *) * ) { /-promotion } = (⊕ / ) • (+(⊕ / ) *) • ( (f *) * ) { *, • dist } = (⊕ / ) • (h *) { hypothesis }

Second, define □ a = [a]

so (++ /) • (□ *) = id

Now h = h • (++ /) • ( □ * ) { definition of □ } = (⊕ /) • (h *) • (□ *) { h is a homomorphism} = (⊕ /) • (f *) { *, • dist }

where f = h • □ Hence h = (⊕ /) • (f *) for suitable ⊕ and f.

13 Examples of homomorphisms

Filter is a homomorphism (p ◁) = (++ /) • (f_p *) where f_p a = [a] if p a = [] otherwise

K is the K combinator for combinatory calculus

sort = (merge /) • (□ *)

reverse = (++~ /) • (□ *) where a ⊕~ b = b ⊕ a

14 Lemma

a ⊕ b = h(h⁻¹ a ++ h⁻¹ b)

Then h (x ++ y) = h x ⊕ h y

Try and solve a problem by looking for a homomorphism

15 Text Processing

Text = [ Char ] Line = [ Char \ { NL } ]

unlines :: [Line]⁺ → Text unlines = ⊕ / x ⊕ y = x ++ [ NL ] ++ y

lines is an injective function

lines :: Text → [ Line ] lines • unlines = id

Problem: give a constructive definition of lines

Since line is an injective function (intuitively at least)

lines = (⊗ / ) • (f *)

Direct calculation yields:

f a = [[], []] if a = NL = a, otherwise

(xs ++ [x]) ⊗ ([y] ++ ys) = xs ++ [x ++ y] ++ ys

16 More text-processing

Word = [ Char \ { SP, NL ]⁺ Para = [ Line⁺ ]⁺

unwords :: [ Word ]⁺ → Line unwords = ⊕SP / x ⊕sp y = x ++ [ SP ] ++ y

words :: Line → [ Word ] words = (( ≠ [] ) ◁ ) • ( ⊗ / ) • (f_SP * )

unparas :: [ Para ]⁺ → [ Lines ] unparas = ⊕[] / x ⊕[] y = x ++ [ [] ] ++ y

paras :: [ Line ] → [ Para ] paras = (( ≠ [] ) ◁ ) • ( ⊗ / ) • (f_[] * )

17 Examples of use

countlines = # • lines

countwords = # • (++ / ) • (words * ) • lines

countparas = # • paras • lines

normalise :: Text → Text normalize = unparse • parse

parse = ( ( words * ) * ) • paras • lines unparse = unlines • unparas • ( ( unwords * ) * )

Unparse is correct because

( f • g )⁻¹ = g⁻¹ • f⁻¹ (f *)⁻¹ = (f⁻¹ *)

20 Directed Reduction

←/- right-reduce -/→ left-reduce

(⊕ ←/-)[ a₁, a₂, … , aₓ ] = a₁ ⊕( a₂ ⊕ … (aₓ⊕e)) The ⊕ need not be associative and doesn’t need to have a unit

←/- :: (( ⍺ × β → β ) × β ) → [ ⍺ ] → β -/→ :: (( β × ⍺ → β) × β ) → [ ⍺ ] → β

(⊕ -/→)[ a₁, a₂, … , aₓ ] = (((e⊕a₁)⊕ a₂) ⊕ … aₓ)

Why do we need more reductions? Because they are implementations of the fold function and are closer to what can be achieved by machines.

(f * ) (⊕ ←/- []) where a ⊕ x = [ f a ] ++ x

21 Duality Lemma

(⊕ -/→) = (⊕~ ←/-) • reverse

Example: fact n = x / [ 1 … n ] fact n = ( × -/→ 1 ) [1 … n] “going up”

fact n = (× ←/- 1) [ 1 … n ] = ( ×~ -/→ 1) [n, n-1 … 1] = (× -/→ 1) [n, n-1, … 1] “going down”

22 Specialization Lemma

Every homomorphism can be written as a left-reduction, or as a right reduction.

( ⊙ / ) • ( f * ) = (⊕ ←/- e) = (⊗ -/→ e) where a ⊕ b = f a ⊙ b a ⊗ b = a ⊙ f b

and e is the identity element of ⊙.

22.5 Cons

(:) :: ⍺ × [ ⍺ ] → [ ⍺ ] a : x = [ a ] ++ x x ++ y = (: ←/- y ) x

Either : or ++ can be taken as primitive, but unlike ++, every list can be expressed/constructed in terms of [] and : in exactly one way.

Cons has a canonical form, ++ has many ways in which the same thing can be expressed.

23 Recursive characterisation

(⊕ ←/- e) [] = e

(⊕ ←/- e) ( [a] ++ x ) = a ⊕ (⊕ ←/- e) x

We can say f = (⊕ ←/- e) is the solution of:

f [] = e f([a] ++ x) = a ⊕ fx

Progress of computation is recursive.

Similarly,

(⊕ -/→ e)[] = e (⊕ -/→ e)(x ++ [a] ) = (⊕ -/→ e) x ⊕ a

But also (⊕ -/→ e) [] = e (⊕ -/→ e)([a] ++ x) = (⊕ -/→ (e ⊕ a) ) x

So f = (⊕ -/→ e)(⊕ -/→ e) is the solution of f x = g e x g e [] = e g e([a] ++ x) = g(e ⊕ a) x

So left reductions have the advantage that they give an iterative notion of programming. They are immediately expressing in terms of imperative notions:

y := (⊕ -/→ e) x ⟹ y := e; for a ← x do y := y ⊕ a

24 Efficiency Consideration

In a functional programming language ←/- can be more time-efficient than -/→. ←/- can be more space-efficient than -/→.

Recall a << b = a and a >> b = b (<< ←/- e)[1, 2, 3]

(⊕ ←/- e)([a] ++ x) = a ⊕ (⊕ ←/- e) x (⊕ -/→ e)([a] ++ x) = (⊕ -/→ (e ⊕ a)) x

(<< ←/- e)[1, 2, 3] = 1 << (<< ←/- e)[2, 3] (←/- .2) = 1 (<<.1) This can terminate after one step

(>> -/→ e)[3, 2, 1] = (>> -/→ (e >> 3))[2, 1] (-/→ .2) = (>> -/→ 3)[2, 1] (>>.1) = (>> -/→ (3 >> 2))[1] (-/→ .2) = (>> -/→ 2)[1] (>>.1) = (>> -/→ ( 2 >> 1))[] (-/→ .2) = (>> -/→ 1)[] (>>.1) = 1 (-/→ .1)

Whole of the list must be traversed

(+ ←/- 0)[1, 2, 3] = 1 + (+ ←/- 0)[2, 3] = 1 + (2 + (+ ←/- 0)[3]) = 1 + (2 + (3 + (+ ←/- 0)[]) = 1 + (2 + (3 + 0)) = 6

Linear space. Same size as the list that we started with.

(+ -/→ 0)[1, 2, 3] = (+ -/→ (0 + 1))[2, 3] { We can evaluate the answer now reducing the size of the list } = (+ -/→ 1)[2, 3] = (+ -/→ (1 + 2))[3] = (+ -/→ 3)[3] = (+ -/→ (3 + 3))[] = (+ -/→ 6)[] = 6 Constant space

Conclusion

Use (⊕ -/→ e) when ⊕ is strict in the sense that it requires the evaluation of both arguments to return the result. Eg: + × ↑ ↓

Use (⊕ ←/- e) when ⊕⊕ is non-strict that is it does not always demand the complete evaluation of both left and right arguments to return the result.

e.g. and or <<