preliminaries.tex

\chapter{Type theory}
\label{cha:typetheory}

\section{Type theory versus set theory}
\label{sec:types-vs-sets}
\label{sec:axioms}

\index{type theory}
Homotopy type theory is (among other things) a foundational language for mathematics, i.e., an alternative to Zermelo--Fraenkel\index{set theory!Zermelo--Fraenkel} set theory.
However, it behaves differently from set theory in several important ways, and that can take some getting used to.
Explaining these differences carefully requires us to be more formal here than we will be in the rest of the book.
As stated in the introduction, our goal is to write type theory \emph{informally}; but for a mathematician accustomed to set theory, more precision at the beginning can help avoid some common misconceptions and mistakes.

We note that a set-theoretic foundation has two ``layers'': the deductive system of first-order logic,\index{first-order!logic} and, formulated inside this system, the axioms of a particular theory, such as ZFC.
Thus, set theory is not only about sets, but rather about the interplay between sets (the objects of the second layer) and propositions (the objects of the first layer).

By contrast, type theory is its own deductive system: it need not be formulated inside any superstructure, such as first-order logic.
Instead of the two basic notions of set theory, sets and propositions, type theory has one basic notion: \emph{types}.
Propositions (statements which we can prove, disprove, assume, negate, and so on\footnote{Confusingly, it is also a common practice (dating 
back to Euclid) to use the word ``proposition'' synonymously with ``theorem''.
  We will confine ourselves to the logician's usage, according to which a \emph{proposition} is a statement \emph{susceptible to} proof, whereas a \emph{theorem}\indexfoot{theorem} (or ``lemma''\indexfoot{lemma} or ``corollary''\indexfoot{corollary}) is such a statement that \emph{has been} proven.
Thus ``$0=1$'' and its negation ``$\neg(0=1)$'' are both propositions, but only the latter is a theorem.}) are identified with particular types, via the correspondence shown in \cref{tab:pov} on page~\pageref{tab:pov}.
Thus, the mathematical activity of \emph{proving a theorem} is identified with a special case of the mathematical activity of \emph{constructing an object}---in this case, an inhabitant of a type that represents a proposition.

\index{deductive system}%
This leads us to another difference between type theory and set theory, but to explain it we must say a little about deductive systems in general.
Informally, a deductive system is a collection of \define{rules}
\indexdef{rule}%
for deriving things called \define{judgments}.
\indexdef{judgment}%
If we think of a deductive system as a formal game,
\index{game!deductive system as}%
then the judgments are the ``positions'' in the game which we reach by following the game rules.
We can also think of a deductive system as a sort of algebraic theory, in which case the judgments are the elements (like the elements of a group) and the deductive rules are the operations (like the group multiplication).
From a logical point of view, the judgments can be considered to be the ``external'' statements, living in the metatheory, as opposed to the ``internal'' statements of the theory itself.

In the deductive system of first-order logic (on which set theory is based), there is only one kind of judgment: that a given proposition has a proof.
That is, each proposition $A$ gives rise to a judgment ``$A$ has a proof'', and all judgments are of this form.
A rule of first-order logic such as ``from $A$ and $B$ infer $A\wedge B$'' is actually a rule of ``proof construction'' which says that given the judgments ``$A$ has a proof'' and ``$B$ has a proof'', we may deduce that ``$A\wedge B$ has a proof''.
Note that the judgment ``$A$ has a proof'' exists at a different level from the \emph{proposition} $A$ itself, which is an internal statement of the theory.
% In particular, we cannot manipulate it to construct propositions such as ``if $A$ has a proof, then $B$ does not have a proof''---unless we are using our set-theoretic foundation as a meta-theory with which to talk about some other axiomatic system.

The basic judgment of type theory, analogous to ``$A$ has a proof'', is written ``$a:A$'' and pronounced as ``the term $a$ has type $A$'', or more loosely ``$a$ is an element of $A$'' (or, in homotopy type theory, ``$a$ is a point of $A$'').
\indexdef{term}%
\indexdef{element}%
\indexdef{point!of a type}%
When $A$ is a type representing a proposition, then $a$ may be called a \emph{witness}\index{witness!to the truth of a proposition} to the provability of $A$, or \emph{evidence}\index{evidence, of the truth of a proposition} of the truth of $A$ (or even a \emph{proof}\index{proof} of $A$, but we will try to avoid this confusing terminology).
In this case, the judgment $a:A$ is derivable in type theory (for some $a$) precisely when the analogous judgment ``$A$ has a proof'' is derivable in first-order logic (modulo differences in the axioms assumed and in the encoding of mathematics, as we will discuss throughout the book).
 
On the other hand, if the type $A$ is being treated more like a set than like a proposition (although as we will see, the distinction can become blurry), then ``$a:A$'' may be regarded as analogous to the set-theoretic statement ``$a\in A$''.
However, there is an essential difference in that ``$a:A$'' is a \emph{judgment} whereas ``$a\in A$'' is a \emph{proposition}.
In particular, when working internally in type theory, we cannot make statements such as ``if $a:A$ then it is not the case that $b:B$'', nor can we ``disprove'' the judgment ``$a:A$''.

A good way to think about this is that in set theory, ``membership'' is a relation which may or may not hold between two pre-existing objects ``$a$'' and ``$A$'', while in type theory we cannot talk about an element ``$a$'' in isolation: every element \emph{by its very nature} is an element of some type, and that type is (generally speaking) uniquely determined.
Thus, when we say informally ``let $x$ be a natural number'', in set theory this is shorthand for ``let $x$ be a thing and assume that $x\in\nat$'', whereas in type theory ``let $x:\nat$'' is an atomic statement: we cannot introduce a variable without specifying its type.\index{membership}


At first glance, this may seem an uncomfortable restriction, but it is arguably closer to the intuitive mathematical meaning of ``let $x$ be a natural number''.
In practice, it seems that whenever we actually \emph{need} ``$a\in A$'' to be a proposition rather than a judgment, there is always an ambient set $B$ of which $a$ is known to be an element and $A$ is known to be a subset.
This situation is also easy to represent in type theory, by taking $a$ to be an element of the type $B$, and $A$ to be a predicate on $B$; see \cref{subsec:prop-subsets}.

A last difference between type theory and set theory is the treatment of equality.
The familiar notion of equality in mathematics is a proposition: e.g.\ we can disprove an equality or assume an equality as a hypothesis.
Since in type theory, propositions are types, this means that equality is a type: for elements $a,b:A$ (that is, both $a:A$ and $b:A$) we have a type ``$\id[A]ab$''.
(In \emph{homotopy} type theory, of course, this equality proposition can behave in unfamiliar ways: see \cref{sec:identity-types,cha:basics}, and the rest of the book).
When $\id[A]ab$ is inhabited, we say that $a$ and $b$ are \define{(propositionally) equal}.
\index{propositional!equality}%
\index{equality!propositional}%

However, in type theory there is also a need for an equality \emph{judgment}, existing at the same level as the judgment ``$x:A$''.\index{judgment}
\symlabel{defn:judgmental-equality}%
This is called \define{judgmental equality}
\indexdef{equality!judgmental}%
\indexdef{judgmental equality}%
or \define{definitional equality},
\indexdef{equality!definitional}%
\indexsee{definitional equality}{equality, definitional}%
and we write it as $a\jdeq b : A$ or simply $a \jdeq b$.
It is helpful to think of this as meaning ``equal by definition''.
For instance, if we define a function $f:\nat\to\nat$ by the equation $f(x)=x^2$, then the expression $f(3)$ is equal to $3^2$ \emph{by definition}.
Inside the theory, it does not make sense to negate or assume an equality-by-definition; we cannot say ``if $x$ is equal to $y$ by definition, then $z$ is not equal to $w$ by definition''.
Whether or not two expressions are equal by definition is just a matter of expanding out the definitions; in particular, it is algorithmically\index{algorithm} decidable (though the algorithm is necessarily meta-theoretic, not internal to the theory).\index{decidable!definitional equality}

As type theory becomes more complicated, judgmental equality can get more subtle than this, but it is a good intuition to start from.
Alternatively, if we regard a deductive system as an algebraic theory, then judgmental equality is simply the equality in that theory, analogous to the equality between elements of a group---the only potential for confusion is that there is \emph{also} an object \emph{inside} the deductive system of type theory (namely the type ``$a=b$'') which behaves internally as a notion of ``equality''.

The reason we \emph{want} a judgmental notion of equality is so that it can control the other form of judgment, ``$a:A$''.
For instance, suppose we have given a proof that $3^2=9$, i.e.\ we have derived the judgment $p:(3^2=9)$ for some $p$.
Then the same witness $p$ ought to count as a proof that $f(3)=9$, since $f(3)$ is $3^2$ \emph{by definition}.
The best way to represent this is with a rule saying that given the judgments $a:A$ and $A\jdeq B$, we may derive the judgment $a:B$.

Thus, for us, type theory will be a deductive system based on two forms of judgment:
\begin{center}
\medskip
\begin{tabular}{cl}
  \toprule
  Judgment & Meaning\\
  \midrule
  $a : A$       & ``$a$ is an object of type $A$''\\
  $a \jdeq b : A$ & ``$a$ and $b$ are definitionally equal objects of type $A$''\\
  \bottomrule
\end{tabular}
\medskip
\end{center}
%
\symlabel{defn:defeq}%
When introducing a definitional equality, i.e., defining one thing to be equal to another, we will use the symbol ``$\defeq$''.
Thus, the above definition of the function $f$ would be written as $f(x)\defeq x^2$.

Because judgments cannot be put together into more complicated statements, the symbols ``$:$'' and ``$\jdeq$'' bind more loosely than anything else.%
\footnote{In formalized\indexfoot{mathematics!formalized} type theory, commas and turnstiles can bind even more loosely.
  For instance, $x:A,y:B\vdash c:C$ is parsed as $((x:A),(y:B))\vdash (c:C)$.
  However, in this book we refrain from such notation until \cref{cha:rules}.}
Thus, for instance, ``$p:\id{x}{y}$'' should be parsed as ``$p:(\id{x}{y})$'', which makes sense since ``$\id{x}{y}$'' is a type, and not as ``$\id{(p:x)}{y}$'', which is senseless since ``$p:x$'' is a judgment and cannot be equal to anything.
Similarly, ``$A\jdeq \id{x}{y}$'' can only be parsed as ``$A\jdeq(\id{x}{y})$'', although in extreme cases such as this, one ought to add parentheses anyway to aid reading comprehension.
Moreover, later on we will fall into the common notation of chaining together equalities --- e.g.\ writing $a=b=c=d$ to mean ``$a=b$ and $b=c$ and $c=d$, hence $a=d$'' --- and we will also include judgmental equalities in such chains.
Context usually suffices to make the intent clear.

This is perhaps also an appropriate place to mention that the common mathematical notation ``$f:A\to B$'', expressing the fact that $f$ is a function from $A$ to $B$, can be regarded as a typing judgment, since we use ``$A\to B$'' as notation for the type of functions from $A$ to $B$ (as is standard practice in type theory; see \cref{sec:pi-types}).

\index{assumption|(defstyle}%
Judgments may depend on \emph{assumptions} of the form $x:A$, where $x$ is a variable
\indexdef{variable}%
and $A$ is a type.
For example, we may construct an object $m + n : \nat$ under the assumptions that $m,n : \nat$.
Another example is that assuming $A$ is a type, $x,y : A$, and $p : \id[A]{x}{y}$, we may construct an element $p^{-1} : \id[A]{y}{x}$.
The collection of all such assumptions is called the \define{context};%
\index{context}
from a topological point of view it may be thought of as a ``parameter\index{parameter!space} space''.
In fact, technically the context must be an ordered list of assumptions, since later assumptions may depend on previous ones: the assumption $x:A$ can only be made \emph{after} the assumptions of any variables appearing in the type $A$.

If the type $A$ in an assumption $x:A$ represents a proposition, then the assumption is a type-theoretic version of a \emph{hypothesis}:
\indexdef{hypothesis}%
we assume that the proposition $A$ holds.
When types are regarded as propositions, we may omit the names of their proofs.
Thus, in the second example above we may instead say that assuming $\id[A]{x}{y}$, we can prove $\id[A]{y}{x}$.
However, since we are doing ``proof-relevant'' mathematics,
\index{mathematics!proof-relevant}%
we will frequently refer back to proofs as objects.
In the example above, for instance, we may want to establish that $p^{-1}$ together with the proofs of transitivity and reflexivity behave like a groupoid; see \cref{cha:basics}.

Note that under this meaning of the word \emph{assumption}, we can assume a propositional equality (by assuming a variable $p:x=y$), but we cannot assume a judgmental equality $x\jdeq y$, since it is not a type that can have an element.
However, we can do something else which looks kind of like assuming a judgmental equality: if we have a type or an element which involves a variable $x:A$, then we can \emph{substitute} any particular element $a:A$ for $x$ to obtain a more specific type or element.
We will sometimes use language like ``now assume $x\jdeq a$'' to refer to this process of substitution, even though it is not an \emph{assumption} in the technical sense introduced above.
\index{assumption|)}%

By the same token, we cannot \emph{prove} a judgmental equality either, since it is not a type in which we can exhibit a witness.
Nevertheless, we will sometimes state judgmental equalities as part of a theorem, e.g.\ ``there exists $f:A\to B$ such that $f(x)\jdeq y$''.
This should be regarded as the making of two separate judgments: first we make the judgment $f:A\to B$ for some element $f$, then we make the additional judgment that $f(x)\jdeq y$.

In the rest of this chapter, we attempt to give an informal presentation of type theory, sufficient for the purposes of this book; we give a more formal account in \cref{cha:rules}.
Aside from some fairly obvious rules (such as the fact that judgmentally equal things can always be substituted\index{substitution} for each other), the rules of type theory can be grouped into \emph{type formers}.
Each type former consists of a way to construct types (possibly making use of previously constructed types), together with rules for the construction and behavior of elements of that type.
In most cases, these rules follow a fairly predictable pattern, but we will not attempt to make this precise here; see however the beginning of \cref{sec:finite-product-types} and also \cref{cha:induction}.\index{type theory!informal}


\index{axiom!versus rules}%
\index{rule!versus axioms}%
An important aspect of the type theory presented in this chapter is that it consists entirely of \emph{rules}, without any \emph{axioms}.
In the description of deductive systems in terms of judgments, the \emph{rules} are what allow us to conclude one judgment from a collection of others, while the \emph{axioms} are the judgments we are given at the outset.
If we think of a deductive system as a formal game, then the rules are the rules of the game, while the axioms are the starting position.
And if we think of a deductive system as an algebraic theory, then the rules are the operations of the theory, while the axioms are the \emph{generators} for some particular free model of that theory.

In set theory, the only rules are the rules of first-order logic (such as the rule allowing us to deduce ``$A\wedge B$ has a proof'' from ``$A$ has a proof'' and ``$B$ has a proof''): all the information about the behavior of sets is contained in the axioms.
By contrast, in type theory, it is usually the \emph{rules} which contain all the information, with no axioms being necessary.
For instance, in \cref{sec:finite-product-types} we will see that there is a rule allowing us to deduce the judgment ``$(a,b):A\times B$'' from ``$a:A$'' and ``$b:B$'', whereas in set theory the analogous statement would be (a consequence of) the pairing axiom.

The advantage of formulating type theory using only rules is that rules are ``procedural''.
In particular, this property is what makes possible (though it does not automatically ensure) the good computational properties of type theory, such as ``canonicity''.\index{canonicity}
However, while this style works for traditional type theories, we do not yet understand how to formulate everything we need for \emph{homotopy} type theory in this way.
In particular, in \cref{sec:compute-pi,sec:compute-universe,cha:hits} we will have to augment the rules of type theory presented in this chapter by introducing additional axioms, notably the \emph{univalence axiom}.
In this chapter, however, we confine ourselves to a traditional rule-based type theory.


\section{Function types}
\label{sec:function-types}

\index{type!function|(defstyle}%
\indexsee{function type}{type, function}%
Given types $A$ and $B$, we can construct the type $A \to B$ of \define{functions}
\index{function|(defstyle}%
\indexsee{map}{function}%
\indexsee{mapping}{function}%
with domain $A$ and codomain $B$.
We also sometimes refer to functions as \define{maps}.
\index{domain!of a function}%
\index{codomain, of a function}%
\index{function!domain of}%
\index{function!codomain of}%
\index{functional relation}%
Unlike in set theory, functions are not defined as
functional relations; rather they are a primitive concept in type theory.
We explain the function type by prescribing what we can do with functions, 
how to construct them and what equalities they induce.

Given a function $f : A \to B$ and an element of the domain $a : A$, we
can \define{apply}
\indexdef{application!of function}%
\indexdef{function!application}%
\indexsee{evaluation}{application, of a function}
the function to obtain an element of the codomain $B$,
denoted $f(a)$ and called the \define{value} of $f$ at $a$.
\indexdef{value!of a function}%
It is common in type theory to omit the parentheses\index{parentheses} and denote $f(a)$ simply by $f\,a$, and we will sometimes do this as well.

But how can we construct elements of $A \to B$? There are two equivalent ways:
either by direct definition or by using
$\lambda$-abstraction. Introducing a function by definition
\indexdef{definition!of function, direct}%
means that
we introduce a function by giving it a name --- let's say, $f$ --- and saying
we define $f : A \to B$ by giving an equation
\begin{equation}
  \label{eq:expldef}
  f(x) \defeq \Phi
\end{equation}
where $x$ is a variable
\index{variable}%
and $\Phi$ is an expression which may use $x$.
In order for this to be valid, we have to check that $\Phi : B$ assuming $x:A$.

Now we can compute $f(a)$ by replacing the variable $x$ in $\Phi$ with
$a$. As an example, consider the function $f : \nat \to \nat$ which is
defined by $f(x) \defeq x+x$.  (We will define $\nat$ and $+$ in \cref{sec:inductive-types}.)
Then $f(2)$ is judgmentally equal to $2+2$.

If we don't want to introduce a name for the function, we can use
\define{$\lambda$-abstraction}.
\index{lambda abstraction@$\lambda$-abstraction|defstyle}%
\indexsee{function!lambda abstraction@$\lambda$-abstraction}{$\lambda$-abstraction}%
\indexsee{abstraction!lambda-@$\lambda$-}{$\lambda$-abstraction}%
Given an expression $\Phi$ of type $B$ which may use $x:A$, as above, we write $\lam{x:A} \Phi$ to indicate the same function defined by~\eqref{eq:expldef}.
Thus, we have
\[ (\lamt{x:A}\Phi) : A \to B. \]
For the example in the previous paragraph, we have the typing judgment
\[ (\lam{x:\nat}x+x) : \nat \to \nat. \]
As another example, for any types $A$ and $B$ and any element $y:B$, we have a \define{constant function}
\indexdef{constant!function}%
\indexdef{function!constant}%
$(\lam{x:A} y): A\to B$.

We generally omit the type of the variable $x$ in a $\lambda$-abstraction and write $\lam{x}\Phi$, since the typing $x:A$ is inferable from the judgment that the function $\lam x \Phi$ has type $A\to B$.
By convention, the ``scope''
\indexdef{variable!scope of}%
\indexdef{scope}%
of the variable binding ``$\lam{x}$'' is the entire rest of the expression, unless delimited with parentheses\index{parentheses}.
Thus, for instance, $\lam{x} x+x$ should be parsed as $\lam{x} (x+x)$, not as $(\lam{x}x)+x$ (which would, in this case, be ill-typed anyway).

Another equivalent notation is
\symlabel{mapsto}%
\[ (x \mapsto \Phi) : A \to B. \]
\symlabel{blank}%
We may also sometimes use a blank ``$\blank$'' in the expression $\Phi$ in place of a variable, to denote an implicit $\lambda$-abstraction.
For instance, $g(x,\blank)$ is another way to write $\lam{y} g(x,y)$.

Now a $\lambda$-abstraction is a function, so we can apply it to an argument $a:A$.
We then have the following \define{computation rule}\indexdef{computation rule!for function types}\footnote{Use of this equality is often referred to as \define{$\beta$-conversion}
\indexsee{beta-conversion@$\beta $-conversion}{$\beta$-reduction}%
\indexsee{conversion!beta@$\beta $-}{$\beta$-reduction}%
or \define{$\beta$-reduction}.%
\index{beta-reduction@$\beta $-reduction|footstyle}%
\indexsee{reduction!beta@$\beta $-}{$\beta$-reduction}%
}, which is a definitional equality:
\[(\lamu{x:A}\Phi)(a) \jdeq \Phi'\]
where $\Phi'$ is the
expression $\Phi$ in which all occurrences of $x$ have been replaced by $a$.
Continuing the above example, we have
%
\[ (\lamu{x:\nat}x+x)(2) \jdeq 2+2. \]
%
Note that from any function $f:A\to B$, we can construct a lambda abstraction function $\lam{x} f(x)$.
Since this is by definition ``the function that applies $f$ to its argument'' we consider it to be definitionally equal to $f$:\footnote{Use of this equality is often referred to as \define{$\eta$-conversion}
\indexsee{eta-conversion@$\eta $-conversion}{$\eta$-expansion}%
\indexsee{conversion!eta@$\eta $-}{$\eta$-expansion}%
or \define{$\eta$-expansion.
\index{eta-expansion@$\eta $-expansion|footstyle}%
\indexsee{expansion, eta-@expansion, $\eta $-}{$\eta$-expansion}%
}}
\[ f \jdeq (\lam{x} f(x)). \]
This equality is the \define{uniqueness principle for function types}\indexdef{uniqueness!principle!for function types}, because it shows that $f$ is uniquely determined by its values.

The introduction of functions by definitions with explicit parameters can be reduced
to simple definitions by using $\lambda$-abstraction: i.e., we can read 
a definition of $f: A\to B$ by
\[ f(x) \defeq \Phi \]
as 
\[ f \defeq \lamu{x:A}\Phi.\]

When doing calculations involving variables, we have to be 
careful when replacing a variable with an expression that also involves
variables, because we want to preserve the binding structure of
expressions. By the \emph{binding structure}\indexdef{binding structure} we mean the
invisible link generated by binders such as $\lambda$, $\Pi$ and
$\Sigma$ (the latter we are going to meet soon) between the place where the variable is introduced and where it is used. As an example, consider $f : \nat \to (\nat \to \nat)$
defined as 
\[ f(x) \defeq \lamu{y:\nat} x + y. \] 
Now if we have assumed somewhere that $y : \nat$, then what is $f(y)$? It would be wrong to just naively replace $x$ by $y$ everywhere in the expression ``$\lam{y}x+y$'' defining $f(x)$, obtaining $\lamu{y:\nat} y + y$, because this means that $y$ gets \define{captured}.
\indexdef{capture, of a variable}%
\indexdef{variable!captured}%
Previously, the substituted\index{substitution} $y$ was referring to our assumption, but now it is referring to the argument of the $\lambda$-abstraction. Hence, this naive substitution would destroy the binding structure, allowing us to perform calculations which are semantically unsound.

But what \emph{is} $f(y)$ in this example? Note that bound (or ``dummy'')
variables
\indexdef{variable!bound}%
\indexdef{variable!dummy}%
\indexsee{bound variable}{variable, bound}%
\indexsee{dummy variable}{variable, bound}%
such as $y$ in the expression $\lamu{y:\nat} x + y$
have only a local meaning, and can be consistently replaced by any
other variable, preserving the binding structure. Indeed, $\lamu{y:\nat} x + y$ is declared to be judgmentally equal\footnote{Use of this equality is often referred to as \define{$\alpha$-conversion.
\indexfoot{alpha-conversion@$\alpha $-conversion}
\indexsee{conversion!alpha@$\alpha$-}{$\alpha$-conversion}
}} to
$\lamu{z:\nat} x + z$.  It follows that 
$f(y)$ is judgmentally equal to  $\lamu{z:\nat} y + z$, and that answers our question.  (Instead of $z$,
any variable distinct from $y$ could have been used, yielding an equal result.)

Of course, this should all be familiar to any mathematician: it is the same phenomenon as the fact that if $f(x) \defeq \int_1^2 \frac{dt}{x-t}$, then $f(t)$ is not $\int_1^2 \frac{dt}{t-t}$ but rather $\int_1^2 \frac{ds}{t-s}$.
A $\lambda$-abstraction binds a dummy variable in exactly the same way that an integral does.

We have seen how to define functions in one variable. One
way to define functions in several variables would be to use the
cartesian product, which will be introduced later; a function with
parameters $A$ and $B$ and results in $C$ would be given the type 
$f : A \times B \to C$. However, there is another choice that avoids
using product types, which is called \define{currying}
\indexdef{currying}%
\indexdef{function!currying of}%
(after the mathematician Haskell Curry).
\index{programming}%

The idea of currying is to represent a function of two inputs $a:A$ and $b:B$ as a function which takes \emph{one} input $a:A$ and returns \emph{another function}, which then takes a second input $b:B$ and returns the result.
That is, we consider two-variable functions to belong to an iterated function type, $f : A \to (B \to C)$.
We may also write this without the parentheses\index{parentheses}, as $f : A \to B \to C$, with
associativity\index{associativity!of function types} to the right as the default convention.  Then given $a : A$ and $b : B$,
we can apply $f$ to $a$ and then apply the result to $b$, obtaining
$f(a)(b) : C$. To avoid the proliferation of parentheses, we allow ourselves to
write $f(a)(b)$ as $f(a,b)$ even though there are no products
involved.
When omitting parentheses around function arguments entirely, we write $f\,a\,b$ for $(f\,a)\,b$, with the default associativity now being to the left so that $f$ is applied to its arguments in the correct order.

Our notation for definitions with explicit parameters extends to
this situation: we can define a named function $f : A \to B \to C$ by
giving an equation
\[ f(x,y) \defeq \Phi\]
where $\Phi:C$ assuming $x:A$ and $y:B$. Using $\lambda$-abstraction\index{lambda abstraction@$\lambda$-abstraction} this
corresponds to
\[ f \defeq \lamu{x:A}{y:B} \Phi, \]
which may also be written as 
\[ f \defeq x \mapsto y \mapsto \Phi. \] 
We can also implicitly abstract over multiple variables by writing multiple blanks, e.g.\ $g(\blank,\blank)$ means $\lam{x}{y} g(x,y)$.
Currying a function of three or more arguments is a straightforward extension of what we have just described.
 
\index{type!function|)}%
\index{function|)}%


\section{Universes and families}
\label{sec:universes}

So far, we have been using the expression ``$A$ is a type'' informally. We
are going to make this more precise by introducing \define{universes}.
\index{type!universe|(defstyle}%
\indexsee{universe}{type, universe}%
A universe is a type whose elements are types. As in naive set theory,
we might wish for a universe of all types $\UU_\infty$ including itself
(that is, with $\UU_\infty : \UU_\infty$).
However, as in set
theory, this is unsound, i.e.\ we can deduce from it that every type,
including the empty type representing the proposition False (see \cref{sec:coproduct-types}), is inhabited.
For instance, using a
representation of sets as trees, we can directly encode Russell's
paradox\index{paradox} \cite{coquand:paradox}.
%  or alternatively, in order to avoid the use of
% inductive types to define trees, we can follow Girard \cite{girard:paradox} and encode the Burali-Forti paradox,
% which shows that the collection of all ordinals cannot be an ordinal.

To avoid the paradox we introduce a hierarchy of universes
\indexsee{hierarchy!of universes}{type, universe}%
\[ \UU_0 : \UU_1 : \UU_2 : \cdots \]
where every universe $\UU_i$ is an element of the next universe
$\UU_{i+1}$. Moreover, we assume that our universes are
\define{cumulative},
\indexdef{type!universe!cumulative}%
\indexdef{cumulative!universes}%
that is that all the elements of the $i^{\mathrm{th}}$
universe are also elements of the $(i+1)^{\mathrm{st}}$ universe, i.e.\ if
$A:\UU_i$ then also $A:\UU_{i+1}$.
This is convenient, but has the slightly unpleasant consequence that elements no longer have unique types, and is a bit tricky in other ways that need not concern us here; see the Notes.

When we say that $A$ is a type, we mean that it inhabits some universe
$\UU_i$. We usually want to avoid mentioning the level
\indexdef{universe level}%
\indexsee{level}{universe level or $n$-type}%
\indexsee{type!universe!level}{universe level}%
$i$ explicitly,
and just assume that levels can be assigned in a consistent way; thus we
may write $A:\UU$ omitting the level. This way we can even write
$\UU:\UU$, which can be read as $\UU_i:\UU_{i+1}$, having left the
indices implicit.  Writing universes in this style is referred to as
\define{typical ambiguity}.
\indexdef{typical ambiguity}%
It is convenient but a bit dangerous, since it allows us to write valid-looking proofs that reproduce the paradoxes of self-reference.
If there is any doubt about whether an argument is correct, the way to check it is to try to assign levels consistently to all universes appearing in it.
When some universe \UU is assumed, we may refer to types belonging to \UU as \define{small types}.
\indexdef{small!type}%
\indexdef{type!small}%

To model a collection of types varying over a given type $A$, we use functions $B : A \to \UU$  whose
codomain is a universe. These functions are called
\define{families of types} (or sometimes \emph{dependent types});
\indexsee{family!of types}{type, family of}%
\indexdef{type!family of}%
\indexsee{type!dependent}{type, family of}%
\indexsee{dependent!type}{type, family of}%
they correspond to families of sets as used in
set theory.

\symlabel{fin}%
An example of a type family is the family of finite sets $\Fin
: \nat \to \UU$, where $\Fin(n)$ is a type with exactly $n$ elements.
(We cannot \emph{define} the family $\Fin$ yet --- indeed, we have not even introduced its domain $\nat$ yet --- but we will be able to soon; see \cref{ex:fin}.)
We may denote the elements of $\Fin(n)$ by $0_n,1_n,\dots,(n-1)_n$, with subscripts to emphasize that the elements of $\Fin(n)$ are different from those of $\Fin(m)$ if $n$ is different from $m$, and all are different from the ordinary natural numbers (which we will introduce in \cref{sec:inductive-types}).
\index{finite!sets, family of}%

A more trivial (but very important) example of a type family is the \define{constant} type family
\indexdef{constant!type family}%
\indexdef{type!family of!constant}%
at a type $B:\UU$, which is of course the constant function $(\lam{x:A} B):A\to\UU$.

As a \emph{non}-example, in our version of type theory there is no type family ``$\lam{i:\nat} \UU_i$''.
Indeed, there is no universe large enough to be its codomain.
Moreover, we do not even identify the indices $i$ of the universes $\UU_i$ with the natural numbers \nat of type theory (the latter to be introduced in \cref{sec:inductive-types}).

\index{type!universe|)}%

\section{Dependent function types (\texorpdfstring{$\Pi$}{Π}-types)}
\label{sec:pi-types}

\index{type!dependent function|(defstyle}%
\index{function!dependent|(defstyle}%
\indexsee{dependent!function}{function, dependent}%
\indexsee{type!Pi-@$\Pi$-}{type, dependent function}%
\indexsee{Pi-type@$\Pi$-type}{type, dependent function}%
In type theory we often use a more general version of function
types, called a \define{$\Pi$-type} or \define{dependent function type}. The elements of
a $\Pi$-type are functions
whose codomain type can vary depending on the
element of the domain to which the function is applied, called \define{dependent functions}. The name ``$\Pi$-type''
is used because this type can also be regarded as the  cartesian
product over a given type.

Given a type $A:\UU$ and a family $B:A \to \UU$, we may construct
the type of dependent functions $\prd{x:A}B(x) : \UU$.
There are many alternative notations for this type, such as
\[ \tprd{x:A} B(x) \qquad \dprd{x:A}B(x) \qquad \lprd{x:A} B(x). \]
If $B$ is a constant family, then the dependent product type is the ordinary function type:
\[\tprd{x:A} B \jdeq (A \to B).\]
Indeed, all the constructions of $\Pi$-types are generalizations of the corresponding constructions on ordinary function types.

\indexdef{definition!of function, direct}%
We can introduce dependent functions by explicit definitions: to
define $f : \prd{x:A}B(x)$, where $f$ is the name of a dependent function to be
defined, we need an expression $\Phi : B(x)$ possibly involving the variable $x:A$,
\index{variable}%
and we write
\[ f(x) \defeq \Phi \qquad \mbox{for $x:A$}.\]
Alternatively, we can use \define{$\lambda$-abstraction}%
\index{lambda abstraction@$\lambda$-abstraction|defstyle}%
\begin{equation}
  \label{eq:lambda-abstraction}
  \lamu{x:A} \Phi \ :\ \prd{x:A} B(x).
\end{equation}
\indexdef{application!of dependent function}%
\indexdef{function!dependent!application}%
As with non-dependent functions, we can \define{apply} a dependent function $f : \prd{x:A}B(x)$ to an argument $a:A$ to obtain an element $f(a):B(a)$.
The equalities are the same as for the ordinary function type, i.e.\ we have the computation rule
\index{computation rule!for dependent function types}%
given $a:A$ we have $f(a) \jdeq \Phi'$ and  
$(\lamu{x:A} \Phi)(a) \jdeq \Phi'$, where $\Phi' $ is obtained by replacing all
occurrences of $x$ in $\Phi$ by $a$ (avoiding variable capture, as always).
Similarly, we have the uniqueness principle $f\jdeq (\lam{x} f(x))$ for any $f:\prd{x:A} B(x)$.
\index{uniqueness!principle!for dependent function types}%

As an example, recall from \cref{sec:universes} that there is a type family $\Fin:\nat\to\UU$ whose values are the standard finite sets, with elements $0_n,1_n,\dots,(n-1)_n : \Fin(n)$.
There is then a dependent function $\fmax : \prd{n:\nat} \Fin(n+1)$
which returns the ``largest'' element of each nonempty finite type, $\fmax(n) \defeq n_{n+1}$.
\index{finite!sets, family of}%
As was the case for $\Fin$ itself, we cannot define $\fmax$ yet, but we will be able to soon; see \cref{ex:fin}.

Another important class of dependent function types, which we can define now, are functions which are \define{polymorphic}
\indexdef{function!polymorphic}%
\indexdef{polymorphic function}%
over a given universe.
A polymorphic function is one which takes a type as one of its arguments, and then acts on elements of that type (or other types constructed from it).
\symlabel{idfunc}%
\indexdef{function!identity}%
\indexdef{identity!function}%
An example is the polymorphic identity function $\idfunc : \prd{A:\UU} A \to A$, which we define by $\idfunc{} \defeq \lam{A:\type}{x:A} x$.

We sometimes write some arguments of a dependent function as subscripts.
For instance, we might equivalently define the polymorphic identity function by $\idfunc[A](x) \defeq x$.
Moreover, if an argument can be inferred from context, we may omit it altogether.
For instance, if $a:A$, then writing $\idfunc(a)$ is unambiguous, since $\idfunc$ must mean $\idfunc[A]$ in order for it to be applicable to $a$.

Another, less trivial, example of a polymorphic function is the ``swap'' operation that switches the order of the arguments of a (curried) two-argument function:
\[ \mathsf{swap} : \prd{A:\UU}{B:\UU}{C:\UU} (A\to B\to C) \to (B\to A \to C)
\]
We can define this by
\[ \mathsf{swap}(A,B,C,g) \defeq \lam{b}{a} g(a)(b). \]
We might also equivalently write the type arguments as subscripts:
\[ \mathsf{swap}_{A,B,C}(g)(b,a) \defeq g(a,b). \]

Note that as we did for ordinary functions, we use currying to define dependent functions with
several arguments (such as $\mathsf{swap}$). However, in the dependent case the second domain may
depend on the first one, and the codomain may depend on both. That is,
given $A:\UU$ and type families $B : A \to \UU$ and $C : \prd{x:A}B(x) \to \UU$, we may construct
the type $\prd{x:A}{y : B(x)} C(x,y)$ of functions with two
arguments.
(Like $\lambda$-abstractions, $\Pi$s automatically scope\index{scope} over the rest of the expression unless delimited; thus $C : \prd{x:A}B(x) \to \UU$ means $C : \prd{x:A}(B(x) \to \UU)$.)
In the case when $B$ is constant and equal to $A$, we may condense the notation and write $\prd{x,y:A}$; for instance, the type of $\mathsf{swap}$ could also be written as
\[ \mathsf{swap} : \prd{A,B,C:\UU} (A\to B\to C) \to (B\to A \to C).
\]
Finally, given $f:\prd{x:A}{y : B(x)} C(x,y)$ and arguments $a:A$ and $b:B(a)$, we have $f(a)(b) : C(a,b)$, which,
as before, we write as $f(a,b) : C(a,b)$.

\index{type!dependent function|)}%
\index{function!dependent|)}%


\section{Product types}
\label{sec:finite-product-types}

Given types $A,B:\UU$ we introduce the type $A\times B:\UU$, which we call their \define{cartesian product}.
\indexsee{cartesian product}{type, product}%
\indexsee{type!cartesian product}{type, product}%
\index{type!product|(defstyle}%
\indexsee{product!of types}{type, product}%
We also introduce a nullary product type, called the \define{unit type} $\unit : \UU$.
\indexsee{nullary!product}{type, unit}%
\indexsee{unit!type}{type, unit}%
\index{type!unit|(defstyle}%
We intend the elements of $A\times B$ to be pairs $\tup{a}{b} : A \times B$, where $a:A$ and $b:B$, and the only element of $\unit$ to be some particular object $\ttt : \unit$.
\indexdef{pair!ordered}%
However, unlike in set theory, where we define ordered pairs to be particular sets and then collect them all together into the cartesian product, in type theory, ordered pairs are a primitive concept, as are functions.   

\begin{rmk}\label{rmk:introducing-new-concepts}
  There is a general pattern for introduction of a new kind of type in type theory, and because products are our second example following this pattern,\footnote{The description of universes above is an exception.} it is worth emphasizing the general form:
  To specify a type, we specify:
  \begin{enumerate}
  \item how to form new types of this kind, via \define{formation rules}.
    \indexdef{formation rule}%
    \index{rule!formation}%  
(For example, we can form the function type $A \to B$ when $A$ is a type and when $B$ is a type. We can form the dependent function type $\prd{x:A} B(x)$ when $A$ is a type and $B(x)$ is a type for $x:A$.)

  \item how to construct elements of that type.  
    These are called the type's \define{constructors} or \define{introduction rules}.
    \indexdef{constructor}%
    \indexdef{rule!introduction}%
    \indexdef{introduction rule}%
    (For example, a function type has one constructor, $\lambda$-abstraction.
    Recall that a direct definition like $f(x)\defeq 2x$ can equivalently be phrased
    as a $\lambda$-abstraction $f\defeq \lam{x} 2x$.)

  \item how to use elements of that type.  
    These are called the type's \define{eliminators} or \define{elimination rules}.
    \indexsee{rule!elimination}{eliminator}%
    \indexsee{elimination rule}{eliminator}%
    \indexdef{eliminator}%
    (For example, the function type has one eliminator, namely function application.)

  \item 
    a \define{computation rule}\indexdef{computation rule}\footnote{also referred to as \define{$\beta$-reduction}\index{beta-reduction@$\beta $-reduction|footstyle}}, which expresses how an eliminator acts on a constructor.
(For example, for functions, the computation rule states that $(\lamu{x:A}\Phi)(a)$ is judgmentally equal to the substitution of $a$ for $x$ in $\Phi$.)

  \item 
    an optional \define{uniqueness principle}\indexdef{uniqueness!principle}\footnote{also referred to as \define{$\eta$-expansion}\index{eta-expansion@$\eta $-expansion|footstyle}}, which expresses
uniqueness of maps into or out of that type.  
For some types, the uniqueness principle characterizes maps into the type, by stating that 
every element of the type is uniquely determined by the results of applying eliminators to it, and can be reconstructed from those results by applying a constructor---thus expressing how constructors act on eliminators, dually to the computation rule.  
(For example, for functions, the uniqueness principle says that any function $f$ is judgmentally equal to the ``expanded'' function $\lamu{x} f(x)$, and thus is uniquely determined by its values.)
For other types, the uniqueness principle says that every map (function) \emph{from} that type is uniquely determined by some data. (An example is the coproduct type introduced in \cref{sec:coproduct-types}, whose uniqueness principle is mentioned in \cref{sec:universal-properties}.)  
    
    When the uniqueness principle is not taken as a rule of judgmental equality, it is often nevertheless provable as a \emph{propositional} equality from the other rules for the type.
    In this case we call it a \define{propositional uniqueness principle}.
    \indexdef{uniqueness!principle, propositional}%
    \indexsee{propositional!uniqueness principle}{uniqueness principle, propositional}%
    (In later chapters we will also occasionally encounter \emph{propositional computation rules}.)
    \indexdef{computation rule!propositional}%
  \end{enumerate}
The inference rules in \cref{sec:syntax-more-formally} are organized and named accordingly; see, for example, \cref{sec:more-formal-pi}, where each possibility is realized.
\end{rmk}

The way to construct pairs is obvious: given $a:A$ and $b:B$, we may form $(a,b):A\times B$.
Similarly, there is a unique way to construct elements of $\unit$, namely we have $\ttt:\unit$.
We expect that ``every element of $A\times B$ is a pair'', which is the uniqueness principle for products; we do not assert this as a rule of type theory, but we will prove it later on as a propositional equality.

Now, how can we \emph{use} pairs, i.e.\ how can we define functions out of a product type?
Let us first consider the definition of a non-dependent function $f : A\times B \to C$.
Since we intend the only elements of $A\times B$ to be pairs, we expect to be able to define such a function by prescribing the result
when $f$ is applied to a pair $\tup{a}{b}$.
We can prescribe these results by providing a function $g : A \to B \to C$.
Thus, we introduce a new rule (the elimination rule for products), which says that for any such $g$, we can define a function $f : A\times B \to C$ by
\[ f(\tup{a}{b}) \defeq g(a)(b). \]
We avoid writing $g(a,b)$ here, in order to emphasize that $g$ is not a function on a product.
(However, later on in the book we will often write $g(a,b)$ both for functions on a product and for curried functions of two variables.)
This defining equation is the computation rule for product types\index{computation rule!for product types}.

Note that in set theory, we would justify the above definition of $f$ by the fact that every element of $A\times B$ is a pair, so that it suffices to define $f$ on pairs.
By contrast, type theory reverses the situation: we assume that a function on $A\times B$ is well-defined as soon as we specify its values on tuples, and from this (or more precisely, from its more general version for dependent functions, below) we will be able to \emph{prove} that every element of $A\times B$ is a pair.
From a category-theoretic perspective, we can say that we define the product $A\times B$ to be left adjoint to the ``exponential'' $B\to C$, which we have already introduced.

As an example, we can derive the \define{projection}
\indexsee{function!projection}{projection}%
\indexsee{component, of a pair}{projection}%
\indexdef{projection!from cartesian product type}%
functions
\symlabel{defn:proj}%
\begin{align*}
  \fst & :  A \times B \to A \\
  \snd & :  A \times B \to B
\end{align*}
with the defining equations 
\begin{align*}
  \fst(\tup{a}{b}) & \defeq  a \\
  \snd(\tup{a}{b}) & \defeq  b.
\end{align*}
%
\symlabel{defn:recursor-times}%
Rather than invoking this principle of function definition every time we want to define a function, an alternative approach is to invoke it once, in a universal case, and then simply apply the resulting function in all other cases.
That is, we may define a function of type
\begin{equation}
  \rec{A\times B} : \prd{C:\UU}(A \to B \to C) \to A \times B \to C
\end{equation}
with the defining equation
\[\rec{A\times B}(C,g,\tup{a}{b}) \defeq g(a)(b). \]
Then instead of defining functions such as $\fst$ and $\snd$ directly by a defining equation, we could  define
\begin{align*}
  \fst &\defeq \rec{A\times B}(A, \lam{a}{b} a)\\
  \snd &\defeq \rec{A\times B}(B, \lam{a}{b} b).
\end{align*}
We refer to the function $\rec{A\times B}$ as the \define{recursor}
\indexsee{recursor}{recursion principle}%
for product types.  The name ``recursor'' is a bit unfortunate here, since no recursion is taking place.  It comes from the fact that product types are a degenerate example of a general framework for inductive types, and for types such as the natural numbers, the recursor will actually be recursive.  We may also speak of the \define{recursion principle} for cartesian products, meaning the fact that we can define a function $f:A\times B\to C$ as above by giving its value on pairs.
\index{recursion principle!for cartesian product}%

We leave it as a simple exercise to show that the recursor can be
derived from the projections and vice versa.
% Ex: Derive from projections

\symlabel{defn:recursor-unit}%
We also have a recursor for the unit type:
\[\rec{\unit} : \prd{C:\UU}C \to \unit \to C\]
with the defining equation
\[ \rec{\unit}(C,c,\ttt) \defeq c. \]
Although we include it to maintain the pattern of type definitions, the recursor for $\unit$ is completely useless,
because we could have defined such a function directly
by simply ignoring the argument of type $\unit$.

To be able to define \emph{dependent} functions over the product type, we have
to generalize the recursor. Given $C: A \times B \to \UU$, we may
define a function $f : \prd{x : A \times B} C(x)$ by providing a
function
\narrowequation{
 g : \prd{x:A}\prd{y:B} C(\tup{x}{y})
}
with defining equation
\[ f(\tup x y) \defeq g(x)(y). \] 
For example, in this way we can prove the propositional uniqueness principle, which says that every element of $A\times B$ is equal to a pair.
\index{uniqueness!principle, propositional!for product types}%
Specifically, we can construct a function
\[ \uppt : \prd{x:A \times B} (\id[A\times B]{\tup{\fst {(x)}}{\snd {(x)}}}{x}). \]
Here we are using the identity type, which we are going to introduce below in \cref{sec:identity-types}.
However, all we need to know now is that there is a reflexivity element $\refl{x} : \id[A]{x}{x}$ for any $x:A$.
Given this, we can define
\[ \uppt(\tup{a}{b}) \defeq \refl{\tup{a}{b}}. \]
This construction works, because in the case that $x \defeq \tup{a}{b}$ we can 
calculate 
\[ \tup{\fst(\tup{a}{b})}{\snd{(\tup{a}{b})}} \jdeq \tup{a}{b} \]
using the defining equations for the projections. Therefore,
\[ \refl{\tup{a}{b}} : \id{\tup{\fst(\tup{a}{b})}{\snd{(\tup{a}{b})}}}{\tup{a}{b}} \]
is well-typed, since both sides of the equality are judgmentally equal.

More generally, the ability to define dependent functions in this way means that to prove a property for all elements of a product, it is enough 
to prove it for its canonical elements, the tuples.
When we come to inductive types such as the natural numbers, the analogous property will be the ability to write proofs by induction.
Thus, if we do as we did above and apply this principle once in the universal case, we call the resulting function \define{induction} for product types: given $A,B : \UU$ we have
\symlabel{defn:induction-times}%
\[ \ind{A\times B} : \prd{C:A \times B \to \UU}
\Parens{\prd{x:A}{y:B} C(\tup{x}{y})} \to \prd{x:A \times B} C(x) \]
with the defining equation 
\[ \ind{A\times B}(C,g,\tup{a}{b}) \defeq g(a)(b). \]
Similarly, we may speak of a dependent function defined on pairs being obtained from the \define{induction principle}
\index{induction principle}%
\index{induction principle!for product}%
of the cartesian product.
It is easy to see that the recursor is just the special case of induction
in the case that the family $C$ is constant.
Because induction describes how to use an element of the product type, induction is also called the \define{(dependent) eliminator},
\indexsee{eliminator!of inductive type!dependent}{induction principle}%
and recursion the \define{non-dependent eliminator}.
\indexsee{eliminator!of inductive type!non-dependent}{recursion principle}%
\indexsee{non-dependent eliminator}{recursion principle}%
\indexsee{dependent eliminator}{induction principle}%

% We can read induction propositionally as saying that a property which
% is true for all pairs holds for all elements of the product type.

Induction for the unit type turns out to be more useful than the
recursor: 
\symlabel{defn:induction-unit}%
\[ \ind{\unit} : \prd{C:\unit \to \UU} C(\ttt) \to \prd{x:\unit}C(x)\]
with the defining equation
\[ \ind{\unit}(C,c,\ttt) \defeq c. \]
Induction enables us to prove the propositional uniqueness principle for $\unit$, which asserts that its only inhabitant is $\ttt$.
That is, we can construct
\[\un : \prd{x:\unit} \id{x}{\ttt} \]
by using the defining equations
\[\un(\ttt) \defeq \refl{\ttt} \]
or equivalently by using induction:
\[\un \defeq \ind{\unit}(\lamu{x:\unit} \id{x}{\ttt},\refl{\ttt}). \]

\index{type!product|)}%
\index{type!unit|)}%

\section{Dependent pair types (\texorpdfstring{$\Sigma$}{Σ}-types)}
\label{sec:sigma-types}

\index{type!dependent pair|(defstyle}%
\indexsee{type!dependent sum}{type, dependent pair}%
\indexsee{type!Sigma-@$\Sigma$-}{type, dependent pair}%
\indexsee{Sigma-type@$\Sigma$-type}{type, dependent pair}%
\indexsee{sum!dependent}{type, dependent pair}%

Just as we generalized function types (\cref{sec:function-types}) to dependent function types (\cref{sec:pi-types}), it is often useful to generalize the product types from \cref{sec:finite-product-types} to allow the type of
the second component of a pair to vary depending on the choice of the first
component. This is called a \define{dependent pair type}, or \define{$\Sigma$-type}, because in set theory it
corresponds to an indexed sum (in the sense of coproduct or
disjoint union) over a given type.

Given a type $A:\UU$ and a family $B : A \to \UU$, the dependent
pair type is written as $\sm{x:A} B(x) : \UU$.
Alternative notations are 
\[ \tsm{x:A} B(x) \hspace{2cm} \dsm{x:A}B(x) \hspace{2cm} \lsm{x:A} B(x). \]
Like other binding constructs such as $\lambda$-abstractions and $\Pi$s, $\Sigma$s automatically scope\index{scope} over the rest of the expression unless delimited, so e.g.\ $\sm{x:A} B(x) \times C(x)$ means $\sm{x:A} (B(x) \times C(x))$.

\symlabel{defn:dependent-pair}%
\indexdef{pair!dependent}%
The way to construct elements of a dependent pair type is by pairing: we have
$\tup{a}{b} : \sm{x:A} B(x)$ given $a:A$ and $b:B(a)$.
If $B$ is constant, then the dependent pair type is the
ordinary cartesian product type:
\[ \Parens{\sm{x:A} B} \jdeq (A \times B).\]
All the constructions on $\Sigma$-types arise as straightforward generalizations of the ones for product types, with dependent functions often replacing non-dependent ones.

For instance, the recursion principle%
\index{recursion principle!for dependent pair type}
says that to define a non-dependent function out of a $\Sigma$-type
$f : (\sm{x:A} B(x)) \to C$, we provide a function 
$g : \prd{x:A} B(x) \to C$, and then we can define $f$ via the defining
equation
\[ f(\tup{a}{b}) \defeq g(a)(b). \]
\indexdef{projection!from dependent pair type}%
For instance, we can derive the first projection from a $\Sigma$-type:
\symlabel{defn:dependent-proj1}%
\begin{equation*}
  \fst : \Parens{\sm{x : A}B(x)} \to A.
\end{equation*}
by the defining equation
\begin{equation*}
  \fst(\tup{a}{b}) \defeq a.
\end{equation*}
However, since the type of the second component of a pair
\narrowequation{
  (a,b):\sm{x:A} B(x)
}
is $B(a)$, the second projection must be a \emph{dependent} function, whose type involves the first projection function:
\symlabel{defn:dependent-proj2}%
\[ \snd : \prd{p:\sm{x : A}B(x)}B(\fst(p)). \]
Thus we need the \emph{induction} principle%
\index{induction principle!for dependent pair type}
for $\Sigma$-types (the ``dependent eliminator'').
This says that to construct a dependent function out of a $\Sigma$-type into a family $C : (\sm{x:A} B(x)) \to \UU$, we need a function
\[ g : \prd{a:A}{b:B(a)} C(\tup{a}{b}). \]
We can then derive a function 
\[ f : \prd{p : \sm{x:A}B(x)} C(p) \]
with  defining equation\index{computation rule!for dependent pair type}
\[ f(\tup{a}{b}) \defeq g(a)(b).\]
Applying this with $C(p)\defeq B(\fst(p))$, we can define
\narrowequation{
\snd : \prd{p:\sm{x : A}B(x)}B(\fst(p))
}
with the obvious equation
\[ \snd(\tup{a}{b})  \defeq  b. \]
To convince ourselves that this is correct, we note that $B (\fst(\tup{a}{b})) \jdeq B(a)$, using the defining equation for $\fst$, and
indeed $b : B(a)$.

We can package the recursion and induction principles into the recursor for $\Sigma$:
\symlabel{defn:recursor-sm}%
\[ \rec{\sm{x:A}B(x)} : \dprd{C:\UU}\Parens{\tprd{x:A} B(x) \to C} \to
\Parens{\tsm{x:A}B(x)} \to C \]
with the defining equation
\[ \rec{\sm{x:A}B(x)}(C,g,\tup{a}{b}) \defeq g(a)(b) \]
and the corresponding induction operator:
\symlabel{defn:induction-sm}%
\begin{narrowmultline*}
  \ind{\sm{x:A}B(x)} : \narrowbreak
    \dprd{C:(\sm{x:A} B(x)) \to \UU}
    \Parens{\tprd{a:A}{b:B(a)} C(\tup{a}{b})}
    \to \dprd{p : \sm{x:A}B(x)} C(p)
\end{narrowmultline*}
with the defining equation 
\[ \ind{\sm{x:A}B(x)}(C,g,\tup{a}{b}) \defeq g(a)(b). \]
As before, the recursor is the special case of induction
when the family $C$ is constant.

As a further example, consider the following principle, where $A$ and $B$ are types and $R:A\to B\to \UU$.
\[ \ac : \Parens{\tprd{x:A} \tsm{y :B} R(x,y)} \to
\Parens{\tsm{f:A\to B} \tprd{x:A} R(x,f(x))}
\]
We may regard $R$ as a ``proof-relevant relation''
\index{mathematics!proof-relevant}%
between $A$ and $B$, with $R(a,b)$ the type of witnesses for relatedness of $a:A$ and $b:B$.
Then $\ac$ says intuitively that if we have a dependent function $g$ assigning to every $a:A$ a dependent pair $(b,r)$ where $b:B$ and $r:R(a,b)$, then we have a function $f:A\to B$ and a dependent function assigning to every $a:A$ a witness that $R(a,f(a))$.
Our intuition tells us that we can just split up the values of $g$ into their components.
Indeed, using the projections we have just defined, we can define:
\[ \ac(g) \defeq \Parens{\lamu{x:A} \fst(g(x)),\, \lamu{x:A} \snd(g(x))}. \]
To verify that this is well-typed, note that if $g:\prd{x:A} \sm{y :B} R(x,y)$, we have
\begin{align*}
\lamu{x:A} \fst(g(x)) &: A \to  B, \\
\lamu{x:A} \snd(g(x)) &: \tprd{x:A} R(x,\fst(g(x))).
\end{align*}
Moreover, the type $\prd{x:A} R(x,\fst(g(x)))$ is the result of substituting the function $\lamu{x:A} \fst(g(x))$ for $f$ in the family being summed over in the co\-do\-main of \ac:
\[ \tprd{x:A} R(x,\fst(g(x))) \jdeq
\Parens{\lamu{f:A\to B} \tprd{x:A} R(x,f(x))}\big(\lamu{x:A} \fst(g(x))\big). \]
Thus, we have
\[ \Parens{\lamu{x:A} \fst(g(x)),\, \lamu{x:A} \snd(g(x))} : \tsm{f:A\to B} \tprd{x:A} R(x,f(x))\]
as required.

If we read $\Pi$ as ``for all'' and $\Sigma$ as ``there exists'', then the type of the function $\ac$ expresses:
\emph{if for all $x:A$ there is a $y:B$ such that $R(x,y)$, then there is a function $f : A \to B$ such that for all $x:A$ we have $R(x,f(x))$}.
Since this sounds like a version of the axiom of choice, the function \ac has traditionally been called the \define{type-theoretic axiom of choice}, and as we have just shown, it can be proved directly from the rules of type theory, rather than having to be taken as an axiom.
\index{axiom!of choice!type-theoretic}%
However, note that no choice is actually involved, since the choices have already been given to us in the premise: all we have to do is take it apart into two functions: one representing the choice and the other its correctness.
In \cref{sec:axiom-choice} we will give another formulation of an ``axiom of choice'' which is closer to the usual one.

Dependent pair types are often used to define types of mathematical structures, which commonly consist of several dependent pieces of data.
To take a simple example, suppose we want to define a \define{magma}\indexdef{magma} to be a type $A$ together with a binary operation $m:A\to A\to A$.
The precise meaning of the phrase ``together with''\index{together with} (and the synonymous ``equipped with''\index{equipped with}) is that ``a magma'' is a \emph{pair} $(A,m)$ consisting of a type $A:\UU$ and an operation $m:A\to A\to A$.
Since the type $A\to A\to A$ of the second component $m$ of this pair depends on its first component $A$, such pairs belong to a dependent pair type.
Thus, the definition ``a magma is a type $A$ together with a binary operation $m:A\to A\to A$'' should be read as defining \emph{the type of magmas} to be
\[ \mathsf{Magma} \defeq \sm{A:\UU} (A\to A\to A). \]
Given a magma, we extract its underlying type (its ``carrier''\index{carrier}) with the first projection $\proj1$, and its operation with the second projection $\proj2$.
Of course, structures built from more than two pieces of data require iterated pair types, which may be only partially dependent; for instance the type of pointed magmas (magmas $(A,m)$ equipped with a basepoint $e:A$) is
\[ \mathsf{PointedMagma} \defeq \sm{A:\UU} (A\to A\to A) \times A. \]
We generally also want to impose axioms on such a structure, e.g.\ to make a pointed magma into a monoid or a group.
This can also be done using $\Sigma$-types; see \cref{sec:pat}.

In the rest of the book, we will sometimes make definitions of this sort explicit, but eventually we trust the reader to translate them from English into $\Sigma$-types.
We also generally follow the common mathematical practice of using the same letter for a structure of this sort and for its carrier (which amounts to leaving the appropriate projection function implicit in the notation): that is, we will speak of a magma $A$ with its operation $m:A\to A\to A$.

Note that the canonical elements of $\mathsf{PointedMagma}$ are of the form $(A,(m,e))$ where $A:\UU$, $m:A\to A\to A$, and $e:A$.
Because of the frequency with which iterated $\Sigma$-types of this sort arise, we use the usual notation of ordered triples, quadruples and so on to stand for nested pairs (possibly dependent) associating to the right.
That is, we have $(x,y,z) \defeq (x,(y,z))$ and $(x,y,z,w)\defeq (x,(y,(z,w)))$, etc.

\index{type!dependent pair|)}%

\section{Coproduct types}
\label{sec:coproduct-types}

Given $A,B:\UU$, we introduce their \define{coproduct} type $A+B:\UU$.
\indexsee{coproduct}{type, coproduct}%
\index{type!coproduct|(defstyle}%
\indexsee{disjoint!sum}{type, coproduct}%
\indexsee{disjoint!union}{type, coproduct}%
\indexsee{sum!disjoint}{type, coproduct}%
\indexsee{union!disjoint}{type, coproduct}%
This corresponds to the \emph{disjoint union} in set theory, and we may also use that name for it.
In type theory, as was the case with functions and products, the coproduct must be a fundamental construction, since there is no previously given notion of ``union of types''.
We also introduce a
nullary version: the \define{empty type $\emptyt:\UU$}.
\indexsee{nullary!coproduct}{type, empty}%
\indexsee{empty type}{type, empty}%
\index{type!empty|(defstyle}%

There are two ways to construct elements of $A+B$, either as $\inl(a) : A+B$ for $a:A$, or as
$\inr(b):A+B$ for $b:B$.
(The names $\inl$ and $\inr$ are short for ``left injection'' and ``right injection''.)
There are no ways to construct elements of the empty type. 

\index{recursion principle!for coproduct}
To construct a non-dependent function $f : A+B \to C$, we need 
functions $g_0 : A \to C$ and $g_1 : B \to C$. Then $f$ is defined
via the defining equations
\begin{align*}
  f(\inl(a)) &\defeq g_0(a), \\
  f(\inr(b)) &\defeq g_1(b).
\end{align*}
That is, the function $f$ is defined by \define{case analysis}.
\indexdef{case analysis}%
As before, we can derive the recursor:
\symlabel{defn:recursor-plus}%
\[ \rec{A+B} : \dprd{C:\UU}(A \to C) \to (B\to C) \to A+B \to C\]
with the defining equations
\begin{align*}
\rec{A+B}(C,g_0,g_1,\inl(a)) &\defeq g_0(a), \\
\rec{A+B}(C,g_0,g_1,\inr(b)) &\defeq g_1(b).
\end{align*}

\index{recursion principle!for empty type}
We can always construct a function $f : \emptyt \to C$ without
having to give any defining equations, because there are no elements of \emptyt on which to define $f$.
Thus, the recursor for $\emptyt$ is
\symlabel{defn:recursor-emptyt}%
\[\rec{\emptyt} : \tprd{C:\UU} \emptyt \to C,\]
which constructs the canonical function from the empty type to any other type.
Logically, it corresponds to the principle \textit{ex falso quodlibet}.
\index{ex falso quodlibet@\textit{ex falso quodlibet}}

\index{induction principle!for coproduct}
To construct a dependent function $f:\prd{x:A+B}C(x)$ out of a coproduct, we assume as given the family 
$C: (A + B) \to \UU$, and 
require 
\begin{align*}
  g_0 &: \prd{a:A} C(\inl(a)), \\
  g_1 &: \prd{b:B} C(\inr(b)).
\end{align*}
This yields $f$ with the defining equations:\index{computation rule!for coproduct type}
\begin{align*}
  f(\inl(a)) &\defeq g_0(a), \\
  f(\inr(b)) &\defeq g_1(b).
\end{align*}
We package this scheme into the induction principle for coproducts:
\symlabel{defn:induction-plus}%
\begin{narrowmultline*}
  \ind{A+B} :
  \dprd{C: (A + B) \to \UU}
  \Parens{\tprd{a:A} C(\inl(a))} \to \narrowbreak
  \Parens{\tprd{b:B} C(\inr(b))} \to \tprd{x:A+B}C(x). 
\end{narrowmultline*}
As before, the recursor arises in the case that the family $C$ is
constant. 

\index{induction principle!for empty type}
The induction principle for the empty type
\symlabel{defn:induction-emptyt}%
\[ \ind{\emptyt} : \prd{C:\emptyt \to \UU}{z:\emptyt} C(z) \]
gives us a way to define a trivial dependent function out of the
empty type. % In the presence of $\eta$-equality it is derivable
% from the recursor.
% ex

\index{type!coproduct|)}%
\index{type!empty|)}%


\section{The type of booleans}
\label{sec:type-booleans}

\indexsee{boolean!type of}{type of booleans}%
\index{type!of booleans|(defstyle}%
The type of booleans $\bool:\UU$ is intended to have exactly two elements 
$\bfalse,\btrue : \bool$. It is clear that we could construct this
type out of coproduct
% this one results in a warning message just because it's on the same page as the previous entry 
% for {type!coproduct}, so it's not our fault
\index{type!coproduct}%
and unit\index{type!unit} types as $\unit + \unit$. However,
since it is used frequently, we give the explicit rules here.
Indeed, we are going to observe that we can also go the other way
and derive binary coproducts from $\Sigma$-types and $\bool$.

\index{recursion principle!for type of booleans}
To derive a function $f : \bool \to C$ we need $c_0,c_1 : C$ and
add the defining equations
\begin{align*}
  f(\bfalse) &\defeq c_0, \\
  f(\btrue)  &\defeq c_1.
\end{align*}
The recursor corresponds to the if-then-else construct in
functional programming:
\symlabel{defn:recursor-bool}%
\[ \rec{\bool} : \prd{C:\UU}  C \to C \to \bool \to C \]
with the defining equations
\begin{align*}
  \rec{\bool}(C,c_0,c_1,\bfalse) &\defeq c_0, \\
  \rec{\bool}(C,c_0,c_1,\btrue)  &\defeq c_1.
\end{align*}

\index{induction principle!for type of booleans}
Given $C : \bool \to \UU$, to derive a dependent function 
$f : \prd{x:\bool}C(x)$ we need $c_0:C(\bfalse)$ and $c_1 : C(\btrue)$, in which case we can give the defining equations
\begin{align*}
  f(\bfalse) &\defeq c_0, \\
  f(\btrue)  &\defeq c_1.
\end{align*}
We package this up into the induction principle
\symlabel{defn:induction-bool}%
\[ \ind{\bool} : \dprd{C:\bool \to \UU}  C(\bfalse) \to C(\btrue)
\to \tprd{x:\bool} C(x) \]
with the defining equations
\begin{align*}
  \ind{\bool}(C,c_0,c_1,\bfalse) &\defeq c_0, \\
  \ind{\bool}(C,c_0,c_1,\btrue)  &\defeq c_1.
\end{align*}

As an example, using the induction principle we can deduce that, as we expect, every element of $\bool$ is either $\btrue$ or $\bfalse$.
As before, in order to state this we use the equality types which we have not yet introduced, but we need only the fact that everything is equal to itself by $\refl{x}:x=x$.
Thus, we construct an element of
\begin{equation}\label{thm:allbool-trueorfalse}
  \prd{x:\bool}(x=\bfalse)+(x=\btrue),
\end{equation}
i.e.\ a function assigning to each $x:\bool$ either an equality $x=\bfalse$ or an equality $x=\btrue$.
We define this element using the induction principle for \bool, with $C(x) \defeq (x=\bfalse)+(x=\btrue)$;
the two inputs are $\inl(\refl{\bfalse}) : C(\bfalse)$ and $\inr(\refl{\btrue}):C(\btrue)$.
In other words, our element of~\eqref{thm:allbool-trueorfalse} is
\[ \ind{\bool}\big(\lam{x}(x=\bfalse)+(x=\btrue),\, \inl(\refl{\bfalse}),\, \inr(\refl{\btrue})\big). \]

We have remarked that $\Sigma$-types can be regarded as analogous to indexed disjoint unions, while coproducts are binary disjoint unions.
It is natural to expect that a binary disjoint union $A+B$ could be constructed as an indexed one over the two-element type \bool.
For this we need a type family $P:\bool\to\type$ such that $P(\bfalse)\jdeq A$ and $P(\btrue)\jdeq B$.
Indeed, we can obtain such a family precisely by the recursion principle for $\bool$.
\index{type!family of}%
(The ability to define \emph{type families} by induction and recursion, using the fact that the universe $\UU$ is itself a type, is a subtle and important aspect of type theory.)
Thus, we could have defined
\index{type!coproduct}%
\[ A + B \defeq \sm{x:\bool} \rec{\bool}(\UU,A,B,x). \]
with
\begin{align*}
  \inl(a) &\defeq \tup{\bfalse}{a}, \\
  \inr(b) &\defeq \tup{\btrue}{b}.
\end{align*}
We leave it as an exercise to derive the induction principle of a coproduct type from this definition.
(See also \cref{ex:sum-via-bool,sec:appetizer-univalence}.)

We can apply the same idea to products and $\Pi$-types: we could have defined
\[ A \times B \defeq \prd{x:\bool}\rec{\bool}(\UU,A,B,x) \]
Pairs could then be constructed using induction for \bool:
\[ \tup{a}{b} \defeq \ind{\bool}(\rec{\bool}(\UU,A,B),a,b) \]
while the projections are straightforward applications
\begin{align*}
  \fst(p) &\defeq p(\bfalse), \\
  \snd(p) &\defeq p(\btrue).
\end{align*}
The derivation of the induction principle for binary products defined in this way is a bit more involved, and requires function extensionality, which we will introduce in \cref{sec:compute-pi}.
Moreover, we do not get the same judgmental equalities; see \cref{ex:prod-via-bool}.
This is a recurrent issue when encoding one type as another; we will return to it in \cref{sec:htpy-inductive}. 

We may occasionally refer to the elements $\bfalse$ and $\btrue$ of $\bool$ as ``false'' and ``true'' respectively.
However, note that unlike in classical\index{mathematics!classical} mathematics, we do not use elements of $\bool$ as truth values
\index{value!truth}%
or as propositions.
(Instead we identify propositions with types; see \cref{sec:pat}.)
In particular, the type $A \to \bool$ is not generally the power set\index{power set} of $A$; it represents only the ``decidable'' subsets of $A$ (see \cref{cha:logic}).
\index{decidable!subset}%

\index{type!of booleans|)}%


\section{The natural numbers}
\label{sec:inductive-types}

\indexsee{type!of natural numbers}{natural numbers}%
\index{natural numbers|(defstyle}%
\indexsee{number!natural}{natural numbers}%
The rules we have introduced so far do not allow us to construct any infinite types.
The simplest infinite type we can think of (and one which is of course also extremely useful) is the type $\nat : \UU$ of natural numbers.
The elements of $\nat$ are constructed using $0 : \nat$\indexdef{zero} and the successor\indexdef{successor} operation $\suc : \nat \to \nat$.
When denoting natural numbers, we adopt the usual decimal notation $1 \defeq \suc(0)$, $2 \defeq \suc(1)$, $3 \defeq \suc(2)$, \dots.

The essential property of the natural numbers is that we can define functions by recursion and perform proofs by induction --- where now the words ``recursion'' and ``induction'' have a more familiar meaning.
\index{recursion principle!for natural numbers}%
To construct a non-dependent function $f : \nat \to C$ out of the natural numbers by recursion, it is enough to provide a starting point $c_0 : C$ and a ``next step'' function $c_s : \nat \to C \to C$.
This gives rise to $f$ with the defining equations\index{computation rule!for natural numbers}
\begin{align*}
  f(0) &\defeq c_0, \\
  f(\suc(n)) &\defeq c_s(n,f(n)).
\end{align*}
We say that $f$ is defined by \define{primitive recursion}.
\indexdef{primitive!recursion}%
\indexdef{recursion!primitive}%

As an example, we look at how to define a function on natural numbers which doubles its argument.
In this case we have $C\defeq \nat$.
We first need to supply the value of $\dbl(0)$, which is easy: we put $c_0 \defeq 0$.
Next, to compute the value of $\dbl(\suc(n))$ for a natural number $n$, we first compute the value of $\dbl(n)$ and then perform the successor operation twice.
This is captured by the recurrence\index{recurrence} $c_s(n,y) \defeq \suc(\suc(y))$.
Note that the second argument $y$ of $c_s$ stands for the result of the \emph{recursive call}\index{recursive call} $\dbl(n)$.

Defining $\dbl:\nat\to\nat$ by primitive recursion in this way, therefore, we obtain the defining equations:
\begin{align*}
  \dbl(0) &\defeq 0\\
  \dbl(\suc(n)) &\defeq \suc(\suc(\dbl(n))).
\end{align*}
This indeed has the correct computational behavior: for example, we have 
\begin{align*}
  \dbl(2) &\jdeq \dbl(\suc(\suc(0)))\\
  & \jdeq c_s(\suc(0), \dbl(\suc(0))) \\
                 & \jdeq \suc(\suc(\dbl(\suc(0)))) \\
                 & \jdeq \suc(\suc(c_s(0,\dbl(0)))) \\
                 & \jdeq \suc(\suc(\suc(\suc(\dbl(0))))) \\
                 & \jdeq \suc(\suc(\suc(\suc(c_0)))) \\
                 & \jdeq \suc(\suc(\suc(\suc(0))))\\
                 &\jdeq 4.
\end{align*}
We can define multi-variable functions by primitive recursion as well, by currying and allowing $C$ to be a function type.
\indexdef{addition!of natural numbers}
For example, we define addition $\add : \nat \to \nat \to \nat$ with $C \defeq \nat \to \nat$ and the following ``starting point'' and ``next step'' data:
\begin{align*}
  c_0 & : \nat \to \nat \\
  c_0 (n) & \defeq n \\
  c_s & : \nat \to (\nat \to \nat) \to (\nat \to \nat) \\
  c_s(m,g)(n) & \defeq \suc(g(n)).
\end{align*}
We thus obtain $\add : \nat \to \nat \to \nat$ satisfying the definitional equalities
\begin{align*}
  \add(0,n) &\jdeq n \\
  \add(\suc(m),n) &\jdeq \suc(\add(m,n)). 
\end{align*}
As usual, we write $\add(m,n)$ as $m+n$.
The reader is invited to verify that $2+2\jdeq 4$.
% ex: define multiplication and exponentiation.

As in previous cases, we can package the principle of primitive recursion into a recursor:
\[\rec{\nat}  : \dprd{C:\UU} C \to (\nat \to C \to C) \to \nat \to C \]
with the defining equations
\symlabel{defn:recursor-nat}%
\begin{align*}
\rec{\nat}(C,c_0,c_s,0)  &\defeq c_0, \\
\rec{\nat}(C,c_0,c_s,\suc(n)) &\defeq c_s(n,\rec{\nat}(C,c_0,c_s,n)).
\end{align*}
%ex derive rec from it
Using $\rec{\nat}$ we can present $\dbl$ and $\add$ as follows:
\begin{align}
\dbl &\defeq \rec\nat\big(\nat,\, 0,\, \lamu{n:\nat}{y:\nat} \suc(\suc(y))\big) \label{eq:dbl-as-rec}\\
\add &\defeq \rec{\nat}\big(\nat \to \nat,\, \lamu{n:\nat} n,\, \lamu{n:\nat}{g:\nat \to \nat}{m :\nat} \suc(g(m))\big).
\end{align}
Of course, all functions definable only using the primitive recursion principle will be \emph{computable}.
(The presence of higher function types --- that is, functions with other functions as arguments --- does, however, mean we can define more than the usual primitive recursive functions; see e.g.~\cref{ex:ackermann}.)
This is appropriate in constructive mathematics;
\index{mathematics!constructive}%
in \cref{sec:intuitionism,sec:axiom-choice} we will see how to augment type theory so that we can define more general mathematical functions.

\index{induction principle!for natural numbers}
We now follow the same approach as for other types, generalizing primitive recursion to dependent functions to obtain an \emph{induction principle}.
Thus, assume as given a family $C : \nat \to \UU$, an element $c_0 : C(0)$, and a function $c_s : \prd{n:\nat} C(n) \to C(\suc(n))$; then we can construct $f : \prd{n:\nat} C(n)$ with the defining equations:\index{computation rule!for natural numbers}
\begin{align*}
  f(0) &\defeq c_0, \\
  f(\suc(n)) &\defeq c_s(n,f(n)).
\end{align*}
We can also package this into a single function
\symlabel{defn:induction-nat}%
\[\ind{\nat}  : \dprd{C:\nat\to \UU} C(0) \to \Parens{\tprd{n : \nat} C(n) \to C(\suc(n))} \to \tprd{n : \nat} C(n) \]
with the defining equations
\begin{align*}
\ind{\nat}(C,c_0,c_s,0)  &\defeq c_0, \\
\ind{\nat}(C,c_0,c_s,\suc(n)) &\defeq c_s(n,\ind{\nat}(C,c_0,c_s,n)).
\end{align*}
Here we finally see the connection to the classical notion of proof by induction.
Recall that in type theory we represent propositions by types, and proving a proposition by inhabiting the corresponding type.
In particular, a \emph{property} of natural numbers is represented by a family of types $P:\nat\to\type$.
From this point of view, the above induction principle says that if we can prove $P(0)$, and if for any $n$ we can prove $P(\suc(n))$ assuming $P(n)$, then we have $P(n)$ for all $n$.
This is, of course, exactly the usual principle of proof by induction on natural numbers.

\index{associativity!of addition!of natural numbers}
As an example, consider how we might represent an explicit proof that $+$ is associative.
(We will not actually write out proofs in this style, but it serves as a useful example for understanding how induction is represented formally in type theory.)
To derive
\[\assoc : \prd{i,j,k:\nat} \id{i + (j + k)}{(i + j) + k}, \]
it is sufficient to supply
\[ \assoc_0 :  \prd{j,k:\nat} \id{0 + (j + k)}{(0+ j) + k} \]
and
\begin{narrowmultline*}
  \assoc_s  : \prd{i:\nat} \left(\prd{j,k:\nat} \id{i + (j + k)}{(i + j) + k}\right)
   \narrowbreak
   \to \prd{j,k:\nat} \id{\suc(i) + (j + k)}{(\suc(i) + j) + k}.
\end{narrowmultline*}
To derive $\assoc_0$, recall that $0+n \jdeq n$, and hence  $0 + (j + k) \jdeq j+k \jdeq (0+ j) + k$.
Hence we can just set
\[ \assoc_0(j,k) \defeq \refl{j+k}. \]
For $\assoc_s$, recall that the definition of $+$ gives $\suc(m)+n \jdeq \suc(m+n)$, and hence 
\begin{align*}
   \suc(i) + (j + k)  &\jdeq \suc(i+(j+k)) \qquad\text{and}\\
   (\suc(i)+j)+k &\jdeq \suc((i+j)+k).
\end{align*}
Thus, the output type of $\assoc_s$ is equivalently $\id{\suc(i+(j+k))}{\suc((i+j)+k)}$.
But its input (the ``inductive hypothesis'')
\index{hypothesis!inductive}%
\index{inductive!hypothesis}%
yields $\id{i+(j+k)}{(i+j)+k}$, so it suffices to invoke the fact that if two natural numbers are equal, then so are their successors.
(We will prove this obvious fact in \cref{lem:map}, using the induction principle of identity types.)
We call this latter fact
$\apfunc{\suc} : %\prd{m,n:\nat}
(\id[\nat]{m}{n}) \to (\id[\nat]{\suc(m)}{\suc(n)})$, so we can define
\[\assoc_s(i,h,j,k) \defeq \apfunc{\suc}( %n+(j+k),(n+j)+k,
h(j,k)). \]
Putting these together with $\ind{\nat}$, we obtain a proof of associativity.

\index{natural numbers|)}%


\section{Pattern matching and recursion}
\label{sec:pattern-matching}

\index{pattern matching|(defstyle}%
\indexsee{matching}{pattern matching}%
\index{definition!by pattern matching|(}%
The natural numbers introduce an additional subtlety over the types considered up until now.
In the case of coproducts, for instance, we could define a function $f:A+B\to C$ either with the recursor:
\[ f \defeq \rec{A+B}(C, g_0, g_1) \]
or by giving the defining equations:
\begin{align*}
  f(\inl(a)) &\defeq g_0(a)\\
  f(\inr(b)) &\defeq g_1(b).
\end{align*}
To go from the former expression of $f$ to the latter, we simply use the computation rules for the recursor.
Conversely, given any defining equations
\begin{align*}
  f(\inl(a)) &\defeq \Phi_0\\
  f(\inr(b)) &\defeq \Phi_1
\end{align*}
where $\Phi_0$ and $\Phi_1$ are expressions that may involve the variables
\index{variable}%
$a$ and $b$ respectively, we can express these equations equivalently in terms of the recursor by using $\lambda$-abstraction\index{lambda abstraction@$\lambda$-abstraction}:
\[ f\defeq \rec{A+B}(C, \lam{a} \Phi_0, \lam{b} \Phi_1).\]
In the case of the natural numbers, however, the ``defining equations'' of a function such as $\dbl$:
\begin{align}
  \dbl(0) &\defeq 0 \label{eq:dbl0}\\
  \dbl(\suc(n)) &\defeq \suc(\suc(\dbl(n)))\label{eq:dblsuc}
\end{align}
involve \emph{the function $\dbl$ itself} on the right-hand side.
However, we would still like to be able to give these equations, rather than~\eqref{eq:dbl-as-rec}, as the definition of \dbl, since they are much more convenient and readable.
The solution is to read the expression ``$\dbl(n)$'' on the right-hand side of~\eqref{eq:dblsuc} as standing in for the result of the recursive call, which in a definition of the form $\dbl\defeq \rec{\nat}(\nat,c_0,c_s)$ would be the second argument of $c_s$.

More generally, if we have a ``definition'' of a function $f:\nat\to C$ such as
\begin{align*}
  f(0) &\defeq \Phi_0\\
  f(\suc(n)) &\defeq \Phi_s
\end{align*}
where $\Phi_0$ is an expression of type $C$, and $\Phi_s$ is an expression of type $C$ which may involve the variable $n$ and also the symbol ``$f(n)$'', we may translate it to a definition
\[ f \defeq \rec{\nat}(C,\,\Phi_0,\,\lam{n}{r} \Phi_s') \]
where $\Phi_s'$ is obtained from $\Phi_s$ by replacing all occurrences of ``$f(n)$'' by the new variable $r$.

This style of defining functions by recursion (or, more generally, dependent functions by induction) is so convenient that we frequently adopt it.
It is called definition by \define{pattern matching}.
Of course, it is very similar to how a computer programmer may define a recursive function with a body that literally contains recursive calls to itself.
However, unlike the programmer, we are restricted in what sort of recursive calls we can make: in order for such a definition to be re-expressible using the recursion principle, the function $f$ being defined can only appear in the body of $f(\suc(n))$ as part of the composite symbol ``$f(n)$''.
Otherwise, we could write nonsense functions such as
\begin{align*}
  f(0)&\defeq 0\\
  f(\suc(n)) &\defeq f(\suc(\suc(n))).
\end{align*}
If a programmer wrote such a function, it would simply call itself forever on any positive input, going into an infinite loop and never returning a value.
In mathematics, however, to be worthy of the name, a \emph{function} must always associate a unique output value to every input value, so this would be unacceptable.

This point will be even more important when we introduce more complicated inductive types in \cref{cha:induction,cha:hits,cha:real-numbers}.
Whenever we introduce a new kind of inductive definition, we always begin by deriving its induction principle.
Only then do we introduce an appropriate sort of ``pattern matching'' which can be justified as a shorthand for the induction principle.

\index{pattern matching|)}%
\index{definition!by pattern matching|)}%

\section{Propositions as types}
\label{sec:pat}

\index{proposition!as types|(defstyle}%
\index{logic!propositions as types|(}%
As mentioned in the introduction, to show that a proposition is true in type theory corresponds to exhibiting an element of the type corresponding to that proposition.
\index{evidence, of the truth of a proposition}%
\index{witness!to the truth of a proposition}%
\index{proof|(}
We regard the elements of this type as \emph{evidence} or \emph{witnesses} that the proposition is true. (They are sometimes even called \emph{proofs}, but this terminology can be misleading, so we generally avoid it.)
In general, however, we will not construct witnesses explicitly; instead we present the proofs in ordinary mathematical prose, in such a way that they could be translated into an element of a type.
This is no different from reasoning in classical set theory, where we don't expect to see an explicit derivation using the rules of predicate logic and the axioms of set theory.

However, the type-theoretic perspective on proofs is nevertheless different in important ways.
The basic principle of the logic of type theory is that a proposition is not merely true or false, but rather can be seen as the collection of all possible witnesses of its truth.
Under this conception, proofs are not just the means by which mathematics is communicated, but rather are mathematical objects in their own right, on a par with more familiar objects such as numbers, mappings, groups, and so on.
Thus, since types classify the available mathematical objects and govern how they interact, propositions are nothing but special  types --- namely, types whose elements are proofs.

\index{propositional!logic}%
\index{logic!propositional}%
The basic observation which makes this identification feasible is that we have the following natural correspondence between \emph{logical} operations on propositions, expressed in English, and \emph{type-theoretic} operations on their corresponding types of witnesses.
\index{false}%
\index{true}%
\index{conjunction}%
\index{disjunction}%
\index{implication}%
\begin{center}
\medskip
\begin{tabular}{ll}
  \toprule
  English & Type Theory\\
  \midrule
  True & $\unit$ \\
  False & $\emptyt$ \\
  $A$ and $B$ & $A \times B$ \\
  $A$ or $B$ & $A + B$ \\
  If $A$ then $B$ & $A \to B$ \\
  $A$ if and only if $B$ & $(A \to B) \times (B \to A)$ \\
  Not $A$ &  $A \to \emptyt$ \\
  \bottomrule
\end{tabular}
\medskip
\end{center}

The point of the correspondence is that in each case, the rules for constructing and using elements of the type on the right correspond to the rules for reasoning about the proposition on the left.
For instance, the basic way to prove a statement of the form ``$A$ and $B$'' is to prove $A$ and also prove $B$, while the basic way to construct an element of $A\times B$ is as a pair $(a,b)$, where $a$ is an element (or witness) of $A$ and $b$ is an element (or witness) of $B$.
And if we want to use ``$A$ and $B$'' to prove something else, we are free to use both $A$ and $B$ in doing so, analogously to how the induction principle for $A\times B$ allows us to construct a function out of it by using elements of $A$ and of $B$.

Similarly, the basic way to prove an implication\index{implication} ``if $A$ then $B$'' is to assume $A$ and prove $B$, while the basic way to construct an element of $A\to B$ is to give an expression which denotes an element (witness) of $B$ which may involve an unspecified variable element (witness) of type $A$.
And the basic way to use an implication ``if $A$ then $B$'' is deduce $B$ if we know $A$, analogously to how we can apply a function $f:A\to B$ to an element of $A$ to produce an element of $B$.
We strongly encourage the reader to do the exercise of verifying that the rules governing the other type constructors translate sensibly into logic.

Of special note is that the empty type $\emptyt$ corresponds to falsity.\index{false}
When speaking logically, we refer to an inhabitant of $\emptyt$ as a \define{contradiction}:
\indexdef{contradiction}%
thus there is no way to prove a contradiction,%
\footnote{More precisely, there is no \emph{basic} way to prove a contradiction, i.e.\ \emptyt has no constructors.
If our type theory were inconsistent, then there would be some more complicated way to construct an element of \emptyt.}
while from a contradiction anything can be derived.
We also define the \define{negation}
\indexdef{negation}%
of a type $A$ as
%
\begin{equation*}
  \neg A \ \defeq\ A \to \emptyt.
\end{equation*}
%
Thus, a witness of $\neg A$ is a function $A \to \emptyt$, which we may construct by assuming $x : A$ and deriving an element of~$\emptyt$.
\index{proof!by contradiction}%
\index{logic!constructive vs classical}
Note that although the logic we obtain is ``constructive'', as discussed in the introduction, this sort of ``proof by contradiction'' (assume $A$ and derive a contradiction, concluding $\neg A$) is perfectly valid constructively: it is simply invoking the \emph{meaning} of ``negation''.
The sort of ``proof by contradiction'' which is disallowed is to assume $\neg A$ and derive a contradiction as a way of proving $A$.
Constructively, such an argument would only allow us to conclude $\neg\neg A$, and the reader can verify that there is no obvious way to get from $\neg\neg A$ (that is, from $(A\to \emptyt)\to\emptyt$) to $A$.

\mentalpause

The above translation of logical connectives into type-forming operations is referred to as \define{propositions as types}: it gives us a way to translate propositions and their proofs, written in English, into types and their elements.
For example, suppose we want to prove the following tautology (one of ``de Morgan's laws''):
\index{law!de Morgan's|(}%
\index{de Morgan's laws|(}%
\begin{equation}\label{eq:tautology1}
  \text{\emph{``If not $A$ and not $B$, then not ($A$ or $B$)''}.}
\end{equation}
An ordinary English proof of this fact might go as follows.
\begin{quote}
  Suppose not $A$ and not $B$, and also suppose $A$ or $B$; we will derive a contradiction.
  There are two cases.
  If $A$ holds, then since not $A$, we have a contradiction.
  Similarly, if $B$ holds, then since not $B$, we also have a contradiction.
  Thus we have a contradiction in either case, so not ($A$ or $B$).
\end{quote}
Now, the type corresponding to our tautology~\eqref{eq:tautology1}, according to the rules given above, is
\begin{equation}\label{eq:tautology2}
  (A\to \emptyt) \times (B\to\emptyt) \to (A+B\to\emptyt)
\end{equation}
so we should be able to translate the above proof into an element of this type.

As an example of how such a translation works, let us describe how a mathematician reading the above English proof might simultaneously construct, in his or her head, an element of~\eqref{eq:tautology2}.
The introductory phrase ``Suppose not $A$ and not $B$'' translates into defining a function, with an implicit application of the recursion principle for the cartesian product in its domain $(A\to\emptyt)\times (B\to\emptyt)$.
This introduces unnamed variables
\index{variable}%
(hypotheses)
\index{hypothesis}%
of types $A\to\emptyt$ and $B\to\emptyt$.
When translating into type theory, we have to give these variables names; let us call them $x$ and $y$.
At this point our partial definition of an element of~\eqref{eq:tautology2} can be written as
\[ f((x,y)) \defeq\; \Box\;:A+B\to\emptyt \]
with a ``hole'' $\Box$ of type $A+B\to\emptyt$ indicating what remains to be done.
(We could equivalently write $f \defeq \rec{(A\to\emptyt)\times (B\to\emptyt)}(A+B\to\emptyt,\lam{x}{y} \Box)$, using the recursor instead of pattern matching.)
The next phrase ``also suppose $A$ or $B$; we will derive a contradiction'' indicates filling this hole by a function definition, introducing another unnamed hypothesis $z:A+B$, leading to the proof state:
\[ f((x,y))(z) \defeq \;\Box\; :\emptyt \]
Now saying ``there are two cases'' indicates a case split, i.e.\ an application of the recursion principle for the coproduct $A+B$.
If we write this using the recursor, it would be
\[ f((x,y))(z) \defeq \rec{A+B}(\emptyt,\lam{a} \Box,\lam{b}\Box,z) \]
while if we write it using pattern matching, it would be
\begin{align*}
  f((x,y))(\inl(a)) &\defeq \;\Box\;:\emptyt\\
  f((x,y))(\inr(b)) &\defeq \;\Box\;:\emptyt.
\end{align*}
Note that in both cases we now have two ``holes'' of type $\emptyt$ to fill in, corresponding to the two cases where we have to derive a contradiction.
Finally, the conclusion of a contradiction from $a:A$ and $x:A\to\emptyt$ is simply application of the function $x$ to $a$, and similarly in the other case.
\index{application!of hypothesis or theorem}%
(Note the convenient coincidence of the phrase ``applying a function'' with that of ``applying a hypothesis'' or theorem.)
Thus our eventual definition is
\begin{align*}
  f((x,y))(\inl(a)) &\defeq x(a)\\
  f((x,y))(\inr(b)) &\defeq y(b).
\end{align*}

As an exercise, you should verify 
the converse tautology \emph{``If not ($A$ or $B$), then  (not $A$) and (not $B$)}'' by exhibiting an element of 
\[ ((A + B) \to \emptyt) \to (A \to \emptyt) \times (B \to \emptyt), \]
for any types $A$ and $B$, using the rules we have just introduced.

\index{logic!classical vs constructive|(}
However, not all classical\index{mathematics!classical} tautologies hold under this interpretation.
For example, the rule 
\emph{``If not ($A$ and $B$), then (not $A$) or (not $B$)''} is not valid: we cannot, in general, construct an element of the corresponding type
\[ ((A \times B) \to \emptyt) \to (A \to \emptyt) + (B \to \emptyt).\]
This reflects the fact that the ``natural'' propositions-as-types logic of type theory is \emph{constructive}.
This means that it does not include certain classical principles, such as the law of excluded middle (\LEM{})\index{excluded middle}
or proof by contradiction,\index{proof!by contradiction}
and others which depend on them, such as this instance of de Morgan's law.
\index{law!de Morgan's|)}%
\index{de Morgan's laws|)}%

Philosophically, constructive logic is so-called because it confines itself to constructions that can be carried out \emph{effectively}, which is to say those with a computational meaning.
Without being too precise, this means there is some sort of algorithm\index{algorithm} specifying, step-by-step, how to build an object (and, as a special case, how to see that a theorem is true).
This requires omission of \LEM{}, since there is no \emph{effective}\index{effective!procedure} procedure for deciding whether a proposition is true or false.

The constructivity of type-theoretic logic means it has an intrinsic computational meaning, which is of interest to computer scientists.
It also means that type theory provides \emph{axiomatic freedom}.\index{axiomatic freedom}
For example, while by default there is no construction witnessing \LEM{}, the logic is still compatible with the existence of one (see \cref{sec:intuitionism}).
Thus, because type theory does not \emph{deny} \LEM{}, we may consistently add it as an assumption, and work conventionally without restriction.
In this respect, type theory enriches, rather than constrains, conventional mathematical practice.

We encourage the reader who is unfamiliar with constructive logic to work through some more examples as a means of getting familiar with it.
See \cref{ex:tautologies,ex:not-not-lem} for some suggestions.
\index{logic!classical vs constructive|)}

\mentalpause

So far we have discussed only propositional logic.
\index{quantifier}%
\index{quantifier!existential}%
\index{quantifier!universal}%
\index{predicate!logic}%
\index{logic!predicate}%
Now we consider \emph{predicate} logic, where in addition to logical connectives like ``and'' and ``or'' we have quantifiers ``there exists'' and ``for all''.
In this case, types play a dual role: they serve as propositions and also as types in the conventional sense, i.e., domains we quantify over.
A predicate over a type $A$ is represented as a family $P : A \to \UU$, assigning to every element $a : A$ a type $P(a)$ corresponding to the proposition that $P$ holds for $a$. We now extend the above translation with an explanation of the quantifiers:
\begin{center}
  \medskip
  \begin{tabular}{ll}
    \toprule
    English & Type Theory\\
    \midrule
    For all $x:A$, $P(x)$ holds & $\prd{x:A} P(x)$ \\
    There exists $x:A$ such that $P(x)$ & $\sm{x:A}$ $P(x)$ \\
    \bottomrule
  \end{tabular}
  \medskip
\end{center}
As before, we can show that tautologies of (constructive) predicate logic translate into inhabited types.
For example, \emph{If for all $x:A$, $P(x)$ and $Q(x)$ then (for all $x:A$, $P(x)$) and (for all $x:A$, $Q(x)$)} translates to
\[ (\tprd{x:A} P(x) \times Q(x)) \to (\tprd{x:A} P(x)) \times (\tprd{x:A} Q(x)). \]
An informal proof of this tautology might go as follows:
\begin{quote}
  Suppose for all $x$, $P(x)$ and $Q(x)$.
  First, we suppose given $x$ and prove $P(x)$.
  By assumption, we have $P(x)$ and $Q(x)$, and hence we have $P(x)$.
  Second, we suppose given $x$ and prove $Q(x)$.
  Again by assumption, we have $P(x)$ and $Q(x)$, and hence we have $Q(x)$.
\end{quote}
The first sentence begins defining an implication as a function, by introducing a witness for its hypothesis:\index{hypothesis}
\[ f(p) \defeq \;\Box\; : (\tprd{x:A} P(x)) \times (\tprd{x:A} Q(x)). \]
At this point there is an implicit use of the pairing constructor to produce an element of a product type, which is somewhat signposted in this example by the words ``first'' and ``second'':
\[ f(p) \defeq \Big( \;\Box\; : \tprd{x:A} P(x) \;,\; \Box\; : \tprd{x:A}Q(x) \;\Big). \]
The phrase ``we suppose given $x$ and prove $P(x)$'' now indicates defining a \emph{dependent} function in the usual way, introducing a variable
\index{variable}%
for its input.
Since this is inside a pairing constructor, it is natural to write it as a $\lambda$-abstraction\index{lambda abstraction@$\lambda$-abstraction}:
\[ f(p) \defeq \Big( \; \lam{x} \;\big(\Box\; : P(x)\big) \;,\; \Box\; : \tprd{x:A}Q(x) \;\Big). \]
Now ``we have $P(x)$ and $Q(x)$'' invokes the hypothesis, obtaining $p(x) : P(x)\times Q(x)$, and ``hence we have $P(x)$'' implicitly applies the appropriate projection:
\[ f(p) \defeq \Big( \; \lam{x} \proj1(p(x))  \;,\; \Box\; : \tprd{x:A}Q(x) \;\Big). \]
The next two sentences fill the other hole in the obvious way:
\[ f(p) \defeq \Big( \; \lam{x} \proj1(p(x))  \;,\; \lam{x} \proj2(p(x)) \; \Big). \]
Of course, the English proofs we have been using as examples are much more verbose than those that mathematicians usually use in practice; they are more like the sort of language one uses in an ``introduction to proofs'' class.
The practicing mathematician has learned to fill in the gaps, so in practice we can omit plenty of details, and we will generally do so.
The criterion of validity for proofs, however, is always that they can be translated back into the construction of an element of the corresponding type.

\symlabel{leq-nat}%
As a more concrete example, consider how to define inequalities of natural numbers.
One natural definition is that $n\le m$ if there exists a $k:\nat$ such that $n+k=m$.
(This uses again the identity types that we will introduce in the next section, but we will not need very much about them.)
Under the propositions-as-types translation, this would yield:
\[ (n\le m) \defeq \sm{k:\nat} (\id{n+k}{m}). \]
The reader is invited to prove the familiar properties of $\le$ from this definition.
For strict inequality, there are a couple of natural choices, such as
\[ (n<m) \defeq \sm{k:\nat} (\id{n+\suc(k)}{m}) \]
or
\[ (n<m) \defeq (n\le m) \times \neg(\id{n}{m}). \]
The former is more natural in constructive mathematics, but in this case it is actually equivalent to the latter, since $\nat$ has ``decidable equality'' (see \cref{sec:intuitionism,prop:nat-is-set}).
\index{decidable!equality}%

There is also another interpretation of the type $\sm{x:A} P(x)$.
Since an inhabitant of it is an element $x:A$ together with a witness that $P(x)$ holds, instead of regarding $\sm{x:A} P(x)$ as the proposition ``there exists an $x:A$ such that $P(x)$'', we can regard it as ``the type of all elements $x:A$ such that $P(x)$'', i.e.\ as a ``subtype'' of $A$.
\index{subtype}%

We will return to this interpretation in \cref{subsec:prop-subsets}.
For now, we note that it allows us to incorporate axioms into the definition of types as mathematical structures which we discussed in \cref{sec:sigma-types}.
For example, suppose we want to define a \define{semigroup}\index{semigroup} to be a type $A$ equipped with a binary operation $m:A\to A\to A$ (that is, a magma\index{magma}) and such that for all $x,y,z:A$ we have $m(x,m(y,z)) = m(m(x,y),z)$.
This latter proposition is represented by the type
\[\prd{x,y,z:A} m(x,m(y,z)) = m(m(x,y),z),\]
so the type of semigroups is
\[ \semigroup \defeq \sm{A:\UU}{m:A\to A\to A} \prd{x,y,z:A} m(x,m(y,z)) = m(m(x,y),z), \]
i.e.\ the subtype of $\mathsf{Magma}$ consisting of the semigroups.
From an inhabitant of $\semigroup$ we can extract the carrier $A$, the operation $m$, and a witness of the axiom, by applying appropriate projections.
We will return to this example in \cref{sec:equality-of-structures}.

Note also that we can use the universes in type theory to represent ``higher order logic'' --- that is, we can quantify over all propositions or over all predicates.
For example, we can represent the proposition \emph{for all properties $P : A \to \UU$, if $P(a)$ then $P(b)$} as
\[ \prd{P : A \to \UU} P(a) \to P(b) \]
where $A : \UU$ and $a,b : A$.
However, \emph{a priori} this proposition lives in a different, higher, universe than the
propositions we are quantifying over; that is
\[ \Parens{\prd{P : A \to \UU_i} P(a) \to P(b)} : \UU_{i+1}. \]
We will return to this issue in \cref{subsec:prop-subsets}.

\mentalpause

We have described here a ``proof-relevant''
\index{mathematics!proof-relevant}%
translation of propositions, where the proofs of disjunctions and existential statements carry some information.
For instance, if we have an inhabitant of $A+B$, regarded as a witness of ``$A$ or $B$'', then we know whether it came from $A$ or from $B$.
Similarly, if we have an inhabitant of $\sm{x:A} P(x)$, regarded as a witness of ``there exists $x:A$ such that $P(x)$'', then we know what the element $x$ is (it is the first projection of the given inhabitant).

As a consequence of the proof-relevant nature of this logic, we may have ``$A$ if and only if $B$'' (which, recall, means $(A\to B)\times (B\to A)$), and yet the types $A$ and $B$ exhibit different behavior.
For instance, it is easy to verify that ``$\mathbb{N}$ if and only if $\unit$'', and yet clearly $\mathbb{N}$ and $\unit$ differ in important ways.
The statement ``$\mathbb{N}$ if and only if $\unit$'' tells us only that \emph{when regarded as a mere proposition}, the type $\mathbb{N}$ represents the same proposition as $\unit$ (in this case, the true proposition).
We sometimes express ``$A$ if and only if $B$'' by saying that $A$ and $B$ are \define{logically equivalent}.
\indexdef{logical equivalence}%
\indexdef{equivalence!logical}%
This is to be distinguished from the stronger notion of \emph{equivalence of types} to be introduced in \cref{sec:basics-equivalences,cha:equivalences}:
although $\mathbb{N}$ and $\unit$ are logically equivalent, they are not equivalent types.

In \cref{cha:logic} we will introduce a class of types called ``mere propositions'' for which equivalence and logical equivalence coincide.
Using these types, we will introduce a modification to the above-described logic that is sometimes appropriate, in which the additional information contained in disjunctions and existentials is discarded.

Finally, we note that the propositions-as-types correspondence can be viewed in reverse, allowing us to regard any type $A$ as a proposition, which we prove by exhibiting an element of $A$.
Sometimes we will state this proposition as ``$A$ is \define{inhabited}''.
\indexdef{inhabited type}%
\indexsee{type!inhabited}{inhabited type}%
That is, when we say that $A$ is inhabited, we mean that we have given a (particular) element of $A$, but that we are choosing not to give a name to that element.
Similarly, to say that $A$ is \emph{not inhabited} is the same as to give an element of $\neg A$.
In particular, the empty type $\emptyt$ is obviously not inhabited, since $\neg \emptyt \jdeq (\emptyt \to \emptyt)$ is inhabited by $\idfunc[\emptyt]$.\footnote{This should not be confused with the statement that type theory is consistent, which is the \emph{meta-theoretic} claim that it is not possible to obtain an element of $\emptyt$ by following the rules of type theory.\indexfoot{consistency}}

\index{proof|)}%
\index{proposition!as types|)}%
\index{logic!propositions as types|)}%

\section{Identity types}
\label{sec:identity-types}

\index{type!identity|(defstyle}%
\indexsee{identity!type}{type, identity}%
\indexsee{type!equality}{type, identity}%
\indexsee{equality!type}{type, identity}%
While the previous constructions can be seen as generalizations of
standard set theoretic constructions, our way of handling identity  seems to be
specific to type theory.
According to the propositions-as-types conception, the \emph{proposition} that two elements of the same type $a,b:A$ are equal must correspond to some \emph{type}.
Since this proposition depends on what $a$ and $b$ are, these \define{equality types} or \define{identity types} must be type families dependent on two copies of $A$.

We may write the family as $\idtypevar{A}:A\to A\to\type$, so that $\idtype[A]ab$ is the type representing the proposition of equality between $a$ and $b$.
Once we are familiar with propositions-as-types, however, it is convenient to also use the standard equality symbol for this; thus ``$\id{a}{b}$'' will also be a notation for the \emph{type} $\idtype[A]ab$ corresponding to the proposition that $a$ equals $b$.
For clarity, we may also write ``$\id[A]{a}{b}$'' to specify the type $A$.
If we have an element of $\id[A]{a}{b}$, we may say that $a$ and $b$ are \define{equal}, or sometimes \define{propositionally equal} if we want to emphasize that this is different from the judgmental equality $a\jdeq b$ discussed in \cref{sec:types-vs-sets}.
\indexdef{equality!propositional}%
\indexdef{propositional!equality}%

Just as we remarked in \cref{sec:pat} that the propositions-as-types versions of ``or'' and ``there exists'' can include more information than just the fact that the proposition is true, nothing prevents  the type $\id{a}{b}$ from also including more information.
Indeed, this is the cornerstone of the homotopical interpretation, where we regard witnesses of $\id{a}{b}$ as \emph{paths}\indexdef{path} or \emph{equivalences} between $a$ and $b$ in the space $A$.  Just as there can be more than one path between two points of a space, there can be more than one witness that two objects are equal.  Put differently, we may regard $\id{a}{b}$ as the type of \emph{identifications}\indexdef{identification} of $a$ and $b$, and there may be many different ways in which $a$ and $b$ can be identified.
We will return to the interpretation in \cref{cha:basics}; for now we focus on the basic rules for the identity type.
Just like all the other types considered in this chapter, it will have rules for formation, introduction, elimination, and computation, which behave formally in exactly the same way.

The formation rule says that given a type $A:\UU$ and two elements $a,b:A$, we can form the type $(\id[A]{a}{b}):\UU$ in the same universe.
The basic way to construct an element of $\id{a}{b}$ is to know that $a$ and $b$ are the same.
Thus, the introduction rule is a dependent function
\[\refl{} : \prd{a:A} (\id[A]{a}{a})\]
called \define{reflexivity},
\indexdef{reflexivity!of equality}%
which says that every element of $A$ is equal to itself (in a specified way).  We regard $\refl{a}$ as being the
constant path\indexdef{path!constant}\indexsee{loop!constant}{path, constant}
at the point $a$.

In particular, this means that if $a$ and $b$ are \emph{judgmentally} equal, $a\jdeq b$, then we also have an element $\refl{a} : \id[A]{a}{b}$.
This is well-typed because $a\jdeq b$ means that also the type $\id[A]{a}{b}$ is judgmentally equal to $\id[A]{a}{a}$, which is the type of $\refl{a}$.

The induction principle (i.e.\ the elimination rule) for the identity types is one of the most subtle parts of type theory, and crucial to the homotopy interpretation.
We begin by considering an important consequence of it, the principle that ``equals may be substituted for equals'', as expressed by the following:
\index{indiscernability of identicals}%
\index{equals may be substituted for equals}%
\begin{description}
\item[Indiscernability of identicals:]
For every family 
\[
C : A \to \UU
\]
there is a function
\[
f : \prd{x,y:A}{p:\id[A] x y} C(x) \to C(y)
\]
such that
\[
f(x,x,\refl{x}) \defeq \idfunc[C(x)].
\]
\end{description}
This says that every family of types $C$ respects equality, in the sense that applying $C$ to \emph{equal} elements of $A$ also results in a function between the resulting types. The displayed equality states that the function associated to reflexivity is the identity function (and we shall see that, in general, the function $f(x,y,p): C(x) \to C(y)$ is always an equivalence of types).

Indiscernability of identicals can be regarded as a recursion principle for the identity type, analogous to those given for booleans and natural numbers above.
Just as $\rec{\nat}$ gives a specified map $\nat\to C$ for any other type $C$ of a certain sort, indiscernability of identicals gives a specified map from $\id[A] x y$ to certain other reflexive, binary relations on $A$, namely those of the form $C(x) \to C(y)$ for some unary predicate $C(x)$.
We could also formulate a more general recursion principle with respect to reflexive relations of the more general form $C(x,y)$.
However, in order to fully characterize the identity type, we must generalize this recursion principle to an induction principle, which not only considers maps out of $\id[A] x y$ but also families over it.
Put differently, we consider not only allowing equals to be substituted for equals, but also taking into account the evidence $p$ for the equality.
    
\subsection{Path induction}

\index{generation!of a type, inductive|(}
The induction principle for the identity type is called \define{path induction},
\index{path!induction|(}%
\index{induction principle!for identity type|(}%
in view of the homotopical interpretation to be explained in  the introduction to \cref{cha:basics}.  It can be seen as stating that the family of identity types is freely generated by the elements of the form $\refl{x}: \id{x}{x}$.

\begin{description}
\item[Path induction:] 
  Given a family 
  \[ C : \prd{x,y:A} (\id[A]{x}{y}) \to \UU \]
  and a function
  \[ c :  \prd{x:A} C(x,x,\refl{x}),\]
  there is a function
  \[ f : \prd{x,y:A}{p:\id[A]{x}{y}} C(x,y,p) \]
  such that 
  \[ f(x,x,\refl{x}) \defeq c(x). \]
\end{description}

Note that just like the induction principles for products, coproducts, natural numbers, and so on, path induction allows us to define \emph{specified} functions which exhibit appropriate computational behavior.
Just as we have \emph{the} function $f:\nat\to C$ defined by recursion from $c_0:C$ and $c_s:\nat \to C \to C$, which moreover satisfies $f(0)\jdeq c_0$ and $f(\suc(n))\jdeq c_s(n,f(n))$, we have \emph{the} function $f : \dprd{x,y:A}{p:\id[A]{x}{y}} C(x,y,p)$ defined by path induction from $c :  \prd{x:A} C(x,x,\refl{x})$, which moreover satisfies $f(x,x,\refl{x}) \jdeq c(x)$.

To understand the meaning of this principle, consider first the simpler case when $C$
does not depend on $p$.  Then we have $C:A\to A\to \UU$, which we may
regard as a predicate depending on two elements of $A$.  We are
interested in knowing when the proposition $C(x,y)$ holds for some pair
of elements $x,y:A$.  In this case, the hypothesis of path induction
says that we know $C(x,x)$ holds for all $x:A$, i.e.\ that if we
evaluate $C$ at the pair $x, x$, we get a true proposition --- so $C$ is
a reflexive relation.  The conclusion then tells us that $C(x,y)$ holds
whenever $\id{x}{y}$.  This is exactly the more general recursion principle
for reflexive relations mentioned above.

The general, inductive form of the rule allows $C$ to also depend on the witness $p:\id{x}{y}$ to the identity between $x$ and $y$.  In the premise, we not only replace $x, y$ by $x,x$, but also simultaneously replace $p$ by reflexivity: to prove a property for all elements $x,y$ and paths $p : \id{x}{y}$ between them, it suffices to consider all the cases where the elements are $x,x$ and the path is $\refl{x}: \id{x}{x}$.  If we were viewing types just as sets, it would be unclear what this buys us, but since there may be many different identifications $p : \id{x}{y}$ between $x$ and $y$, it makes sense to keep track of them in considering families over the type $\id[A]{x}{y}$.
In \cref{cha:basics} we will see that this is very important to the homotopy interpretation.

If we package up path induction into a single function, it takes the form:
\symlabel{defn:induction-ML-id}%
\begin{narrowmultline*}
  \indid{A} :  \dprd{C : \prd{x,y:A} (\id[A]{x}{y}) \to \UU}
  \Parens{\tprd{x:A} C(x,x,\refl{x})} \to 
  \narrowbreak
  \dprd{x,y:A}{p:\id[A]{x}{y}}   C(x,y,p)
\end{narrowmultline*}
with the equality\index{computation rule!for identity types}
\[ \indid{A}(C,c,x,x,\refl{x}) \defeq c(x). \]
The function $ \indid{A}$ is traditionally called $J$.
\indexsee{J@$J$}{induction principle for identity type}%
We leave it as an easy exercise to show that indiscernability of identicals follows from path induction.  

\mentalpause

Given a proof $p : \id{a}{b}$,
path induction requires us to replace \emph{both} $a$ and $b$ with the same unknown element $x$; thus in order to define an element of a family
$C$, for all pairs of elements of $A$, it suffices to define it on the diagonal.
In some proofs, however, it is simpler to make use of an equation $p : \id{a}{b}$ by replacing all occurrences of $b$ with $a$ (or vice versa), because it is sometimes easier to do the remainder of the proof for the specific element $a$ mentioned in the equality than for a general unknown $x$.  This motivates a second induction principle for identity types, which says that the family of types $\id[A]{a}{x}$ is generated by the element $\refl{a} : \id{a}{a}$.  As we show below, this second principle is equivalent to the first; it is just sometimes a more convenient formulation.

\index{path!induction based}%
\index{induction principle!for identity type!based}%
\begin{description}
\item[Based path induction:] 
  Fix an element $a:A$, and suppose given a family
  \[ C : \prd{x:A} (\id[A]{a}{x}) \to \UU \]
  and an element
  \[ c : C(a,\refl{a}). \]
  Then we obtain a function
  \[ f : \prd{x:A}{p:\id{a}{x}} C(x,p) \]
  such that
  \[ f(a,\refl{a}) \defeq c.\]
\end{description}

Here, $C(x,p)$ is a family of types, where $x$ is an element of $A$ and $p$ is an element of the identity type $\id[A]{a}{x}$, for fixed $a$ in $A$. The based path induction principle says that to define an element of this family for all $x$ and $p$, it suffices to consider
just the case where $x$ is $a$ and $p$ is $\refl{a} : \id{a}{a}$.

Packaged as a function, based path induction becomes:
\symlabel{defn:induction-PM-id}%
\begin{align*}
  \indidb{A} :  \dprd{a:A}{C : \prd{x:A} (\id[A]{a}{x}) \to \UU}
  C(a,\refl{a}) \to \dprd{x:A}{p : \id[A]{a}{x}} C(x,p) 
\end{align*}
with the equality
\[ \indidb{A}(a,C,c,a,\refl{a}) \defeq c. \]
%\[ g(x)(x,\refl{x}) \defeq d(x) \]

Below, we show that path induction and based path induction are equivalent.  Because of this, we will sometimes be sloppy and also refer to based path induction simply as ``path induction'', relying on the reader to infer which principle is meant from the form of the proof.

\begin{rmk}\label{rmk:the-only-path-is-refl}
Intuitively, the induction principle for the natural numbers expresses the fact that the only natural numbers are $0$ and $\suc(n)$, so if we prove a property for these cases, then we have proved it for all natural numbers.  Applying this same reading to path induction, we might loosely say that path induction expresses the fact that the only path is \refl{}, so if we prove a property for reflexivity, then we have proved it for all paths.  However, this reading is quite confusing in the context of the homotopy interpretation of paths, where there may be many different ways in which two elements $a$ and $b$ can be identified, and therefore many different elements of the identity type!  How can there be many different paths, but at the same time we have an induction principle asserting that the only path is reflexivity?

The key observation is that it is not the identity \emph{type} that is inductively defined, but the identity \emph{family}.
In particular, path induction says that the \emph{family} of types $(\id[A]{x}{y})$, as $x,y$ vary over all elements of $A$, is inductively defined by the elements of the form $\refl{x}$.
This means that to give an element of any other family $C(x,y,p)$ dependent on a \emph{generic} element $(x,y,p)$ of the identity family, it suffices to consider the cases of the form $(x,x,\refl{x})$.
In the homotopy interpretation, this says that the type of triples $(x,y,p)$, where $x$ and $y$ are the endpoints of the path $p$ (in other words, the $\Sigma$-type $\sm{x,y:A}(\id{x}{y})$), is inductively generated by the constant loops\index{path!constant} at each point $x$.
As we will see in \cref{cha:basics}, in homotopy theory the space corresponding to $\sm{x,y:A}(\id{x}{y})$ is the \emph{free path space} --- the space of paths in $A$ whose endpoints may vary --- and it is in fact the case that any point of this space is homotopic to the constant loop at some point, since we can simply retract one of its endpoints along the given path.
The analogous fact is also true in type theory: we can prove by path induction on $p:x=y$ that $\id[\sm{x,y:A}(\id{x}{y})]{(x,y,p)}{(x,x,\refl{x})}$.

Similarly, based path induction says that for a fixed $a:A$, the \emph{family} of types $(\id[A]{a}{y})$, as $y$ varies over all elements of $A$, is inductively defined by the element $\refl{a}$.
Thus, to give an element of any other family $C(y,p)$ dependent on a generic element $(y,p)$ of this family, it suffices to consider the case $(a,\refl{a})$.
Homotopically, this expresses the fact that the space of paths starting at some chosen point (the \emph{based path space} at that point, which type-theoretically is $\sm{y:A} (\id{a}{y})$) is contractible to the constant loop on the chosen point.
Again, the corresponding fact is also true in type theory: we can prove by based path induction on $p:a=y$ that $\id[\sm{y:A}(\id{a}{y})]{(y,p)}{(a,\refl{a})}$.
Note also that according to the interpretation of $\Sigma$-types as subtypes mentioned in \cref{sec:pat}, the type $\sm{y:A}(\id{a}{y})$ can be regarded as ``the type of all elements of $A$ which are equal to $a$'', a type-theoretic version of the ``singleton\index{type!singleton} subset'' $\{a\}$.

Neither path induction nor based path induction provides a way to give an element of a family $C(p)$ where $p$ has \emph{two fixed endpoints} $a$ and $b$.
In particular, for a family $C: (\id[A]{a}{a}) \to \UU$ dependent on a loop, we \emph{cannot} apply path induction and consider only the case for $C(\refl{a})$, and consequently, we cannot prove that all loops are reflexivity.
Thus, inductively defining the identity family does not prohibit non-reflexivity paths in specific instances of the identity type.
In other words, a path $p:\id{x}{x}$ may  be not equal to reflexivity as an element of $(\id{x}{x})$, but the pair $(x,p)$ will nevertheless be equal to the pair $(x,\refl{x})$ as elements of $\sm{y:A}(\id{x}{y})$.
\end{rmk}

\index{path!induction|)}%
\index{induction principle!for identity type|)}%
\index{generation!of a type, inductive|)}

\subsection{Equivalence of path induction and based path induction}

The two induction principles for the identity type introduced above are equivalent.
It is easy to see that path induction follows from the based path induction principle.
Indeed, let us assume the premises of path induction:
\begin{align*}
C &: \prd{x,y:A}(\id[A]{x}{y}) \to \UU,\\
c &: \prd{x:A} C(x,x,\refl{x}).
\end{align*}
Now, given an element $x:A$, we can instantiate both of the above, obtaining
\begin{align*}
C' &: \prd{y:A} (\id[A]{x}{y}) \to \UU,  \\
C' &\defeq C(x), \\
c' &: C'(x,\refl{x}), \\
c' &\defeq c(x).
\end{align*}
Clearly, $C'$ and $c'$ match the premises of based path induction and hence we can construct 
\begin{equation*}
  g : \prd{y:A}{p : \id{x}{y}} C'(y,p)
\end{equation*}
with the defining equality
\[ g(x,\refl{x}) \defeq c'.\]
Now we observe that $g$'s codomain is equal to $C(x,y,p)$.
Thus, discharging our assumption $x:A$, we can derive a function 
\[ f : \prd{x,y:A}{p : \id[A]{x}{y}} C(x,y,p) \]
with the required judgmental equality $f(x,x,\refl{x}) \judgeq g(x,\refl{x}) \defeq c' \defeq c(x)$.

Another proof of this fact is to observe that any such $f$ can be obtained as an instance of $\indidb{A}$
so it suffices to define $\indid{A}$ in terms of $\indidb{A}$ as
\[ \indid{A}(C,c,x,y,p) \defeq \indidb{A}(x,C(x),c(x),y,p). \]

The other direction is a bit trickier; it is not clear how we can use a particular instance of path induction to derive a particular instance of
based path induction. What we can do instead is to construct one instance of path induction which shows 
all possible instantiations of based path induction at once.
Define
\begin{align*}
D &: \prd{x,y:A} (\id[A]{x}{y}) \to \UU, \\
D(x,y,p) &\defeq \prd{C : \prd{z:A} (\id[A]{x}{z}) \to \UU} C(x,\refl{x}) \to C(y,p).
\end{align*}
Then we can construct the function
\begin{align*}
d &: \prd{x : A} D(x,x,\refl{x}), \\
d &\defeq \lamu{x:A}\lamu{C:\prd{z:A}{p : \id[A]{x}{z}} \UU}\lam{c:C(x,\refl{x})} c
\end{align*}
and hence using path induction obtain
\[ f : \prd{x,y:A}{p:\id[A]{x}{y}} D(x,y,p) \]
with $f(x,x,\refl{x}) \defeq d(x)$. Unfolding the definition of $D$, we can expand the type of $f$:
\[ f : \prd{x,y:A}{p:\id[A]{x}{y}}{C : \prd{z:A} (\id[A]{x}{z}) \to \UU} C(x,\refl{x}) \to C(y,p). \]
Now given $x:A$ and $p:\id[A]{a}{x}$, we can derive the conclusion of based path induction:
\[ f(a,x,p,C,c) : C(x,p). \]
Notice that we also obtain the correct definitional equality.

Another proof is to observe that any use of based path induction is an instance of $\indidb{A}$  and to define
\begin{narrowmultline*}
\indidb{A}(a,C,c,x,p) \defeq \narrowbreak
\indid{A}
  \begin{aligned}[t]
    \big(
    &\big(\lamu{x,y:A}{p:\id[A]{x}{y}} \tprd{C : \prd{z:A} (\id[A]{x}{z}) \to \UU} C(x,\refl{x}) \to C(y,p) \big),\\
    &(\lamu{x:A}{C:\prd{z:A} (\id[A]{x}{z}) \to \UU}{d:C(x,\refl{x})} d),
     a, x, p, C, c \big) 
   \end{aligned}
\end{narrowmultline*}


Note that the construction given above uses universes. That is, if we want to
model $\indidb{A}$ with $C : \prd{x:A} (\id[A]{a}{x}) \to \UU_i$, we need
to use $\indid{A}$ with 
%
\[ D:\prd{x,y:A} (\id[A]{x}{y}) \to \UU_{i+1} \]
%
since $D$ quantifies over all $C$ of the given type. While this is
compatible with our definition of universes, it is also possible to
derive $\indidb{A}$ without using universes: we can show that $\indid{A}$ entails \cref{lem:transport,thm:contr-paths}, and that these two principles imply $\indidb{A}$ directly.
We leave the details to the reader as \cref{ex:pm-to-ml}.

We can use either of the foregoing formulations of identity types
to establish that equality is an equivalence relation, that every function preserves equality and that every family respects equality. We leave the details to the next chapter, where this will be derived  and explained in the context of homotopy type theory.

\begin{rmk}\label{rmk:propeq-vs-jdeq}
  We emphasize that despite having some unfamiliar features, propositional equality is \emph{the} equality of mathematics in homotopy type theory.
  This distinction does not belong to judgmental equality, which is rather a metatheoretic feature of the rules of type theory.
  For instance, the associativity of addition for natural numbers proven in \cref{sec:inductive-types} is a \emph{propositional} equality, not a judgmental one.
  The same is true of the commutative law (\cref{ex:add-nat-commutative}).
  Even the very simple commutativity $n+1=1+n$ is not a judgmental equality for a generic $n$ (though it is judgmental for any specific $n$, e.g. $3+1\jdeq 1+3$, since both are judgmentally equal to $4$ by the computation rules defining $+$).
  We can only prove such facts by using the identity type, since we can only apply the induction principle for \nat with a type as output (not a judgment).
\end{rmk}

\subsection{Disequality}
\label{sec:disequality}

Finally, let us also say something about \define{disequality},
\indexdef{disequality}%
which is negation of equality:%
\footnote{We use ``inequality''
  to refer to $<$ and $\leq$. Also, note that this is negation of the \emph{propositional} identity type.
Of course, it makes no sense to negate judgmental equality $\jdeq$, because judgments are not subject to logical operations.}
%
\begin{equation*}
  (x \neq_A y) \ \defeq\ \lnot (\id[A]{x}{y}).
\end{equation*}
If $x\neq y$, we say that $x$ and $y$ are \define{unequal}
\indexdef{unequal}%
or \define{not equal}.
%
Just like negation, disequality plays a less important role here than it does in classical\index{mathematics!classical}
mathematics. For example, we cannot prove that two things are equal by proving that they
are not unequal: that would be an application of the classical law of double negation, see \cref{sec:intuitionism}.

Sometimes it is useful to phrase disequality in a positive way. For example,
in~\cref{RD-inverse-apart-0} we shall prove that a real number $x$ has an inverse if,
and only if, its distance from~$0$ is positive, which is a stronger requirement than $x
\neq 0$.

\index{type!identity|)}%

\sectionNotes

The type theory presented here is a version of Martin-L\"{o}f's intuitionistic type 
theory~\cite{Martin-Lof-1972,Martin-Lof-1973,Martin-Lof-1979,martin-lof:bibliopolis}, which itself is based on and influenced 
by the foundational work of Brouwer \cite{beeson}, Heyting~\cite{heyting1966intuitionism}, Scott~\cite{scott70}, de 
Bruijn~\cite{deBruijn-1973}, Howard~\cite{howard:pat}, Tait~\cite{Tait-1966,Tait-1968}, and Lawvere~\cite{lawvere:adjinfound}\index{Lawvere}.
\index{proof!assistant}%
Three principal variants of Martin-L\"{o}f's type theory underlie the \NuPRL \cite{constable+86nuprl-book}, \Coq~\cite{Coq}, and 
\Agda \cite{norell2007towards} computer implementations of type theory.  The theory given here differs from these formulations in a number 
of respects, some of which are critical to the homotopy interpretation, while others are technical conveniences or involve concepts that 
have not yet been studied in the homotopical setting.

\index{type theory!intensional}%
\index{type theory!extensional}%
\index{intensional type theory}%
\index{extensional!type theory}%
Most significantly, the type theory described here is derived from the \emph{intensional} version of Martin-L\"{o}f's type 
theory~\cite{Martin-Lof-1973}, rather than the \emph{extensional} version~\cite{Martin-Lof-1979}.  Whereas the extensional theory makes no 
distinction between judgmental and propositional equality, the intensional theory regards judgmental equality as purely definitional, and 
admits a much broader, proof-relevant interpretation of the identity type that is central to the homotopy interpretation.  From the 
homotopical perspective, extensional type theory confines itself to homotopically discrete sets (see \cref{sec:basics-sets}), whereas the 
intensional theory admits types with higher-dimensional structure.  The \NuPRL system~\cite{constable+86nuprl-book} is extensional, whereas 
both \Coq~\cite{Coq} and \Agda~\cite{norell2007towards} are intensional.  Among intensional type theories, there are a number of variants 
that differ in the structure of identity proofs.  The most liberal interpretation, on which we rely here, admits a \emph{proof-relevant} 
interpretation of equality, whereas more restricted variants impose restrictions such as \emph{uniqueness of identity proofs 
  (UIP)}~\cite{Streicher93},
\indexsee{UIP}{uniqueness of identity proofs}%
\index{uniqueness!of identity proofs}%
stating that any two proofs of equality are judgmentally equal, and \emph{Axiom K}~\cite{Streicher93},
\index{axiom!Streicher's Axiom K}
stating that 
the only proof of equality is reflexivity (up to judgmental equality).  These additional requirements may be selectively imposed in the \Coq 
and \Agda\ systems.

%(In the terminology of \cref{cha:hlevels} such a type theory is about $0$-truncated types.)

Another point of variation among intensional theories is the strength of judgmental equality, particularly as regards objects of function type.  Here we include the uniqueness principle\index{uniqueness!principle} ($\eta$-conversion) $f \jdeq \lam{x} f(x)$, as a principle of judgmental equality.  This principle is used, for example, in \cref{sec:univalence-implies-funext}, to show that univalence implies propositional function extensionality.  Uniqueness principles are sometimes considered for other types.
For instance, the uniqueness principle\index{uniqueness!principle!for product types} for cartesian products would be a judgmental version of the propositional equality $\uppt$ which we constructed in \cref{sec:finite-product-types}, saying that $u \jdeq (\proj1(u),\proj2(u))$.
This and the corresponding version for dependent pairs would be reasonable choices (which we did not make), but we cannot include all such rules, because the corresponding uniqueness principle for identity types would trivialize all the higher homotopical structure.  So we are \emph{forced} to leave it out, and the question then becomes where to draw the line. With regards to inductive types, we discuss these points further in~\cref{sec:htpy-inductive}.

It is important for our purposes that (propositional) equality of functions is taken to be \emph{extensional} (in a different sense than that used above!).
This is not a consequence of the rules in this chapter; it will be expressed by \cref{axiom:funext}.
\index{function extensionality}%
This decision is significant for our purposes, because it specifies that equality of functions is as expected in mathematics.  Although we include \cref{axiom:funext} as an axiom, it may be derived from the univalence axiom and the uniqueness principle for functions\index{uniqueness!principle!for function types} (see \cref{sec:univalence-implies-funext}), as well as from the existence of an interval type (see \cref{thm:interval-funext}).

Regarding inductive types such as products, $\Sigma$-types, coproducts, natural numbers, and so on (see \cref{cha:induction}), there are additional choices regarding the formulation of induction and recursion.
\index{pattern matching}%
We have taken \emph{induction principles} as basic and \emph{pattern matching} as derived from them, but one may also do the other; see \cref{cha:rules}.
Usually in the latter case one allows also \emph{deep} pattern matching; see~\cite{Coquand92Pattern}.
There are several reasons for our choice.
One reason is that induction principles are what we obtain naturally in categorical semantics.
Another is that specifying the allowable kinds of (deep) pattern matching is quite tricky;
for instance, \Agda's
\index{proof!assistant!Agda@\textsc{Agda}}%
pattern matching can prove Axiom K by default,
\index{axiom!Streicher's Axiom K}%
although a flag \texttt{--without-K} prevents this~\cite{CDP14}.
Finally, deep pattern matching is not well-understood for \emph{higher} inductive types (see \cref{cha:hits}).
Therefore, we will only use pattern matches such as those described in \cref{sec:pattern-matching}, which are directly equivalent to the application of an induction principle.

\index{proof!assistant!Coq@\textsc{Coq}}%
Unlike the type theory of \Coq, we do not include a primitive type of propositions.  Instead, as discussed in \cref{sec:pat}, we embrace 
the \emph{propositions-as-types (PAT)} principle, identifying propositions with types.
This was suggested originally by de Bruijn~\cite{deBruijn-1973}, Howard~\cite{howard:pat}, Tait~\cite{Tait-1968}, and Martin-L\"{o}f~\cite{Martin-Lof-1972}.
(Our decision is explained more fully in \cref{subsec:pat?,subsec:hprops}.)

We do, however, include a full cumulative hierarchy of universes, so that the type formation and equality judgments become instances of the membership and equality judgments for a universe.
As a convenience, we regard objects of a universe as types, rather than as codes for types; in the terminology of \cite{martin-lof:bibliopolis}, this means we use ``Russell-style universes'' rather than ``Tarski-style universes''.
\index{type!universe!Tarski-style}%
\index{type!universe!Russell-style}%
An alternative would be to use Tarski-style universes, with an explicit coercion\index{coercion, universe-raising} function required to make an element $A:\UU$ of a universe into a type $\mathsf{El}(A)$, and just say that the coercion is omitted when working informally.

We also treat the universe hierarchy as cumulative, in that every type in $\UU_i$ is also in $\UU_j$ for each $j\geq i$.
There are different ways to implement cumulativity formally: the simplest is just to include a rule that if $A:\UU_i$ then $A:\UU_j$.
However, this has the annoying consequence that for a type family $B:A\to \UU_i$ we cannot conclude $B:A\to\UU_j$, although we can conclude $\lam{a} B(a) : A\to\UU_j$.
A more sophisticated approach that solves this problem is to introduce a judgmental subtyping relation $<:$ generated by $\UU_i<:\UU_j$, but this makes the type theory more complicated to study.
Another alternative would be to include an explicit coercion function $\uparrow : \UU_i \to \UU_j$, which could be omitted when working informally.

It is also not necessary that the universes be indexed by natural numbers and linearly ordered.
For some purposes, it is more appropriate to assume only that every universe is an element of some larger universe, together with a ``directedness'' property that any two universes are jointly contained in some larger one.
There are many other possible variations, such as including a universe ``$\UU_{\omega}$'' that contains all $\UU_i$ (or even higher ``large cardinal'' type universes), or by internalizing the hierarchy into a type family $\lam{i} \UU_i$.
The latter is in fact done in \Agda.

The path induction principle for identity types was formulated by Martin-L\"{o}f~\cite{Martin-Lof-1972}.
The based path induction rule in the setting of Martin-L\"of type theory is due to Paulin-Mohring \cite{Moh93}; it can be seen as an intensional generalization of the concept of ``pointwise functionality''\index{pointwise!functionality} for hypothetical judgments from \NuPRL~\cite[Section~8.1]{constable+86nuprl-book}.
The fact that Martin-L\"of's rule implies Paulin-Mohring's was proved by Streicher using Axiom K (see~\cref{sec:hedberg}), by Altenkirch and Goguen as in \cref{sec:identity-types}, and finally by Hofmann without universes (as in \cref{ex:pm-to-ml}); see~\cite[\S1.3 and Addendum]{Streicher93}.

\sectionExercises

\begin{ex}\label{ex:composition}
  Given functions $f:A\to B$ and $g:B\to C$, define their \define{composite}
  \indexdef{composition!of functions}%
  \indexdef{function!composition}%
 $g\circ f:A\to C$.
  \index{associativity!of function composition}%
  Show that we have $h \circ (g\circ f) \jdeq (h\circ g)\circ f$.
\end{ex}

\begin{ex}\label{ex:pr-to-rec}
  Derive the recursion principle for products $\rec{A\times B} $ using only the projections, and verify that the definitional equalities are valid.
  Do the same for $\Sigma$-types.
\end{ex}

\begin{ex}\label{ex:pr-to-ind}
  Derive the induction principle for products $\ind{A\times B}$, using only the projections and the propositional uniqueness principle $\uppt$.
  Verify that the definitional equalities are valid.
  Generalize $\uppt$ to $\Sigma$-types, and do the same for $\Sigma$-types.
  \emph{(This requires concepts from \cref{cha:basics}.)}
\end{ex}

\begin{ex}\label{ex:iterator}
\index{iterator!for natural numbers}
Assuming as given only the \emph{iterator} for natural numbers
\[\ite : \prd{C:\UU} C \to (C \to C) \to \nat \to C \]
with the defining equations
\begin{align*}
\ite(C,c_0,c_s,0)  &\defeq c_0, \\
\ite(C,c_0,c_s,\suc(n)) &\defeq c_s(\ite(C,c_0,c_s,n)),
\end{align*}
derive a function having the type of the recursor $\rec{\nat}$.
Show that the defining equations of the recursor hold propositionally for this function, using the induction principle for $\nat$.
\end{ex}

\begin{ex}\label{ex:sum-via-bool}
\index{type!coproduct}%
Show that if we define $A + B \defeq \sm{x:\bool} \rec{\bool}(\UU,A,B,x)$, then we can give a definition of $\ind{A+B}$ for which the definitional equalities stated in \cref{sec:coproduct-types} hold.
\end{ex}

\begin{ex}\label{ex:prod-via-bool}
\index{type!product}%
Show that if we define $A \times B \defeq \prd{x:\bool}\rec{\bool}(\UU,A,B,x)$, then we can give a definition of  $\ind{A\times B}$ for which the definitional equalities stated in \cref{sec:finite-product-types} hold propositionally (i.e.\ using equality types).
\emph{(This requires the function extensionality axiom, which is introduced in \cref{sec:compute-pi}.)}
\end{ex}

\begin{ex}\label{ex:pm-to-ml}
Give an alternative derivation of $\indidb{A}$ from $\indid{A}$ which avoids the use of universes.
  \emph{(This is easiest using concepts from later chapters.)}
\end{ex}

\begin{ex}\label{ex:nat-semiring}
  \index{multiplication!of natural numbers}%
  Define multiplication and exponentiation using $\rec{\nat}$.
  Verify that $(\nat,+,0,\times,1)$ is a semiring\index{semiring} using only $\ind{\nat}$.
  You will probably also need to use symmetry and transitivity of equality, \cref{lem:opp,lem:concat}.
\end{ex}

\begin{ex}\label{ex:fin}
  \index{finite!sets, family of}%
  Define the type family $\Fin : \nat \to \UU$ mentioned at the end of \cref{sec:universes}, and the dependent function $\fmax : \prd{n:\nat} \Fin(n+1)$ mentioned in \cref{sec:pi-types}.
\end{ex}

\begin{ex}\label{ex:ackermann}
  \indexdef{function!Ackermann}%
  \indexdef{Ackermann function}%
  Show that the Ackermann function $\ack : \nat \to \nat \to \nat$ is definable using only $\rec{\nat}$ satisfying the following equations:
  \begin{align*}
    \ack(0,n) &\jdeq \suc(n), \\
    \ack(\suc(m),0) &\jdeq \ack(m,1), \\
    \ack(\suc(m),\suc(n)) &\jdeq \ack(m,\ack(\suc(m),n)).
  \end{align*}
\end{ex}

\begin{ex}\label{ex:neg-ldn}
  Show that for any type $A$, we have $\neg\neg\neg A \to \neg A$.
\end{ex}

\begin{ex}\label{ex:tautologies}
  Using the propositions as types interpretation, derive the following tautologies.
  \begin{enumerate}
  \item If $A$, then (if $B$ then $A$).
  \item If $A$, then not (not $A$).
  \item If (not $A$ or not $B$), then not ($A$ and $B$).
  \end{enumerate}
\end{ex}

\begin{ex}\label{ex:not-not-lem}
  Using propositions-as-types, derive the double negation of the principle of excluded middle, i.e.\ prove \emph{not (not ($P$ or not $P$))}.
\end{ex}

\begin{ex}\label{ex:without-K}
  Why do the induction principles for identity types not allow us to construct a function $f: \prd{x:A}{p:\id{x}{x}} (\id{p}{\refl{x}})$ with the defining equation
  \[ f(x,\refl{x}) \defeq \refl{\refl{x}} \quad ?\]
\end{ex}

\begin{ex}\label{ex:subtFromPathInd}
  Show that indiscernability of identicals follows from path induction.  
\end{ex}

\begin{ex}\label{ex:add-nat-commutative}
  Show that addition of natural numbers is commutative: $\prd{i,j:\nat} (i+j=j+i)$.
\end{ex}

% Local Variables:
% TeX-master: "hott-online"
% End: