Let $T$ be a theory and let $C$ be a category. If $T$ is modelable in $T$, then there exists a category of models $T(C)$ in $C$. If $T$ is again modelable in $T(C)$ then we can get a category of double models $T(T(C))$. Repeating this again produces a category of triple models $T(T(T(C)))$. This iterated internalisation process produces the $n$-fold models of $T$ in $C$. This can be used to study a variety of different of theories in categories.
For example, we can repeatedly internalise the theory of preorders to categories to get $n$-fold preorders. However, the main things we will want to repeatedly internalise to categories are categories themselves. The iterated internalisations of categories in categories are $n$-fold categories. The double and triple categories we have dealt with previously are special types of $n$-fold categories. We call all $n$-fold categories with $n \gt 3$ multiple categories.
The morphisms of $n$-fold categories are $n$-fold functors. For example, the morphisms of single categories are simply single functors. The morphisms of double categories are double functors, and the morphisms of triple categories are triple functors. By generalisation, a morphism of multiple categories is a multiple functor. $n$-fold categories and $n$-fold functors together make up the categories of $n$-fold categories.
In a previous post, we described double categories. It was mentioned that double categories are categories internal to the category of categories $Cat$. This process of internalising categories in categories can be repeated $n$ times to produce $n$-fold categories. Repeating this process three times produces triple categories.
Definition:
A triple category $C$ is an internal category in double categories. A triple category has the following six components:
A double category of objects $Ob(C)$
A double category of arrows $Arrows(C)$
A source double functor $source: Arrows(C) \to Ob(C)$
A target double functor $target: Arrows(C) \to Ob(C)$
An identity double functor $identity: Ob(C) \to Arrows(C)$
A composition double functor $\circ: Arrows(C) \times_{Ob(C)} Arrows(C) \to Arrows(C)$
Alternatively, a triple category can be constructed from eight different types of cells with composition, source, target, and identity morphisms coming in different directions. So for example, triple categories have vertical, horizontal, and transversal types of composition. This stands in contrast to elementary categories, which only have composition defined in one direction.
Cells:
A triple category has eight different types of cells constructed from combining three different types of arrows in three different dimensions.
Objects
Arrows
Vertical arrows
Horizontal arrows
Transversal arrows
Squares
Vertical-Horizontal squares
Vertical-Transversal squares
Horizontal/Transversal squares
Cubes
A triple category $C$ has three different double categories of squares and three different categories of arrows associated to it.
To symbolise the synthesis of functional programming and double categories we will need a new logo. The functional programming paradigm is often symbolised by the Greek letter lambda. On the other hand, double categories are symbolised by the blackboard font letters at the start of their names. To indicate the use of double categories in computer programming languages and their link to the study of functional mappings, we will use a blackboard font lambda.
Perhaps the most fundamental idea in category theory is internalisation. This lets you take some concept familiar from classical mathematics and set theory, and then turn it into an internal concept in some category. With a sufficiently advanced category, with a powerful enough internal language like a topos, you can really remake all mathematics internal to that category. This creates a different perspective on everything.
A particular example of internalisation is that of a double category. This is a category internal to the category of categories $Cat$. These double categories are the first in a line of different types of n-fold categories. Looking back at it now, I believe these n-fold categories, categories internal to the category of categories, are more fundamental then bicategories because internalisation is more interesting and fundamental then enrichment.
I want to escape from set-theoretic preconceptions which are still so ubiquitous even as category theory gains traction. The only way I think that you can do that is through internalisation, and the concept of mathematics internal to some category other then $Sets$. If you just go with enrichment, then you have all the assumptions about $Set$-categories packed in, you are just adding on more structure to the $Set$-category without doing anything really different.
Perhaps you could then do enriched category theory internal to a given category, e.g internal enriched category theory. That would resolve all issues. But the point is that this internalisation notion is more fundamental, and you want a bit of it even if you do enriched category theory. This gives me the impression that double categories are more fundamental then category enriched categories.
The perception that double categories are more fundamental then 2-categories is formalised by the notion that any 2-category can be converted in to a double category. This can be achieved by defining the horizontal or the vertical morphisms of the double category to be identities. So an advantage of a double-categorical framework is that it can subsume 2-categories as a special case.
Double categories:
A category internal to the category of categories $Cat$ is a structure $C$ that has six components:
A category of objects $Ob(C)$
A category of morphisms $Arrows(C)$
A source functor $S: Arrows(C) \to Ob(C)$
A target functor $T: Arrows(C) \to Ob(C)$
An identity functior $id: Ob(C) \to Arrows(C)$
A composition functor $\circ: Arrows(C) \times_{Ob(C)} \times Arrows(C) \to Arrows(C)$.
The key takeaway is that for any category $C$ then the category of its commutative squares $Sq(C)$ forms a double category. This formalizes the fact that commutative squares can be composed either horizontally or vertically. This is a very important part of the theory of commutative squares.
An internal quiver in a category $C$ is just a diagram $F: T_2^* \to C$. These diagrams consist of a pair of parallel arrows in the category.
Then if we are given such an internal quiver, we can always form an equalizer of its to arrows to get a subobject of the arrow object $A$.
The problem is that this subobject $S$ is not going to contain all the elements we want for a composition domain. For example, for a category $C$ this consists of all endomorphisms instead of all composable pairs of arrows. Instead we have to work with pullbacks, and get the pullback of the first and second arrow. This is why internal categories are always defined over categories that have all pullbacks.
Then this pullback has the property that it is the equalizer of the first and second arrows as they filter through the product object $A^2$. This makes $R$ into an internal relation in the category $C$, embedded in the object $A$. If we take a look at the above diagram, its not hard to see that the first and second projection arrows can be made to filter through $A \times_O A$ to make them into a ternary quiver consisting of three parallel arrows going to $A$. This leads to a diagram that looks like this.
Then diagrams of this type form a category $C$. The axioms of this category $C$ can be used to ensure the laws specifying the source and target of composite morphisms. In particular, we can ensure that the source of the composite is the source of the second morphism and the target of the composite is the target of the first morphism. All that can be encoded in the index category. We then have the idea of a compositional quiver, which is a further over this index category $CQ$.
The importance of this functorial characterization is that we can form a functor category $C^{CQ}$. The problem, however, is that internal compositional quivers don't necessarily have to have nice input domains $A \times_O A$. This is why to have an internal magmoid, we must have a compositional quiver diagram with the property that the composition domain is the pullback $A \times_O A$.
In other words, this means that in the compositional quiver diagram $A \times_O A$ and the morphisms $1$ and $2$ form a universal limiting cone for the pullback. This leads me to an interesting idea: functors with the property that certain subcones within the functor, described as subdiagrams, are universal. This can be used for example to characterize whenever an object in a diagram is a product or a coproduct. Something like this would make for an interesting data structure for those interested in implementing category theory on the machine.
To form an internal category is a little bit more involved. We can always add identities to an internal magmoid diagram, by adding another identity morphism going from objects to arrows. To make that unital magmoid into a category also requires that we satisfy associativity and the unit laws. Rest assured, however, that internal categories can be formed in $Sets^{\to}$ because it has all pullbacks, and if there is anything interesting to be said there I will be saying it eventually. For now these are just some of my thoughts on internal magmoids and related notions.
A key takeaway is that everything should be represented by a functor. Natural transformations for example are simply functors from a category in to an arrow category, so they also have nicely defined functorial semantics. If we take categories to be something like functors or diagrams internal to some category, that is a step forwards too. While internal categories could be just some collection of objects and arrows satisfying some axioms, if we can make them in to special types of functors that is most convenient. Functors are the ideal structures we should work towards dealing with.
This blog has been around for a while now, and as time has gone on some of my opinions and views have changed. One thing that has remained the same is that I have always had a mind focused on logic. Everything to do with logic interests me. I don't know why this is, some of us just have a mind focused on logic I suppose.
As a logician, I have tried to seek out a wide variety of tools of logical reasoning: logic programming, horn clauses, set theory, predicate logic, ontologies, lattices and formal concept analysis, categorical logic and topoi, etc. As I have done all this research and thinking about logic I've felt there is a mismatch between functional programming and the techniques I use. My primary problem then has been how do we solve this mismatch between functional programming and logic?
Problem: functional programming is not logical enough
Solution 1: integrate functional and logic programming in to a single language
Solution 2: embrace functional programming but make it more logical by developing a new logic of functions
When I first encountered topos theory, I thought that solution 1 would be the way to go. We would simply take existing logical and functional programming techniques and use topoi as a unifying bridge between them. The result would be a unified multi-paradigm declarative language with topos theory as its core.
The idea of unifying functional and logical programming is rather obvious and it fits with the common practice of multi-paradigm programming. The problem with the approach of throwing everything and the kitchen sink into a language is its indecisive. The multi-paradigm approach is too fence sitting and it fails to committ to one thing or another.
It occurs to me in developing the toposic structure theory of computations (2023) that we can just develop new logical techniques for handling functions to get our 'logic fix' that way, all without leaving functional programming for another paradigm. Practical work in modern logic uses categoric logic and topos theory. Therefore, this new logic of functional programming will have to be developed using categorical logic and the topos of functions.
What should the future trend in declarative programming be? It should be towards more and better functional programming. We can make things more functional by developing them in the functional languages: Clojure, F#, Haskell, etc and we can make these existing functional languages better. Instead of the multi-paradigm concept, functional programming should come first.
A trend toward functional programming has emerged in the last fifteen years. Functional programming languages have seen greater adoption, and several functional features have finally started to make their way into mainstream languages. Lambda expressions have been added to C++ (with C++11), Java (with Java 8), and C# (with version 2).
I rode on the wave of this rise in functional programming. When functional programming started becoming mainstream, I decided to join in. As I went through this personal journey, there have been various problems I have sought to solve along the way and ideas that I have sought to make sense of.
Functional modification
I made my own journey from imperative languages, primarily JavaScript, to more functional ones. One of the first things we are all faced with when making the transition to functional programming is making sense of immutability and functional purity. How is it possible to program if we cannot change anything?
As it turns out, changing things in functional languages is possible. You are just encouraged to do so with pure functions. A case in point is the assoc function in Clojure. This pure function takes in a composite structure and returns a changed version without affecting the original structure.
The assoc function is generalized to handle various composite data structures, all within a functional framework. So we can use assoc to modify the parts of a map:
(def point {:x 1, :y 2})
(assoc point :x 10) ;=> {:x 10, :y 2}
In addition, the assoc-in method can be used to make modifications deep within the nested parts of a composite data structure. For example, it can be used to modify the values within a matrix.
So it would seem that there is a way of modifying things in functional languages. It is just that the modifications take on a different form. This leads us to functional references, which help us to make sense of functional modifications.
Functional references:
Traditional languages like C++, Java, C#, and JavaScript provide generalized imperative references in the form of lvalues (locator values). An lvalue is anything that occurs on the left-hand side of an equal sign. The right-hand side contains what the lvalue is to be turned into. When you move over to the functional world, you get generalized functional references, also known as lenses.
; the basic idea is that a lens can be defined by combining a pure
; getter function and a pure setter function.
(deftype Lens [getter setter]
clojure.lang.IFn
(invoke [this arg] (get arg))
(applyTo [this args] (clojure.lang.AFn/applyToHelper this args)))
Lenses implement clojure.lang.IFn, so they can be treated a lot like functions. However, we need to define special methods that wrap their setter functions to make further use of them.
(defn modify
"Set the part of object at lens to val."
[lens obj val]
(let [setter (.-setter lens)]
(setter obj val)))
(defn zap
"Apply the function func to the part of the obj at lens."
[lens obj func]
(let [setter (.-setter lens)]
(setter obj (func (lens obj)))))
The clojure get and get-in functions are the counterparts of assoc and assoc-in. We can combine get and assoc methods to form lenses.
; an example lens which describes the first part of a tuple
(def first-lens
(Lens.
; get the first part of a tuple
(fn [arg]
(get arg 0))
; set the first part of a tuple
(fn [arg v]
(assoc arg 0 v))))
; an example lens which describes the second part of a tuple
(def second-lens
(Lens.
; get the second part of a tuple
(fn [arg]
(get arg 1))
; set the second part of a tuple
(fn [arg v]
(assoc arg 1 v))))
A fundamental property of references in imperative and functional languages is that they can be arbitrarily nested. In the functional world, this nesting is defined compositionally.
(defn ^Lens compose-lenses
"combine two lenses to get their composition"
[^Lens f, ^Lens g]
(Lens.
(comp (.-getter f) (.-getter g))
(let [set-f (.-setter f)
set-g (.-setter g)]
(fn [a v]
(set-g a (set-f (g a) v))))))
By using lens composition, we can recapture the nesting provided by Clojure's get-in and assoc-in methods with a much more extensible framework. With lenses, we have now translated references from an imperative context to a functional one.
Bijections are lenses:
Another key fact is that bijections are lenses. In particular, we can form a lens from a function and its inverse. This basic fact is implemented in the code below.
(defn bijection-lens
"get a lens from a function and its inverse."
[func inv-func]
(Lens.
func
(fn [arg v]
(inv-func v))))
A most obvious example is the bijection between booleans like true and false and bits like zero and one. It is customary to treat false as zero and one as true. So we use this to define a bijection.
Recalling that we earlier demonstrated that lenses are compositional, any lens can now be composed with a bijection to get a different lens. A lens now has two different components: the information it refers to and its representation of that information. There is a single notion of the first part of a pair of numbers, but there are potentially infinite representations of that same first part.
Algebraic theory of information:
The fact that lenses can be composed with bijections suggests a dichotomy between information and its accessors.
Information: a location within a larger structure
Accessor: the way that a given piece of information is viewed
The first part of a tuple $X^2$ is a unique piece of information, but it may have many different representations. Each composition with a bijection produces a different representation. With so many bijections available producing different representations, it is important that we consider lenses and information as different.
There is a further important property. We often find that some reference has more information than another, or that one memory location has more data then another. So, for example, consider the class of 2x2 matrices. Then these matrices have a first row and a first column, and both of them together contain the $(0,0)$ entry. So the $(0,0)$ is an information substructure of the first row and the first column.
We can think of a lens as a way of drilling down into part of a structure, and some lenses drill down deeper into a structure. The depth that a lens goes down to is part of the theory of information orders, which is a kind of logic. This motivates our further study of the logic of functional programming.
The varied behaviour of functions:
Even with this groundwork laid by functional references we don't really know how general functions behave. With all this work with lenses, all we can do is translate concepts from imperative programming (like modification and references) to functional programming. But we still don't have a way of describing how functions behave.
To explain why we haven't gotten that far yet, we should look at some examples. If we have a quadruple in the set $X^4$ then we can say that it has four different lenses associated to each of its indices. Each lens refers to a different part of the quadruple. We can define a permutation $f(a,b,c,d) = (b,a,d,c)$ which swaps the first and second pairs of values.
(defn permute-quadruple
"Swap the first value with the second and the third with
the fourth"
[[a b c d]]
[b a d c])
(permute-quadruple [10, 20, 30, 40]) ;=> [20, 10, 40, 30]
This is not simply the modification of some part of a structure. This permutation moves data from one lens to another. There seem to be motions and subcomputations occuring inside of our pure functions! What are these? How do you explain them? This will need to be the subject of further investigation.
(defn transform-quadruple
"A transformation of a quadruple which is not a permutation."
[[a b c d]]
[c a a c])
(permute-quadruple [10, 20, 30, 40]) ;=> [30, 10, 10, 30]
Aside from moving values around from place to place, a function can modify them. For example, you have things like map which lets you take a function and apply it to all parts of a structure.
(defn increment-triple
"Increment all parts of a triple"
[[a b c]]
(vec (map inc [a b c])))
(increment-triple [1 2 3]) ;=> [2 3 4]
One further example uses combiners like + and * to take multiple different places and combine them into a single place in the result.
(defn sum-product-pair
"Get the sum of the first two elements and the product of the
next two."
[[a b c d]]
[(+ a b) (* c d)])
(sum-product-pair [10 20 30 40]) ;=> [30 1200]
These are some of the many examples of things we can do with functions. A couple more examples come from linear algebra:
When you transpose a matrix each entry in the output matrix is determined by an entry in the input matrix.
If you take two matrices and multiply them then each entry of the output matrix is determined by a row in the first matrix and a column in the second.
When you add two matrices each entry in the output matrix is determined by the an entry in the first matrix and an entry in the second matrix.
There are examples abound throughout math. This concept of information dependence appears everywhere. Once we have a concept of information and of functional references, it is desirable that we should have a logical theory as to how functions behave around these references. To describe how they move information around. It follows that there is still more that needs to be done.
Describing the transport of data:
The varied behaviour of functions that we are seeking to explain all have a common description. They all transport data from one place to another using some intermediate function. Permutations move data from one place to another without changing them along the way. They are purely a rearrangement of data. The map function moves data from a location back to itself, but it changes it along the way.
Commutative square diagrams can be used to explain all these varied behaviours of functions that we have been seeing. In order to see this it might be helpful to explain commutative square diagrams using exceptionally descriptive variable names. While we could describe each of its four components by single letter variables: f,g,h, and i perhaps the approach I am about to propose will be more suggestive.
We can say that a commutative square diagram for a given function, has in addition to that function three additional components: an input function, an output function, and a transport function. Intuitively, this means that the commutative square diagram describes the movement of data from the input place to the output place occurs along the given transport function. This is how we will describe the movement of data from one place to another.
With this fundamental formalism we now have the means we need to describe the behavior of functions, such as how data is moved from one place to another. If we look back at the examples we were exploring before, like permutations, transformations, matrix transpositions, the incrementation of triples, arithmetic functions, etc they all have an explanation for their behavior using this fundamental formalism.
Take the function permute-quadruple. It moves the first index to the second index, and vice versa, and the third index to the fourth index and vice versa. In each case data is moved from one place to another along the identity function. In the case of increment-triple each index is moved back to itself using the increment-triple function. In these examples, the motion of data from one place to another by a function is described by a commutative square diagram.
When we introduced lenses and functional references we made a very big step forward, because we had a general formalism for describing functional mutations, yet the story could not end there. Introducing these functional references left us open to the question of how we could understand how various different kinds of functions move the data these lenses point to around. With commutative square diagrams, we have the means we need to explain all the behaviour we have been seeing.
Information flows:
What we have so far described is how data can be moved from an input place to an output place along some transport function. We can also ask what can be said of data migrations without a transport function? What happens if we just want to talk about input and output locations by themselves?
Then the formalism we get is a pair $(I,O)$ of an input location and an output location. These location pairs, or partition pairs, describe how ignorance spreads or information flows under the operation of a function. The idea of exploring such pairs with a view towards studying how information flows was first put forward by Hartmanis and Stearns:
By studying these partition pairs in the context of commutative square diagrams, we can make them part of a larger framework that includes a theory of the transport of information. At the same time, these information flows help us to describe how these commutative square diagrams function.
In a recent paper of mine, I study these information flows in terms of the commutative diagrams they produce. It is noted, for example, that for any function $f: A \to B$ the partition pairs of $f$ form a lattice, and the properties of these lattices are described. We use these results to form a new kind of structure theory.
The topos of functions:
If we take two different commutative square diagrams, where the transport of one is the function of another, then these two diagrams can be composed. The composition of the two square diagrams occurs componentwise along their input and output functions. The result is a new square diagram. If we take this to be our composition law, the result is a category: the category $Sets^{\to}$. If we look a little deeper we find there is something very special about this category: it is a topos.
The fact that it is a topos gives us a lot. This means that $Sets^{\to}$ has all small limits and colimits, products, coproducts, epi-mono factorisations, subobject classifiers, an object of truth values, logical connectives, an internal logic, exponentation, power objects, subobject lattices, etc. Everything you can do in a topos you can do in $Sets^{\to}$. This makes $Sets^{\to}$ a broadly interesting subject of study in its own right, with some interesting properties of its own, like the failure of the law of excluded middle to hold.
The key thing for us is how $Sets^{\to}$ can help us all better understand functional programming. Its central concept, the commutative square diagram, describes how data moves around in functions. The commutative square can be described by a function, an input place, an output place, and a transport function that describes how inputs are mapped to outputs. As is true for any form of categorical logic, the dual logical theories of $Sets^{\to}$ are emergent properties of these commutative square diagrams.
Functional lenses let you focus in on some particular piece of a complex data structure. Commutative square diagrams, on the other hand, let you focus on some particular piece of a function. The result is the structure theory of functions. With this structure theory, we can now reason logically about the behaviour of functional programs.
Feedback:
How can we further improve the structure theory of functional programs? Write to me at jhuni.v2@gmail.com or leave a comment in the comment section below.
References:
[1] Hartmanis, J., & Stearns, R. E. (1966). Algebraic structure theory of sequential machines.
[2] Bernier, J. (2023). Toposic structure theory of computations.
[3] Goldblatt, R. (2014). Topoi the categorial analysis of logic. Elsevier Science.
[4] Johnstone, P. T. (2008). Sketches of an Elephant: A topos theory compendium. Oxford University Press.
[5] Lane, M. S., & Moerdijk, I. (2010). Sheaves in geometry and logic: A first introduction to topos theory. Springer.
[6] Barr, M., & Wells, C. (1985). Toposes, triples and theories. Springer-Verlag.
[7] MacLarty, C. (2005). Elementary categories, elementary toposes. Clarendon Press.
[8] Bell, J. L. (2008). Toposes and local set theories: An introduction. Dover Publications.
[9] Moerdijk, I. (2008). Classifying spaces and classifying topoi. Springer.
[10] Caramello, O. (2018). Theories, sites, toposes: Relating and studying mathematical theories through topos-theoretic 'Bridges'. Oxford University Press.