This is a typed up copy of my lecture notes from the combinatorics
seminar at KTH, 2010-09-01. This is not a perfect copy of what was said
at the seminar, rather a starting point from which the talk grew.
In some points, I've tried to fill in the most sketchy and
un-articulated points with some simile of what I ended up actually
saying.
Combinatorial species started out as a theory to deal with enumerative
combinatorics, by providing a toolset & calculus for formal power
series. (see Bergeron-Labelle-Leroux and Joyal)
As it turns out, not only is species useful for manipulating generating
functions, btu it provides this with a categorical approach that may be
transplanted into other areas.
For the benefit of the entire audience, I shall introduce some
definitions.
Definition: A category C is a collection of objects and arrows
with each arrow assigned a source and target object, such that
- Each object has its own identity arrow 1.
- Chains of arrows are associatively composable, with 1 the identity
of this composition.
Examples: Sets, Finite sets, k-Vector spaces, left and right
R-Modules, Graphs, Groups, Abelian groups.
[tex]\mathbb B[/tex]: finite sets with only bijections as morphisms.
Examples:
- Category of a monoid.
- Category generated by a graph
- Category of a group (groupoid version of the category of a monoid)
- Category of a poset
- Category of Haskell types and functions.
Definition: A functor F is a map of categories, in other words, a
pair of maps Fo on objects and Fa on arrows, such that the identity
arrow maps to identity arrows and compositions of maps map to
compositions of their images.
Examples:
Constant functor: sends every object to A, every map to 1.
Identity functor: sends objects and arrows to themselves.
Underlying set functor, free monoid/vector space/module/... functors.
We are now equipped to define Species:
Definition: A species of structures is a functor [tex]\mathbb
B\to\mathbb B[/tex].
The idea is a set gets mapped to the set of all structures labelled by
the elements in the original set. A bijection on labels maps to the
transport of structures along the relabeling.
Examples:
[tex]A[/tex]: rooted trees, labels on vertices
[tex]G[/tex]: simple graphs, labels on vertices
[tex]Gc[/tex]: connected simple graphs, labels on vertices.
[tex]a[/tex]: trees, labels on vertices.
[tex]D[/tex]: directed graphs, labels on vertices
[tex]\wp[/tex]: subsets, [tex]\wp[U] = \{S: S\subseteq
U\}[/tex].
[tex]End[/tex]: endofunctions
[tex]Inv[/tex]: involutions
[tex]S[/tex]: permutations
[tex]C[/tex]: cycles
[tex]L[/tex]: linear orders
[tex]E[/tex]: sets: [tex]E[U] = \{U\}[/tex]
[tex]e[/tex]: elements: [tex]e[U] = U[/tex]
[tex]X[/tex]: singletons: [tex]X[U] = U[/tex] if [tex]|U|=1[/tex]
and [tex]X[U] = \emptyset[/tex] otherwise
[tex]1[/tex]: empty set: [tex]1[\emptyset]=\{\emptyset\}[/tex],
[tex]1[U] = \emptyset[/tex] otherwise
[tex]0[/tex]: empty species: [tex]0[U] = \emptyset[/tex].
In enumerative combinatorics, the power of species resides in the
association to each species a number of generating functions:
The generating series of a species F is the exponential formal power
series
[tex]F(x) = \sum_{n=0}^\infty |F[n]|\frac{x^n}{n!}[/tex]
where we use the convention [tex][n] = \{1,2,\dots,n\}[/tex], and
[tex]F[n] = F[[n]][/tex].
Thus:
[tex]L(x) = \frac{1}{1-x}[/tex]
[tex]S(x) = \frac{1}{1-x}[/tex]
[tex]E(x) = e^x[/tex]
[tex]e(x) = xe^x[/tex]
[tex]\wp(x) = e^{2x}[/tex]
[tex]X(x) = x[/tex]
[tex]1(x) = 1[/tex]
[tex]0(x) = 0[/tex]
There are a number other in use for combinatorics, but for my purposes,
this is the one I'll focus on.
Operations on species
The real power, though, emerges when we start combining species, and
carry over the combinations to actions on the corresponding power
series.
Addition
The number of ways to form either, say, a graph or a linear order on a
set of labels is the sum of the numbers of ways to form either in
isolation.
This corresponds cleanly to the coproduct in Set, and we may write F+G
for the species
[tex](F+G)[U] = F[U] + G[U][/tex]
(where the second + is the coproduct — i.e. disjoint union — of sets)
In the power series, we get [tex](F+G)(x) = F(x) + G(x)[/tex].
Examples: We may define the species of all non-empty sets
[tex]E_+[/tex] by
[tex]E = 1 + E_+[/tex]
This kind of functional equations is where the theory of species
starts to really shine.
Multiplication
A tricoloring of a set is a subdivision of the set into three disjoint
subsets covering the original set. The number of tricolorings of size
n is
[tex]\sum_{i+j+k=n} \#\text{sets of size i}\cdot\#\text{sets
of size j}\cdot\#\text{sets of size k}[/tex]
A permutation fixes some set of points. The permutation restricted to
the non-fixed points is a derangement. Total number of permutations
on n elements is
[tex]\sum_{i+j=n}\#\text{sets of size
i}\cdot\#\text{derangements of size j}[/tex]
In both of these cases, the total generating series is a product of
the component series, and we end up defining
[tex]F\cdot G[U] = \{(f,g): f\in F[U_1], g\in G[U_2], U_1\cap
U_2 = \emptyset, U_1\cup U_2 = U\}[/tex]
So [tex]F\cdot G[U] = \sum_{U_1,U_2\text{ decompose }U}
F[U_1]\times G[U_2][/tex].
Thus, tricolorings are [tex]E\cdot E\cdot E[/tex] and permutations are
[tex]S = E\cdot Der[/tex], where the set is that of fixed points, and
the derangement captures the actual action of the permutation.
Composition
Endofunctions of sets decompose in their actions on the points as cycles
or directed trees leading in to these cycles.
Since a collection of disjoint cycles corresponds to a permutation, we
can consider such endofunctions to be permutations decorated with rooted
trees attached to points of the permutations; or even permutations of
rooted trees.
To form such a structure on a set U, we'd first partition U into
subsets, put the structure of a rooted tree on each subset, and then the
structure of a permutation on the set of these subsets.
Thus, the number of such structures on n elements is
[tex]\sum_{r\leq n}\sum_{\sum_{k=1}^n i_k = n}
\#\tex{permutations on [r]}\cdot\prod_{k=1}^r\#\text{rooted
trees on [i_k]}[/tex]
This corresponds to the power series [tex]S(A(x))[/tex], and we write,
in general, [tex]F\circ G[U][/tex] for the species of F-structures of
G-structures on subsets.
Examples:
A = X·E(A)
L = 1 + X·L
B = 1 + X·B·B
Pointing
Picking out a single point in a structure on n points can be done in
precisely n ways.
Thus the corresponding generating function will be
[tex]\sum n\cdot f_n\cdot\frac{x^n}{n!}[/tex]
for f-structures with a single label distinguished.
Since we're working with exponential power series, we may notice that
[tex]\frac{\partial}{\partial x} \sum f_n\frac{x^n}{n!} = \sum
f_{n+1}\frac{x^n}{n!}[/tex]
and thus that derivatives are shifts.
Furthermore, [tex]x\cdot\sum f_n\frac{x^n}{n!} = \sum
f_n\frac{x^{n+1}}{n!} = \sum f_{n-1}n\cdot{x^n}{n!}[/tex]
so that the generating function for F-structures with a single
distinguished are
[tex]F^\bullet(x) = x\cdot\frac{\partial}{\partial x}F(x)[/tex]
In species, this process is called pointing.
In functional programming, Conor McBride related this construction to
Huet's Zipper datatypes.
As it turns out, many of the constructions for species make eminent
sense outside the category [tex]\mathbb B[/tex]. In fact, species in
Hask are known to programming language researchers as container
datatypes and the whole calculus translates relatively cleanly.
Functional equations translate to the standard new data type definitions
in Haskell.
Examples:
L = 1 + X·L
`` data List a = Nil | Cons a (List a)``
B = 1 + X·B·B
`` data BinaryTree a = Leaf | Node (BinaryTree a) a (BinaryTree a)``
A = X·E(A)
`` data RootedTree a = Node a (Set (RootedTree a))``
usually, we simulate the Set here by a List. If we need for our
rooted trees to be planar, we can in fact impose a Linear Order
structure instead, and get something like
Ap = X·L(Ap)
The species interpretation of [tex]\frac{\partial}{\partial x}[/tex]
corresponding to leaving a hole in the structure carries over cleanly,
so that [tex]\frac{\partial}{\partial x}T[/tex] is the type of
T-with-a-hole.
Two derivatives of lists
We can deal with [tex]\frac{\partial}{\partial x}L[/tex] in two
different ways:
1.
DL = D(1+X·L) = D1 + D(X·L) = 0 + DX·L+X·DL
and thus
DL-X·DL = 1·L
so (1-X)·DL = L
and thus
DL = L·1/(1-X)
Now, from L=1+X·L follows by a similar cavalier use of subtraction and
division — which, by the way, in species theory is captured by the
idea of a virtual species, and dealt with relatively cleanly — that
L = 1+X·L
so
L-X·L = 1
and thus
(1-X)·L = 1
so
L = 1/(1-X)
Thus, we can conclude that
DL = L·1/(1-X) = L·L
and thus, a list with a hole is a pair of lists: the stuff before the
hole and the stuff after the hole.
2.
We could, instead of using implicit differentiation, as in 1, attack
the derivation we had of
L = 1/(1-X) = 1+X+X·X+X·X·X+…
Indeed,
DL = D(1/(1-X)) = D(1+X+X·X+…) = 0+1+2X+3X·X+…
which we can observe factors as
= (1+X+X·X+X·X·X+…)·(1+X+X·X+X·X·X+…) = L·L
Or we can just use the division rule
DL = D(1/(1-X)) = D(1/u)·D(1-X) = -1/(u·u)·(-1) = 1/[(1-X)·(1-X)] =
L·L
Gröbner bases for operads
All of this becomes relevant to the implementation of Buchberger's
algorithm on shuffle operads (see Dotsenko—Khoroshkin and
Dotsenko—Vejdemo-Johansson) in the step where the S-polynomials (and
thereby also the reductions) are defined. With a common multiple
defined, we need some way to extend the modifications that take the
initial term to the common multiple to the rest of that term.
For this, it turns out, that the derivative of the tree datatype used
provides theoretical guarantees that only partially filled in trees of
the right size, with holes the right size, can be introduced; and also
provides an easy and relatively efficient algorithm for contructing the
hole-y trees and later filling in the holes.