# Gröbner bases for Operads — or what I did in my vacation

Published: Fri 08 May 2009

This post is to give you all a very swift and breakneck introduction to Gröbner bases; not trying to be a nice and soft introduction, but much more leading up to announcing the latest research output from me.

Recall how you would run the Gaussian algorithm on a matrix. You'd take the leftmost upmost non-zero entry, divide its row by its value, and then use that row to eliminate anything in the corresponding column.

Once we have the matrix on echelon form, we can then do lots of things with it. Importantly, we can use substitution using the leading terms to do equation system solving.

The starting point for the theory of Gröbner bases was that the same method could be used - with some modification - to produce something from a bunch of polynomials that ends up being as useful as a row-reduced echelon form.

Basically, the central theorem-definition of Gröbner bases (by Buchberger, named for his thesis advisor Gröbner), says that a Gröbner basis for some ideal in some polynomial ring is a bunch of generators for that ideal such that if we do polynomial division of any element of the polynomial in the ring with the generators, one after the other, until no more divisions could be performed, then what we get out of it all is uniquely determined.

This need not necessarily be the case, even to begin with - we could get weird loops and various kinds of bad behaviour that screws up the concept of dividing, with remainder, by a whole bunch of polynomials. Having a Gröbner basis means this is no longer a problem.

The important bit of the theorem is that we can GET a Gröbner basis by just iteratively trying out combinations of the generators, trying to find overlaps of the leading terms, and generating "bad examples". Any bad example that doesn't get completely reduced to 0 by division by the generators is also needed to complete the Gröbner basis, and by adjoining it we grow our generating set, but not the things it generates. And the method works by growing the generating set until anything we can build out of two of the generators really will reduce completely to 0.

And, the central theorem says, if THIS works, then reduction works in general, and we have a tool as good for solving systems of polynomial equations as the echelon form is for linear equation systems.

## Losing commutativity

The next interesting step is to look at non-commutative polynomial rings. Here, we suddenly have a deep theoretic issue popping up. We know that the word problem for monoids - i.e. whether a specific string can be reduced with rewriting rules to an empty string - is unsolvable in general. It hooks up to Turing machines and the limits of theoretical computer science, but in essence we can encode problems as Gröbner basis computations for non-commutative algebras that we know cannot possibly be solved in finite time.

So we cannot hope for the situation to be as good as for commutative algebras. However, there is a theorem floating around - the Diamond lemma by Bergman - that says that if we DO get a Gröbner basis computation that halts in finite time, then all the good things that we had in the commutative case - such as reductions modulo the generators being well defined - hold for the things we've computed. In other words, the only thing that could go wrong would be that the computation of the Gröbner basis doesn't finish in finite time.

Here, an operad is a collection {O(n) : n?1} of vector spaces with permutations attached to them. Hence, for each n, there is a map Sn ← Hom(O(n),O(n)), or in other words, we can apply any permutation of n things to any element in the component O(n) and get something back out of it.

Furthermore, an operad O = {O(n)} has defined on it structure operations. These behave like the composition of multilinear functions - so we have a way of plugging one function into another:
f(a1,a2,...,g(ai,...,aj),...,an)
and this composition is associative, so the order we figure out some sequence of compositions doesn't matter.

These gadgets show up in modern approaches to universal algebra like questions, and also all over the place in topology; some of the earliest instances were from homotopy theory.

Recently, Dotsenko and Khoroshkin released a paper on Gröbner bases for operads. In the paper, they figure out a way to find the kind of canonical and ordered basis that you need in order to mimic the whole workflow of Gröbner bases above; and specified how one should go about producing a Diamond Lemma for operads. It turns out that Gröbner bases would be useful to prove operads to be Koszul - something I won't discuss here, but which is important in the operad theory context.

Basically, instead of working in a polynomial ring on some set of variables, we start out with out variables in one of these graded sets {V(n)}. Then we can build rooted trees, whose internal vertices of degree n+1 are labeled by elements from V(n).

The vector space spanned by all such trees forms an operad where the composition operation works by taking a tree and attaching the root to the leaf numbered by whatever position we need to compose at.

The resulting construction is the free operad on the generating set and takes the role that the polynomial ring had in the previous examples. And what Dotsenko and Khoroshkin pointed out was that if we restrict the kinds of actions we allow from permutations on these trees somewhat, we end up with something that has exactly one representative that fits in the restricted context for each tree that occurs as a basis element of the free operad.

So we can impose some sort of ordering on these trees, and use them to mimic the Diamond lemma.

Indeed, what we end up doing is forming the overlaps between leading terms by finding trees that parts of the trees from the leading terms from pairs of operad elements can embed into, and using the resulting procedure to build the same kind of bad cases we need to test for Buchberger's algorithm.

And again, it turns out that while we may not always get an answer within finite time, if we're lucky then the answer we DO get has all the properties we could dream of. And not only that - all the previous kinds of Gröbner bases embed as special cases of doing it this way.

I heard of this, and got my hands on the paper, when I first arrived in CIRM in Luminy, outside Marseille, for 2 weeks of operad theory with a master's course and a conference on the subject. And I got so excited - once upon a time this was essentially my proposal for PhD thesis project - that I decided to sit down and code the whole thing up right away!

And code away I did. Once the two weeks were gone, with the valuable help from Vladimir Dotsenko and Eric Hoffbeck, I had ended up with a working implementation in Haskell of the whole paradigm. It's now available from the HackageDB, and runs at least in GHC 6.8 and 6.10:

```ghci -cpp Math.Operad
GHCi, version 6.10.1: http://www.haskell.org/ghc/  :? for help

*Math.Operad> let v = corolla 2 [1,2]
*Math.Operad> let g1t1 = nsCompose 1 v v
*Math.Operad> let g1t2 = nsCompose 2 v v
*Math.Operad> let g2t2 = shuffleCompose 1 [1,3,2] v v
*Math.Operad> let g1 = (oet g1t1) + (oet g1t2) :: FreeOperad Integer
*Math.Operad> let g2 = (oet g2t2) - (oet g1t2) :: FreeOperad Integer
(0.00 secs, 524996 bytes)
[
+1 % 1*m2(1,m2(2,m2(3,4))),

+1 % 1*m2(m2(1,2),3)
+1 % 1*m2(1,m2(2,3)),

+1 % 1*m2(m2(1,3),2)
+(-1) % 1*m2(1,m2(2,3))]
(0.41 secs, 55184352 bytes)
```
And we see here that with the particular set of generators given, namely using m2 as the variable, and for a generating set, we pick
m2(m2(x1,x2),x3) + m2(x1,m2(x2,x3))
and
m2(m2(x1,x3),x2) - m2(x1,m2(x2,x3))
we get a Gröbner basis almost instantly containing additionally the generator
m2(x1,m2(x2,m2(x3,x4)))