This blogpost is inspired to a large part by comments made by Rob
Ghrist, in connection to his talks on using the Euler characteristic
integration theory to count targets detected by sensor networks.
He pointed out that the underlying principle inducing the rule
[tex]\chi(A\cup B) = \chi(A)+\chi(B)-\chi(A\cap B)[/tex]
goes under many names, among those \emph{Inclusion-Exclusion},
favoured among computer scientists (and combinatoricists). He also
pointed out that the origin of this principle is the Mayer-Vietoris
long exact sequence
[tex]\cdots\to H_{n}(A\cap B)\to H_{n}(A)\oplus H_{n}(B)\to
H_{n}(A\cup b)\to\cdots[/tex]
In this blog post, I'd like to give more meat to this assertion as well
as point out how the general principle of Inclusion-Exclusion for finite
sets follows immediately from Mayer-Vietoris.
Inclusion-Exclusion, and the passage from two sets to many
The basic principle of Inclusion-Exclusion says that if we have two
sets, [tex]A[/tex] and [tex]B[/tex], then the following relationship
of cardinalities holds:
[tex]|A\cup B| = |A| + |B| - |A\cap B[/tex]
A first proof would be performed by referring to an appropriate Venn
diagram, or by pointing out that if we try to count all the elements
of [tex]A\cup B[/tex] by adding the counts for [tex]A[/tex] and
[tex]B[/tex], then we've overcounted the elements of [tex]A\cap
B[/tex], and thus need to subtract these.
Now, this is good and convincing for the situation where we have just
two sets, but what if we had a whole family? What if we had
[tex]A_{1},\dots,A_{n}[/tex]? Well, our old friend the complete
induction rides to our rescue. Suppose we had already proven that for
[tex]n-1[/tex] sets [tex]A_{i}[/tex], the following formula holds:
[tex]\left|\bigcup_{i\in\{1,\dots,n-1\}} A_{i}\right| =
\sum_{k=1}^{n-1} (-1)^{1+k}\sum_{S\subseteq\{1,\dots,n-1\},
|S|=k}\left|\bigcap_{s\in S} A_{s}\right|[/tex]
If you think this formula is imposing, let me show you the first few
ways it works out, so that you can get a feeling for it:
[tex]|A_{1}\cup A_{2}| = |A_{1}| + |A_{2}| - |A_{1}\cap
A_{2}[/tex]
[tex]
\begin{multline*}
|A_{1}\cup A_{2}\cup A_{3}| = \\
|A_{1}|+|A_{2}|+|A_{3}|-|A_{1}\cap A_{2}|-|A_{1}\cap
A_{3}|-|A_{2}\cap A_{3}|+\\
|A_{1}\cap A_{2}\cap A_{3}|
\end{multline*}[/tex]
Here it starts becoming apparent why it is called the
Inclusion-Exclusion formula: we include all elements from all sets, but
then we've gotten too many elements from the pairwise intersections, so
we exclude those, but then we've gotten too few elements from the triple
intersections, so we re-include those, but then we've gotten too few
elements from the quadruple intersections - and on it goes.
If our assumption holds, we can consider what the appropriate size of
the union of [tex]n[/tex] different sets should be. We can pick one set,
say [tex]A_{n}[/tex], and treat it separately. Then, certainly, the
union of the rest has size given by the formula, by our induction
assumption. So, consider what happens when we add in the elements in
[tex]A_{n}[/tex]. We then overcount all elements in any of the
intersections [tex]A_{i}\cap A_{n}[/tex], so we have to subtract
those. But we then have undercounted all elements in any intersection on
the form [tex]A_{i}\cap A_{j}\cap A_{n}[/tex], so we need to add
those back in. And so we go on, and for each intersection that occurred
in the formula for [tex]n-1[/tex] sets, we find ourselves forced to add
one new term, of the opposite sign, that counts the intersection of that
old term with [tex]A_{n}[/tex].
Thus, the new formula takes on the exact guise of the general formula we
wanted to prove, since the terms involving [tex]A_{n}[/tex] are
precisely indexed, in that formula, by the terms excluding
[tex]A_{n}[/tex], but with opposite signs.
Encoding set sizes homologically
So, now that we know that pairwise Inclusion-Exclusion implies the
general Inclusion-Exclusion, let us take a look on how we can express
Inclusion-Exclusion with homology. We note first of all that
[tex]H_{0}(X)[/tex] counts the number of path-connected components of a
space [tex]X[/tex]. This follows easily from a few properties. First, a
basis for the [tex]0[/tex]-chains of [tex]X[/tex] is given by the points
in [tex]X[/tex]. Second, two [tex]0[/tex]-chains are homologous
precisely if there is a [tex]1[/tex]-chain whose boundary is the
difference between those [tex]0[/tex]-chains. The boundary of a path is
the difference of its endpoints, so we end up with two basis elements
being homologous precisely if there is a path connecting them.
Thus, two [tex]0[/tex]-chain elements are homologous precisely if
they're in the same path-component of [tex]X[/tex]. So, the homology
classes of [tex]0[/tex]-chains are precisely the path-connected
components.
So, the [tex]0[/tex]-degree homology classes count path-components. How
do we use that? Well ... if we consider a finite set [tex]X[/tex] with
the discrete topology - i.e. open sets are single points and any unions
of single point sets, then a path is a continuous function
[tex][0,1]\to X[/tex], i.e. one where the preimage of any open set is
open. Thus, the preimage of any point is an open set. Suppose a path
contains more than one point. Then we can partition the interval
[tex][0,1][/tex] into two disjoint open sets, by taking one set to be
the preimage of one of the points, and the other set to be the preimage
of everything else. However, the interval [tex][0,1][/tex] is connected,
which means that if there are two such open sets, one of them has to be
empty.
Thus, the path-connected components of a discrete topological space are
the points themselves. So if [tex]X[/tex] is discrete, then
[tex]H_{0}(X) = |X|[/tex].
Long exact sequences in homology and the Mayer-Vietoris sequence
It is a theorem from homological algebra that if we have a short exact
sequence of chain complexes
[tex]0\to A_{*}\to B_{*}\to C_{*}\to 0[/tex]
then there is a long exact sequence in homology given by
[tex]\cdots\to H_{n}(A_{*})\to H_{n}(B_{*})\to
H_{n}(C_{*})\to H_{n-1}(A_{*})\to\cdots[/tex]
Almost all interesting long exact sequences in topology are special
cases of this particular fact. We will derive one interesting sequence:
the Mayer-Vietoris sequence from this.
Take a topological space [tex]X[/tex]. The singular chain complex
[tex]C_{*}(X)[/tex] forms a chain complex with [tex]C_{n}(X)[/tex]
the free abelian group of [tex]n[/tex]-chains and the differential given
by the usual formula.
Suppose we have two spaces: [tex]X[/tex] and [tex]Y[/tex]. Then any
[tex]n[/tex]-chain on [tex]X\cap Y[/tex] is an [tex]n[/tex]-chain on
[tex]X[/tex] and an [tex]n[/tex]-chain on [tex]Y[/tex]. So we get an
inclusion map [tex]C_{n}(X\cap Y)\to C_{n}(X)\oplus
C_{n}(Y)[/tex] by [tex]\sigma\mapsto(\sigma,-\sigma)[/tex]. We
choose this sign instead of the image [tex](\sigma,\sigma)[/tex] to
make the next step cleaner. It is an injection because
[tex]C_{n}[/tex] is a free abelian group, and the injections
[tex]X\leftarrow X\cap Y\rightarrow Y[/tex] are set maps on bases
for the corresponding chain groups.
Compatibility of this map with the differential is almost as easy. If we
first apply the differential and then map into the direct sum, we get
the map [tex]\sigma\mapsto d\sigma\mapsto(d\sigma,-d\sigma)[/tex].
If we instead first map into the direct sum, and then apply the
differentials, we get the map
[tex]\sigma\mapsto(\sigma,-\sigma)\mapsto(d\sigma,-d\sigma)[/tex].
These are equal, and thus the map we've constructed is a chain map.
So, [tex]C_{*}(X\cap Y)\to C_{*}(X)\oplus C_{*}(Y)[/tex] is an
injective map of chain complexes.
If we now have a pair of chains, we can construct a chain on the union
of the two spaces. An [tex]n[/tex]-chain on [tex]X[/tex] is
automatically an [tex]n[/tex]-chain on [tex]X\cup Y[/tex]. Similarly,
an [tex]n[/tex]-chain on [tex]Y[/tex] is automatically an
[tex]n[/tex]-chain on [tex]X\cup Y[/tex]. So both [tex]C_{n}(X)[/tex]
and [tex]C_{n}(Y)[/tex] inject into [tex]C_{n}(X\cup Y)[/tex] by the
same kind of argument as with the injection [tex]C_{n}(X\cap Y)\to
C_{n}(X)[/tex].
The injections we get this way can be put together to a map
[tex]C_{n}(X)\oplus C_{n}(Y)\to C_{n}(X\cup Y)[/tex] by, for
instance, [tex](\sigma,\tau)\mapsto\sigma+\tau[/tex]. I am going to
fudge over the issue of whether the [tex]n[/tex]-chains on [tex]X[/tex]
and [tex]Y[/tex] form a generating set for the [tex]n[/tex]-chains on
[tex]X\cup Y[/tex], and just take it on faith that this works. The
details are not particularly difficult - they are just obnoxiously long
and annoying.
Now, given that we believe we can get a generating set by taking the
union of the bases for the [tex]n[/tex]-chains on [tex]X[/tex] and
[tex]Y[/tex], then this means that the map I gave above is surjective.
Indeed, a basis element [tex]\sigma\in C_{n}(X)[/tex] is hit by the
element [tex](\sigma,0)[/tex], and a basis element [tex]\tau\in
C_{n}(Y)[/tex] is hit by the element [tex](0,\tau)[/tex].
This argument proves exactness at the first and last term of the short
exact sequence
[tex]0\to C_{*}(X\cap Y)\to C_{*}(X)\oplus C_{*}(Y)\to
C_{n}(X\cup Y)\to 0[/tex]
and it remains to prove exactness at the middle term. So we need to
prove that
[tex]\mathrm{ker}((\sigma,\tau)\mapsto\sigma+\tau) =
\mathrm{im}(\sigma\mapsto(\sigma,-\sigma))[/tex]
Now, since [tex]\sigma+(-\sigma)=0[/tex], the inclusion of the image
in the kernel is obvious. So, suppose now that [tex]\sigma\in
C_{n}(X)[/tex], [tex]\tau\in C_{n}(Y)[/tex] and
[tex]\sigma+\tau=0[/tex]. Then, in [tex]C_{n}(X\cup Y)[/tex], we
know that [tex]\sigma=-\tau[/tex]. But then, [tex]\tau\in
C_{n}(X)[/tex] since [tex]\sigma\in C_{n}(X)[/tex] and similarly
[tex]\sigma\in C_{n}
(Y)[/tex] since [tex]\tau\in C_{n}(Y)[/tex]. So
[tex]\sigma,\tau\in C_{n}(X\cap Y)[/tex] and
[tex]\tau=-\sigma[/tex]. But then this kernel element really is in
the image of the inclusion we gave.
So this is a short exact sequence of chain complexes. It follows from
the theorem of long exact sequences in homology that there is an
associated long exact sequence, called the Mayer-Vietoris sequence:
[tex]\cdots\to H_{n}(X\cap Y)\to H_{n}(X)\oplus H_{n}(Y)\to
H_{n}(X\cup Y)\to H_{n-1}(X\cap Y)\to\cdots[/tex]
Putting it all together
Suppose [tex]X[/tex] is a discrete topological space. Then any
continuous map from a [tex]k[/tex]-simplex [tex]\Delta^{k}[/tex] to
[tex]X[/tex] is by necessity constant at one point. So all the singular
simplices in [tex]X[/tex] are constant maps, and thus a basis for any
[tex]n[/tex]-chain group is given by just the points in [tex]X[/tex] as
such. The differential acts on such a constant map by sending it to an
alternating sum of identical maps the cardinality of which is one more
than the dimension of the simplex. So the differential vanishes at every
other point in the chain complex, and is an isomorphism on the rest. But
then we have exactly two possibilities for a homology computation:
Case 1: [tex]C_{n}(X)\overset1\to C_{n-1}(X)\overset0\to
C_{n-2}(X)[/tex]
Here, the kernel is the entire chain group. But so is the image. Thus
the homology group is trivial.
Case 2: [tex]C_{n}(X)\overset0\to C_{n-1}(X)\overset1\to
C_{n-2}(X)[/tex]
Here, the kernel is trivial, but then again, so is the image. Thus,
again, so is the homology group.
The only part not really covered by this is the homology group
[tex]H_{0}(X)[/tex], since this is computed from a diagram on the shape
[tex]C_{1}(X)\overset0\to C_{0}(X)\to0[/tex]. So
[tex]H_{0}(X)[/tex] behaves like we showed above, and is the only
non-trivial homology group of the discrete space.
But then, the only bit that remains non-zero in the Mayer-Vietoris
sequence is
[tex]0\to H_{0}(X\cap Y)\to H_{0}(X)\oplus H_{0}(Y)\to
H_{0}(X\cup Y)\to 0[/tex]
If we have a short exact sequence of free abelian groups, then this
relates the number of basis elements in each of the groups with the
structure of the sequence. Let our short exact sequence be given by
[tex]0\to A\to B\to C\to 0[/tex]. then, by injectivity of the map
[tex]A\to B[/tex], the cardinality of a basis of the image in
[tex]B[/tex] is equal to the cardinality of a basis for [tex]A[/tex]. By
surjectivity of the map [tex]B\to C[/tex], and by Noether's theorem,
the cardinality of a basis of [tex]B[/tex] is equal to the cardinality
of a basis of the kernel of this map, plus the cardinality of a basis of
[tex]C[/tex]. By exactness in the middle, this kernel basis is equal to
the image basis from A. So, we have, writing [tex]|G|[/tex] for the
cardinality of a basis of the free abelian group [tex]G[/tex], that
[tex]|B| = |A|+|C|[/tex]. Thus, [tex]|A|-|B|+|C|=0[/tex].
And if we apply this to the Mayer-Vietoris sequence fragment we
acquired, it follows that
[tex]|X\cap Y| - |X| - |Y| + |X\cup Y| = 0[/tex]
and thus
[tex]|X\cup Y| = |X| + |Y| - |X\cap Y|[/tex]