This is the first of a projected twopart series of articles about ultrafilters. The former introduces ultrafilters and establishes various properties about them, so that the latter article can concentrate entirely on applications of ultrafilters.
Much of this material is from Professor Imre Leader’s lecture course on Ramsey theory, notes of which can be obtained from Paul Russell’s website.
I’ve been intending to write about ultrafilters for some time, but, whilst simple to define and manipulate, they are rather abstract objects. I suppose I’ll first give a formal definition, and then attempt to provide a more intuitive interpretation of what an ultrafilter is.
So, an ultrafilter U on a set S is a collection of subsets of S with the following properties:
 If A is in U, then every superset of A is also in U (U is an upset);
 If A and B are in U, then the intersection A ∩ B is also in U;
 Precisely one of A and S\A belongs to U.
The correct way to view ultrafilters is a way to assign a finitelyadditive measure to S, where every set has either measure 0 (almost nothing) or measure 1 (almost everything). Then, U is the collection of big (measure1) subsets of S. The axioms then have entirely intuitive interpretations:

If A is a big set (has measure 1), then so is every superset of A;
 The union of two measure0 sets also has measure 0;
 If A has measure 1, its complement has measure 0, and viceversa.
Very often, we take S to be the set of positive integers, . From the ultrafilter axioms, we can deduce the following:

If we partition a measure1 set into finitely many sets, then precisely one of those has measure 1.
There is a notion of a stronger version of an ultrafilter (on necessarily uncountable sets) where ‘finitely’ is replaced with ‘countably’ (thus is a genuine measure in the usual sense), but their existence is equivalent to that of a certain large cardinal called a measurable cardinal for obvious reasons. Anyway, in the remainder of the article we will only consider ultrafilters on the set N of positive integers.
Examples of ultrafilters
When trying to explain ultrafilters, I am compelled to provide a few examples. The easiest examples to provide are the principal (boring) ultrafilters, such as the following:
A set has measure 1 if and only if it contains the number 7.
There is a principal ultrafilter for each natural number n, by replacing 7 with n. Principal ultrafilters are somewhat silly, since they only care about a single element of S. Note that if a finite set has measure 1 with respect to some ultrafilter, then one of its elements must have measure 1 and the ultrafilter must be principal. So, in a nonprincipal ultrafilter, all measure1 sets are infinite (as they should be).
Anyway, what’s an example of a nonprincipal ultrafilter?
It transpires that nonprincipal (or free) ultrafilters cannot be explicitly constructed, relying heavily on the axiom of choice. We create an ultrafilter U by extending a filter F, which is a strictly weaker notion than that of an ultrafilter. A filter F need only satisfy the following properties:

The empty set does not belong to F;
 The entire set S does belong to F;
 If A and B belong to F, then so does the intersection;
 If A belongs to F, then so does every superset of A.
Unlike ultrafilters, there are lots of interesting examples of filters. For instance, we can take the filter of cofinite sets, where a set belongs to F if and only if its complement is finite.
There is an obvious partial order on the set of filters, namely inclusion. If a filter is not an ultrafilter, then we can extend it by throwing in another set. If we have a chain (totallyordered set of filters), then the union is a filter which is an upper bound. Hence, by Zorn’s lemma, we can extend any filter to a maximal filter (an ultrafilter).
Ultrafilters as quantifiers
You are probably aware of the universal quantifier ∀ and the existential quantifier ∃. Given your favourite ultrafilter U, we can define the quantifier ∀U (pronounced ‘for Umost’) as follows:
 ‘∀U x : P(x)’ means ‘P(x) is satisfied by a measure1 set of x in S’
It has the beautiful property that it distributes over all Boolean operators:
 ∀U x : (P(x) or Q(x)) ⇔ (∀U x : P(x)) or (∀U x : Q(x))
 ∀U x : (P(x) and Q(x)) ⇔ (∀U x : P(x)) and (∀U x : Q(x))
 ∀U x : (not P(x)) ⇔ not (∀U x : P(x))
The first two of these also apply to ∃ and ∀, respectively, but the third applies to neither (∀ and ∃ are interchanged, rather than preserved, when conjugated by logical negation).
Warning: it does not commute with itself (∀U x ∀U y is not the same as ∀U y ∀U x). This again contrasts with the more familiar logical quantifiers.
The StoneCech compactification of N
The space of ultrafilters on the positive integers is denoted βN and admits a natural topology. Specifically, we define a base of open sets:

For every set A of positive integers, let C(A) = {U in βN  A in U} be an open set.
C(N) and C(Ø) are βN and Ø, respectively. The intersection of two base sets, C(A) and C(B), is easily verified to be C(A ∩ B), so this is indeed a base of open sets. A set is open if and only if it is some arbitrary union of base sets. Note that again we have C distributing over everything:

C(A ∩ B) = C(A) ∩ C(B)
 C(A ∪ B) = C(A) ∪ C(B)
 C(N \ A) = C(N) \ C(A) = βN \ C(A)
In particular, the base sets C(A) are both open and closed.
So, why are we interested in the topology on βN? It transpires that it has several very nice properties:

Hausdorffness: For two distinct ultrafilters, U and V, we can find a set A such that A belongs to U but not to V. Then, C(A) contains U and C(N\A) contains V. Since C(A) and C(N\A) are disjoint open sets, this means that βN is a Hausdorff space.
 Compactness: Suppose we have a collection P of closed sets, such that the intersection of any finite subset of P is nonempty. It suffices to consider the case where the elements of P are all base sets. Indeed, let P = {C(A) : A in Q}, where Q is a collection of subsets of N, such that any intersection of finitely many elements of Q is nonempty. Let F be the filter of all supersets of elements of Q, and extend it to an ultrafilter U. U must necessarily belong to all elements of P, and therefore their intersection. This establishes the finite intersection property, which is trivially equivalent to topological compactness.
 N is dense in βN: We can embed N in βN by associating the number n with the principal ultrafilter {A in N : n in A}. Then N is a dense subset of βN, since for every nonempty C(A) in the base of open sets, we can choose an arbitrary a in A and observe that the principal ultrafilter on a belongs to C(A). βN is actually the largest compact Hausdorff space in which N is dense, and is called the StoneCech compactification of N. It is a surprising fact — which is not too difficult but still nontrivial to prove — that after removing N from βN, we are left with an inseparable space (one with no countable dense subset).
Even though it is very nice from a topological point of view, little is known about βN. Assuming the continuum hypothesis, Parovicenko proved that several topological assumptions are sufficient to characterise βN; conversely, in the absence of the continuum hypothesis, there exist nonhomeomorphic spaces satisfying the same properties. Also, if we assume certain large cardinal axioms, then βN has completely different properties than under the (incompatible) assumption of the continuum hypothesis.
Ultrafilter addition and idempotent ultrafilters
Recall that N admits a commutative addition operation. This can be extended to a noncommutative addition on βN, where we define the sum of two ultrafilters as follows:

U + V = {A ⊆ N  ∀U x ∀V y : x + y in A}
Note that this can easily be verified to satisfy the defining properties of an ultrafilter. This is associative, since (U + V) + W = U + (V + W) = {A ⊆ N  ∀U x ∀V y ∀W z : x + y + z in A}. We can also prove that it is leftcontinuous, that is to say that U + V is a continuous function of U. Of course, working in a general topological space, there is only one definition of continuity that we’re allowed to use: the preimage of an open set is open.
It suffices to prove that, for an arbitrary fixed ultrafilter V, the preimage of C(A) is always open, for every A ⊆ N. So, let’s unpack the definitions:
Note that U + V = {A ⊆ N  ∀U x : (∀V y : x + y in A)}, so the preimage of C(B) is the set:

{U in βN  {A ⊆ N  ∀U x : (∀V y : x + y in A)} in C(B)}
 = {U in βN  B in {A ⊆ N  ∀U x : (∀V y : x + y in A)}}
 = {U in βN  ∀U x : (∀V y : x + y in B)}
 = {U in βN  {x in N  (∀V y : x + y in B)} in U}
 = C({x in N  (∀V y : x + y in B)})
which is open, by definition. To summarise, the operation + is associative and leftcontinuous, and βN is a compact Hausdorff space. We’re now going to forget absolutely everything else about ultrafilters, and prove the following lemma:
The idempotent lemma: If X is a nonempty compact Hausdorff space and + is a leftcontinuous associative operation, then there exists x in X such that x + x = x.
The proof requires extensive use of Zorn’s lemma.
We firstly consider the collection S of nonempty closed sets Y with the property that Y + Y is a subset of Y. The entire space X is trivially one such example, so our collection S is nonempty. Endow S with the partial order of inclusion.
If we have a chain of such sets, then the intersection is nonempty (since all finite intersections are nonempty, and we are working in a compact Hausdorff space where compact sets are closed and viceversa) and closed (by virtue of being an intersection of closed sets). So, every chain has a ‘lower bound’ in S and thus we can find a minimal Y by Zorn’s lemma.
Choose an arbitrary x in Y, which we can do by the fact that Y is nonempty. Y + x is the continuous image of a nonempty compact set, thus nonempty and compact. Also, (Y + x) + (Y + x) = Y + Y + x + x, by associativity, which is clearly a subset of Y + x. Hence, Y + x is also in S and is a subset of Y, so Y + x = Y (as otherwise Y + x would be more minimal than Y).
Now define Z = {y in Y  y + x = x}, which is nonempty since Y + x = Y, and Y contains x. It is the preimage of a singleton set (which must be compact and thus closed) under a continuous function, so is itself closed. If y and y‘ both belong to Z, then (y + y‘) + x = y + (y‘ + x) = y + x = x, so Z + Z is a subset of Z and therefore belongs to S. So Z = Y by minimality of Y.
Hence, since x is in Y, x is in Z and therefore x + x = x, concluding our claim.
Consequently, there must exist an ultrafilter U with U + U = U. It is clear that such an ultrafilter must necessarily be nonprincipal. U is known as an idempotent ultrafilter. Nonprincipal ultrafilters themselves already have many applications (as we shall see in the followup post), but idempotent ultrafilters are even more useful, powerful and generally awesome. I can imagine that you’re all bursting with excitement in anticipation of the followup article, so I shall endeavour to write it as soon as feasibly possible.
Thank you for this article. I haven’t asked for it, but it seems to clarify everything about ultrafilters I wished to know 🙂 I also ask for few things about article:
You mentioned that “There is a principal ultrafilter for each natural number n”, but you seem to forgot to mention these are all, i.e. principal ultrafilters are exactly these defined by single element.
When mentioning measurable cardinals, you said they have to be countably additive. On Wikipedia, it’s said that it must be κadditive. Does former imply latter?
Other than that, I wish you Merry Christmas! (P.S. I love the new logo ^^)
Thank you, and Merry Christmas!
Yes, I promised that I would have a Christmasthemed banner over the festive period (and I’m going to reuse last winter’s New Year banner).
Stanislaw Ulam (who deserves to be better known than he currently is) showed that if κ is the minimal cardinal admitting a countablyadditive measure, then κ must also admit a κadditive measure. Hence, the existence of a countablyadditive ultrafilter is indeed equivalent to the axiom that a measurable cardinal exists.
Pingback: Applications of ultrafilters  Complex Projective 4Space