Čech cohomology and illusions

Jeffery MensahJanuary 19, 2025∼2500 words

This post builds off of Roger Penrose's article On the Cohomology of Impossible Figures, in which he shows a remarkable connection betweeen impossible drawings, such as the Penrose triangle, and nontrivial cohomology groups. In simple terms, the fact these perceptual illusions work is due to an obstruction preventing one from synthesizing the local information in the figure into a consistent global construct; this is precisely what cohomology measures. We give a brief introduction to Čech cohomology and show how a well-known auditory illusion, the Shepard tone, works by the same principles discussed by Penrose.

Čech cohomology

Let XX be a topological space and U={Ui}iI\mathscr{U} = \{U_i\}_{i\in I} be an open cover of XX. We can build a simplicial complex N(U)\mathrm{N}(\mathscr{U}) out of U\mathscr{U} by imagining the open sets UiU_i as "thickened" vertices, as follows. Let N0(U)\mathrm{N}_0(\mathscr{U}) be the set of all 11-tuples [Ui][U_i] for iIi \in I, which we call vertices or 00-simplices. Intuitively, two vertices [Ui][U_i] and [Uj][U_j] bound a "line segment" or a 11-simplex [Ui,Uj][U_i, U_j] if UiU_i and UjU_j have nonempty intersection. Continuing, we say that three 11-simplices [Ui,Uj][U_i, U_j], [Uj,Uk][U_j, U_k], [Uk,Ui][U_k, U_i] bound a "triangle" or a 22-simplex if UiUjUkU_i \cap U_j \cap U_k \neq \varnothing. In general, we may define the set of kk-simplices

Nk(U)={(U0,,Uk)Uk+1i=0kUi}.\mathrm{N}_{k}(\mathscr{U}) = \left\{ (U_{0}, \ldots, U_{k}) \in \mathscr{U}^{k+1} \bigm| \bigcap_{i=0}^{k} U_i \neq \varnothing \right\}.

We call N(U)=k=0Nk(U)\mathrm{N}(\mathscr{U}) = \bigcup_{k = 0}^{\infty} \mathrm{N}_k(\mathscr{U}) the nerve complex of U\mathscr{U}. Define partial boundary maps k,i ⁣:Nk(U)Nk1(U)\partial_{k, i} \colon \mathrm{N}_k(\mathscr{U}) \to \mathrm{N}_{k-1}(\mathscr{U}) by

k,i([U0,,Uk])=[U0,,Ui^,,Uk]\partial_{k, i}([U_0, \ldots, U_k]) = [U_0, \ldots, \widehat{U_i}, \ldots, U_k]

where ^\widehat{\,\, \cdot \,\,} signifies removal of an open set. As a direct consequence of our construction, every 22-simplex [Ui,Uj,Uk][U_i, U_j, U_k] is "bounded" by three 11-simplices [Ui,Uj][U_i, U_j], [Ui,Uj][U_i, U_j], and [Ui,Uk][U_i, U_k], which are in turn bounded a total of six 00-simplices (counting duplicates), as demonstrated below.

Notice that each 00-simplex [Ui][U_i] is counted twice, since the 11-simplices bounding [Ui,Uj,Uk][U_i, U_j, U_k] join together at the vertices. This suggests that by appropriately giving each simplex in the boundary a "sign coefficient" and summing them together, we may make the double boundary of a simplex vanish. To formalize this idea, we consider the space of Čech kk-chains Cˇk(U)=ZNk(U)\v{\mathrm{C}}_k(\mathscr{U}) = \mathbb{Z}\mathrm{N}_k(\mathscr{U}) formed by all finite formal combinations of simplices with integer coefficients. Then we may define (oriented) boundary maps

k ⁣:Cˇk(U)Cˇk1(U);k([U0,,Uk])=i=0k(1)ik,i([U0,,Uk]).\partial_{k} \colon \v\mathrm{C}_{k}(\mathscr{U}) \to \v\mathrm{C}_{k-1}(\mathscr{U}); \quad \quad \partial_k ([U_0, \ldots, U_{k}]) = \sum_{i=0}^{k} (-1)^{i} \partial_{k,i} ([U_0, \ldots, U_k]).

One can then check that these choice of coefficients correctly eliminates the duplicate second boundaries. In other words, kk+1=0\partial_{k} \circ \partial_{k+1} = 0, so Cˇ(U)\v\mathrm{C}_\bullet(\mathscr{U}) forms a chain complex.

Čech cochains

Recall that a presheaf of abelian groups on a topological space XX is a functor F ⁣:Open(X)Ab\mathscr{F} \colon \mathbf{Open}(X) \to \mathbf{Ab}. In other words, F\mathscr{F} assigns to every open set UXU \subseteq X an abelian group F(U)\mathscr{F}(U) of sections, and for every inclusion VUV \subseteq U of open sets, there is a restriction morphism resU,V ⁣:F(U)F(V)\operatorname{res}_{U, V} \colon \mathscr{F}(U) \to \mathscr{F}(V) which takes a section over UU and restricts it to a section over VV. A simple example of a presheaf of abelian groups is given by the sheaf C(,R)\mathrm{C}(-, \mathbb{R}) of continuous real-valued functions on a space XX. Here, addition of sections is just given by pointwise addition of functions, and restriction is given by the usual defintion of function restriction.

Given a presheaf of abelian groups F\mathscr{F}, we may define a Čech kk-cochain with coefficients in F\mathscr{F} to be a map which assigns each kk-simplex σNk(U)\sigma \in \mathrm{N}_k(\mathscr{U}) to a section f(σ)F(σ)f(\sigma) \in \mathscr{F}(|\sigma|), where σ|\sigma| is the intersection of all open sets in the simplex. This extends to a homomorphism

f ⁣:Cˇk(U)σNk(U)F(σ),f \colon \v\mathrm{C}_{k}(\mathscr{U}) \to \prod_{\sigma \in \mathrm{N}_k(\mathscr{U})} \mathscr{F}(|\sigma|),

which we also refer to as a kk-cochain. These form an abelian group Cˇk(U,F)\v\mathrm{C}^k(\mathscr{U}, \mathscr{F}) and between these we have similarly defined coboundary maps

δk ⁣:Cˇk(U,F)Cˇk+1(U,F);(δkf)(σ)=i=0k+1(1)if(k+1,iσ)σ,\delta_{k} \colon \v\mathrm{C}^{k}(\mathscr{U}, \mathscr{F}) \to \v\mathrm{C}^{k+1}(\mathscr{U}, \mathscr{F}); \quad \quad (\delta_kf) (\sigma) = \sum_{i=0}^{k+1} (-1)^{i} f(\partial_{k+1, i}\sigma)|_{|\sigma|},

where σ\cdot|_{|\sigma|} denotes the restriction of any of the above sections to the common support σ|\sigma|. To define this for all integers, for k<0k < 0 we set Cˇk(U,F)=0\v\mathrm{C}^{k}(\mathscr{U}, \mathscr{F}) = 0. As before, one can also check that δkδk1=0\delta_{k} \circ \delta_{k - 1} = 0, so these groups form a cochain complex. We say that cochains in the image of δk\delta_k are kk-coboundaries and cochains in the kernel of δk\delta_k are kk-cocycles. Then the quotient

Hˇk(U,F)=defkerδkimδk1,\v\mathrm{H}_k(\mathscr{U}, \mathscr{F}) \overset{\rm def}{=} \frac{\ker \delta_k}{\operatorname{im} \delta_{k-1}},

is the kkth Čech cohomology group of XX with respect to the open cover U\mathscr{U} and values in F\mathscr{F}. It will be non-trivial when there exist kk-cocycles which are not equal to the coboundary of any (k1)(k-1)-cocyle.

We can describe low-dimensional cocycles and coboundaries as follows. A 00-simplex is just determined by an open set UUU \in \mathscr{U}, and so a 00-cochain just assigns a section in F(U)\mathscr{F}(U) over each open set UUU \in \mathscr{U}. The coboundary of a 00-cochain ff is given by the 11-cochain

(δf)([U,V])=f([U])UVf([V])UV.(\delta f)([U, V]) = f([U])|_{U \cap V} - f([V])|_{U \cap V}.

In other words, δf\delta f compares the sections over U0U_0 and U1U_1 on their common intersection U0U1U_0 \cap U_1. By definition, the condition for ff to be a cocycle is that δf=0\delta f = 0, which states that the sections given by ff must all agree on their common intersections. In principle, this means that they can be "glued" together to form a global section on XX (however, since F\mathscr{F} is only a presheaf, we do not know that such an object is actually a section). On the other hand Cˇ1(U,F)=0\v\mathrm{C}^{-1}(\mathscr{U}, \mathscr{F}) = 0, so there are no nontrivial 00-coboundaries. It follows that when F\mathscr{F} is a sheaf, we have H0ˇ(U,F)F(X)\v\mathrm{H^0}(\mathscr{U}, \mathscr{F}) \cong \mathscr{F}(X).

A 11-simplex is given by a pair of intersecting open sets, so a 11-cochain assigns for each pair [U,V][U, V] a section in F(UV)\mathscr{F}(U \cap V). The coboundary of a 11-cochain gg is given by the 22-cochain

(δg)([U,V,W])=g([V,W])UVWg([U,W])UVW+g([U,V])UVW.(\delta g)([U, V, W]) = g([V, W])|_{U \cap V \cap W} - g([U, W])|_{U \cap V \cap W} + g([U, V])|_{U \cap V \cap W}.

Intuitively, this evaluates gg on the "triangle" formed by taking the boundary of the 22-simplex [U,V,W][U, V, W]. If gg is a 11-cocyle, then this vanishes, which implies that evaluating gg on the "path" formed by 11-simplices [U,V][U, V] and [V,W][V, W] is the same as evaluating gg on the path from UU to WW. In other words,

g([U,W])UVW=g([U,V])UVW+g([V,W])UVW,g([U, W])|_{U \cap V \cap W} = g([U, V])|_{U \cap V \cap W} + g([V, W])|_{U \cap V \cap W},

which is commonly called the cocycle condition.

Example. Let XX be a manifold and π ⁣:LX\pi \colon L \to X be a real line bundle on XX with a trivializing open cover U={Ui}iI\mathscr{U} = \{U_i\}_{i \in I}. Then for each iIi \in I there exists a chart ϕi ⁣:π1(Ui)Ui×R\phi_i \colon \pi^{-1}(U_i) \to U_i \times \mathbb{R}, and for any two indices i,jIi, j \in I such that UiUjU_i \cap U_j, the transition map is of the form

τji=ϕjϕi1 ⁣:(UiUj)×R(UjUi)×R;τij(x,v)=(x,[gij(x)](v)),\tau_{ji} = \phi_{j} \circ \phi_i^{-1} \colon (U_i \cap U_j) \times \mathbb{R} \to (U_j \cap U_i) \times \mathbb{R}; \quad \tau_{ij}(x, v) = \Big(x, [g_{ij}(x)] (v)\Big),

where gij ⁣:UiUjGL(1,R)g_{ij} \colon U_i \cap U_j \to \mathrm{GL}(1, \mathbb{R}) is a continuous function. One may check that these functions satisfy the cocycle condition gki=gkjgjig_{ki} = g_{kj} \cdot g_{ji} where defined. Hence, the 11-cochain gg, with values in the sheaf of continuous GL(1,R)\mathrm{GL}(1, \mathbb{R})-valued functions, defined by g([Ui,Uj])=gijg([U_i, U_j]) = g_{ij} is a 11-cocycle.

If X=S1X = \mathbb{S}^1 and LL is the Möbius strip, then we may cover the circle by two intervals UU and VV, such that their intersection UVU \cap V becomes a disjoint union of two intervals II and JJ. Moreover, we may choose charts such that gUV=+1g_{UV} = +1 on II and gUV=1g_{UV} = -1 on JJ, with the change in sign representing the 180180^\circ twist throughout the strip.

Suppose, for the sake of contradiction, that g=δfg = \delta f for some 00-cochain ff. Then, without loss of generality, we can take f([U])=1f([U]) = 1, which implies f([V])f([V]) is positive on II and negative on JJ. By continuity, this implies that f([V])f([V]) must vanish at some point in VV, which is disallowed as 0∉GL(1,R)0 \not \in \mathrm{GL}(1, \mathbb{R}). It follows that Hˇ1(U,F)\v{\mathrm{H}}^1(\mathscr{U}, \mathscr{F}) is nontrivial.

Shepard tones

A Shepard tone is an auditory illusion consisting of a periodic tone whose pitch appears to rise or fall indefinitely. The illusions works by exploiting the manner in which we perceive pitch color from a combination of raw frequencies.

A pure tone is simply a sinusoidal pressure wave ρ ⁣:RR\rho \colon \mathbb{R} \to \mathbb{R} of the form ρ(t)=Asin(ft+ϕ)\rho(t) = A\sin(ft + \phi). Perceptually, pure tones with frequencies ff and 2f2f are thought of as having the same "color". In common musical terms, we say that the two pure tones are separated by an octave. For example, in standard tuning the note A4\mathrm{A}_4 is defined to be 440Hz440 \, \mathrm{Hz}, and the note A5\mathrm{A}_5, which is at 880Hz880\,\mathrm{Hz}, sounds the "same", even though they have different frequencies. This partitions the set of pure frequencies, denoted by F=R>0\mathbf{F} = \mathbb{R}_{>0}, into pitch classes [f]={2nfnZ}[f] = \{ 2^n f\mid n \in \mathbb{Z} \}, which we denote by the quotient P=F/\mathbf{P} = \mathbf{F}/{\sim}.

In nature, a pure tone by itself is almost never encountered. Instead, they are often produced along with a set of overtones, which have frequencies at rational multiples of a lowest fundamental frequency. Our brains detect this fundamental frequency, which determines a single pitch color for the tone, despite being made up of a combination of different frequencies. A Shepard tone consists of an infinite superposition of tones separated by octaves (with decaying amplitudes), so that there is no one true fundamental frequency; such a tone can not be canonically modeled by any point in F\mathbf{F}. Despite this, the brain still makes an abitrary choice, lifting a point pPp \in \mathbf{P} to a frequency fFf \in \mathbf{F}. A continuously changing Shepard tone γ ⁣:IP\gamma \colon I \to \mathbf{P} is lifted to a continuous curve γ ⁣:IF\overline{\gamma} \colon I \to \mathbf{F}.

The ambiguity group

Write S1=R/Z\mathbb{S}^1 = \mathbb{R}/\mathbb{Z} and let γ ⁣:S1P\gamma \colon \mathbb{S}^1 \to \mathbf{P} be the periodic Shepard tone given by γ(t)=[2t]\gamma(t) = [2^t]. If IS1I \subseteq \mathbb{S}^1 is an open interval, then a partial lift γ~ ⁣:IF\widetilde{\gamma} \colon I \to \mathbf{F} is completely determined by its value on a single point t0It_0 \in I. If α\alpha and β\beta are two lifts of γI\gamma|_{I}, then for all tIt \in I,

log2α(t)log2β(t)=log2α(t0)log2β(t0)Z.\log_{2} \alpha(t) - \log_2 \beta(t) = \log_{2} \alpha(t_0) - \log_2 \beta(t_0) \in \mathbb{Z}.

In other words, the ambiguity in the lift of γI\gamma|_{I} is determined by a single integer. Two observers listening to γI\gamma|_I may perceive two different frequency curves, but they must "differ" by an integer. This is what Penrose refers to as the ambiguity group, which acts on the set possible observations, or lifts, of γI\gamma|_I. Formally, since FP\mathbf{F} \to \mathbf{P} is a principal Z\mathbb{Z}-bundle, each fiber is a homogeneous Z\mathbb{Z}-space, for which subtraction is defined. Then the ambiguity group of an open set US1U \subsetneq \mathbb{S}^1 is defined to be

A(U)={log2αlog2βα and β are lifts of γU}.\mathscr{A}(U) = \Big\{ \log_2 \alpha - \log_2 \beta \bigm| \alpha \text{ and } \beta \text{ are lifts of } \gamma|_{U} \Big\}.

For example, if UU is the disjoint union of two intervals, then A(U)Z×Z\mathscr{A}(U) \cong \mathbb{Z} \times \mathbb{Z}, since a lift is determined by a single point in each interval. If a lift exists over UU, then A(U)\mathscr{A}(U) is the set of locally constant functions UZU \to \mathbb{Z}. Assigning the zero group for U=XU = X, this forms a presheaf of abelian groups on S1\mathbb{S}^1.

Cocycles and coboundaries

Let U={Ui}iI\mathscr{U} = \{U_i\}_{i \in I } be an open covering of S1\mathbb{S}^1 by subintervals such that there do not exist any triple intersections consisting of distinct open sets. For example, we may cover S1\mathbb{S}^1 by two intervals, as done in the previous example. Now, imagine that a listener focuses on each interval UiU_i separately, locally observing a lift γi=γUi\gamma_i = \gamma|_{U_i} (which does not necessarily have to agree with neighboring lifts). This defines a 11-cochain sCˇ1(U,A)s \in \v\mathrm{C}^1(\mathscr{U}, \mathscr{A}) given by s([Ui,Uj])=log2γjUiUjlog2γiUiUjs([U_i, U_j]) = \log_2 \gamma_j|_{U_i \cap U_j} - \log_2 \gamma_i|_{U_i \cap U_j}. Since every 22-simplex [Ui,Uj,Uk][U_i, U_j, U_k] can be assumed to be of the form [Ui,Ui,Uj][U_i, U_i, U_j] without loss of generality, ss must be a cocycle, since

(δs)([Ui,Ui,Uj])=s([Ui,Uj])s([Ui,Uj])+s([Ui,Ui])UiUj=0.(\delta s) ([U_i, U_i, U_j]) = s([U_i, U_j]) - s([U_i, U_j]) + s([U_i, U_i])|_{U_i \cap U_j} = 0.

Intuitively, this condition says that a listener can locally adjust parts of their observations on UiU_i and UjU_j by an element of the ambiguity group A(UiUj)\mathscr{A}(U_i \cap U_j) so that they agree. A natural question is if a listener can adjust each whole observation on (i.e, not just on the intersections) by elements of the ambiguity groups A(Ui)\mathscr{A}(U_i) so that the lifts all agree. Such an adjustment corresponds to a 00-cochain aCˇ0(U,A)a \in \v\mathrm{C}^0(\mathscr{U}, \mathscr{A}) with the property that

log2γiUiUja([Ui])UiUj=log2γjUjUia([Uj])UjUi\log_2 \gamma_{i}|_{U_i \cap U_j} - a([U_i])|_{U_i \cap U_j} = \log_2 \gamma_{j}|_{U_j \cap U_i} - a([U_j])|_{U_j \cap U_i}

whenever UiU_i and UjU_j intersect. It follows that

(δa)([Ui,Uj])=a([Uj])UjUia([Ui])UiUj=log2γjUiUjlog2γiUjUi=s([Ui,Uj]).(\delta a)([U_i, U_j]) = a([U_j])|_{U_j \cap U_i} - a([U_i])|_{U_i \cap U_j} = \log_2 \gamma_{j}|_{U_i \cap U_j} - \log_2 \gamma_{i}|_{U_j \cap U_i} = s([U_i, U_j]).

Hence, the individual lifts can be made to agree if and only if ss is a coboundary. Perceptually, this would imply that one could listen to the entire periodic Shepard tone γ ⁣:S1P\gamma \colon \mathbb{S}^1 \to \mathbf{P} and observe a genuine periodic "melody" γ ⁣:S1F\gamma \colon \mathbb{S}^1 \to \mathbf{F}. Of course, this is not the case, which can be seen intuitively by observation, or by the fact that FP\mathbf{F} \to \mathbf{P} is a nontrivial principal bundle, and hence does not admit any global sections.

Alternatively, one may directly show [s][s] is a nonzero element of the first Čech cohomology group Hˇ1(U,A)\v\mathrm{H}^1(\mathscr{U}, \mathscr{A}). For example, if U\mathscr{U} is a covering of the circle by two intervals UU and VV, then the intersection UVU \cap V consists of two disjoint intervals, so A(UV)=Z×Z\mathscr{A}(U \cap V) = \mathbb{Z} \times \mathbb{Z}. One can check that s([U,V])s([U, V]) is of the form (k,k±1)(k, k \pm 1), while (δa)([U,V])(\delta a)([U, V]) is of the form (k,k)(k, k), implying that ss cannot be coboundary.

Other illusions

The fact that the Shepard tone works as an illusion can be seen as a consequence of the fact that Hˇ1(S1,Z)\v\mathrm{H}^1(\mathbb{S}^1, \mathbb{Z}) is nontrivial. In essence this is the same obstruction described by Penrose for the tribar, whose paradoxical nature is connected to the fact that Hˇ1(S1,R×)\v\mathrm{H}^1(\mathbb{S}^1, \mathbb{R}^\times) is nontrivial. As suggested in the conclusion of Penrose's article, it may be possible to construct more interesting visual and auditory illusions by considering more complicated ambiguity groups (for example, the Necker cube and Z/2Z\mathbb{Z}/2\mathbb{Z}).