by Carl Wang-Erickson
Barry Mazur’s article “Deforming Galois representations” [1] is regarded today as opening up a new direction in algebraic number theory. Appearing in print in 1989 following a conference talk at the Mathematical Sciences Research Institute (MSRI) in Berkeley in March 1987, this article introduced moduli theory of Galois representations into the consciousness of number theorists. By the time when this author began as a PhD student about twenty years later, Mazur’s article was promoted to my generation of students as the proper starting point to grasp one of the strands of mathematical developments needed to appreciate the “\( R = \mathbb{T} \) theorems” that were understood as the pinnacle of \( p \)-adic techniques to establish arithmetic Langlands correspondences. To the novice, it was at least easy to appreciate that it was important to understand both “\( R \)” and “\( \mathbb{T} \)” to start to grok “\( R = \mathbb{T} \),” and we understood that the start of the story of \( R \) was Mazur’s paper. Indeed, “\( R \)” stands for a universal deformation ring for \( p \)-adic Galois representations satisfying some condition of interest, as Mazur first formulated it. Then homomorphisms \( R \to \overline{\mathbb{Q}}_p \) are in bijection with the Galois representations of interest. And \( \mathbb{T} \) is the image ring of an action of Hecke operators on some natural Hecke module, so that homomorphisms \( \mathbb{T} \to \overline{\mathbb{Q}}_p \) are in bijection with Hecke eigensystems. Thus “\( R = \mathbb{T} \)” not only supplies a desired bijection between \( p \)-adic Galois representations and Hecke eigensystems, but establishes this bijection along with a common \( p \)-adic interpolation.
That said, the influence of this paper seems to not have been initially predicted. Some attendees of the MSRI conference felt that Mazur had gone down a rabbit hole, wondering what this work could possibly be good for. The answer emerged about six years later in Andrew Wiles’s lectures and preprint: you could prove modularity of elliptic curves using \( R \)!
As Wiles remarks in his Fermat paper [e11], by the time of the MSRI conference in March 1987,
Hida had constructed some explicit one-parameter families of Galois representations. In an attempt to understand this, Mazur had been developing the language of deformations of Galois representations. Moreover, Mazur realized that the universal deformation rings he found should be given by Hecke rings, at least in certain special cases ([e11], p. 450).
Wiles points out here that the first instances of “\( R = \mathbb{T} \)” appeared in Mazur’s article, articulating a vision to establish modularity of Galois representations that Wiles’s Fermat paper brought to fruition. Not only that, but Mazur’s article, at once, introduced the deformation theory of Galois representations thoroughly enough that it can still be regarded as an excellent reference today; characterized it using homological tools; and showed how to impose local conditions corresponding to various flavors of modular forms. These topics of Mazur’s article, and a small sample of directions of inquiry reflecting the influence of his article, are what we will discuss in what follows.
1. The motivation for and formulation of R
Haruzo Hida’s article Galois representations into \( \operatorname{GL}_2(\mathbb{Z}_p[\![ X ]\!]) \) attached to ordinary cusp forms appeared in print in 1986 [e3]. It was the primary inspiration for Mazur’s article. By this time, 2-dimensional \( p \)-adic Galois representations associated to single cusp forms were well-understood. Let us begin by reviewing them on our way to introducing Hida’s construction of “big” Galois representations.
1.1. The p-adic Galois representations associated to cuspidal Hecke eigenforms
Let \( G_{\mathbb Q} = \operatorname{Gal}(\overline{\mathbb{Q}}/{\mathbb Q}) \) and choose Frobenius elements \( \operatorname{Fr}_\ell \in G_{\mathbb Q} \) for all prime numbers \( \ell \). Let \( f \) denote a normalized cuspidal Hecke eigenform \( f \) of weight \( k \in \mathbb{Z}_{\geq 1} \) that is new of level \( N \in \mathbb{Z}_{\geq 1} \). Representing its Fourier expansion as \[ f(z) = \sum_{\geq 1} a_n(f) q^n \qquad\text{where }q = e^{2 \pi i z} ,\]
in fact the \( a_n(f) \) are algebraic and generate a number field, the Hecke field of \( f \), denoted \[ {\mathbb Q}(f) = {\mathbb Q}(a_n(f)_{n \geq 1}).\]
Given a choice of \( p \)-adic norm on \( {\mathbb Q}(f) \), thought of as an embedding \( \sigma_v : {\mathbb Q}(f) \hookrightarrow \overline{\mathbb{Q}}_p \), there exists a unique (up to isomorphism) irreducible continuous representation \[ \rho_{f,v} : G_{\mathbb Q} \to \operatorname{GL}_2(\overline{\mathbb{Q}}_p) \]
characterized by \[ \operatorname{Tr} \rho_{f,v}(\operatorname{Fr}_\ell) = \sigma_v(a_\ell(f)) \quad \text{ for all primes }\ell \nmid Np. \]
This \( \rho_{f,v} \) was constructed by Shimura (when \( k=2 \)), Deligne (when \( k > 2 \)), and Deligne–Serre (when \( k=1 \)). Then \( \rho_{f,v} \) further has properties, as follows.
- \( \det \rho_{f,v}(c) = -1 \) for any choice of complex conjugation element \( c \in G_{\mathbb Q} \). One calls such 2-dimensional representations of \( G_{\mathbb Q} \) odd.
- \( \det \rho_{f,v} = \chi_{f,v} \cdot \kappa^{k-1} \), where \( \kappa :
G_{\mathbb Q} \to \mathbb{Z}_p^\times \) denotes the \( p \)-adic cyclotomic character
and
\[ \chi_f : (\mathbb{Z}/N\mathbb{Z})^\times \to \overline{\mathbb{Q}}^\times \]
denotes the nebencharacter of \( f \); we also think of \( \chi_f \) as a character of \( G_{\mathbb Q} \) using class field theory, and write \( \chi_{f,v} := \sigma_v \circ \chi_f \).
- \( \rho_{f,v} \) is ramified only at places dividing \( Np \infty \),
which is why
\[ \operatorname{Tr} \rho_{f,v}(\operatorname{Fr}_\ell) = \sigma_v(a_\ell(f)) \]
is well-defined.
- The restriction of \( \rho_{f,v} \) to a decomposition group at \( p \),
\[ G_p = \operatorname{Gal}(\overline{\mathbb{Q}}_p/{\mathbb{Q}}_p) \subset G_{\mathbb Q} ,\]
is de Rham and has Hodge–Tate weights \( [0,k-1] \); it is also crystalline if \( p \nmid N \).
- At primes \( q \) dividing \( N \), the restriction of \( \rho_{f,v} \) to the decomposition group \( G_q \) at \( q \) is ramified, and its restriction to an inertia subgroup \( I_q \subset G_q \) is restricted according to the \( q \)-typical part of the level group for which \( f \) is modular.
It is conjectured that all Galois representations arising “in nature” (that is, from arithmetic geometry) also come from modular forms and their generalizations. A classic example of a Galois representation arising from arithmetic geometry is furnished by the \( p \)-adic Tate module of an elliptic curve \( E \) defined over \( q \), \[ T_p E = \varprojlim_n E[p^n] = \Bigl[\dotsm \buildrel{\cdot p}\over\to E[p^{n+1}] \buildrel{\cdot p}\over\to E[p^n] \buildrel{\cdot p}\over\to \dotsm \buildrel{\cdot p}\over\to E[p^2] \buildrel{\cdot p}\over\to E[p] \Bigr]. \]
Because \( E(\mathbb{C}) \) is a complex torus, making \[ E(\mathbb{C})[p^n] \simeq \mathbb{Z}/p^n \mathbb{Z} \times \mathbb{Z}/p^n \mathbb{Z} \]
as groups, one finds that \( T_p E \simeq \mathbb{Z}_p \times \mathbb{Z}_p \). The Galois action on the algebraic points of \( E[p^n] \) is compatible with the limit, giving a continuous action of \( G_{\mathbb Q} \) on \( T_p E \), or with a choice of basis, \[ \rho_{E,p} : G_{\mathbb Q} \to \operatorname{GL}_2(\mathbb{Z}_p). \]
This has the property that for all primes \( \ell \) at which \( E \) has good reduction, \[ \operatorname{Tr} \rho_E(\operatorname{Fr}_\ell) = \ell + 1 - \# E(\mathbb{F}_\ell). \]
Moreover, \( \rho_E \otimes_{\mathbb{Z}_p} \overline{\mathbb{Q}}_p \) is irreducible and odd, and its restriction to \( G_p \) has Hodge–Tate weights \( [0,1] \). Then the modularity conjecture for elliptic curves can be expressed as the existence of a cuspidal Hecke eigenform \( f = f_E \) with Hecke field \( {\mathbb Q}(f) = {\mathbb Q} \) such that \( \rho_{E,p} \simeq \rho_{f,p} \), or, equivalently, such that \( a_\ell(f) = \ell + 1 - \# E(\mathbb{F}_\ell) \) for all but finitely many primes \( \ell \).
1.2. Hida’s interpolation of ordinary cusp forms and their Galois representations
Hida established [e4] 1-parameter \( p \)-adic interpolations of cuspidal Hecke eigenforms that are ordinary, in a sense we now explain.
Fix an integer \( N \in \mathbb{Z}_{\geq 1} \). We call this \( N \) a tame level because we assume \( p \nmid N \) and consider modular forms of level \( \Gamma_1(Np^r) \) for \( r \in \mathbb{Z}_{\geq 1} \); we also may let \( r=0 \) and let \( \Gamma_1(Np^0) := \Gamma_1(N) \cap \Gamma_0(p) \). Let \( U_p \) refer to the usual Hecke operator at \( p \) at these levels. We consider pairs \( (f,v) \) as above — that is, \( f \) is a normalized cuspidal Hecke eigenform, now of level \( \Gamma_1(Np^r) \) for \( r \in \mathbb{Z}_{\geq 0} \), and \( v \) is a place over \( p \) of \( {\mathbb Q}(f) \). Such an \( (f,v) \) is called \( p \)-ordinary or simply ordinary provided that \( a_p(f) \) is a \( v \)-unit, or, equivalently, that the \( v \)-adic valuation of the \( U_p \)-eigenvalue of \( f \) is 0.
Letting \[ S_k(p^r) := S_k(\Gamma_1(Np^r)) \otimes_\mathbb{Z} \mathbb{Z}_p ,\]
the module of modular forms of weight \( k \) and level \( \Gamma_1(Np^r) \) over \( \mathbb{Z}_p \), there is a canonical direct summand \( S_k^{\operatorname{ord}}(p^r) \subset S_k(p^r) \) on which \( U_p \) acts by \( p \)-units. Similarly, denoting by \( \mathbb{T}_k(p^r) \) the Hecke algebra over \( \mathbb{Z}_p \) generated by Hecke endomorphisms within \( \operatorname{End}_{\mathbb{Z}_p}(S_k(p^r)) \), there is a canonical quotient ring \( \mathbb{T}_k^{\operatorname{ord}}(p^r) \) of \( \mathbb{T}_k(p^r) \) on which \( U_p \) acts by a \( p \)-unit (via its regular action). While homomorphisms \( \mathbb{T}_k(p^r) \to \overline{\mathbb{Q}}_p \) are in bijection with pairs \( (f,v) \) of level \( \Gamma_1(Np^r) \), it is exactly those \( (f,v) \) that are ordinary that factor through \( \mathbb{T}_k(p^r) \twoheadrightarrow \mathbb{T}_k^{\operatorname{ord}}(p^r) \).
Hida established the following \( p \)-integral \( p \)-adic interpolation of ordinary \( (f,v) \). For simplicity letting \( p \) be odd, let \begin{equation} \label{eq: Lambda decomp} \Lambda := \mathbb{Z}_p[\![ \mathbb{Z}_p^\times ]\!] \simeq \prod_{\nu : \mathbb{F}_p^\times \to \mathbb{Z}_p^\times} \mathbb{Z}_p[\![ X]\!], \tag{1.2.1} \end{equation}
the Iwasawa algebra of \( \mathbb{Z}_p^\times \). The isomorphism at the right comes from the decomposition \[\mathbb{Z}_p^\times \simeq \mathbb{F}_p^\times \times (1 + p\mathbb{Z}_p), \]
and sends the topological generator \( [1+p] \in 1 + p\mathbb{Z}_p \) to \( 1+X \). The spectrum of \( \Lambda \) may be thought of as a weight space, where \( k \in \mathbb{Z} \) corresponds to a homomorphism \[ \varphi_k : \Lambda \to \mathbb{Z}_p, \quad \mathbb{Z}_p^\times \ni [z] \mapsto z^{k-1}, \]
and the additional data of (the \( p \)-typical part of a) primitive nebencharacter \[ \chi_v : (\mathbb{Z}/Np^r\mathbb{Z})^\times \to \overline{\mathbb{Q}}_p^\times \]
of modulus \( Np^r \) gives rise to a homomorphism \[ \varphi_{k,\chi} : \Lambda \to \mathbb{Z}_p[\zeta_{p^{r-1}}] .\]
All of the Hecke algebras above are naturally \( \Lambda \)-algebras along the map \[ \Lambda \to \mathbb{T}_k(p^r), \qquad \mathbb{Z}_p^\times \ni [z] \mapsto p^{k-1}\cdot \langle z\rangle_{p^r}, \]
where \( \langle z\rangle_{p^r} \) refers to the level \( p^r \) diamond Hecke operator evaluated at \( z \pmod{p^r} \).
\( \mathbb{T}_k^{\operatorname{ord}}(p^r)_{\chi_v} \) denoting the pushout along \[ \chi_v : \mathbb{Z}_p[(\mathbb{Z}/p^r\mathbb{Z})^\times] \to \mathbb{Z}_p[\zeta_{p^{r-1}}] \]
of \( \mathbb{T}_k^{\operatorname{ord}}(p^r) \).
One thinks of the \( \mathbb{T}_N^{\operatorname{ord}} \) as a universal 1-dimensional family of Hecke eigensystems because it is finite and flat over \( \Lambda \), and because \( \Lambda \) has one parameter in the sense of \eqref{eq: Lambda decomp}. On many local component rings of \( \mathbb{T}_N^{\operatorname{ord}} \) (which is semilocal), the rank relative to \( \Lambda \) is 1, meaning that \( \mathbb{T}_N^{\operatorname{ord}} \) has a local factor that is isomorphic to \( \mathbb{Z}_p[\![ X]\!] \) compatibly with the weight map \( \Lambda \to \mathbb{T}_N^{\operatorname{ord}} \) and one of the factors of the decomposition \eqref{eq: Lambda decomp}. For notational convenience, let us write \( \bar f \) for a congruence class of ordinary \( (f,v) \) of tame level \( N \), which bijectively label the maximal ideals of \( \mathbb{T}_N^{\operatorname{ord}} \). On such local components, which we write as \( \mathbb{T}_{N,\bar f}^{\operatorname{ord}} \), there is a smooth 1-parameter family where the parameter is the weight, and one can think of this family as parameterizing an ordinary cusp form \( (f_k,v_k) \in S_k^{\operatorname{ord}}(p^0) \) of weight \( k \geq 2 \) by specializing along each weight map \( \varphi_k : \Lambda \to \mathbb{Z}_p \) that factors through \( \Lambda \twoheadrightarrow \mathbb{Z}_p[\![ X]\!] \).
Hida further found that Galois representations interpolate along with the Hecke eigensystems. For simplicity, we state the case of rank 1 relative to \( \Lambda \), where we have the ordinary cusp forms \( (f_k,v_k) \) parameterized by the weight \( k \).
In particular, the representation \[ \rho \otimes_{\mathbb{Z}_p[\![ X]\!]} \overline{\mathrm{Frac}(\mathbb{Z}_p[\![ X]\!])} \]
is irreducible and \( \operatorname{Tr} \rho(\operatorname{Fr}_\ell) \) is equal to the image of the Hecke operator \( T_\ell \) in \( \mathbb{Z}_p[\![ X]\!] \), for all primes \( \ell \nmid Np \). Under a mild assumption, there is an isomorphism \begin{equation} \label{eq: ord shape} \rho\vert_{G_p} \simeq \biggl( \begin{matrix} \ast \,&\,\ast \\ {0}&{\nu(U_p)} \end{matrix} \biggr) : G_p \to \operatorname{GL}_2\bigl(\,\overline{\mathrm{Frac}(\mathbb{Z}_p[\![ X]\!])}\,\bigr). \tag{1.2.4} \end{equation}
1.3. The motivating question
How should one think about these “big” Galois representations constructed by Hida, interpolating all of the \( \rho_{f,v} \) corresponding to ordinary \( (f,v) \) in the congruence class \( \bar f \)? The following question expresses some motivation for Mazur’s investigation.
is a projection onto a local factor of \( \mathbb{T}_N^{\operatorname{ord}} \) as in Theorem 1.2.3 is this \( \rho^{\operatorname{ord}} \) identical to \( \rho \)?
As a first step to translate the motivating question into a precise question, we address the residual case. Indeed, letting \( \mathbb{F} \) denote the residue field of \( \smash{\mathbb{T}_{N,\bar f}^{\operatorname{ord}}} \), the congruence class \( \bar f \) can be thought of as the \( \mathbb{F} \)-valued Hecke eigensystem arising via the residue map \( \mathbb{T}_{N,\bar f}^{\operatorname{ord}} \twoheadrightarrow \mathbb{F} \). There exists a unique semisimple Galois representation \[ \bar\rho = \rho_{\bar f} : G_{\mathbb Q} \to \operatorname{GL}_2(\mathbb{F}) \]
characterized in the same way as in Theorem 1.2.3: \[ \operatorname{Tr} \bar\rho(\operatorname{Fr}_\ell) = T_\ell, \qquad \text{for all primes } \ell \nmid Np, \]
where we think of \( T_\ell \in \mathbb{F} \) via the residue map. We call this the residual (Galois) representation associated to \( \bar f \).
Next, having fixed this \( \bar\rho \), we may understand the possible Galois representations that can arise from the ordinary \( (f,v) \) in the congruence class \( \bar f \) by following Mazur in the opening lines of his article [1]:
Given a continuous homomorphism \[ G_{{\mathbb Q},S} \stackrel{\bar{\rho}}{\longrightarrow} \operatorname{GL}_2(\mathbb{F}_p), \]
… the motivating problem of this paper is to study, in a systematic way, the possible liftings of \( \bar\rho \) to \( p \)-adic representations \[ G_{{\mathbb Q},S} \stackrel{\bar{\rho}}{\longrightarrow} \operatorname{GL}_2(\mathbb{Z}_p). \]
This notion of lift can be generalized to any complete local \( \mathbb{Z}_p \)-algebra \( (A,\mathfrak{m}_A) \) with residue field \( \mathbb{F}_p \). A lift of \( \bar\rho \) to \( A \) is a continuous homomorphism \( \rho_A : G_{{\mathbb Q},S} \to \operatorname{GL}_2(A) \) such that there is an equality of homomorphisms \( \rho_A \pmod{\mathfrak{m}_A} = \bar\rho \) from \( G_{{\mathbb Q},S} \) to \( \operatorname{GL}_2(\mathbb{F}_p) \). In particular, the putative “largest possible” \( \rho^{\operatorname{ord}} \) related to the congruence class \( \bar f \), as in Question 1.3.1, should be realizable as a lift of \( \bar\rho \) to a local ring \( R^{\operatorname{ord}} = R_{\bar\rho}^{\operatorname{ord}} \).
These developments lead to refined questions, which Mazur addresses.
1.4. Mazur’s introduction of deformation theory and its dimension
Mazur goes on to introduce his systematic study of lifts, introducing deformation theory and initially ignoring the ordinary condition.
We use the techniques of deformation theory. … We prove that if \( \bar\rho \) is absolutely irreducible, there is a universal deformation of \( \rho \), i.e., a complete noetherian local ring \( R = R(\bar\rho) \) with residue field \( \mathbb{F}_p \), and a continuous homomorphism \[ G_{{\mathbb Q},S} \stackrel{\bar{\rho}}{\longrightarrow} \operatorname{GL}_2(R) \]
(well-defined up to conjugation by an element in \( \operatorname{GL}_2(R) \) which reduces to the identity matrix modulo the maximal ideal in \( R \)) which is universal in an evident sense. Under the assumption that \( p > 2 \) and that \( S \) contains the primes \( p \) and \( \infty \), we show that the Krull dimension of \( R/pR \) is \( \geq 1 \) if \( \det(\rho) \) is even, and it is \( \geq 3 \) if \( \det(\rho) \) is odd, with equality holding if the deformation problem is unobstructed.
As the text indicates, there is a notion of deformation of \( \bar\rho \), which is an equivalence class of lifts under a condition sometimes called strict equivalence: two lifts \( \rho_A \) and \( \rho^{\prime}_A \) are strictly equivalent when there exists a matrix \( x \in 1 + M_2(\mathfrak{m}_A) \subset \operatorname{GL}_2(A) \) such that \( \rho_A = x \cdot \rho^{\prime}_A \cdot x^{-1} \).
Here is a more formal and detailed version of what Mazur has summarized in the introduction above.
- There is a universal deformation of \( \bar\rho \), consisting of the data of a complete Noetherian local \( \mathbb{Z}_p \)-algebra \( R_{\bar\rho} \) with residue field \( \mathbb{F}_p \) and a homomorphism \( \rho^\mathrm{univ}_{\bar\rho} : G_{{\mathbb Q},S} \to \operatorname{GL}_2(R_{\bar\rho}) \). It is universal in the sense that, for any lift \( \rho_A : G_{{\mathbb Q},S} \to \operatorname{GL}_2(A) \) there exists a unique homomorphism \( \phi : R_{\bar\rho} \to A \) such that \( \operatorname{GL}_2(\phi) \circ \rho_{\bar\rho}^\mathrm{univ} \) and \( \rho_A \) are strictly equivalent.
- If \( p > 2 \) and \( S \) contains the primes \( p \) and \( \infty \), then there are bounds on Krull dimension according to the parity of \( \det \bar\rho \).
- If \( \bar\rho \) is odd, then \( \dim R_{\bar\rho} \geq 3 \).
- If \( \bar\rho \) is even, then \( \dim R_{\bar\rho} \geq 1 \).
Mazur’s lower bounds on Krull dimension come from the following relations between deformations and Galois cohomology.
- Lifts \( \tilde \rho \) of \( \bar\rho : G_{{\mathbb Q},S} \to
\operatorname{GL}_d(\mathbb{F}_p) \) to the dual numbers
\( \mathbb{F}_p[\epsilon]/(\epsilon^2) \) are in bijection with continuous
(inhomogeneous) 1-cocycles on \( G_{{\mathbb Q},S} \) valued in the adjoint
representation of \( \bar\rho \), which is a \( \mathbb{F}_p \)-vector
space denoted \( Z^1(G_{{\mathbb Q},S}, \operatorname{ad}\bar\rho) \). This follows from
the straightforward bijection between such 1-cocycles \( f \) and
their realization in a \( 2d \)-dimensional (over \( \mathbb{F}_p \)) representation, which amounts to \( \tilde \rho \),
as
\[
\tilde\rho = \begin{pmatrix} \bar\rho&
\text{``}\epsilon\text{"} f \bar\rho\\
{0}&\ \bar\rho \end{pmatrix} .
\]
- Strict equivalences of such lifts amounts to conjugation by matrices of the form \( \bigl(\begin{smallmatrix}{1}&{\ast}\\{0}&{1}\end{smallmatrix}\bigr) \), under which the orbit of \( f \) is precisely a cohomology class. Consequently, deformations to the dual numbers are in bijection with first adjoint cohomology, \( H^1(G_{{\mathbb Q},S},\operatorname{ad}\bar\rho) \).
- Obstructions are realized in second adjoint cohomology by a sort
of product map. The simplest example (as Mazur describes in
([1], Section 1.6, Remark))
arises when considering whether
some \( \tilde \rho \) as above, with matching cohomology class
\( [f] \), extends from the dual numbers to
\( \mathbb{F}_p[\epsilon]/(\epsilon^3) \). It extends to
\( \mathbb{F}_p[\epsilon]/(\epsilon^3) \) if and only if the cup product
\begin{equation}
\label{eq: quadratic}
H^1(\dotsm) \otimes H^1(\dotsm) \to H^2(G_{{\mathbb Q},S},\operatorname{ad}\bar\rho)
\tag{1.4.2}
\end{equation}
vanishes on \( [f] \otimes [f] \).
Here is the general conclusion that Mazur derives from these arguments. Let \( d^i := \dim_{\mathbb{F}_p} H^i(G_{{\mathbb Q},S}, \operatorname{ad}\bar\rho) \).
is canonically isomorphic to \( H^1(G_{{\mathbb Q},S},\operatorname{ad}\bar\rho) \). Choosing an arbitrary surjection \( \sigma : \mathbb{Z}_p[\![ x_1, \dotsc, x_{d^1}]\!] \twoheadrightarrow R_{\bar\rho} \) that is an isomorphism on tangent spaces, there is an injection \begin{equation} \label{eq: obstruction map} \operatorname{Hom}_{\mathbb{F}_p}\biggl(\frac{\ker \sigma}{(x_1, \dotsc, x_{d^1}) \cdot \ker \sigma} \biggr) \hookrightarrow H^2(G_{{\mathbb Q},S}, \operatorname{ad}\bar\rho). \tag{1.4.4} \end{equation}
Following Mazur, let \[ \delta = d^1 - d^2 \]
denote the expected dimension of \( R_{\bar\rho}/pR_{\bar\rho} \). Upon the expectation that \( R_{\bar\rho} \) is \( \mathbb{Z}_p \)-flat, this \( \delta \) is also the expected relative dimension of \( R_{\bar\rho} \) over \( \mathbb{Z}_p \). The following corollary expresses the extent to which we can approach this expectation merely from the deformation theory of Proposition 1.4.3.
If the expected dimension is achieved by \( R_{\bar\rho}/pR_{\bar\rho} \), then \( R_{\bar\rho}/pR_{\bar\rho} \) is a local complete intersection ring. If \( d^2=0 \), then \( R_{\bar\rho} \) is isomorphic to a power series ring over \( \mathbb{Z}_p \) of the expected dimension.
The bounds on Krull dimension given in Theorem 1.4.1 follow from Corollary 1.4.5 and the global Euler characteristic formula, as Mazur argues in greater generality in ([1], Section 1.10).
1.5. Expected dimension and flatness conjectures
Along the way toward presenting the results of Section 1.4, Mazur introduces the following perspectives and questions regarding \( R_{\bar\rho} \) that continue to be of essential interest today.
- The noetherianness of \( R_{\bar\rho} \) follows from an abstract group-theoretic condition, which he calls \( \Phi_p \) and is typically called Mazur’s \( \Phi_p \) finiteness condition. The \( \Phi_p \) condition on a profinite group \( G \) is that, for any finite index subgroup of \( G \), its maximal pro-\( p \) quotient is topologically finitely generated. Thanks to the Hermite–Minkowski theorem, \( G_{{\mathbb Q},S} \) satisfies \( \Phi_p \) for all primes \( p \).
- The theory of Section 1.4
works perfectly well under the
following generalizations.
- For all dimensions \( d \in \mathbb{Z}_{\geq 1} \) with \( \operatorname{GL}_d \) generalizing \( \operatorname{GL}_2 \) as above.
- For all Galois groups \( G_{F,S} \) for number fields \( F \) and a finite set of places \( S \) of \( F \), which satisfies \( \Phi_p \) for the same reason as does \( G_{{\mathbb Q},S} \).
- For all finite residue fields \( k \), generalizing the \( \mathbb{F}_p \) above.
Since Mazur’s article, other algebraic groups have been considered in place of \( \operatorname{GL}_d \).
- For representations of \( G_{{\mathbb Q},S} \) or other absolute Galois groups, it is natural to ask whether the expected dimension and complete intersection property of Corollary 1.4.5 always holds.
- The question of Krull dimension is subtle and extrapolates from questions that are known to be difficult: in the case \( d=1 \) for \( G_{F,S} \) and any \( \rho \), the expected dimension conjecture (Conjecture 1.5.1, below), is equivalent to the \( p \)-adic Leopoldt conjecture for \( F \) ([1], Lemma 4, Section 1.10). In this case, \( \delta = r_2 + 1 \), where \( r_2 \) is the number of complex places of \( F \).
- Mazur had no example of a irreducible component of \( \operatorname{Spec} R \) on which \( p \) is nilpotent, i.e., which “doesn’t lift to characteristic zero” ([1], Section 1.10, Rem.). This is considered today to be a flatness conjecture (that is, flatness over \( \mathbb{Z}_p \)).
Mazur remarked that he did not have any counterexamples to the following statement, which since has developed into a conjecture encompassing the questions raised by (iii)–(v).
1.6. Intellectual lineage
While we have mainly emphasized the novelty of Mazur’s use of deformation theory, its historical precedents bear mentioning. Indeed, Mazur mentions two in the elided portion of the quote that begins Section 1.4:
We use the techniques of deformation theory. There have been numerous studies of the global variation of representations over \( \mathbb{C} \) of finitely generated groups; cf. the memoir of Lubotzky and Magid [e2] or the recent preprint of Goldman and Millson [e6]. The viewpoint we adopt here is similar, with the exception that in our context (our groups are profinite and our representations are \( p \)-adic) it makes sense only to consider formal deformations.
Goldman and Millson showed that deformations of representations of fundamental groups of compact Kähler manifolds have at most quadratic singularities, i.e., the quadratic term described in \eqref{eq: quadratic} suffices as a presentation of the deformation ring. Lubotzky and Magid deal with moduli varieties of representations of finitely generated groups. In Section 7 of [e2], entitled “Historical remarks” (which touches on a surprisingly broad array of mathematical problems with links to the moduli theory of representations), they point out that Weil, in his study of discrete subgroups \( \Gamma \) of a Lie group with Lie algebra \( \mathfrak{g} \), seems to have been the first to notice that deformations are controlled by \( H^1(\Gamma, \mathfrak{g}) \) [e1]. This is an important insight that Mazur draws upon.
2. Local conditions and the first “R \({}= \mathbb{T} \)”
Having established the above foundational results of Galois deformation theory, Mazur returns to the motivating case, imposing a \( p \)-local ordinary condition on Galois representations and modular forms. The rest of his paper imposes sufficiently narrow restrictions on \( \bar\rho \) so that he can prove that a deformation ring for ordinary Galois representation \( R_{\bar\rho}^{\operatorname{ord}} \) is naturally isomorphic to an ordinary Hecke algebra. This outcome is the first historical example of an \( R = \mathbb{T} \) theorem ([1], Section 2.5, Proposition 14), setting the stage for the ensuing decades of exploration of these deformation spaces and proofs of modularity theorems. For example, Mazur anticipated one direction of inquiry: after previewing his proof that certain \( R_{\bar\rho}^{\operatorname{ord}} \) are naturally isomorphic to an ordinary Hecke algebra, Mazur asks, “Are all representations [in the deformation space] similarly approximable” even when no ordinary condition is imposed? ([1], p. 386).
2.1. The ordinary deformation ring
Because Galois representations \( \bar\rho: G_{{\mathbb Q},S} \to \operatorname{GL}_2(\mathbb{F}_p) \) associated to holomorphic modular forms are odd (Section 1.1), by Corollary 1.4.5, there are too many dimensions — at least 3, which is expected — compared to Hida’s 1-dimensional families of \( p \)-ordinary Hecke eigensystems (Theorem 1.2.2). Mazur realized that this difference in dimension can be attributed to the \( p \)-local condition that \( p \)-adic Galois representations \( \rho \) associated to \( p \)-ordinary modular forms must satisfy: the restriction \( \rho\vert_{G_p} \) to the decomposition subgroup at \( p \), \( G_p \subset G_{{\mathbb Q},S} \), must be reducible with an unramified quotient as in \eqref{eq: ord shape}.
To make this rigorous, Mazur describes the ordinary local condition: given \( \rho_A : G_{{\mathbb Q},S} \to \operatorname{GL}_2(A) \) and thinking of \( A \oplus A \) as the natural module for the \( G_{{\mathbb Q},S} \)-action via \( \rho : G_{{\mathbb Q},S} \to \operatorname{GL}_2(A) \), one calls \( \rho_A \) ordinary when there exists a \( A \)-rank 1 free summand \( M \subset A \oplus A \) that is invariant under the inertia subgroup \( I_p \subset G_{{\mathbb Q},S} \).
Mazur claims that there exists a universal ordinary deformation ([1], Proposition 3, Section 1.7) and finds its coordinate ring, the universal ordinary deformation ring \( R_{\bar\rho}^{\operatorname{ord}} \), as a quotient of \( R_{\bar\rho} \) under a supplemental assumption ([1], Proposition 12, Section 2.5). The Zariski-closedness of a local condition gained recognition as an important goal in the theory.
2.2 “Neat” residual representations
Now that \( R_{\bar\rho}^{\operatorname{ord}} \) exists, Mazur’s next challenge is to show that it is small enough to be isomorphic to an ordinary Hecke algebra. To do this, he chooses a situation where the ambient deformation ring \( R_{\bar\rho} \) is as nice as possible, relative to the restrictions of Corollary 1.4.5, when \( \bar\rho \) is irreducible and odd: this is the case where \( R_{\bar\rho} \) is isomorphic to a power series ring over \( \mathbb{Z}_p \) in 3 variables. He shows that this case occurs when \( \bar\rho \) satisfies a condition called “neat,” which we will develop shortly.
Mazur considers the case where \( S = \{p, \infty\} \), so \( G_{{\mathbb Q},S} \) is the Galois group of \( {\mathbb Q} \) ramified at only \( p \) and \( \infty \); and selects the residual representation \( \bar\rho \) to be an induced representation \[ \bar\rho = \operatorname{Ind}_{{\mathbb Q}(\sqrt{-p})}^{\mathbb Q} \bar\chi : G_{{\mathbb Q},S} \to \operatorname{GL}_2(\mathbb{F}_p) \]
from a nontrivial character \( \bar\chi : G_{{\mathbb Q}(\sqrt{-p})} \to \mathbb{F}_p^\times \) of the imaginary quadratic field \( {\mathbb Q}(\sqrt{-p}) \) contained in the \( p \)-th cyclotomic field; thus one must have \( p \equiv 3\pmod{4} \). Mazur imposes additional assumptions, as follows, and also lists some primes \( p \) for which these assumptions are satisfied:
- the class number of \( L = \overline{\mathbb{Q}}^{\ker \bar\rho} \) is relatively prime to \( p \);
- the abelian cyclic extension \( L/{\mathbb Q}(\sqrt{-p}) \) is ramified only at the
prime of \( {\mathbb Q}(\sqrt{-p}) \) over
\( p \); - \( \bar\rho \) satisfies a neatness condition of ([1], Section 1.12), which means that certain \( \operatorname{Gal}(L/{\mathbb Q}) \)-modules constructed from the unit group \( L^\times \) have no nontrivial \( \operatorname{ad}\bar\rho \)-isotypical parts.
The neatness assumption is good enough to deduce that \( H^2(G_{{\mathbb Q},S},
\operatorname{ad}\bar\rho) = 0 \), so that the unrestricted Galois deformation ring
\( R_{\bar\rho} \) is isomorphic to a power series ring \( \mathbb{Z}_p[\![ X_1,X_2,X_3
]\!] \).
Then Mazur checks that there is a unique lift of \( \bar\rho \) to \( \rho : G_{{\mathbb Q}, S}
\to \operatorname{GL}_2(\mathbb{Z}_p) \) such that the image of \( \rho \) is isomorphic to the image of
\( \bar\rho \) via reduction modulo \( p \). He can also identify this as the Galois
representation \( \rho_f \) attached to a weight 1 \( p \)-ordinary cuspidal eigenform
\( f \) of level \( \Gamma_0(p) \).
Because \( \bar\rho \) is induced, its image is a dihedral group. Such representations will be called globally dihedral in what follows.
2.3. Establishing R\( {}= \mathbb{T} \)
Now we have enough information to express Mazur’s ordinary \( R = \mathbb{T} \) theorem and sketch its proof. Let \( R_{\bar\rho}^{\operatorname{ord}} \) denote the universal ordinary deformation ring for \( \bar\rho \), which is naturally a quotient \( R_{\bar\rho} \twoheadrightarrow R_{\bar\rho}^{\operatorname{ord}} \). Let \( \mathbb{T}^{\operatorname{ord}} \) denote Hida’s ordinary Hecke algebra of tame level \( N=1 \). The weight 1 eigenform \( f \) gives a homomorphism \( \phi_f : \mathbb{T}^{\operatorname{ord}} \to \mathbb{Z}_p \) sending each Hecke operator to its eigenvalue on \( f \). Let \( \mathbb{T}_{\bar f}^{\operatorname{ord}} \) denote the completion of \( \mathbb{T}^{\operatorname{ord}} \) at the maximal ideal that is the kernel of the composite \[ \phi_{\bar f} : \mathbb{T}^{\operatorname{ord}} \stackrel{\phi_f}{\longrightarrow} \mathbb{Z}_p \twoheadrightarrow \mathbb{F}_p. \]
Mazur’s argument in ([1], Sections 2.1–2.5) to establish a natural isomorphism \( R^{\operatorname{ord}}_{\bar\rho} \stackrel{\sim}{\rightarrow} \mathbb{T}^{\operatorname{ord}}_{\bar f} \) proceeds through the following steps.
- The lift \( \rho_f \) of \( \bar\rho \) is the only lift (among all coefficient rings \( (A,\mathfrak{m}_A) \), properly interpreted) under which the image of the inertia subgroup \( I_p \) is constant, i.e., maps isomorphically to \( \bar\rho(I_p) \) modulo \( \mathfrak{m}_A \) (Proposition 9 and Lemma 5, [1]).
- There is a quotient \( R_{\bar\rho} \twoheadrightarrow R^\mathrm{dih}_{\bar\rho} \) factoring all of the maps \( R_{\bar\rho} \to A \) such that the associated Galois representation \( \rho_A \) is dihedral, i.e., it has pro-dihedral image. Moreover, there is an isomorphism \( R^{\mathrm{dih}}_{\bar\rho} \simeq \mathbb{Z}_p [\![ t_1, t_2]\!] \) (Lemma 6, Proposition 11, [1]). Because \( \bar\rho_f \) and \( \rho_f \) have dihedral image, the associated map \( R_{\bar\rho} \to \mathbb{Z}_p \) factors through \( R_{\bar\rho} \twoheadrightarrow R^{\mathrm{dih}}_{\bar\rho} \).
- The ordinary and globally dihedral loci within \( \operatorname{Spf} R_{\bar\rho} \) are
transverse in the strongest possible way, notwithstanding the fact that
the \( \mathbb{Z}_p \)-point associated to \( \rho_f \) lies in both of these two
subloci (Lemma 7,
[1]).
Namely, the intersection of the
globally dihedral and ordinary subspaces of the tangent space
\( \mathfrak{t}_{\bar\rho} \) of \( \operatorname{Spf} R_{\bar\rho} \),
\[
\mathfrak{t}_{\bar\rho} = \operatorname{Hom}_{\mathbb{F}_p}
\biggl(\frac{\mathfrak{m}_{\bar\rho}}{(\mathfrak{m}_{\bar\rho}^2, p)},
\mathbb{F}_p\biggr) \cong H^1(G_{{\mathbb Q},S}, \operatorname{ad}\bar\rho) \simeq
\mathbb{F}_p^{\oplus 3},
\]
is the zero subspace.
- Because \( R^\mathrm{dih}_{\bar\rho} \simeq \mathbb{Z}_p [\![ t_1, t_2]\!] \), the globally dihedral subspace of \( \mathfrak{t}_{\bar\rho} \) has dimension 2. Thus the transversality result (iii) implies that hence the tangent space of the ordinary sublocus has dim \( \leq 1 \) (Proposition 13, [1]). This is equivalent to the existence of some surjection \( \mathbb{Z}_p[\![ t]\!] \twoheadrightarrow R_{\bar\rho}^{\operatorname{ord}} \).
- By interpolating the Galois representations associated to the ordinary
eigenforms parameterized by \( \mathbb{T}_{\bar f}^{\operatorname{ord}} \) following Hida,
there is a
representation
\[\rho_{\mathbb{T}_{\bar
f}^{\operatorname{ord}}} :
G_{{\mathbb Q},S} \to \operatorname{GL}_2(\mathbb{T}_{\bar
f}^{\operatorname{ord}}).
\]
This representation is ordinary and deforms \( \bar\rho \), and therefore there is a surjective map \( R_{\bar\rho}^{\operatorname{ord}} \twoheadrightarrow \mathbb{T}_{\bar f}^{\operatorname{ord}} \).
- Because \( \mathbb{T}_{\bar f}^{\operatorname{ord}} \) is \( \Lambda \)-flat and
\( R_{\bar\rho}^{\operatorname{ord}} \) is a quotient of \( \mathbb{Z}_p[\![ t]\!] \), the
surjection provides an
isomorphism
\[R_{\bar\rho}^{\operatorname{ord}}\cong
\mathbb{T}_{\bar f}^{\operatorname{ord}}.
\]
2.4. Complements
Among the parts of his paper that we have not discussed in detail, Mazur engages in prescient discussions of 2-dimensional Galois representations deforming the neat residual representation \( \bar\rho \) fixed above, as follows.
- The closed locus of \( \operatorname{Spf} R_{\bar\rho} \) in which the action of \( I_p \) is reducible is the union of two hypersurfaces, each of which consists of representations whose restriction to \( I_p \) stabilizes a fixed flag. The intersection of these two hypersurfaces is a line in which the action of \( I_p \) is reducible and decomposable. Such loci figured into studies of Ghate–Vatsal [e18], [e24], who proved (under mild assumptions) that the \( I_p \)-reducible and decomposable locus generically consists of dihedral representations of \( G_{{\mathbb Q},S} \).
- The analytic space \( (\operatorname{Spf} R_{\bar\rho})^\mathrm{an} \) associated to \( \operatorname{Spf} R_{\bar\rho} \) admits a map, due to Sen, to the 2-dimensional analytic affine space of quadratic monic polynomials in \( {\mathbb{Q}}_p[t] \). Loosely speaking, this keeps track of Hodge–Tate weights \( p \)-adically interpolating over \( (\operatorname{Spf} R_{\bar\rho})^\mathrm{an} \). Mazur asks several provocative questions about loci with weights \( (0,0) \). The algebraization and characterization of loci with fixed such weights played a substantial role in subsequent studies and applications of Galois deformation theory.
3. The influence of Galois deformation theory
The vast influence of Mazur’s paper manifests in many ways. To conclude, what follows is an attempt to mention some of the most prominent directions represented among developments sparked by Mazur’s paper, though undoubtedly there are more.
(1) There are many further Zariski-closedness results and characterizations regarding loci within Galois deformation spaces with local properties of interest as in Section 2.4, one of the first being the PhD thesis of Ramakrishna [e9]. Later, Kisin expanded such results to a broader array of conditions from \( p \)-adic Hodge theory [e22], [e19].
(2) Wiles [e11] and Taylor–Wiles [e12] developed methods — the numerical criterion and the Taylor–Wiles method — to compare Galois deformation rings with Hecke algebras and establish \( R = \mathbb{T} \) theorems, followed by works of many others to expand the scope of these methods. Among such efforts, innovations to the Taylor–Wiles method due to Kisin [e22], [e21] stand out, as he applied these innovations to establish the modularity conjecture (Fontaine–Mazur conjecture for \( \operatorname{GL}_2/{\mathbb Q} \)) in a broad array of cases.
(3) Skinner–Wiles [e15] applied the Taylor–Wiles method to address residually reducible cases of the Fontaine–Mazur conjecture for \( \operatorname{GL}_2/{\mathbb Q} \).
(4) There are explicit studies of spaces of deformations of Galois representations, including those of Mazur’s PhD supervisees Boston [e7] and Böckle [e16].
(5) The “infinite fern” of modular points within Galois deformation spaces was identified by Mazur [3] and Gouvêa–Mazur [2]. This work, along with Coleman’s development of the theory of overconvergent \( p \)-adic modular forms [e13], led to the development of \( p \)-adic analytic families of modular forms, the first example given by the Coleman–Mazur eigencurve [4].
(6) Clozel–Harris–Taylor [e20] expanded the scope of deformation-theoretic methods (e.g., Taylor–Wiles method) to prove automorphy theorems for \( d \)-dimensional Galois representations satisfying conditions such that they ought to arise from automorphic representations of a unitary groups, generalizing to \( d > 2 \) the case of \( d=2 \) and \( \operatorname{GL}_2 \). There were other earlier studies of deformation theory for Galois representations valued in higher rank reductive groups, for example due to Tilouine [e14].
(7) There are efforts such as those of Diamond–Flach–Guo [e17] and Bellaïche–Chenevier [e23] to use Galois deformation spaces, and their relation to Hecke algebras, to draw conclusions about Selmer groups and Bloch–Kato conjectures.
(8) The development of the theory of pseudocharacters due to Wiles [e5], Taylor [e8], Carayol [e10], and others applying deformation theory to the trace function arising from representations. This was further developed and applied by Bellaïche–Chenevier [e23] and Chenevier [e25]. A decisive development of the theory of pseudocharacters, extending the notion to \( G \)-valued representations where \( G \) is a reductive group, was provided by V. Lafforgue in the course of his proof of the unramified Langlands correspondence over function fields [e26].
(9) The introduction of stacks of Galois representations such as in work of the author [e29] and of Emerton–Gee [e34], varying among all residual representations. In particular, these no longer require the restriction, necessary for Mazur’s deformation theory, that the residual representation has no nonscalar automorphisms.
(10) The proof of the expected dimension and \( \mathbb{Z}_p \)-flatness properties, Conjecture 1.5.1, is due in the \( \ell \)-adic local case to Böckle–Iyengar–Pašk&umacrnas [e35] (when \( \ell = p \)), and to Helm [e30], Dat–Helm–Kurinczuk–Moss [e37] and Zhu [e38] (when \( \ell \neq p \)). It is currently open in the global case.
(11) Venkatesh–Galatius’s expansion of the scope of Galois deformation techniques into derived settings [e28], setting up the Galois side of derived enrichments first set up on the automorphic side by Calegari–Geraghty [e27].
(12) Recently there have been spectacular applications of higher dimensional Galois representations and their deformation theory, such as Newton–Thorne’s work on symmetric power functorality for modular forms [e31], [e32] and Hilbert modular forms [e39], and of Boxer–Calegari–Gee–Pilloni on the modularity of abelian surfaces [e33], [e36].
There is now a lot of attention directed at the potential for geometrizations of the arithmetic Langlands correspondences analogous to those of the geometric Langlands program. Such geometrization emphasizes sheaves on moduli stacks of Galois representations over the points of this stack, suggesting that there are even more yet-unexplored layers of significance of the Galois deformation spaces pioneered by Mazur.
Carl Wang-Erickson is currently an assistant professor of mathematics at the University of Pittsburgh. His research focuses on the relationship between \( p \)-adic families of Galois representations and the cohomology of modular curves.