by Wen-Ching Winnie Li
1. Introduction
1.1 Ogg’s impact on classical modular forms
The group \( \operatorname{SL}_2(\mathbb R) \) acts on the upper half-plane \( \mathfrak H \) by fractional linear transformations. A classical modular form is a holomorphic function on \( \mathfrak H \) with symmetries with respect to a congruence subgroup \( \Gamma \) of \( \operatorname{SL}_2(\mathbb Z) \). It is called a cusp form if it vanishes at the cusps of \( \Gamma \). A modular form has a Fourier expansion. Historically, the arithmetic of modular forms is understood by way of studying their Fourier coefficients. The twentieth century witnessed fantastic advancements in modular forms, from theory to applications. Listed below are some landmark breakthroughs relevant to the theme of this article.
- In the 1930s Hecke introduced the Hecke operators \( T_p \) for primes \( p \) acting on modular forms, thereby reducing the study of modular forms to common eigenforms of \( T_p \) (see [e5], Numbers 39, 40).
- Weil’s converse theorem [e9] of 1967 provides an analytic criterion for a holomorphic function on \( \mathfrak H \) defined by given Fourier coefficients to be a modular form for \( \Gamma_0(N) \). This originates from Hecke’s converse theorem for modular forms for \( \operatorname{SL}_2(\mathbb Z) \), namely the case \( N=1 \) [e1].
- Atkin and Lehner’s newform theory [e10] of 1970 further reduces the study of common Hecke eigenforms for congruence subgroups of type \( \Gamma_0(N) \) to newforms.
The converse theorem (2) and the newform theory (3) were extended to all classical modular forms in the 1970s by Miyake [e12] and Li [e16].
Andrew Ogg made important contributions to classical modular forms in the 1960s and the 1970s. His 1969 Benjamin notes Modular Forms and Dirichlet Series [1] elaborated the converse theorem by Hecke and Weil, among other things. It was a popular book in modular forms for a long time.
To explain his contribution to the newform theory, let \[ f(\tau) = \sum_{n\ge 1} a_n e^{2\pi i n\tau} \] be a cuspidal newform of weight \( k \) level \( N \) and character \( \chi \), normalized with \( a_1=1 \). Then \( f \) is an eigenfunction of the Hecke operator \( T_p \) with eigenvalue \( a_p \) for primes \( p\, \nmid\, N \), and the Fourier coefficients \( a_n \) are determined by \( a_p \) for all primes \( p \) in the following way, expressed in terms of the L-function attached to \( f \): \begin{align*} L(f,s) &=\sum_{n\ge 1} a_n n^{-s}\\ &= \prod_{p\,|\,N}\,\frac{1}{1-a_p p^{-s}} ~\prod_{p\, \nmid\, N}\,\frac{1}{1 - a_pp^{-s} + \chi(p)p^{k-1-2s}}. \end{align*}
For primes \( p \,\nmid\, N \), the Ramanujan conjecture,
proved by
Deligne
[e13],
[e14]
for weight \( k > 2 \), by
Eichler and
Shimura
[e2],
[e4]
for \( k=2 \), and by
Deligne
and
Serre
[e15]
for \( k=1 \), asserts that
\[
|a_p| \le 2p^{(k-1)/2}.
\]
The size of \( a_p \) for \( p\,|\,N \), as described below, was proved in
[e16]
using a theorem of Ogg in
[3]:
\( |a_p| = p^{(k-1)/2} \) if \( \chi \) is not a character mod \( N/p \); when \( \chi \) is a character mod \( N/p \), then \( a_p^2 = \chi(p)p^{k-2} \) if \( p\parallel N \), and \( a_p=0 \) otherwise.
Ogg also studied convolution of L-series attached to cusp forms. Given two cusp forms \( f \) and \( g \) of weight \( k \) and level \( N \) with Fourier coefficients \( \{a_n\} \) and \( \{b_n\} \) respectively, Rankin introduced the convoluted L-series \[ L_{f,g}(s) = \prod_{p\,\nmid\, N}\,\frac{1}{1-p^{-2s}} \,\sum_{n \ge 1} \,a_n \overline{b_n} n^{-(s-k+1)} \] defined for \( \Re s \) large. When \( N=1 \), he showed that it had an analytic continuation to the whole complex plane and obtained a functional equation relating the values at \( s \) and \( 1-s \). Ogg [2] extended this to square-free \( N \) by supplementing the missing Euler factors at primes dividing \( N \).
The case for any pair of newforms was done in late 1970s in [e16], [e19]. Langlands reinterpreted the L-functions attached to cuspidal newforms as L-functions attached to cuspidal automorphic representations of \( \operatorname{GL}_2 \) over \( \mathbb Q \). In the context of representation theory, the above result can be rephrased as determining the local L- and \( \epsilon \)-factors of \( \pi_1 \times \pi_2 \) for two cuspidal automorphic representations \( \pi_1 \) and \( \pi_2 \) of \( \operatorname{GL}_2 \) over \( \mathbb Q \) so that, when multiplied together, they give rise to a global functional equation for the L-function \( L(\pi_1 {\times} \pi_2, s) \) under \( s \to 1-s \).
1.2 Computing traces of Hecke operators
We have seen that a newform is a common eigenfunction of the Hecke operators and is determined by the eigenvalues of the Hecke operators. Thus it is natural to ask, given a congruence subgroup \( \Gamma \) of \( \operatorname{SL}_2(\mathbb R) \), how to compute the traces of Hecke operators on the space of modular forms for \( \Gamma \) with a given weight. Since the space of modular forms for \( \Gamma \) decomposes as the direct sum of the space of Eisenstein series and the space of cusp forms, and the former is well understood, we are interested in explicit trace formulae of Hecke operators on the space \( S_{k+2}(\Gamma) \) of cusp forms of weight \( k+2\ge 2 \) for \( \Gamma \). When \( k=0 \), the cusp forms of weight 2 are in one-to-one correspondence with holomorphic differentials on the compactified modular curve \( X_\Gamma= \Gamma \backslash \mathfrak H^* \), where \( \mathfrak H^* \) is \( \mathfrak H \) for \( \Gamma \) cocompact, and it is \( \mathfrak H \cup \{\text{cusps}\} \) for \( \Gamma \) non-cocompact; hence the dimension of the space \( S_2(\Gamma) \) is the genus of \( X_\Gamma \). In particular, \( S_2(\Gamma) \) is trivial if \( X_\Gamma \) has genus 0. Furthermore, we shall assume \( k \) even if \( -\operatorname{Id} \in \Gamma \) because in this case the space is trivial for \( k \) odd.
This was done by Ahlgren [e27] for \( \Gamma_0(4) \); Ahlgren and Ono [e26] for \( \Gamma_0(8) \); Frechette, Ono and Papanikolas [e28] for newforms of \( \Gamma_0(8) \); Fuselier [e31], [e35] for \( \operatorname{SL}_2(\mathbb Z) \); and Lennon [e33], [e32] for \( \Gamma_0(3) \) and \( \Gamma_0(9) \). They all followed Ihara’s approach in [e8], combining counting elliptic curves over finite fields with the Selberg trace formula. In [e38] Ono and Saad did it for \( \Gamma_0(2) \) and \( \Gamma_0(4) \), following Zagier’s work [e25] for \( \operatorname{SL}_2(\mathbb Z) \). Their method uses the Rankin–Cohen brackets of Zagier’s mock modular form.
All the groups mentioned above have genus 0 and are non-cocompact. When \( \Gamma \) is cocompact, the modular curve \( X_\Gamma \) is called a Shimura curve. It has no cusps. In this case we identify the space of modular forms with cusp forms of \( \Gamma \) by abuse of language. Ogg was interested in Shimura curves \( X_\Gamma \) and modular forms for \( \Gamma \) in the 1980s. See [4], [5]. While Hecke operators on forms for a cocompact \( \Gamma \) is defined the same way as \( \Gamma \) non-cocompact, however, due to the lack of cusps, there is no a priori choice of a point on the Shimura curve \( X_\Gamma \) to play the role of the cusp at \( \infty \) for non-cocompact groups to facilitate the study of the Hecke operators. This has been a major obstacle in studying the arithmetic of modular forms on Shimura curves. Yang’s paper [e34] well illustrates this point.
Inspired by Long, Li and Tu [e37] and by Scholl [e22], in a recent joint work with Hoffman, Long and Tu [e39], we obtained explicit trace formulae of Hecke operators on \( S_{k+2}(\Gamma) \) in terms of hypergeometric character sums for certain arithmetic triangle groups \( \Gamma \). Our geometric approach gives a unified treatment for \( \Gamma \) elliptic modular (the non-cocompact case), including aforementioned groups, and \( \Gamma \) arising from the indefinite quaternion algebra \( B_6 = \bigl(\frac{-1, 3}{\mathbb{Q}}\bigr) \) over \( \mathbb{Q} \) with discriminant 6 (the cocompact case). The same method can also be applied to obtain eigenvalues of the Hecke operators as well.
1.3 Geometric interpretation of the trace formula
Let \( \Gamma \) be a congruence subgroup commensurable with \( \operatorname{SL}_2(\mathbb Z) \) (hence non-cocompact) or the norm 1 group \( \mathcal O^1 \) of a maximal order \( \mathcal O \) of an indefinite nonsplit quaternion algebra \( B \) over \( \mathbb{Q} \) (hence cocompact). Assume the compactified modular curve \( X_\Gamma \) has Shimura canonical model defined over \( \mathbb{Q} \).
The first geometric interpretation of the Hecke operators was the celebrated Eichler–Shimura congruence relation introduced by Eichler [e2] and generalized by Shimura [e4] for Hecke operators acting on weight-2 cusp forms for elliptic modular groups. This was extended to quaternionic groups by Kuga and Shimura [e7]. For forms of weight \( k+2 \ge 3 \), using the moduli interpretation of \( X_\Gamma \), Deligne [e13] for \( \Gamma \) non-cocompact and Ohta [e20] for \( \Gamma \) cocompact constructed, for each prime \( \ell \), an automorphic \( \ell \)-adic sheaf \( V^k(\Gamma)_\ell \) on \( X_\Gamma \otimes \overline {\mathbb Q} \) which provided the following geometric interpretation of the Hecke traces.
Combined with the Grothendieck–Lefschetz fixed point formula, we obtain a geometric interpretation of the Hecke trace in terms of the sum of Frobenius traces: \begin{eqnarray} -{\operatorname{Tr}} (T_p\mid S_{k+2} (\Gamma)) = \sum_{\lambda \in X_{\Gamma}(\mathbb F_p)} {\operatorname{Tr}}({\operatorname{Frob}}_\lambda \mid (V^{k} (\Gamma)_{\ell})_{\bar{\lambda}} ). \end{eqnarray}
Here \[ {\operatorname{Tr}}({\operatorname{Frob}}_\lambda \mid (V^{k} (\Gamma)_{\ell})_{\bar{\lambda}} )= {\operatorname{Tr}}({\operatorname{Frob}}_\mathfrak P \mid (V^{k} (\Gamma)_{\ell})_{\bar{\lambda^{\prime}}} ) \] for any algebraic point \( \lambda^{\prime} \) on \( X_\Gamma \) which reduces to \( \lambda \) modulo a degree-1 prime \( \mathfrak P \) above \( p \).
1.4 Our approach
Equation (1) would give an explicit Hecke trace formula if one could compute the Frobenius traces on stalks of the automorphic sheaf. We show that this can be carried out when the group \( \Gamma \) is an arithmetic triangle group of type (a) or (b) as specified in Section 2.1. More precisely, we prove that, for such a \( \Gamma \), the automorphic sheaf with minimal \( k \), namely \( V^1(\Gamma)_\ell \) for \( \Gamma \) of type (a) and \( V^2(\Gamma)_\ell \) for \( \Gamma \) of type (b), up to twist by a degree-1 sheaf, is isomorphic to the hypergeometric sheaf attached to the hypergeometric datum \( \operatorname{HD}(\Gamma) \) introduced by Katz in [e24] [e30] and further extended by Beukers, Cohen and Mellit in [e36] for which the Galois action on a stalk has Frobenius traces explicitly expressed by hypergeometric character sums. Since, as explained in Section 3.1, for the groups we consider the Frobenius traces on sheaves with larger \( k \) are symmetric powers of those with minimal \( k \), we obtain desired Hecke traces for all \( k \ge 1 \).
Our unified approach has the following two merits.
I. Suppose the space \( S_{k+2}(\Gamma) \) has dimension \( m \). Using the same method, we can compute the traces of \( (\operatorname{Frob}_{p})^r \) for \( 1 \le r \le m \), which in turn give rise to the eigenvalues of \( T_p \) on \( S_{k+2}(\Gamma) \). This is especially useful when we explore Hecke eigenvalues for modular forms on Shimura curves.
II. Let \( \Gamma^{\prime} \) be a congruence subgroup of \( \Gamma \), where \( \Gamma \) is an arithmetic triangle group of type (a) or (b) while \( \Gamma^{\prime} \) is not. Assume that the curve \( X_{\Gamma^{\prime}} \) is defined over \( \mathbb{Q} \) and there is an explicit \( \mathbb{Q} \)-rational covering map \( \pi: X_{\Gamma^{\prime}} \to X_{\Gamma} \). Then the automorphic sheaves on \( X_{\Gamma^{\prime}} \) are the pull-backs of those on \( X_{\Gamma} \) via \( \pi^* \). By pulling back the hypergeometric sheaf on \( X_\Gamma \) via \( \pi^* \) we can express the Hecke traces on \( S_{k+2}(\Gamma^{\prime}) \) explicitly in terms of the hypergeometric character sums attached to \( \operatorname{HD}(\Gamma) \).
1.5 Organization of this paper
This paper is organized as follows. After describing in Section 2.1 the arithmetic triangle subgroups \( \Gamma \) of types (a) and (b) to be considered in this paper, in Section 2.2 the explicit trace formula for the Hecke operator \( T_p \) on \( S_{k+2}(\Gamma) \) is stated in Theorem 2 for \( \Gamma \) of type (a) and in Theorem 3 of type (b). Two examples for the cocompact group \( \Gamma=(2,4,6) \) of type (b) are illustrated in Section 2.3. The first is the trace formula for the one-dimensional \( S_8(2,4,6) \), in this case the trace of \( T_p \) is also the eigenvalue of \( T_p \). The second is the two-dimensional \( S_{24}(2,4,6) \). We demonstrate that by applying the same method to compute the traces of \( \operatorname{Frob}_p \) and \( (\operatorname{Frob}_{p})^2 \), we obtain the two eigenvalues of \( T_p \). As an application, Theorem 4 in Section 2.4 gives the trace of \( T_p \) on \( S_{k+2}(2,6,6) \) expressed in terms of hypergeometric character sums attached to the datum for \( (2,4,6) \). Observe that \( (2,6,6) \) is not of type (a) or (b). The trace formula is obtained by pulling back the hypergeometric sheaf on \( X_{(2,4,6)} \) along the explicit 2-fold projection \( \pi_2: X_{(2,6,6)} \to X_{(2,4,6)} \) defined over \( \mathbb{Q} \). The automorphic sheaves \( V^k(\Gamma) \) and hypergeometric sheaves \( \mathcal H(\operatorname{HD}(\Gamma)) \) on \( X_\Gamma \) as well as their key properties are recalled in Section 3.1 and Section 3.2, respectively, each having a complex part affording the action of \( \Gamma \) and an \( \ell \)-adic counterpart for each prime \( \ell \) affording Galois actions. In geometric language, these two parts can be combined and reinterpreted as a sheaf on \( X_\Gamma \) of its étale fundamental group. A sketch of the proof of our main results, Theorems 2 and 3, is given in Section 4. We use Katz’s rigidity theorem (Theorem 7) and our comparison theorem (Theorem 8) to prove that \( V^1(\Gamma) \simeq \mathcal H(\operatorname{HD}(\Gamma)) \) for \( \Gamma \) of type (a) and \( V^2(\Gamma) \) is isomorphic to the twist of \( \mathcal H(\operatorname{HD}(\Gamma)) \) by an explicit degree-1 sheaf for \( \Gamma \) of type (b). This gives rise to the contributions of \( \operatorname{Frob}_\lambda \) at nonsingular \( \lambda \in X_\Gamma(\mathbb F_p) \) in (1), which correspond to \( \lambda \in \mathbb F_p^\times, \lambda \ne 1 \) in the trace formulas stated in Theorems 2 and 3. Singular points of \( X_\Gamma(\mathbb F_p) \) arise from the cusps and elliptic points of \( X_\Gamma \). The contribution at a cusp is 1 for \( k \) even; for \( k \) odd, it is \( \pm 1 \) or 0, depending on the type of reduction at \( p \) of the degenerate elliptic curve representing the cusp. Each elliptic point \( \lambda \) of \( X_\Gamma \) is a CM point. The contribution at \( \lambda \) depends on \( k \), the order of the point and the behavior of \( p \) in the CM field. The contributions at singular points are computed separately, and the celebrated Néron–Ogg–Shafarevich criterion is used.
2. Main results
2.1 The groups we consider
Let \( e_1, e_2, e_3 \) be elements in \( \mathbb Z_{ > 0} \cup\{\infty\} \). Defined in terms of generators and relations, an arithmetic triangle group \[ (e_1, e_2, e_3) := \langle g_1, g_2, g_3~|~g_1^{e_1} = g_2^{e_2} = g_3^{e_3} = g_1g_2g_3= \operatorname{Id} \rangle \] can be realized as a discrete subgroup \( \Gamma \) of \( \operatorname{PSL}_2(\mathbb R) \) acting on \( \mathfrak H \). A fundamental domain of \( \Gamma \) is a triangle with the three vertices \( v_1, v_2, v_3 \) stabilized by (a conjugate of) \( g_1, g_2, g_3 \), respectively. Here \( v_i \) is a cusp of \( \Gamma \) if \( e_i = \infty \); if \( e_i \) is a positive integer, then \( v_i \) is an elliptic point of order \( e_i \). These groups are classified by Takeuchi [e17], [e18]. In order that the compactified modular curve \( X_\Gamma \) afford a hypergeometric sheaf over \( \mathbb{Q} \), it must be of genus zero and its Shimura canonical model must be defined over \( \mathbb{Q} \) with the three vertices of \( X_\Gamma \) of orders \( e_1, e_2, e_3 \) being \( \mathbb{Q} \)-rational. As explained in ([e39], Section 3), the concerns on the required properties of the hypergeometric sheaves on \( X_\Gamma \) led us to the seven subgroups, \( \Gamma = (e_1, e_2, e_3) \) listed below; those of type (a) are isomorphic to a subgroup of \( \operatorname{SL}_2(\mathbb Z) \) not containing \( -\operatorname{Id} \), while those of type (b) are isomorphic to a subgroup of \( \operatorname{SL}_2(\mathbb R) \) modulo \( \pm \operatorname{Id}{:} \)
- \( (3, \infty, \infty) \simeq \Gamma_1(3) \),
\( (\infty, \infty, \infty) \simeq \Gamma_1(4) \); - \( (2, \infty, \infty) \simeq \Gamma_0(2)/\{\pm \operatorname{Id}\} \),
\( (2,3, \infty) \simeq \operatorname{PSL}_2(\mathbb Z) \),
\( (2,4, \infty) \simeq \langle \Gamma_0(2), w_2\rangle/\{\pm \operatorname{Id}\}=\Gamma_0(2)^+/\{\pm \operatorname{Id}\} \),
\( (2,6,\infty) \simeq \langle \Gamma_0(3), w_3 \rangle/\{\pm \operatorname{Id}\}=\Gamma_0(3)^+/\{\pm \operatorname{Id}\} \),
\( (2,4,6) = \langle \mathcal O_{B_6}^1, w_2, w_3, w_6\rangle/\{\pm \operatorname{Id}\} \).
Here \( w_2, w_3, w_6 \) are the Atkin–Lehner involutions. They are well known for congruence subgroups of \( \operatorname{SL}_2(\mathbb Z) \). We explain those for the quaternion algebra \( B_6 \). A maximal order of \( B_6 \), which is unique up to conjugation, is \[ \mathcal O_{B_6} = \mathbb Z + \mathbb ZI + \mathbb ZJ + \mathbb Z \,\frac{1+I+J+IJ}{2}, \] where \( I^2=-1 \), \( J^2=3 \), \( IJ=-JI \). The Atkin–Lehner involutions \( w_6, w_2, w_3 \) are defined as the elements \( 3I+IJ \), \( 1+I \), \( (3+3I+J+IJ)/2 \) of \( \mathcal O_{B_6} \), respectively, normalized by dividing by the positive square root of the respective reduced norm 6, 2, 3. They act on the curve \( X_{(2,4,6)} \) and stabilize the elliptic points of orders \( 2,4,6 \), respectively.
2.2 Statements of main results
A hypergeometric datum is a set \( \operatorname{HD}=\{\alpha, \beta\} \), where \( \alpha=\{a_1,\dots, a_n\} \) and \( \beta=\{b_1=1, b_2,\dots, b_n\} \) are multisets with \( a_i, b_j \in \mathbb{Q}^\times \). It is primitive if \( a_i -b_j \notin \mathbb Z \) for \( 1 \le i, j \le n \). We associate a primitive hypergeometric datum \( \operatorname{HD}(\Gamma) = \{\alpha(\Gamma), \beta(\Gamma)\} \) to each of the above seven subgroups \( \Gamma \), as shown in Table 1. \begin{gather*} \begin{array}{|c|c|c|c|c|c|c|c|c|c|c|c|c|c|} \hline \Gamma=(e_1,e_2,e_3)&(3, \infty, \infty) & (\infty, \infty, \infty) & (2,\infty,\infty)&(2,3,\infty)&(2,4,\infty)&(2,6,\infty)&(2,4,6)\\ \hline \alpha(\Gamma) & \bigl\{\frac13, \frac23\bigr\} & \bigl\{\frac12, \frac12\bigr\} & \bigl\{\frac12,\frac12,\frac12\bigr\} &\bigl\{\frac12,\frac16,\frac56\bigr\}&\bigl\{\frac12,\frac14,\frac34\bigr\}&\bigl\{\frac12,\frac13,\frac23\bigr\}&\bigl\{\frac12,\frac14,\frac34\bigr\}\\ \beta(\Gamma)& \{1,1\} & \{1,1\} & \{1,1,1\}&\{1,1,1\}&\{1,1,1\}&\{1,1,1\}& \bigl\{1,\frac56,\frac76\bigr\}\\ \hline \end{array} \\ \textbf{Table 1.}\,\text{The hypergeometric datum } \operatorname{HD}(\Gamma) \text{ attached to } \Gamma. \end{gather*}
When the sets \( \alpha, \beta \) in a primitive hypergeometric datum \( \operatorname{HD}=\{\alpha, \beta\} \) are defined over \( \mathbb{Q} \), for each prime \( p \) and \( \lambda \in \mathbb F_p^\times \), Beukers, Cohen and Mellit defined in [e36] the hypergeometric character sum \( H_p(\operatorname{HD}, \lambda) \). The detailed definition is a bit long and will be omitted. To give a flavor, suppose \( \alpha=\{a_1,\dots, a_n\} \), \( \beta=\{b_1=1, b_2,\dots, b_n\} \). Consider primes \( p \equiv 1 \pmod M \), where \( M = \operatorname{lcd}(\operatorname{HD}) \) is the least common denominator of all \( a_i \) and \( b_j \). Let \( \lambda \in \mathbb F_p^\times \). Then \( H_p(\operatorname{HD}, \lambda) \) can be expressed in terms of Gauss sums as follows: \[ H_p(\operatorname{HD}(\Gamma), \lambda) = \,\frac{1}{1-p}\sum_{s=0}^{p-2}\omega^s((-1)^n\lambda)\prod_{j=1}^n\,\frac{g(\omega^{(p-1)a_j+s})g(\bar{\omega}^{(p-1)b_j+s})}{g(\omega^{(p-1)a_j})g(\bar{\omega}^{(p-1)b_j})}. \] In the above, \( \omega \) is a character of \( \mathbb F_p^\times \) of order \( p-1 \), and \( g(\omega^j) \) is the Gauss sum attached to the character \( \omega^j \) and a fixed nontrivial additive character of \( \mathbb F_p \). The resulting expression is independent of the choice of the additive character and \( \omega \) for our \( \alpha, \beta \) since they are defined over \( \mathbb{Q} \). Note that when characters \( \omega^{(p-1)b_j+s} \) and \( \omega^{(p-1)b_j} \) are nontrivial, we have \[ \omega^s(-1)\,\frac{g(\omega^{(p-1)a_j+s})g(\bar{\omega}^{(p-1)b_j+s})}{g(\omega^{(p-1)a_j})g(\bar{\omega}^{(p-1)b_j})} = \frac{g(\omega^{(p-1)a_j+s})/g(\omega^{(p-1)a_j})}{g(\omega^{(p-1)b_j+s})/g(\omega^{(p-1)b_j})} \] by a Gauss sum identity. Therefore \( H_p(\operatorname{HD},\lambda) \) is regarded as a finite field analog of the complex-valued classical hypergeometric function \( _nF_{n-1}(\alpha, \beta; \lambda) \) defined by \[_nF_{n-1}(\alpha, \beta; \lambda) = \sum_{s=0}^{\infty} \prod_{j=1}^n \,\frac{\Gamma(a_j+s)/\Gamma(a_j)}{\Gamma(b_j+s)/\Gamma(b_j)} \lambda^s \] in terms of the usual \( \Gamma \)-function for \( \lambda \in {\mathbb C} \) such that the series converges.
We shall express Tr\( (T_p, S_{k+2}(\Gamma)) \) in terms of these hypergeometric character sums according to the type of \( \Gamma \).
Denote by \( N_\Gamma = 3 \) (resp. 4) the level of the group \( (3, \infty, \infty)=\Gamma_1(3) \) (resp. \( (\infty, \infty, \infty) = \Gamma_1(4) \)). Observe that \( {\operatorname{Tr}}(T_p, S_{k+2}(\Gamma))=0 \) when \( p \equiv -1 \mod N_\Gamma \) and \( k \) is odd.
To state our results for groups of type (b), for integers \( m \ge 1 \), let \( F_m(S, T) \) be the degree-\( m \) polynomial in \( S \) and \( T \) defined by the recursive relation \begin{align} & F_{m+1}(S, T) = (S-T)F_m(S, T) - T^2F_{m-1}(S, T),\notag\\ \label{F_m} & F_0(S, T) = 1, \quad F_1(S, T)=S. \end{align}
In fact, we may choose \( \alpha_{N(z),p} \) to be the Jacobi sum \( J_\omega\bigl(\frac13,\frac13\bigr) \) (resp. \( J_\omega\bigl(\frac14,\frac14\bigr) \)) if \( N(z) \in \{3, 6\} \) (resp. \( N(z) = 4 \)). Here \( \omega \) is any generator of the group of characters of \( \mathbb F_p^\times \). We remark that a Jacobi sum \( J_\omega(a,b) \) is itself a hypergeometric character sum.
2.3 Examples
To exhibit concrete examples, consider modular forms on the Shimura curve \( X_{(2,4,6)} \) arising from the quaternion algebra \( B_6 \). The group \( \mathcal O_{B_6}^1 \) of norm 1 elements in \( \mathcal O_{B_6} \) modulo center \( \{\pm \operatorname{Id}\} \) is denoted by \( (2,2,3,3) \) since its fundamental domain contains four elliptic points, two of order 2 and two of order 3. As explained by Baba and Granath in ([e29], Section 3.1), the algebra of modular forms on \( (2,2,3,3) \) is a polynomial ring generated by forms \( h_4, h_6, h_{12} \) of weights \( 4, 6, 12 \), respectively, subject to one relation \( h_{12}^2 + 3h_6^4 + h_4^6 = 0 \). In other words, \[ \bigoplus_{k \ge0} S_{2k}(2,2,3,3)= \mathbb C[h_4, h_6, h_{12}]/(h_{12}^2 + 3h_6^4 + h_4^6). \] Moreover, \( h_4, h_6, h_{12} \) are eigenfunctions of the Atkin–Lehner operators \( w_2 \) and \( w_3 \) with eigenvalues \[ -1, -1, \quad +1, -1,\quad\text{ and }\quad -1, +1,\quad\text{respectively}. \] Hence they are also eigenfunctions of \( w_2w_3=w_6 \).
As \( (2,4,6) = \langle \mathcal O_{B_6}^1, w_2, w_3 \rangle/\{\pm \operatorname{Id}\} \), the lowest \( k \) with nontrivial \( S_{k+2}(2,4,6) \) is \( k=6 \), in which case \( S_8(2,4,6)=\langle h_4^2 \rangle \) is one-dimensional. By Jacquet–Langlands correspondence [e11], [e34], \( h_4^2 \) corresponds to the normalized weight-8 level 6 cuspidal newform \( f_{6.8.a.a} \) in the L-functions and Modular Forms Database (LMFDB) notation. For primes \( p > 5 \), denote by \( a_p(f) \) the eigenvalue of \( T_p \) on an eigenfunction \( f \). Thus \( a_p(h_4^2) \) is equal to \( a_p(f_{6.8.a.a}) \). Theorem 3 above gives \[ \eqalign{ - a_p(h_4^2) &= -a_p(f_{6.8.a.a})\cr &=\sum_{\lambda \in \mathbb{F}_p,\lambda \neq 0,1} \bigl(a_{(2,4,6)}(\lambda,p)^3-2pa_{(2,4,6)}(\lambda,p)^2-p^2a_{(2,4,6)}(\lambda,p)+p^3\bigr)\cr &\qquad{}+p\bigl((pH_p(\operatorname{HD}(2,4,6);1))^2-p^2\bigr) +\Bigl(\bigl(\tfrac{-1}p\bigl)+\bigl(\tfrac{-3}p\bigr)+\bigl(\tfrac{-6}p\bigr)\Bigr)p^3, } \] where \[ a_{(2,4,6)}(\lambda, p) = \bigl(\tfrac{-3(1-1/\lambda)}{p}\bigr)pH_p(\operatorname{HD}(2,4,6), 1/\lambda). \]
The space \( S_{24}(2,4,6)=\langle h_4^6, h_6^4 \rangle \) is two-dimensional. We illustrate how to obtain the two eigenvalues \( a_{1,p} \) and \( a_{2,p} \) of \( T_p \) on this space for a prime \( p > 3 \). Computing the traces of \( \operatorname{Frob}_p \) and \( (\operatorname{Frob}_p)^2 \) on \[ H^1_{\text{ét}}(X_{(2,4,6)}\otimes\overline{\mathbb{Q}}, V^{22}(2,4,6)_{\ell}) \] for a prime \( \ell \neq p \), we obtain \begin{align*} -(a_{1,p}+a_{2,p}) &= \sum_{\lambda \in X_{(2,4,6)}(\mathbb F_p)} \operatorname{Tr}(\operatorname{Frob}_\lambda \mid (V^{22}(2,4,6)_{\ell})_{\bar{\lambda}} ) \\ &= \mathcal E(p)+\sum_{\lambda \in \mathbb F_p,\lambda \neq 0,1}F_{11}(a_{(2,4,6)}(\lambda,p),p),\\ -(a_{1,p}^2+a_{2,p}^2) &= -4p^{23}+\mathcal E(p^2)+\sum_{\lambda \in \mathbb F_{p^2},\lambda \neq 0,1}F_{11}(a_{(2,4,6)}(\lambda,p^2),p^2), \end{align*} where \( \mathcal E(q) \) denotes the total contribution from the elliptic points in \( X_{(2,4,6)}(\mathbb F_q) \). Its value for \( q=p \) is as stated in Theorem 3, and for \( q=p^2 \) it is \[ \mathcal E(p^2)=\sum_{N=2,4,6} \sum_{-\frac{22}{2N} \le i\le \frac{22}{2N}} p^{22} (\alpha_{N,p^2}^2/p^2)^{iN}, \] where \[ \alpha_{4,p^2}= J_\omega\bigl(\tfrac14,\tfrac14\bigr)^2,\quad \alpha_{6,p^2}=J_\omega\bigl(\tfrac13,\tfrac13\bigr)^2 \] for a generator \( \omega \) of \( \widehat{\mathbb F_{p^2}^\times} \), and \( \alpha_{2,p^2}^2 \) is any root of \[ T^2-p^2H_{p^2}(\operatorname{HD}(2,4,6),1)T+p^4=0 .\] The computations give \[ \begin{array}{c|ccccc} p& 5&7&11\\ \hline a_{1,p}+a_{2,p} & 25248156 & 5764462768 & 1017121470024\\ a_{1,p}^2+a_{2,p}^2 & 70010194261011336 & 60171677733273590912 & 3068149691314205892000288 \end{array} \] from which we get the eigenvalues \[ 12624078\pm 5184\beta, \quad 2882231384\pm 129600\beta\quad \text{ and }\quad 508560735012 \pm 31363200\beta \] with \( \beta=\sqrt{1296640489} \) for \( p=5,7,11 \), respectively. The two Hecke eigenforms in \( S_{24}(2,4,6) \) correspond to the newforms in the newform orbit 6.24.a.d in LMFDB.
2.4 Applications
The canonical model of the Shimura curve \( X_{(2,2,3,3)} \) is \[ x^2 + 3y^2+z^2=0 ;\] it is defined over \( \mathbb{Q} \), of genus 0, and contains no real points. The Atkin–Lehner involution \( w_2 \) (resp. \( w_3 \)) sends \( [x:y:z] \) to \( [x:-y:z] \) (resp. \( [-x: y:z] \)) with fixed points \( z_2^{\pm} = [1:0:\pm i] \) (resp. \( z_3^{\pm} = [0:1: \pm \sqrt{-3}] \)) elliptic of order 2 (resp. 3). The fixed points \[ z_6^\pm = [\pm \sqrt{-3}:1:0] \] of \( w_6 \) are not elliptic points. The three groups \[ \langle \mathcal O_{B_6}^1, w_i\rangle/\{\pm \operatorname{Id}\} \] for \( i = 3,2,6 \) are denoted \( (2,6,6) \), \( (3,4,4) \) and \( (2,2,2,3) \). As described in [e29], the Shimura curve \( X_{(2,2,3,3)} \) is a 2-fold cover of the curves \[ X_{(2,6,6)},\quad X_{(3,4,4)}, \quad\text{ and }\quad X_{(2,2,2,3)}, \] all projective lines over \( \mathbb{Q} \), with the covering maps sending \[ [x:y:z] \in X_{(2,2,3,3)} \] to \[ [y:z],\quad [x:z], \quad \text{ and } \quad [x:y],\quad\text{respectively}. \] It follows from the definition that \( X_{(2,6,6)} \), \( X_{(3,4,4)} \) and \( X_{(2,2,2,3)} \) are 2-fold covers of \( X_{(2,4,6)} \) under the \( \mathbb{Q} \)-rational covering maps \begin{align*} \pi_2 &: [y:z] \mapsto [-3y^2-z^2 : y^2],\\ \pi_6 &: [x:z] \mapsto \Bigl[x^2 : \frac{-x^2-z^2}{3}\Bigr],\\ \pi_3 &: [x:y] \mapsto [x^2: y^2]. \end{align*}
The relations among these curves are depicted in the diagram below. All covering maps are explicit and \( \mathbb{Q} \)-rational. The three elliptic points on \( X_{(2,4,6)} \) are images of \( z_6^{\pm} \), \( z_2^{\pm} \) and \( z_3^{\pm} \), all \( \mathbb{Q} \)-rational.
Observe that, while the canonical models for \( X_{(2,6,6)} \), \( X_{(3,4,4)} \) and \( X_{(2,2,2,3)} \) are projective lines over \( \mathbb{Q} \), each curve contains an elliptic point which is rational only over a quadratic extension of \( \mathbb{Q} \). This explains why \( (2,6,6) \) and \( (3,4,4) \) are not one of the seven \( \Gamma \)’s considered before. Nonetheless, using the explicit projections \( \pi_2, \pi_6, \pi_3 \) defined above, we can pull back the hypergeometric sheaf on \( X_{(2,4,6)} \) to sheaves on the three 2-fold covers \( X_{(2,6,6)} \), \( X_{(3,4,4)} \) and \( X_{(2,2,2,3)} \) to obtain explicit formulae of the traces of Hecke operators on the spaces of modular forms for \( (2,6,6) \), \( (3,4,4) \) and \( (2,2,2,3) \) respectively, and then those for \( (2,2,3,3) \) by inclusion and exclusion. See Section 7.1 of [e39] for more detail. Here we state the Hecke trace formula for \( \Gamma = (2,6,6) \) as an example.
We elaborate more on the contributions from elliptic points \( \lambda \in X_\Gamma(\mathbb F_p) \). When \( \lambda \) comes from the \( \mathbb{Q} \)-rational elliptic point \( t=0 \) of \( (2,6,6) \) of order 2, \( \pi_2(0) \) is the elliptic point on \( X_{(2,4,6)} \) of order 4 and the contribution depends on the behavior of \( p \) in \( \mathbb{Q}(\sqrt{-1}) \) as described by \eqref{eq_fo}. This is the unique elliptic point on \( X_\Gamma(\mathbb F_p) \) when \( -3 \) is a nonsquare \( \!\!\pmod p \). In this case there are \( p \) terms in the first sum, coming from \( \lambda \in P^1(\mathbb F_p), \lambda \ne 0 \). When \( -3 \) is a square \( \!\!\pmod p \), \( X_\Gamma(\mathbb F_p) \) has two more elliptic points, from \( t=\pm 1/\sqrt{-3} \), which are projected to the elliptic point of order 6 on \( X_{(2,4,6)} \) by \( \pi_2 \). Hence they give rise to the same contribution, as described by the second rule in \eqref{eq_fo} since \( p \) splits in the CM field \( \mathbb{Q}(\sqrt{-3}) \). In this case the first sum consists of \( p-2 \) terms, coming from \( \lambda \in P^1(\mathbb F_p), \lambda \ne 0, \pm 1/\sqrt{-3} \).
Similar results can be obtained for other suitable subgroups of \( (2,4,6) \) or with \( (2,4,6) \) replaced by the other six \( \Gamma \) studied in Theorems 2 and 3.
Explicit Hecke trace formulae have many applications. For instance, knowing Hecke trace can lead to special values of hypergeometric functions. As an example, it follows from the reasoning in Yang [e34] that the Hecke trace \( a_7(h_4^2) \) above yields the identity \[ {}_3F_2\Biggl[\begin{matrix} \frac12 &\frac13 &\frac34\\ &\frac56 &\frac76 \end{matrix} ;\ \frac{2^{10}\cdot3^{3}\cdot5^{6}\cdot7}{11^{4}\cdot 23^4}\Biggr] =\frac{11\cdot 23}{140\sqrt 3} \,\frac{2^{1/3}(4+2\sqrt 2)}{7^{7/6}} \,\frac{\Gamma(7/6)\Gamma(13/24)\Gamma(19/24)}{\Gamma(5/6)\Gamma(17/24)\Gamma(23/24)}. \] The trace formulae in this paper are used by Grove in [e40] to obtain the vertical Sato–Tate distribution, as \( p \to \infty \), of the normalized \( H_p(\operatorname{HD}(\Gamma); \lambda) \), \( \lambda \in \mathbb F_p^\times \), for \( \Gamma \) of type (a). The distribution of that for \( \Gamma \) of type (b) can be obtained by a similar idea.
3. Sheaves
3.1 Automorphic sheaves \( V^k(\Gamma)_{\mathbb C} \) and \( V^k(\Gamma)_\ell \)
Write \( X_\Gamma^\circ = X_\Gamma \setminus \{ \)cusps, elliptic points\( \} \). The automorphic sheaves \( V^k(\Gamma)_{\mathbb C} \) and \( V^k(\Gamma)_\ell \) on \( X_\Gamma \) are first defined on \( X_\Gamma^\circ \), then extended to \( X_\Gamma \) along the inclusion \( \iota: X_\Gamma^\circ \to X_\Gamma \). The sheaves are first defined for \( \Gamma \) torsion-free, then use the push-forward to get sheaves for groups with torsion elements. When \( \Gamma \) is torsion-free, the natural map \( \mathfrak H \to X_\Gamma^{\circ} \) is a universal cover with covering group \( \Gamma \) being the fundamental group \( \pi_1(X_\Gamma^{\circ}, *) \). If \( \Gamma \) is not torsion-free, then \( \Gamma \) mod its center is a quotient of \( \pi_1(X_\Gamma^{\circ}, *) \).
The group \( \operatorname{SL}_2(\mathbb R) \) acts on \( {\mathbb C}^2 \) canonically and it gives rise to the action on \( \operatorname{Sym}^k ({\mathbb C}^2) \). Restricting to the subgroup \( \Gamma \) gives the action of \( \Gamma \) on \( \operatorname{Sym}^k ({\mathbb C}^2) \). For each integer \( k\ge 1 \), the complex sheaf \( V^k(\Gamma)_{\mathbb C} \) is a rank \( k+1 \) local system on \( X_\Gamma \) defined by \[ V^k(\Gamma)_{\mathbb C} = \Gamma \backslash \bigl(\mathfrak H^* \times \operatorname{Sym}^k ({\mathbb C}^2)\bigr). \] The stalk at \( x \in X_\Gamma^\circ \) is \( \operatorname{Sym}^k ({\mathbb C}^2) \), that at \( x \) a cusp or elliptic point is \[ \bigl(\operatorname{Sym}^k ({\mathbb C}^2)\bigr)^{\Gamma_x}, \] the subspace fixed by the stabilizer \( \Gamma_x \) of \( x \) in \( \Gamma \). It was shown by Eichler [e3] and Shimura [e6] for \( \Gamma \) non-cocompact and by Kuga and Shimura [e7] for \( \Gamma \) cocompact that \[ H^1(X_\Gamma, V^k(\Gamma)_{\mathbb C}) \simeq S_{k+2}(\Gamma) \oplus \overline{S_{k+2}(\Gamma)}. \]
Deligne [e13] and Ohta [e20] constructed, for each integer \( k \ge 1 \) and a prime \( \ell \), an \( \ell \)-adic sheaf \( V^k(\Gamma)_\ell \) on \( X_\Gamma \otimes \overline {\mathbb Q} \) using the moduli interpretation of \( X_\Gamma \). To ease our notation, we use \( V^k(\Gamma)_{\ell, \bar \lambda} \) to denote the stalk at the point \( \lambda \) of the sheaf \( V^k(\Gamma)_\ell \). For \( \Gamma \) elliptic modular and torsion-free, \( V^k(\Gamma)_\ell \) has rank \( k+1 \). More precisely, from the universal elliptic curve \( \mathcal E \) over \( X_\Gamma^\circ \), the stalk at an algebraic point \( \lambda \in X_\Gamma^\circ(K) \) over a number field \( K \) is \[ V^k(\Gamma)_{\ell,\bar \lambda} = \operatorname{Sym}^{k}~ \bigl(V^1(\Gamma)_{\ell,\bar \lambda}\bigr), \] where \[ V^1(\Gamma)_{\ell, \bar \lambda} = H_{\text{ét}}^1(\mathcal E_\lambda \otimes \overline{\mathbb{Q}}, {\mathbb Q}_\ell) \] endowed with the action of \( \operatorname{Gal}(\overline {\mathbb{Q}}/K) \). In particular, if \( \Gamma \) is elliptic modular and \( -\operatorname{Id} \notin \Gamma \), then at \( \lambda \in X_\Gamma^\circ(\mathbb{Q}) \) we have, for all \( k \ge 1 \) and almost all primes \( p \), \[ \operatorname{Tr} (\operatorname{Frob}_{p} | V^k(\Gamma)_{\ell,\bar {\lambda}}) = \sum_{j=0}^{\lfloor \frac k2 \rfloor }(-1)^j\binom{k-j}{j} p^j \operatorname{Tr} \bigl(\operatorname{Frob}_{p} | V^1(\Gamma)_{\ell,\bar {\lambda}}\bigr)^{k-2j}. \] This is the relation used in Theorem 2.
For \( \Gamma \) arising from the indefinite nonsplit quaternion algebra \( B \) over \( \mathbb{Q} \) and torsion-free, \( V^k(\Gamma)_\ell \) has rank \( k+1 \) for \( k \) even. From the universal abelian surface over \( X_\Gamma \) with quaternion multiplication by \( B \), at \( \lambda \in X_\Gamma(K) \) there is an \( \ell \)-adic degree-2 representation \( \rho_{\ell,\lambda} \) of \( \operatorname{Gal}(\overline{\mathbb{Q}}/K) \) such that \[ V^{k}(\Gamma)_{\ell,\bar\lambda} = \operatorname{Sym}^{k}~ (\rho_{\ell,\lambda}), \] similar to the elliptic modular case. The rank of \( V^k(\Gamma)_\ell \) for \( k \) odd is very different. For example, \( V^1(\Gamma)_\ell \) has rank 4. See ([e39], Section 2.2) for more detail.
If \( \Gamma \) contains \( -\operatorname{Id} \), be it elliptic modular or quaternionic, \( V^{k}(\Gamma)_\ell \) and \( V^k(\Gamma)_{\mathbb C} \) are both zero for \( k \) odd; for \( k \ge 2 \) even, \( \lambda \in X_\Gamma^\circ(\mathbb{Q}) \), and almost all primes \( p \), \[ \operatorname{Tr}\bigl(\operatorname{\operatorname{Frob}}_{p} | V^k(\Gamma)_{\ell,\bar {\lambda}}\bigr) = F_{k/2}\Bigl(\operatorname{Tr}\bigl(\operatorname{\operatorname{Frob}}_{p} | V^2(\Gamma)_{\ell,\bar {\lambda}}\bigr), p\Bigr), \] where \( F_m \) is defined by (2). This is the relation used in Theorem 3.
Hence we only need to replace \( V^2(\Gamma)_\ell \) or \( V^1(\Gamma)_\ell \) by a sheaf whose Frobenius traces are computable. Our goal is to show that, for each of the seven triangle subgroups \( \Gamma \) in Section 2.1, this can be achieved using the hypergeometric sheaf associated to \( \operatorname{HD}(\Gamma) = \{\alpha(\Gamma), \beta(\Gamma)\} \).
3.2 Hypergeometric sheaves \( \mathcal H(\operatorname{HD})_{\mathbb C} \) and \( \mathcal H(\operatorname{HD})_\ell \)
In this subsection we introduce the hypergeometric sheaf, both complex and \( \ell \)-adic, attached to a primitive hypergeometric datum \[ \operatorname{HD} = \{\alpha \,{=}\, \{a_1, \dots,a_n\}, \beta\,{=}\,\{b_1{=}1, b_2,\dots,b_n\}\} \] of length \( n \).
As we have seen in Section 2.2, the classical hypergeometric function associated to \( \operatorname{HD} \) is \[_nF_{n-1}(\operatorname{HD}; z)= \sum_{r\ge 0} \prod_{1 \le i \le n}\frac{\Gamma(a_i+r)/\Gamma(a_i)}{\Gamma(b_i+r)/\Gamma(b_i)} ~z^r. \] Beukers and Heckman showed in [e23] that \( _nF_{n-1}(\operatorname{HD}; z) \) satisfies the \( n \)-th order linear ordinary differential equation \begin{align*} & \bigl[\theta (\theta + b_2-1) \cdots (\theta + b_n-1) - z(\theta+a_1)\cdots (\theta+a_n)\bigr]F = 0,\\ & \theta = z \frac{d}{dz} \end{align*} which has three regular singularities at \( 0, 1, \infty \) with local exponents \[ \eqalign{ &0=1-b_1, 1-b_2,\dots, 1-b_n & \quad\text{ at } z&=0,\cr &a_1,\dots, a_n & \quad\text{ at } z&=\infty,\cr &0, 1,2,\dots, n-2, -1 + \textstyle\sum_{j} (b_j - a_j) & \quad\text{ at } z&=1. } \]
By assembling solutions of this differential equation through the
nonsingular points we define a rank-\( n \) complex local system on \( \mathbb
P^1({\mathbb C})\setminus\{0, 1 \), \( \infty\} \). Fix a base point \( z_0 \in \mathbb
P^1({\mathbb C})\setminus\{0, 1 \), \( \infty\} \). For \( i \in \{0, 1 \), \( \infty\} \), denote
by \( C_i \) a simple counterclockwise loop starting and ending at \( z_0 \)
enclosing \( i \) and not the other two singularities. Then \( \pi_1(\mathbb
P^1({\mathbb C})\setminus\{0, 1 \), \( \infty\}, \) \( z_0) \) is generated by the homotopy classes
\( [C_0] \), \( [C_1] \), \( [C_\infty] \) with one relation
\( [C_\infty][C_1][C_0]
= \operatorname{Id} \).
Let \( g_0, g_1 \), \( g_\infty \in \operatorname{GL}_n({\mathbb C}) \) be the monodromy action on
the fiber at \( z_0 \) by extending solutions analytically along \( C_0, C_1,
C_\infty \),
respectively.
The monodromy group of \( \operatorname{HD} \) is the triangle group \[ \langle g_0, g_1, g_\infty ~|~ g_\infty g_1 g_0 = \operatorname{Id} \rangle = ( e_0, e_1, e_\infty ), \] where \( e_i \) is the order of \( g_i \). The eigenvalues of \( g_\infty \) are \( e^{2\pi i a_j} \), those of \( g_0 \) are \( e^{-2\pi i b_j} \), while \( g_1 \) is a pseudoreflection, that is, \( g_1-I_n \) has rank 1. This describes the complex hypergeometric sheaf \( \mathcal H(\operatorname{HD})_{\mathbb C} \) on \( \mathbb P^1({\mathbb C})\setminus\{0, 1 \), \( \infty\} \) with the monodromy group \( ( e_0, e_1 \), \( e_\infty) \).
On the algebraic side, for each prime \( \ell \), Katz [e24], [e30] introduced an \( \ell \)-adic rank-\( n \) hypergeometric sheaf \( \mathcal H(\operatorname{HD})_\ell \) on the multiplicative group \( \mathbb G_m \) with the action of \( \operatorname{Gal}(\overline {\mathbb{Q}}/\mathbb{Q}(\zeta_M)) \), where \( M =\operatorname{lcd}(\operatorname{HD}) \). Its action on the fiber at \( \lambda \in \mathbb{Q}(\zeta_M)^\times \) has Frobenius traces given by hypergeometric character sums which are finite field analog of \( {}_nF_{n-1}(\operatorname{HD};1/\lambda) \). When the datum \( \operatorname{HD} \) is defined over \( \mathbb{Q} \), that is, the set of the column vectors \[\Bigl\{ \Bigl(\,\begin{matrix} a_i\\b_i \end{matrix}\,\Bigr) \mod {\mathbb Z} \Bigr\} \] is invariant under multiplication by elements in \( (\mathbb Z/M\mathbb Z)^\times \), the Galois action at the fiber above \( \lambda \in \mathbb{Q}^\times \) can be extended to \( \operatorname{Gal}(\overline {\mathbb{Q}}/\mathbb{Q}) \). One such extension, denoted \( \rho^{\operatorname{BCM}}_{\operatorname{HD},\lambda, \ell} \), was studied by Beukers, Cohen and Millet [e36]. We summarize key properties of this representation below.
- for primes \( p\,\nmid\, M\ell \) such that \( \operatorname{ord}_p \lambda = 0 \), we have \[ \operatorname{Tr}\rho_{\operatorname{HD},\lambda,\ell}^{\operatorname{BCM}}(\operatorname{Frob}_p) = \bigl(\tfrac{-1}{p}\bigr)^\delta p^{(n-m)/2} H_p\bigl(\operatorname{HD}; \tfrac{1}{\lambda}\bigr); \]
- the degree of \( \rho_{\operatorname{HD},\lambda,\ell}^{\operatorname{BCM}} \) is \( n \) for \( \lambda \ne 1 \), and \( n-1 \) for \( \lambda = 1 \);
- the characteristic polynomial of \( \operatorname{Frob}_p \) has coefficients in \( {\mathbb Z} \) and all roots in \( {\mathbb C} \) have the same absolute value \( p^{(n-1)/2} \).
Here \( n-m \) is even, \( \delta = 1 \) when \( \sum a_i \equiv 1/2 \mod {\mathbb Z} \) and \( 2\| M \), and \( \delta=0 \) otherwise. Moreover, the family of representations \[ \bigl\{\rho_{\operatorname{HD},\lambda,\ell}^{\operatorname{BCM}} : \text{prime }\ell\bigr\} \] of \( \operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q}) \) is compatible so that the trace of \( \operatorname{Frob}_p \) is independent of the choice of \( \ell \ne p \).
Therefore, in order that \( V^2(\Gamma) \) or \( V^1(\Gamma) \) is isomorphic to a hypergeometric sheaf or its twist by a character, the modular curve \( X_\Gamma \) has to possess the following properties:
- \( X_\Gamma \) is defined over \( \mathbb{Q} \) and isomorphic to \( \mathbb P^1 \);
- \( X_\Gamma \setminus X_\Gamma^\circ \) consists of three \( \mathbb{Q} \)-rational points, which we may assume to be \( 0, 1, \infty \);
- \( \Gamma \) or \( \bar \Gamma :=\Gamma/\Gamma\cap \{\pm \operatorname{Id}\} \) is an arithmetic triangle group.
For the seven subgroups \( \Gamma \) in Section 2.1, the modular curves \( X_\Gamma \) meet these conditions, and the data \( \operatorname{HD}(\Gamma)=\{\alpha(\Gamma) \), \( \beta(\Gamma)\} \) in Table 1 are defined over \( \mathbb{Q} \) so that Theorem 5 applies.
For our seven subgroups \( \Gamma \) in Section 2.1, we have defined complex and \( \ell \)-adic automorphic sheaves \( V^k(\Gamma) \) and hypergeometric sheaves \( \mathcal H(\operatorname{HD}(\Gamma)) \), first on \( X_\Gamma^\circ \simeq \mathbb P^1 \smallsetminus \{0, 1 \), \( \infty\} = X \), then extended to \( X_\Gamma \) along the inclusion \( \iota: X_\Gamma^\circ \to X_\Gamma \). Upon choosing an isomorphism between \( \overline{{\mathbb Q}_\ell} \) and \( {\mathbb C} \), we may combine the complex and \( \ell \)-adic \( V^k(\Gamma) \) (respectively \( \mathcal H(\operatorname{HD}(\Gamma)) \)) together as a sheaf on \( X_\Gamma \) with action by the étale fundamental group \[ \pi_1(X_\mathbb{Q}, \bar \xi)=\operatorname{Gal}(K/\mathbb{Q}(t)) .\]
4. A sketch of the proof of main theorems
Let \( \Gamma=(e_0, e_1, e_\infty) \) be one of the seven triangle subgroups in Section 2.1, and let \( \lambda = \lambda(\Gamma) \) be the parameter of the curve \( X_\Gamma \) taking values \( 0, 1, \infty \) at the three points of order \( e_0, e_1, e_\infty \), respectively. At a prime \( p \) where \( X_\Gamma \) has good reduction, \[ X_\Gamma(\mathbb F_p) = \mathbb P^1(\mathbb F_p) = \mathbb F_p \cup \{\infty\}. \] The contribution to the trace of \( T_p \) on \( S_{k+2}(\Gamma) \) from a nonsingular \( \lambda \in X_\Gamma(\mathbb F_p) \), namely \[ \lambda \in \mathbb F_p^\times, \quad \lambda \ne 1, \] follows from the theorem below, in which \( \chi_d \) denotes the quadratic character attached to \( \mathbb{Q}(\sqrt d) \), which we identify as a character of \( \operatorname{Gal}(\overline {\mathbb{Q}}/\mathbb{Q}) \) by class field theory, or a character of \( \operatorname{Gal}(K/\mathbb{Q}(t)) \) trivial on \( \operatorname{Gal}(K/\overline{\mathbb{Q}}(t)) \) by (6).
- For \( \Gamma \) of type (a), that is, \( \Gamma \) is \( (\infty, \infty, \infty) \) or \( (3, \infty, \infty) \), we have \[ V^1(\Gamma)_\ell \simeq \mathcal H(\operatorname{HD}(\Gamma))_\ell. \]
- For \( \Gamma \) of type (b), that is, \( \Gamma \) is one of \( (2, \infty, \infty), (2,3,\infty), (2,4,\infty), (2,6,\infty),(2,4,6) \), we have \[ V^2(\Gamma)_\ell \simeq \chi_\Gamma \otimes \mathcal H\bigl(\bigl\{\{1/2\},\{1\}\bigr\}\bigr)_\ell\otimes \mathcal H(\operatorname{HD}(\Gamma))_\ell. \] Here \( \chi_\Gamma \) is a character equal to \( \chi_{-1} \) for \( \Gamma= (2,4,\infty) \), \( \chi_{3} \) for \( \Gamma = (2,4,6) \), and trivial for the other three groups.
To prove Theorem 6, we take the geometric viewpoint explained in Remark 2 to compare the automorphic sheaf and the hypergeometric sheaf occurred in (i) and (ii). Two general results are used. The first one is Katz’s rigidity theorem which compares two complex sheaves.
- The local monodromies at 1 of both \( \mathcal F \) and \( \mathcal G \) are pseudoreflections;
- At both 0 and \( \infty \), \( \mathcal F \) and \( \mathcal G \) have the same
characteristic polynomials of local monodromies.
Then \( \mathcal F \) and \( \mathcal G \) are isomorphic.
The table below shows that \( V^1(\Gamma)_{\mathbb C} \simeq \mathcal H(\operatorname{HD}(\Gamma))_{\mathbb C} \) for \( \Gamma \) of type (a): \begin{gather*} \begin{array}{|c|c|c|c|c|c|c|c|} \hline (e_0,e_1,e_\infty)&\text{Generators}& \text{Exponents}& {\operatorname{HD}(\Gamma)} & {\mathcal H(\operatorname{HD}(\Gamma))_{\mathbb C}}\\ \hline (\infty,\infty,3) & g_0=T:= \bigl({1 \atop 0}{1\atop 1}\bigr) & (1,1) & \alpha=\bigl\{\frac{1}{3}, \frac{2}{3}\bigr\} &\mathcal H\bigl(\bigl\{\frac1{3},\frac23\bigr\},\{1,1\}\bigr)_{\mathbb C}\\ \cong \Gamma_1(3) & g_1=S:= \bigl({-2 \atop -3}{3\atop4}\bigr) &(1,1)&\beta=\{1,1\}& \\ & g_\infty=(ST)^{-1} &\bigl(\frac13,\frac23\bigr)& &\\ \hline (\infty,\infty,\infty) & g_1=T:= \bigl({1\atop 0}{1\atop1}\bigr) & (1,1) &\alpha=\bigl\{\frac12, \frac12\bigr\} &\mathcal H\bigl(\bigl\{\frac1{2},\frac12\bigr\},\{1,1\}\bigr)_{\mathbb C}\\ \cong \Gamma_1(4) &g_0=(ST)^{-1} &(1,1)&\beta=\{1,1\} & \\ & g_\infty=S:= \bigl({1\atop4}{-1\atop-3}\bigr) &\bigl(\frac12,\frac12\bigr)& &\\ \hline \end{array} \end{gather*} A similar table for type (b) groups shows \( V^2(\Gamma)_{\mathbb C} \simeq \mathcal H(\{\{1/2\},\{1\}\})_{\mathbb C} \otimes\mathcal H(\operatorname{HD}(\Gamma))_{\mathbb C} \).
Once the two complex sheaves are shown to be isomorphic, the following comparison theorem says that their \( \ell \)-adic counterparts may differ at most by a twist by a character of \( \operatorname{Gal}(\overline{\mathbb{Q}}/\mathbb{Q}) \).
We then compare the two sheaves at a fiber to determine the twist character \( \chi_\Gamma \) for each \( \Gamma \).
It remains to compute the contributions from the three singular points on \( X_\Gamma \). These are done separately and the celebrated Néron–Ogg–Shafarevich criterion is used. At a cusp it is equal to 1 for \( k \) even, as done in Scholl [e22]; while for \( k \) odd, it is \( \pm 1 \) or 0 depending on the type of reduction at \( p \) of the degenerate elliptic curve representing the cusp. All elliptic points are CM points. We first determine the CM structure at each elliptic point, then compute the contribution at \( p \) from each elliptic point, which mainly depends on \( k \), the order of the point and the behavior of \( p \) in the CM field, as shown in Theorems 2 and 3.
Wen-Ching Winnie Li received her Ph.D. from the University of California, Berkeley in 1974, supervised by Andrew P. Ogg. In 1979 she joined the Pennsylvania State University, where she has been Distinguished Professor of Mathematics since 2012.