Celebratio Mathematica

Antoni Zygmund

The development of square functions in the work of A. Zygmund

by Elias M. Stein

I’ve de­cided to write this es­say about “square func­tions” for two reas­ons. First, their de­vel­op­ment has been so in­ter­twined with the sci­entif­ic work of A. Zyg­mund that it seems highly ap­pro­pri­ate to do so now on the oc­ca­sion of his 80th birth­day. Also these func­tions are of fun­da­ment­al im­port­ance in ana­lys­is, stand­ing as they do at the cross­ing of three im­port­ant roads many of us have trav­elled by: com­plex func­tion the­ory, the Four­i­er trans­form (or or­tho­gon­al­ity in its vari­ous guises), and real-vari­able meth­ods. In fact, the more re­cent ap­plic­a­tions of these ideas, de­scribed at the end of this es­say, can be seen as con­firm­a­tion of the sig­ni­fic­ance Zyg­mund al­ways at­tached to square func­tions.

This is go­ing to be a partly his­tor­ic­al sur­vey, and so I hope you will al­low me to take the usu­al liber­ties as­so­ci­ated with this kind of en­ter­prise: I will break up the ex­pos­i­tion in­to cer­tain “his­tor­ic­al peri­ods”, five to be pre­cise; and by do­ing this I will be able to sug­gest my own views as to what might have been the key in­flu­ences and ideas that brought about these de­vel­op­ments.

One word of ex­plan­a­tion about “square func­tions” is called for. A deep concept in math­em­at­ics is usu­ally not an idea in its pure form, but rather takes vari­ous shapes de­pend­ing on the uses it is put to. The same is true of square func­tions. These ap­pear in a vari­ety of forms, and while in spir­it they are all the same, in ac­tu­al prac­tice they can be quite dif­fer­ent. Thus the meta­morph­os­is of square func­tions is all im­port­ant.

First period (1922–1926): The primordial square functions

It ap­pears that square func­tions arose first in an ex­pli­cit form in a beau­ti­ful the­or­em of Kaczmarz and Zyg­mund deal­ing with the al­most every­where sum­mab­il­ity of or­tho­gon­al ex­pan­sions. The the­or­em was proved in 1926 as the cul­min­a­tion of sev­er­al pa­pers each had writ­ten at about that time. The the­or­em it­self was an out­growth of what cer­tainly was one of the main pre­oc­cu­pa­tions of ana­lysts at that time, namely the ques­tion of con­ver­gence of Four­i­er series. The prob­lem was the fol­low­ing. Sup­pose \( f=f(\theta) \) is a con­tinu­ous func­tion on the circle, \( 0\leq\theta\leq 2\pi \), or more gen­er­ally as­sume that \( f \) is in \( L^{2}(0,2\pi) \) or even that \( f \) is merely in­teg­rable; then does its Four­i­er series \begin{equation} \sum a_{n}e^{in \theta} \quad\text{with}\quad a_{n}=\frac{1}{2\pi}\int_{0}^{2\pi}f(\theta)e^{-in \theta}\,d\theta, \label{eqnon} \end{equation} con­verge al­most every­where?

A re­lated par­al­lel is­sue was the cor­res­pond­ing ques­tion for a gen­er­al or­thonor­mal ex­pan­sion, but now lim­ited to \( f\in L^{2} \). Thus if \( \{\phi_{n}\} \) is an or­thonor­mal sys­tem, and if \[ f\sim\sum a_{n}\phi_{n} \quad\text{with}\quad a_{n}=\int f\overline{\phi}_{n} ,\] where \( \sum |a_{n}|^{2} < \infty \), then what could be said about the con­ver­gence al­most every­where of \begin{equation} \sum_{n=1}^{\infty}a_{n}\phi_{n}(x)? \label{eqntw} \end{equation}

The peri­od we are deal­ing with (1922–1926) was marked by sev­er­al strik­ing achieve­ments in this area, whose es­sen­tial in­terest is not di­min­ished even when viewed from the dis­tant per­spect­ive of more than a half cen­tury. The first res­ult to men­tion was the con­struc­tion by Kolmogorov in 1923 [e3] of an \( L^{1} \) func­tion whose Four­i­er series \eqref{eqnon} di­verged al­most every­where.1 This con­struc­tion made even more press­ing the ques­tion of wheth­er the Four­i­er series \eqref{eqnon} con­verges al­most every­where when (say) \( f \) be­longs to \( L^{2} \), a prob­lem that was not solved till more than forty years later. We shall turn to that in a mo­ment, but now we point out that Kolmogorov’s ex­ample put in­to sharp­er re­lief the \( L^{2} \) res­ults for gen­er­al or­thonor­mal de­vel­op­ments that had been ob­tained (in 1922 and 1923) by Rademach­er and Men­shov. They showed that if \begin{equation} \sum|a_{n}|^{2}(\log n)^{2} < \infty \label{eqnth} \end{equation} then the series \eqref{eqntw} con­verges a.e.

Moreover the con­di­tion \eqref{eqnth} is best pos­sible in the sense that if \( \{\lambda_{n}\} \) is mono­ton­ic and \[ \smash{\lambda_{n}}/\log n\rightarrow 0 ,\] then there ex­ists an or­thonor­mal sys­tem \( \{\phi_{n}\} \) and ex­pan­sion \eqref{eqntw} which di­verged a.e., while \[ \sum \smash{|a_{n}|^{2}\lambda_{n}^{2}} < \infty .\]

For or­din­ary Four­i­er series it was proved2 that the con­di­tion \eqref{eqnth} could be re­laxed and be re­placed by \begin{equation} \sum_{-\infty}^{\infty}|a_{n}|^{2}\log(|n|+2) < \infty. \label{eqnfo} \end{equation}

This last res­ult stood un­sur­passed for forty years un­til Car­leson in 1966 showed that in­deed the Four­i­er series of an \( L^{2} \) func­tion con­verged al­most every­where. It may be in­ter­est­ing to note here that the ba­sic tools re­quired for Car­leson’s the­or­em — the prop­er­ties of the Hil­bert trans­form and their re­la­tion with par­tial sums of Four­i­er series — were first brought to light in this early peri­od: Kolmogorov’s proof of the weak-type (1, 1) prop­erty in 1925; M. Riesz’s pa­per of 1927 [e12] con­tain­ing the \( L^{p} \) in­equal­it­ies for con­jug­ate func­tions and par­tial sums; and Be­sicov­itch’s work (in 1923 [e2] and 1926 [e6]) which began the de­vel­op­ment of “real-vari­able” meth­ods for Hil­bert trans­forms.

Against this back­ground we can now state the idea of Kaczmarz and Zyg­mund. It as­serts as a gen­er­al prin­ciple that for an \( L^{2} \) or­thonor­mal ex­pan­sion (i.e., one where \( \sum |a_{n}|^{2} < \infty \)), at al­most all points the sum­mab­il­ity of the series \( \sum a_{n}\phi_{n}(x) \) by one meth­od one has as a con­sequence the sum­mab­il­ity by any oth­er meth­od which is es­sen­tially stronger than con­ver­gence. A spe­cial (but typ­ic­al) case is as fol­lows:

The­or­em 1:

Sup­pose \( \sum|a_{n}|^{2} < \infty \). Then \( \sum a_{n}\phi_{n}(x) \) is Cesàro sum­mable at al­most each point \( x \) where it is Abel sum­mable.

Re­call that the series is Abel sum­mable at \( x \) if \[ \lim_{r\rightarrow 1}-\sum a_{n}r^{n}\phi_{n}(x) \quad\text{exists}. \] In ad­di­tion, set­ting \begin{align*} s_{n} &=\sum_{k\equiv 0}^{n}a_{k}\phi_{k} \quad\text{and}\\ \sigma_{n} & =(s_{0}+s_{1}+\cdots+s_{n-1})/n , \end{align*} the Cesàro sum­mab­il­ity at \( x \) means the ex­ist­ence of the lim­it \[ \lim_{n\rightarrow\infty}\sigma_{n}(x) .\]

If a series is Cesàro sum­mable it is auto­mat­ic­ally Abel sum­mable (an ex­er­cise!), but the con­verse is in gen­er­al not true. To gain a bet­ter idea of the scope of The­or­em 1 let us point out that \[ \sigma_{n}(x) =\sum_{k=0}^{n}\Bigl(1-\frac kn\Bigr)a_{k}\phi_{k}(x) \] and a res­ult sim­il­ar to The­or­em 1 holds when \( \sigma_{n}(x) \) is re­placed by \[ \sigma_{n}^{\epsilon}(x) = \sum_{k=0}^{n}\Bigl(1-\frac kn\Bigr)^{\epsilon}a_{k}\phi_{k}(x) \quad\text{with }\epsilon > 0 \] (which cor­res­ponds es­sen­tially to \( (C, \epsilon) \) sum­mab­il­ity), but not for \( \epsilon=0 \) which of course would give the usu­al con­ver­gence.

For the proof of The­or­em 1 Kaczmarz and Zyg­mund used a square func­tion which they in­tro­duced for this pur­pose, namely \begin{equation} K(f)=\biggl(\sum_{n=2}^{\infty}n|\sigma_{n}-\sigma_{n-1}|^{2}\biggr)^{\mkern-2mu{1/2}} \label{eqnfi} \end{equation} with \( f\sim\sum a_{n}\phi_{n} \). The ba­sic fact was the \( L^{2} \) in­equal­ity.

\[ \|K(f)\|_{L^{2}} \leq C \|f\|_{L^{2}}. \]

Clearly \[ \sigma_{n}=\sigma_{n-1}=\frac{1}{n(n-1)}\sum_{k=0}^{n-1}ka_{k}\phi_{k},\] so \[ \|\sigma_{n}-\sigma_{n-1}\|_{2}^{2}\leq\frac{c}{n^{4}}\sum_{k < n}k^{2}|a_{k}|^{2}, \quad n\geq 2, \] and thus \[ \sum_{2}^{\infty}n\|\sigma_{n}-\sigma_{n-1}\|_{2}^{2}\leq c^{\prime}\sum|a_{k}|^{2}=c^{\prime}\|f\|_{2}^{2} ,\] which proves the lemma.

To prove the the­or­em one in­vokes a vari­ant of the clas­sic­al Tauberi­an ar­gu­ment, namely, if \( \smash{\sum A_{n}} \) is Abel sum­mable and \( \sum nA_{n}^{2} < \infty \), then \( \sum A_{n} \) con­verges. Now set \( A_{n}=\sigma_{n}-\sigma_{n-1} \); then the Abel sum­mab­il­ity of \( \sum A_{n} \) fol­lows from the cor­res­pond­ing Abel sum­mab­il­ity of \( \sum a_{n}\phi_{n} \). The Tauberi­an con­di­tion holds at al­most all points be­cause of the lemma, and hence one ob­tains a.e. the con­ver­gence of \( \sum(\sigma_{n}-\sigma_{n-1}) \), prov­ing the the­or­em.

We have seen the first ex­ample of a square func­tion, namely \eqref{eqnfi}. While here it plays a minor role, its ba­sic char­ac­ter is already re­vealed: Be­cause of the agil­ity of its quad­rat­ic nature it can ex­ploit eas­ily any situ­ation in which or­tho­gon­al­ity might be im­port­ant.

Second period (1931–1938): Littlewood and Paley

Our scene shifts now from the Con­tin­ent to Eng­land, and to the work of Lit­tle­wood and Pa­ley. Our at­ten­tion will be fo­cused on two im­port­ant series of con­nec­ted pa­pers: three jointly by Lit­tle­wood and Pa­ley 1931–1938 [e14], [e17], [e19], and two by Pa­ley 1932 [e15], [e16]. The in­vest­ig­a­tions de­scribed in these pa­pers were ini­ti­ated sim­ul­tan­eously (the first pa­per in each series was sub­mit­ted in April 1931), but be­cause of Pa­ley’s death in 1933 the fi­nal ver­sions of sev­er­al of the pa­pers were prob­ably Lit­tle­wood’s work alone. It is also in­ter­est­ing to note that no ref­er­ence is made in these pa­pers to the res­ults de­scribed above, and so it is a reas­on­able guess that they were not aware of the pos­sible rel­ev­ance of the ideas of Kaczmarz and Zyg­mund.

The main theme of the Lit­tle­wood–Pa­ley work was to con­sider the “dy­ad­ic de­com­pos­i­tion” of Four­i­er series, namely \[ f(\theta)=\sum_{k=0}^{\infty}\Delta_{k}(\theta), \] with \begin{align*} & \Delta_{k}(\theta) =\sum_{2^{k-1}\leq|n| < 2^{k}}a_{n}e^{in \theta}, \quad k\geq1;\\ & \Delta_{0} =a_{0}. \end{align*}

Their ba­sic res­ult was that the \( L^{p} \) norm of a func­tion was equi­val­ent with the \( L^{p} \) norm of the square func­tion as­so­ci­ated with its dy­ad­ic de­com­pos­i­tion.

The­or­em 2:

For \( 1 < p < \infty \), \[ \biggl\Vert\Bigl(\sum_{k=0}^{\infty}|\Delta_{k}(\theta)|^{2}\Bigr)^{1/2}\biggr\Vert_{p}\simeq\|f\|_{p}. \]

To prove this the­or­em they needed and thus for­mu­lated an “abeli­an” ana­logue, where par­tial sums are re­placed by Abel means, i.e., the Pois­son in­teg­ral of \( f=u(r, \theta) \). Thus giv­en \( f \), let \( \Phi \) be the holo­morph­ic func­tion in the unit disc with \( \operatorname{Re}(\Phi)=u \), and \( \operatorname{Im}(\Phi(0))=0 \). They defined an­oth­er square func­tion the “\( g \)-func­tion” of \( f \) by \[ g(f)(\theta)=\biggl(\int_{0}^{1}(1-r)\bigl|\Phi^{\prime}(r`e^{i\theta})\bigr|^{2}\,dr\biggr)^{1/2} \] and proved the fol­low­ing

The­or­em 3:

With \( 1 < p < \infty \) \begin{equation} \|g(f)\|_{p}\simeq\|f\|_{p} \quad\textit{if } a_{0}=0. \label{eqnsi} \end{equation}

Pa­ley sought a bet­ter un­der­stand­ing of the nature of these prob­lems by con­sid­er­ing vari­ants of The­or­em 2 where the Four­i­er series ex­pan­sion is re­placed by the Walsh–Pa­ley ex­pan­sion. The Walsh–Pa­ley func­tions (called Walsh–Kaczmarz func­tions at that time) are now usu­ally de­scribed as fol­lows. We identi­fy the in­ter­val \( [0,1] \) with the com­pact group con­sist­ing of an in­fin­ite product of cop­ies of the two-ele­ment group (via the usu­al bin­ary ex­pan­sion). The char­ac­ters of that group are the Walsh–Pa­ley func­tions. Writ­ing each in­teger as a sum of powers of 2 gives a nat­ur­al enu­mer­a­tion of the char­ac­ters \( \{\phi_{n}\}_{n=0}^{\infty} \). If we set \begin{align*} & f\sim\sum a_{n}\phi_{n} \quad\text{and}\\ & \Delta_{k}=s_{2^{k}}-s_{2^{k-1}}=\sum_{2^{k-1} < n\leq 2^{k}}a_{n}\phi_{n} \quad\text{with}\\ & \Delta_{0}=a_{0}, \end{align*} then Pa­ley’s the­or­em reads as

The­or­em 4:

For the Walsh–Pa­ley series, with \( 1 < p < \infty \) \[ \biggl\Vert\Bigl(\sum|\Delta_{k}|^{2}\Bigr)^{1/2}\biggr\Vert_{p}\simeq\|f\|_{p}. \]

What makes the proof of The­or­em 4 easi­er than that of The­or­em 2 are the vari­ous sim­pli­fic­a­tions in­her­ent in the fact that \( \{s_{2^{k}}(f)\} \) is a mar­tin­gale se­quence. The name “mar­tin­gale” had not yet been coined. Moreover, a sys­tem­at­ic ex­ten­sion of The­or­em 4 from the point of view of mar­tin­gales, and its fur­ther ex­plor­a­tion in the ma­gic­al world of Browni­an mo­tion — all these came much later, as we shall see. However in Pa­ley’s time some of the ar­gu­ments typ­ic­al of mar­tin­gale the­ory were already un­der­stood. Thus it had been ob­served that \( s_{2^{k}}(f) \) was con­stant on each \( 2^{k} \) in­ter­vals (of length \( 2^{-k} \)) of the form \[ \bigl((l-1)/2^{k}, \ l/2^{k}\bigr) , \qquad l=1,\ldots,2^{k}, \] and that the value of \( s_{2^{k}}(f) \) on each of these in­ter­vals was the mean-value of \( f \) there. From this it is ob­vi­ous when \( f\in L^{p} \), \( 1\leq p\leq\infty \), then \( \{s_{2^{k}}(f)\} \) are bounded in \( L^{p} \) norm; the ana­logue for Four­i­er series is def­in­itely nonob­vi­ous when \( 1 < p < \infty \), and in fact false when \( p=1 \) or \( p=\infty \).

We shall now de­scribe the main device Pa­ley used in his proof of The­or­em 4. Pa­ley was, from what one can learn about his life, a man of cour­age and al­most reck­less dar­ing. A hint of that spir­it can be found in his ap­proach to dif­fi­cult math­em­at­ic­al prob­lems. When faced by the proof of an in­equal­ity like \begin{equation} \int\Bigl(\sum|\Delta_{k}|^{2}\Bigr)^{p/2}\,dx\leq A_{p}^{p}\int|f|^{p}\,dx \label{eqnse} \end{equation} where \( p \) is e.g. an even in­teger \( 2r \), he in­stinct­ively sought to face the prob­lem head-on by mul­tiply­ing out the \( r \) in­fin­ite sums, and then com­ing to grips dir­ectly with the res­ult­ing mul­ti­tude of terms. This kind of au­da­cious at­tack is not so com­mon in our time when it is easi­er to rely on a vari­ety of soph­ist­ic­ated gad­gets which are house­hold items for the work­ing ana­lyst. But giv­en Pa­ley’s re­source­ful­ness this ap­proach worked mar­velously well. His key ob­ser­va­tion was that \begin{equation} \sum_{i_{r}}\int\Delta_{i_{1}}^{2}\Delta_{i_{2}}^{2}\cdots\Delta_{i_{r}}^{2}\,dx \leq \int\Delta_{i_{1}}^{2}\cdots\Delta_{i_{r-1}}^{2}f^{2}\,dx \label{eqnei} \end{equation} where the sum­ma­tion is taken over those \( i_{r} \) for which \( i_{r} > \max(i_{1},\ldots \), \( i_{r-1}) \), which in turn fol­lows from the mar­tin­gale prop­erty that \begin{equation} \int g(x)\Delta_{k}(x)\,dx=0 \label{eqnni} \end{equation} whenev­er \( g \) is “meas­ur­able with re­spect to the past”. From \eqref{eqnei} Pa­ley was able to achieve the proof of \eqref{eqnse} in a few strokes.

The same idea in­spired Lit­tle­wood and Pa­ley’s proof of The­or­em 3, al­though the ex­e­cu­tion is more com­plic­ated; a more re­con­dite form of \eqref{eqnei} must be proved, and here noth­ing as simple as \eqref{eqnni} holds. The ap­pro­pri­ate sub­sti­tute must be fash­ioned with care out of Green’s the­or­em in con­junc­tion with the iden­tity \[ \Delta(|\Phi|^{2})=4|\Phi^{\prime}|^{2} .\] With The­or­em 3 proved, Lit­tle­wood and Pa­ley were able to de­duce The­or­em 4, but here also the steps re­quired were not easy. It was only after their the­ory was reex­amined by Zyg­mund and his stu­dent Mar­cinkiewicz, that a clear­er and broad­er view of the whole sub­ject began to emerge. To this we shall now turn.

Third period (1938–1945): Marcinkiewicz and Zygmund

There are two sig­ni­fic­ant events that marked the peri­od we are now con­cerned with. The first, which even pred­ated the Lit­tle­wood–Pa­ley col­lab­or­a­tion, was the in­tro­duc­tion by Lus­in in 1930 [e13] of his “area in­teg­ral”. The idea of Lus­in seems to have sparked no fur­ther in­terest un­til Mar­cinkiewicz and Zyg­mund took up the sub­ject again about 8 years later. There began a brief but very cre­at­ive peri­od of work by them — a flower­ing of the the­ory where con­nec­tions with a vari­ety of oth­er ideas were brought to light. The second event, a tra­gic one, fol­lowed soon there­after with the death of Mar­cinkiewicz in 1940, and it was left to Zyg­mund alone to re­solve some of the is­sues that their work had led them to.

It may help to cla­ri­fy the de­scrip­tion of the prin­cip­al ideas that Mar­cinkiewicz and Zyg­mund con­trib­uted to the study of square func­tions if we or­gan­ize our present­a­tion in terms of the four main lines along which their work pro­ceeded.

The first sub­ject we shall treat (and the only one that was, strictly speak­ing, joint work) deals with the area in­teg­ral of Lus­in. The defin­i­tion of this is as fol­lows. Sup­pose \( \Phi(z) \) is holo­morph­ic in the unit disc and define \( A(\Phi)(\theta) \) by \begin{equation} (A(\Phi)(\theta))^{2}=\int_{\Gamma(\theta)}|\Phi^{\prime}(z)|^{2}\,dx\,dy \label{eqnonze} \end{equation} with \( \Gamma(\theta) \) a stand­ard “tri­angle” (nontan­gen­tial ap­proach re­gion) in the unit disc with ver­tex at \( e^{i\theta} \). Ob­serve that the ex­pres­sion rep­res­ents the area of the im­age of \( \Gamma(\theta) \) un­der the map­ping \( z\rightarrow\Phi(z) \), with points coun­ted ac­cord­ing to their mul­ti­pli­city. Lus­in’s dis­cov­ery was that if \( \Phi \) is bounded, then \( A(\Phi)(\theta) \) is fi­nite for al­most any \( \theta \); more gen­er­ally that \begin{equation} \|A(\Phi)(\theta)\|_{2}\simeq\|\Phi\|_{2} \quad\text{if } \Phi(0)=0. \label{eqnonon} \end{equation}

Mar­cinkiewicz and Zyg­mund real­ized that on the one hand there was a close ana­logy between the Lit­tle­wood–Pa­ley \( g \)-func­tion and \( A(\Phi) \) (in fact \( A \) is a point­wise ma­jor­ant of \( g \), and the same kind of \( L^{p} \) in­equal­it­ies held for \( A \) as for \( g) \); but on the oth­er hand they sur­mised that the par­al­lel between these two square func­tions should not be pushed too far. The main res­ult they ob­tained for \( A \) was a loc­al­ized ver­sion of Lus­in’s res­ult. This can be stated as fol­lows. Let \[ \Phi^{*}(\theta)=\sup_{z\in\Gamma(\theta)}|\Phi(z)| .\]

The­or­em 5a:

If \( \Phi \) is holo­morph­ic in the unit disc, then for al­most every \( \theta \), \( \Phi^{*}(\theta) < \infty \) im­plies \( A(\Phi)(\theta) < \infty \).

The con­verse was proved five years later by Spen­cer,3 namely

The­or­em 5b:

If \( \Phi \) is holo­morph­ic in the unit disc, then for al­most every \( \theta \), \( A(\Phi)(\theta) < \infty \) im­plies \( \Phi^{*}(\theta) < \infty \).

A cor­res­pond­ing con­verse for \( g \)-func­tions is false, and so the area in­teg­ral \( A \) has some spe­cial af­fin­it­ies with the bound­ary be­ha­vi­or of \( \Phi \), go­ing bey­ond what it shares with \( g \).

The second line of in­vest­ig­a­tion was Zyg­mund’s reex­am­in­a­tion of the Lit­tle­wood–Pa­ley the­or­em for the dy­ad­ic de­com­pos­i­tion of Four­i­er series. His ana­lys­is led him to re­cast and sim­pli­fy the ideas of the proof. These sim­pli­fic­a­tions had im­port­ant con­sequences for later work, as we shall see; but their im­me­di­ate in­terest was that it al­lowed him to con­nect the square func­tion \( \bigl(\sum|\Delta_{k}|^{2}\bigr)^{1/2} \) with the one he and Kaczmarz had con­sidered a dozen years earli­er in their study of sum­mab­il­ity of or­tho­gon­al series (see \eqref{eqnfi}). We sup­pose that we take the Four­i­er ex­pan­sion and set \[ f(\theta)\sim\sum_{n\geq 0}a_{n}e^{in \theta} ,\] \( f\in L^{p} \), so that \( f\in H^{p} \). If we write as be­fore \[ K(f)(\theta)=\biggl(\sum_{n\geq 1}n \bigl|\sigma_{n}(\theta)-\sigma_{n-1}(\theta)\bigr|^{2}\biggr)^{1/2} \] where \[ \sigma_{n}(\theta)=\sum_{0\leq k < n}\Bigl(1-\frac kn\Bigr)a_{k}e^{ik\theta} ,\] then we can state the fol­low­ing the­or­em:

The­or­em 6:

\( \|K(f)\|_{p}\leq A_{p}\|f\|_{p},\ 1 < p < \infty \).4

The proof of this the­or­em re­quired two steps. First, like that of The­or­em 2, one needed the \( L^{p} \) in­equal­it­ies for the \( g \)-func­tion (see \eqref{eqnsi}). Here the ma­jor sim­pli­fic­a­tion was made by Zyg­mund some years later5 and it came in the proof of the fact that \[ \|g(f)\|_{p}\leq A_{p}\|f\|_{p} ,\] when \( p > 2 \). (The case \( p=2 \) was easy, and the range \( p < 2 \) was re­du­cible to \( p=2 \) by the ar­ti­fice stand­ard in those days of us­ing Blasch­ke product de­com­pos­i­tions for \( H^{p} \) func­tions.) For the dif­fi­cult case \( p > 2 \) a “square du­al­ity” was used. An in­geni­ous ar­gu­ment shows that whenev­er \( \phi \geq 0 \), \begin{equation} \int g(f)^{2}\phi\,d\theta \leq c \biggl\{\int g(f)g(\phi)M(f)\,d\theta + \int|f|^{2}\phi\,d\theta\biggr\} \label{eqnontw} \end{equation} where \( M \) is the Hardy–Lit­tle­wood max­im­al func­tion. For \( p\geq 4 \), \eqref{eqnontw} then gives the de­sired res­ult as a con­sequence of the case \( p\leq 2 \) ap­plied to \( g(\phi) \). In­cid­ent­ally, the no­tion of square du­al­ity which seems to have ori­gin­ated in this con­text con­tin­ues to find oth­er ap­plic­a­tions of in­terest.

The second sim­pli­fic­a­tion Zyg­mund made was in the man­ner in which one could re­duce the \( L^{p} \) con­trol of \( \bigl(\sum|\Delta_{k}|^{2}\bigr)^{1/2} \) to that of the \( g \)-func­tion; and in fact a whole list of oth­er square func­tions (in par­tic­u­lar, \( \bigl(\sum_n |\sigma_{n}-\sigma_{n-1}|^{2}\bigr)^{1/2} \)) could be handled in the same way.6 This stream­lin­ing of the proof he found can be said to have led dir­ectly to the “Mar­cinkiewicz mul­ti­pli­er the­or­em”.

In its one-di­men­sion­al form the cel­eb­rated the­or­em that bears Mar­cinkiewicz’s name can be stated as fol­lows. Sup­pose we con­sider a trans­form­a­tion \( T \) giv­en by a mul­ti­pli­er se­quence \( \{\lambda_{n}\}_{-\infty}^{\infty} \), defined by \[ Tf\sim\sum\lambda_{n}a_{n}e^{in \theta} \quad\text{ whenever } f\sim\sum a_{n}e^{in \theta}. \] Then \( T \) is bounded on \( L^{p},\ 1 < p < \infty \), if (i) the se­quence \( \{\lambda_{n}\} \) is bounded, and (ii) if it var­ies boundedly over each dy­ad­ic block; more pre­cisely, \[ \sum_{2^{k}\leq|j| < 2^{k+1}} |\lambda_{j}-\lambda_{j-1}|\leq M .\] (Note that the spe­cial case when the se­quence is con­stant on each dy­ad­ic block is an im­me­di­ate con­sequence of The­or­em 2.) In one di­men­sion the the­or­em’s greatest mer­it is, I be­lieve, in its for­mu­la­tion rather than its proof; the lat­ter is much the same as that of The­or­em 6.

It is in the pas­sage to high­er di­men­sions, however, that one finds the great sig­ni­fic­ance of Mar­cinkiewicz’s work on mul­ti­pli­ers. Its im­port­ance was not only the fact that one could use hitherto one-di­men­sion­al meth­ods to prove \( n \)-di­men­sion­al res­ults; even more pro­found were the ap­plic­a­tions to oth­er ques­tions, such as es­tim­ates for par­tial dif­fer­en­tial equa­tions, already en­vis­aged at that time. We can now see in ret­ro­spect that Mar­cinkiewicz thus an­ti­cip­ated some of the ba­sic in­equal­it­ies later proved by the the­ory of sin­gu­lar in­teg­rals.7 For sim­pli­city of nota­tion we shall state the Mar­cinkiewicz mul­ti­pli­er the­or­em in the case of two di­men­sions. Con­sider the mul­ti­pli­er op­er­at­or \( T \) giv­en by \[ Tf\sim\sum\lambda_{nm}a_{nm}e^{i(n\theta+m\phi)} \quad\text{for } f\sim\sum a_{nm}e^{i(n\theta+m\phi)} .\] Let \( I_{k} \) de­note the dy­ad­ic in­ter­val \[ \{n\mid 2^{k-1}\leq|n| < 2^{k}\} \quad\text{and}\quad J_{l}=\{m\mid 2^{l-1}\leq|m| < 2^{l}\} .\] Write \begin{align*} & \Delta_{1}\lambda_{n,m}=\lambda_{n+1,m}-\lambda_{n,m},\\ & \Delta_{2}\lambda_{n,m}=\lambda_{n,m+1}-\lambda_{n,m}, \text{ and}\\ & \Delta_{1,2}=\Delta_{1}\cdot\Delta_{2}. \end{align*} Now as­sume the fi­nite­ness of the fol­low­ing four quant­it­ies:

  1. \( \sup_{n,m}|\lambda_{n,m}| \);

  2. \( \sup_{k,m}\sum_{n\in I_{k}}|\Delta_{1}\lambda_{n,m}| \), and \( \sup_{m,l}\sum_{m\in J_{l}}|\Delta_{2}\lambda_{n,m}| \); and

  3. \( \sup_{k,l}\sum_{n\in I_{k}}\sum_{n\in J_{l}}|\Delta_{1}\Delta_{2}\lambda_{n,m}| \).

The­or­em 7:

Un­der the as­sump­tion made above, \( T \) is bounded on \( L^{p},\ 1 < p < \infty \).

The last of the four ma­jor lines of in­vest­ig­a­tion con­cern­ing square func­tions that Mar­cinkiewicz and Zyg­mund un­der­took dealt with the at­tempt to find a com­pletely “real-vari­able” ana­logue of the func­tions of Lus­in and Lit­tle­wood–Pa­ley. Start­ing with a func­tion \( f \) on the circle, the area in­teg­ral and \( g \)-func­tions are defined in terms of holo­morph­ic (or har­mon­ic) func­tions whose bound­ary val­ues are re­lated to \( f \). Also the dy­ad­ic square func­tion of The­or­em 2 re­quires the Four­i­er ex­pan­sion of \( f \). What was de­sired was a vari­ant that could be defined more dir­ectly in terms of the ba­sic real-vari­able op­er­a­tions such as in­teg­ra­tion, dif­fer­en­ti­ation, etc.

After some ex­per­i­ment­a­tion Mar­cinkiewicz hit upon the idea of con­sid­er­ing \begin{equation} \mu(F)(x)=\biggl(\int_{0}^{\pi}\bigl|F(x+t)+F(x-t)-2F(x)\bigr|^{2} \frac{dt}{t^{3}}\biggr)^{1/2} \label{eqnonth} \end{equation} with \[ F(x)=\int^{x}f(t)\,dt. \]

It was not dif­fi­cult to see that \[ \|\mu(F)\|_{L^{2}}\simeq\|f\|_{L^{2}} \quad\text{ if }\,\int_{0}^{2\pi}f(x)\,dx =0. \] With this, and us­ing the real-vari­able tools he had already de­veloped, he was able to prove the ana­logue of the the­or­em he and Zyg­mund had found for the area in­teg­ral (The­or­em 5a). The res­ult was as fol­lows.

The­or­em 8a:

Sup­pose \( F\in L^{2} \). If \( F^{\prime}(x) \) ex­ists in a set \( E \), then \( \mu(F)(x) < \infty \) for al­most every \( x \in E \).

The ques­tions that arose were first, wheth­er some of the oth­er prop­er­ties of the area in­teg­ral or \( g \)-func­tion held as well for \( \mu \); and, more in­ter­est­ingly, what was the real sig­ni­fic­ance of the Mar­cinkiewicz func­tion. Zyg­mund found an an­swer to the first ques­tion in 1944 [5] when he proved

The­or­em 8b:

For \( 1 < p < \infty \), \[ \|\mu(F)\|_{L^{p}}\simeq\|f\|_{L^{p}} \quad\text{ if }\, \int_{0}^{2\pi}f(x)\,dx=0. \]

The ar­gu­ment he de­veloped to show this was not an easy one. He was re­quired to in­voke the most ar­cane of the square func­tions, the func­tion \( g^{*} \), which Lit­tle­wood and Pa­ley had also stud­ied. He es­tab­lished the \( L^{p} \) in­equal­it­ies for it and showed that it ac­tu­ally was a ma­jor­ant of the Mar­cinkiewicz func­tion. In­cid­ent­ally \( g^{*} \) is defined by \[ (g^{*}(\Phi)(\theta))^{2}=\int_{0}^{1}\int_{0}^{2\pi}\bigl|\Phi^{\prime}(r e^{i(\theta+\phi)})\bigr|^{2}\bigg|\frac{1-r}{1-r e^{i\phi}}\bigg|^{2}\,d\phi\, dr, \] and so ma­jor­izes also of the area in­teg­ral \eqref{eqnonze}, but it takes in­to ac­count “the tan­gen­tial” ap­proach to the bound­ary.8 The prob­lem that re­mained was to dis­cov­er wheth­er there was a con­verse to the loc­al res­ult giv­en by The­or­em 8a, or to put the ques­tion more broadly, to find the mean­ing of the Mar­cinkiewicz func­tion. It was to be al­most twenty more years be­fore an an­swer to that ques­tion would be found.

Fourth period (1950–1964): Zygmund and his students

Start­ing about 1950 a new dir­ec­tion of con­sid­er­able im­port­ance began to emerge in force. Hin­ted at in earli­er work (of Be­sicov­itch and Mar­cinkiewicz, among oth­ers), its thrust was the de­vel­op­ment of “real-vari­able” meth­ods to re­place com­plex func­tion the­ory — that favored ally of one-di­men­sion­al Four­i­er ana­lys­is. What made this new em­phas­is par­tic­u­larly timely, in fact in­dis­pens­able, was that only with tech­niques com­ing from real-vari­able the­ory could one hope to come to grips with many in­ter­est­ing \( n \)-di­men­sion­al ana­logues of the one-di­men­sion­al the­ory.

The math­em­atician an­im­at­ing this de­vel­op­ment was Ant­oni Zyg­mund. In many ways he set the broad out­lines of the ef­fort, he mastered by his work some of the cru­cial dif­fi­culties, and was throughout the source of in­spir­a­tion for his stu­dents and col­lab­or­at­ors.

a: The area integral

A pi­on­eer­ing res­ult in this new dir­ec­tion was Calderón’s ex­ten­sion to \( \mathbf{R}^{n} \) of the the­or­em of Mar­cinkiewicz and Zyg­mund con­cern­ing the area in­teg­ral, a sub­ject he had taken up at the sug­ges­tion of Zyg­mund. The set­ting for this is as fol­lows. We let \[ \mathbf{R}_{+}^{n+1}=\{(x, y), x=(x_{1},\ldots,x_{n})\in \mathbf{R}^{n}, y\in \mathbf{R}^{+}\} \] be the up­per half-space, and sup­pose that \( u(x, y) \) is har­mon­ic (with re­spect to the \( n+1 \) vari­able \( x_{1},\ldots, \) \( x_{n}, y \)). Some­times we shall as­sume that \( u \) is in fact the Pois­son in­teg­ral of an ap­pro­pri­ate func­tion \( f \) defined on \( \mathbf{R}^{n} \), and then we shall write \( u=\operatorname{PI}(f) \). We let \( \Gamma=\{(x, y), \) \( |x| < y\} \) be a stand­ard cone with ver­tex at the ori­gin, \( \Gamma^{\prime} \) its trun­cated ver­sion, \( \Gamma^{\prime}=\Gamma\cap\{y < 1\} \). For any \( \bar{x}\in \mathbf{R}^{n} \), \( \Gamma(\bar{x}) \) and \( \Gamma^{\prime}(\bar{x}) \) will be the cor­res­pond­ing cones with ver­tices at \( \bar{x} \). The area in­teg­ral of \( u \) is defined by \begin{equation} (A(u)(\bar{x}))^{2}=\int_{\Gamma(\bar{x})}|\nabla u|^{2}y^{1-n}\,dx\, dy \label{eqnonfo} \end{equation} where \( |\nabla u|^{2}=|\partial u/\partial y|^{2}+\sum_{j=1}^{n}|\partial u/\partial x_{\partial}|^{2} \).

Sim­il­arly for the loc­al the­ory one needs the ana­logue of \eqref{eqnonfo} where \( \Gamma(\bar{x}) \) is re­placed by \( \Gamma^{\prime}(\bar{x}) \); this defines \( A_{\mathrm{loc}}(u)(\bar{x}) \). The max­im­al func­tion \( u^{*} \) is defined by \[ u^{*}(\bar{x})=\sup_{(x,y)\in\Gamma(\bar{x})}|u(x, y)| ,\] and its loc­al ana­logue \( u_{\mathrm{loc}}^{*} \) is giv­en by re­pla­cing \( \Gamma(\bar{x}) \) by \( \Gamma^{\prime}(\bar{x}) \) in the defin­i­tion.

The­or­em 9a:

Sup­pose \( u \) is har­mon­ic in \( \mathbf{R}_{+}^{n+1} \). Then \( A_{\mathrm{loc}}u(\bar{x}) < \infty \) at al­most every point \( \bar{x}\in \mathbf{R}^{n} \) where \( u_{\mathrm{loc}}^{*}(\bar{x}) < \infty \).

Calderón’s proof of this the­or­em was pub­lished at the same time (1950) as an­oth­er im­port­ant res­ult he found, namely the ex­ten­sion of Privalov’s the­or­em: \( u \) has a nontan­gen­tial lim­it at al­most every \( \bar{x}\in \mathbf{R}^{n} \), where \( u_{\mathrm{loc}}^{*}(\bar{x}) < \infty \). We shall dis­cuss the ideas be­hind the proof of The­or­em 9a later when we take up its con­verse. Now we turn to the “glob­al” ver­sion, i.e., the high­er-di­men­sion­al ana­logue of the Lit­tle­wood–Pa­ley the­or­em (The­or­em 3).

The­or­em 9b:

Sup­pose \( u= \operatorname{PI} (f) \), then \[ \|A(u)\|_{L^{p}}\simeq\|f\|_{L^{p}},\quad 1 < p < \infty. \]

It would be dif­fi­cult after 25 years to re­call the pre­cise thoughts that mo­tiv­ated the proof of The­or­em 9b, nor would it be easy now for one to ap­pre­ci­ate the dif­fi­culties that seemed then to stand in the way. But I do re­mem­ber that those of us who were gradu­ate stu­dents of Zyg­mund in the middle 1950’s were shaped by the event, akin to the Cre­ation, which ap­peared to some of us to be the be­gin­ning of everything im­port­ant: the 1952 Acta pa­per which de­veloped via the Calderón–Zyg­mund lemma, the real vari­able meth­ods giv­ing the ex­ten­sion of the Hil­bert trans­form to \( n \)-di­men­sions. What was more nat­ur­al, there­fore, than to at­tempt to prove the \( L^{p} \) bounded­ness of \( f\rightarrow A(u) \) by ad­apt­ing these meth­ods? This idea in­deed worked, al­though the ini­tial com­plic­ated proofs were later much sim­pli­fied. The ana­lys­is suc­ceeded as well for the Mar­cinkiewicz func­tion \eqref{eqnonth}, and proved also that the map­pings \( f\rightarrow A(u) \) and \( f\rightarrow\mu(F) \) were of weak-type (1, 1).

We turn now to the proof of The­or­em 9a. Its one-di­men­sion­al ver­sion (The­or­em 5a) had been done by us­ing com­plex func­tion the­ory, in par­tic­u­lar con­form­al map­pings. So a com­pletely dif­fer­ent ap­proach was needed. The idea be­hind it can be un­der­stood by ex­amin­ing the case \( p=2 \) of The­or­em 9b, which has an easy proof. A dir­ect cal­cu­la­tion shows that \begin{equation} \int_{\mathbf{R}^{n}}A^{2}(u)\,dx=c\int_{\mathbf{R}_{+}^{n+1}}y|\nabla u|^{2}\,dx\,dy, \label{eqnonfi} \end{equation} where \( c \) is the volume of the unit ball. Next we can use the fact that \[ |\nabla u|^{2}=\tfrac{1}{2}\Delta(|u|^{2}) ,\] and so by Green’s the­or­em \begin{align*} \int_{\mathbf{R}^{n}}A^{2}(u)\,dx & =\frac{c}{2}\iint_{\mathbf{R}_{+}^{n+1}}y\Delta(|u|^{2})\,dx\,dy\\ & =\frac{c}{2}\int|u(x,0)|^{2}\,dx, \end{align*} which proves The­or­em 9b for \( p=2 \), since \( u(x, 0)=f(x) \). Thus in or­der to con­trol \( A_{\mathrm{loc}}(u)(x) \) on a set \( E \), it is nat­ur­al to con­sider \[ \int_{E}A_{\mathrm{loc}}^{2}(u)(x)\,dx \] which in turn is dom­in­ated by \[ c \int_{R(E)} y|\nabla u|^{2}\,dx\,dy ,\] where \( R(E) \) is a stand­ard “saw­tooth” re­gion in \( \mathbf{R}_{+}^{n+1} \) based on \( E \). At this stage (which is the turn­ing point of the proof) Calderón in­voked Green’s the­or­em for an­oth­er re­gion con­tain­ing \( R(E) \), whose Green’s func­tion he could es­sen­tially bound from be­low by \( c^{\prime}y \).

To prove the con­verse of The­or­em 9a along these lines ap­peared to re­quire, among oth­er things, ap­pro­pri­ate bounds from above for Green’s func­tion for such re­gions, and that seemed much bey­ond what could be done then.9 What turned out to be the right course of ac­tion was to fin­esse the prob­lem of Green’s func­tion and to pro­ceed dir­ectly with es­tim­ates that fol­lowed from the fi­nite­ness of \[ \int_{R(E)}y|\nabla u|^{2}\,dx\,dy .\] These ar­gu­ments also proved to be use­ful in oth­er situ­ations, as we shall see later. The res­ult ob­tained was

The­or­em 9c:

Sup­pose \( u \) is har­mon­ic in \( \mathbf{R}_{+}^{n+1} \). Then \( u_{\mathrm{loc}}^{*}(\bar{x}) < \infty \) for al­most all points \( \bar{x}\in \mathbf{R}^{n} \) where \( A_{\mathrm{loc}}(u)(\bar{x}) < \infty \).

I re­mem­ber quite vividly the ex­cite­ment sur­round­ing the events at the time of this work. It was March 1959, and I had re­turned to the Uni­versity of Chica­go the fall be­fore. Fre­quently I met with my friends Guido Weiss and Mary Weiss, and to­geth­er we of­ten found ourselves in Zyg­mund’s of­fice (Eck­hart 309, two doors from mine). With our teach­er our con­ver­sa­tions ranged over a wide vari­ety of top­ics (not all math­em­at­ic­al) and more than once the sub­ject of square func­tions arose. When this happened the mood would change, if only slightly, as if in de­fer­ence to their spe­cial status, and the en­igma that sur­roun­ded them. I had an idea which seemed prom­ising. But be­fore we could see where it might lead came the spring break. Fur­ther work would have to be held in abey­ance since we were each go­ing our own ways: Zyg­mund trav­elled to Bo­ston to vis­it Calderón; Guido and Mary Weiss, hav­ing bor­rowed my Chev­ro­let, drove to Vir­gin­ia for a va­ca­tion trip; and I went to New York to be mar­ried.

b: The Marcinkiewicz function

In­flu­enced by the re­newed in­terest in area in­teg­rals, and en­cour­aged by some re­cent work he had done with Mary Weiss,10 Zyg­mund re­turned to the study of the Mar­cinkiewicz in­teg­ral \eqref{eqnonth} and the prob­lem of find­ing a con­verse to The­or­em 8a. He was con­vinced that now (more than 20 years after Mar­cinkiewicz’s ori­gin­al work) the time was ripe to see mat­ters to a con­clu­sion. He sug­ges­ted to me that we work on the prob­lem to­geth­er, and of course I was very happy to ac­cept his of­fer. For me this was a unique and re­ward­ing col­lab­or­a­tion — not just be­cause of the spe­cial sat­is­fac­tion one de­rives when ac­cep­ted as an equal by one’s teach­er — but also be­cause as it turned out he did most of the work that really coun­ted!

We real­ized first that The­or­em 8a it­self could be some­what strengthened; what was re­quired was the no­tion of the de­riv­at­ive \( F^{\prime}(x) \) ex­ist­ing (at \( x \)) “in the \( L^{2} \) sense”. Thus \( F^{\prime}(x) \) ex­is­ted in this gen­er­al­ized sense if11 \begin{equation} \frac{1}{h}\int_{0}^{h}\bigg|\frac{F(x+t)-F(x)}{t}-F^{\prime}(x)\bigg|^{2}\,dt\rightarrow 0, \quad\text{ as }\, h\rightarrow 0. \label{eqnonsi} \end{equation}

The finer ver­sion of The­or­em 8a was then: If \( F\in L^{2} \) had a de­riv­at­ive in the sense of \eqref{eqnonsi} at each \( x \in E \), then \( \mu(F)(x) < \infty \) for al­most every \( x\in E \). It was in this form that one might seek a con­verse. The ba­sic plan was to try to make mat­ters turn on the ana­log­ous situ­ation which held for the area in­teg­ral, where one can pass from the fi­nite­ness of a quad­rat­ic ex­pres­sion to the ex­ist­ence of a lim­it. After a series of re­duc­tions we were able to show that at each point \( x \) where \( \mu(F)(x) < \infty \) one had \begin{equation} \int_{|t|\leq y}\bigg|\frac{\partial^{2}u}{\partial y^{2}}(x+t, y)+\frac{\partial^{2}u}{\partial y^{2}}(x-t, y)\bigg|^{2}\,dt\,dy < \infty \label{eqnonse} \end{equation} with \( u=\operatorname{PI}(F) \). On the oth­er hand we could show (us­ing The­or­em 5b) that at al­most every \( x \) where \begin{equation} \int_{|t|\leq y}\bigg|\frac{\partial^{2}u}{\partial y^{2}}(x+t, y)\bigg|^{2}\,dt\,dy < \infty \label{eqnonei} \end{equation} the con­clu­sion \eqref{eqnonsi} ac­tu­ally held.

The ba­sic dif­fi­culty, the pas­sage from \eqref{eqnonse} to \eqref{eqnonei}, was over­come by Zyg­mund us­ing a clev­er “desym­met­riz­a­tion” ar­gu­ment; sev­er­al weeks later he presen­ted me with an es­sen­tially fi­nal draft of the pa­per which he had typed him­self!

There were sev­er­al vari­ants of the fi­nal res­ult — in­volving ex­ten­sions to \( n \)-di­men­sions, or high­er de­riv­at­ives, or even frac­tion­al de­riv­at­ives. The simplest ver­sion, however, was the fol­low­ing:

The­or­em 10:

Let \( F\in L^{2}(0,2\pi) \). Then the set of point \( x \) where \[ \int_{0}^{\pi}\bigl|F(x+t)+F(x-t)-2F(x)\bigr|^{2}\,dt/t^{3} < \infty, \] and the set of points where \( F^{\prime}(x) \) ex­ists in the \( L^{2} \) sense (i.e., \eqref{eqnonsi}) dif­fer by a set of meas­ure zero.

Fifth period (1966–present): Further applications of square functions

We have traced the de­vel­op­ment of square func­tions from their be­gin­nings to a stage where their nature was much bet­ter un­der­stood, in terms of a series of deep the­or­ems that had been ob­tained. Yet it is only more re­cently that their cent­ral role in sev­er­al fields of ana­lys­is has be­come more ap­par­ent. I shall try to de­scribe this very briefly in terms of three spe­cif­ic areas: \( H^{p} \) spaces, sym­met­ric dif­fu­sion semig­roups, and dif­fer­en­ti­ation the­ory in \( \mathbf{R}^{n} \).

a: \( H^{p} \) theory

Be­gin­ning in about 1966 two sep­ar­ate dir­ec­tions of re­search in­volving square func­tions were un­der­taken, and when brought to­geth­er these ul­ti­mately led to a rich har­vest in the the­ory of \( H^{p} \) spaces. The first star­ted with Burk­hold­er’s [e31] ex­ten­sion of Pa­ley’s the­or­em (The­or­em 4 for Walsh–Pa­ley series) to gen­er­al mar­tin­gales. He ob­served that Pa­ley’s ar­gu­ment ex­ten­ded to this gen­er­al set­ting, but also found his own ap­proach which was very dif­fer­ent. He showed that if \[ E_{k}=E(\,\cdot\,\mid\mathcal{F}_{k}) \] are the con­di­tion­al ex­pect­a­tions for an in­creas­ing se­quence of \( \sigma \)-fields \( \{\mathcal{F}_{k}\}_{k=0}^{\infty} \), then with \( E_{-1}(f)\equiv 0 \), \begin{equation} \biggl\Vert\Bigl(\sum_{k=0}^{\infty}\bigl|(E_{k}-E_{k-1})(f)\bigr|^{2}\Bigr)^{1/2}\biggr\Vert_p\simeq\lim_{k\rightarrow\infty}\|E_{k}(f)\|_{p}, \quad 1 < p < \infty. \label{eqnonni} \end{equation}

Next, in work with Gundy, and later also with Sil­ver­stein, the fol­low­ing ad­vances were made:12 It was shown that \eqref{eqnonni} ex­ten­ded to \( p\leq 1 \) if \( \lim_{k\rightarrow\infty}\|E_{k}(f)\|_{p} \) was re­placed with \( \|\sup_{k}E_{k}(f)\|_{p} \), for a large class of mar­tin­gales. This class in­cid­ent­ally in­cludes those oc­cur­ring for the Walsh–Pa­ley series, but more im­port­antly these res­ults went over to the (con­tinu­ous para­met­er) mar­tin­gales arising from Browni­an mo­tion ap­plied to har­mon­ic func­tions. To be more pre­cise, let \( z_{t}(\omega) \) de­note the stand­ard Browni­an mo­tion in the com­plex \( z \)-plane, start­ing at the ori­gin and stopped when reach­ing the unit circle. Here \( 0\leq t < \infty \) is the time para­met­er, and \( \omega \) la­bels the Browni­an path, with \( \omega \in\Omega \), \( \Omega \) be­ing the prob­ab­il­ity space. If \( u \) is har­mon­ic in the unit disc, \( t\rightarrow u(z_{t}(\omega)) \) is a con­tinu­ous-time mar­tin­gale. Let \[ M_{B}(u)(\omega)=\sup_{0\leq t < \infty} |u(z_{t}(\omega))| \] be the Browni­an max­im­al func­tion, and \( S(u)(\omega) \) the mar­tin­gale square func­tion, \[ S(u)(\omega)=\biggl(\int_{0}^{\infty}|\nabla u(z_{t}(\omega))|^{2}\,dt\biggr)^{1/2} .\] Their res­ult then was that \begin{equation} \|Su\|_{L^{p}(\Omega)}\simeq\|M_{B}(u)\|_{L^{p}(\Omega)},\quad 0 < p < \infty, \label{eqntwze} \end{equation} whenev­er \( u(0)=0 \).

The most strik­ing ap­plic­a­tion of this circle of ideas was a con­clu­sion drawn from \eqref{eqntwze}, to wit, whenev­er \( F=u+iv \) is holo­morph­ic in the unit disc, then \( F\in H^{p} \) if and only if \( u^{*}\in L^{p},\ 0 < p < \infty \).

The second line of re­search began when a more dir­ect con­nec­tion between stand­ard mul­ti­pli­er op­er­at­ors and square func­tion was dis­covered. The res­ult was easy to state. Whenev­er \( T \) is a mul­ti­pli­er op­er­at­or of the Mar­cinkiewicz type on \( \mathbf{R}^{n} \) (more pre­cisely one that sat­is­fies the kind of con­di­tions put in Hörmander’s ver­sion of that mul­ti­pli­er the­or­em), then the area in­teg­ral cor­res­pond­ing to \( T(f) \) is point­wise dom­in­ated by a \( g^{*} \) func­tion of \( f \), i.e., \begin{equation} A(T\mkern-4mu f)(x)\leq cg_{\lambda}^{*}(f)(x), \label{eqntwon} \end{equation} where \[ g_{\lambda}^{*}(f)(x)=\biggl(\int\bigl|\nabla u(x-t, y)\bigr|^{2}\Bigl(\frac{y}{y+|t|}\Bigr)^{n\lambda}y^{1-n}\,dy\,dt\biggr)^{1/2}, \] and \( \lambda \) is a para­met­er which de­pends on the nature of the mul­ti­pli­er. An \( H^{p} \) the­ory in \( \mathbf{R}^{n} \) had already been ini­ti­ated sev­er­al years be­fore (by the ef­forts of G. Weiss and oth­ers), and us­ing it and \eqref{eqntwon} it fol­lowed that these mul­ti­pli­ers also ex­ten­ded to bounded op­er­at­ors on \( H^{p} \).

From these con­sid­er­a­tions it might be guessed that a ba­sic tool for \( H^{p} \) the­ory is the re­la­tion between square func­tions and max­im­al prop­er­ties of (har­mon­ic) func­tions. Here im­port­ant con­tri­bu­tions were made by C. Fef­fer­man. One of the res­ults ob­tained in this dir­ec­tion was the fol­low­ing the­or­em:

The­or­em 11:

Sup­pose that \( u \) is har­mon­ic in \( \mathbf{R}_{+}^{n+1} \), and \( u(x, y)\rightarrow 0 \) as \( y\rightarrow\infty \). Then [e40] \[ \|A(u)\|_{p}\simeq\|u^{*}\|_{p}, \quad 0 < p < \infty. \]

In­cid­ent­ally it should be re­marked that the proof used the same ap­proach as its “loc­al” ana­logue, The­or­em 9c, but ad­di­tion­al ar­gu­ments of a quant­it­at­ive nature were of course needed. More re­cently some of these res­ults for square func­tions have been ex­ten­ded to product do­mains, and in this con­text gen­er­al­iz­a­tions of The­or­ems 9 and 11 have been found.13

b: Symmetric diffusion semigroups

The semig­roups which are the sub­ject of the title are a fam­ily of op­er­at­ors \( \{T^{t}\}_{t\geq 0} \), each bounded and sel­fad­joint on \( L^{2} \), with \( T^{t} \) hav­ing norm \( \leq 1 \) on every \( L^{p} \), \( 1\leq p\leq\infty \), and \[ T^{t_{1}+t_{2}}=T^{t_{1}}T^{t_{2}} ,\] with \[ \lim_{t\rightarrow 0}T_{f}^{t}=f \] for \( f\in L^{2} \). Some­times the ad­di­tion­al hy­po­theses are made that \( T^{t}(1)=1 \), and \( T^{t} \) is pos­it­iv­ity-pre­serving.

The sig­ni­fic­ance of this no­tion de­rives from the many im­port­ant ex­amples of such semig­roups in ana­lys­is, and the many rich prop­er­ties that they share. In fact some of the ba­sic res­ults dis­cussed above have ses­sions val­id in this con­text. Here we men­tion two, a max­im­al the­or­em, and a mul­ti­pli­er the­or­em in the spir­it of Mar­cinkiewicz’s the­or­em (The­or­em 7).

The­or­em 11a:

\( \bigl\|\sup_{t > 0}|T^{t}f|\bigr\|_{p}\leq A_{p}\|f\|_{p},\ 1 < p\leq\infty \).

To for­mu­late the mul­ti­pli­er the­or­em we write \( T^{t} \) in terms of its spec­tral de­com­pos­i­tion, \[ T^{t}\mkern-2mu=\mkern-2mu\int_0^{\infty} e^{-\lambda t}\,dE(\lambda) ,\] where \( E(\lambda) \) is a spec­tral res­ol­u­tion on \( L^{2} \). For each bounded Borel meas­ur­able func­tion \( m \) on \( (0, \infty) \), con­sider the “mul­ti­pli­er” op­er­at­or \( T_{m} \) giv­en by \[ T_{m}=\int_0^\infty m(\lambda)\,dE(\lambda) .\] Here we as­sume that \( m \) is of the form \[ m(\lambda)=\lambda\int_{0}^{\infty}M(s)e^{-\lambda s}\,ds ,\] with \( M \) a bounded func­tion.

The­or­em 11b:

\( \|T_{m}(f)\|_{p}\leq A_{p}\|f\|_{p},\ 1 < p < \infty \).

A key tool used for the proof of both these the­or­ems are the Lit­tle­wood–Pa­ley type func­tions \[ g_{k}(f)(x)=\biggl(\int_{0}^{\infty}t^{2k-1}\Bigl|\frac{\partial^{k}}{\partial t^{k}}T^{t}(f)\Bigr|^{2}\,dt\biggr)^{1/2} \quad\text{with }k=1,2,\ldots. \] Also for \( T_{m} \) a re­la­tion of the same kind as \eqref{eqntwon} holds.14

c: Differentiation theorems in \( \mathbf{R}^{n} \)

Prob­ably the most dra­mat­ic ap­plic­a­tions of square func­tions oc­cur in dif­fer­en­ti­ation the­ory. The gen­er­al prob­lem here is to prove that \begin{equation} \lim_{\operatorname{diam} R\rightarrow 0}\frac{1}{\mu(R)}\int_{R}f(x-y)\,d\mu(y)=f(x)\quad \text{ a.e.} \label{eqntwtw} \end{equation} where \( R \) ranges over a suit­able col­lec­tion \( \mathcal{R} \) of sets “centered” at the ori­gin. The clas­sic­al ex­amples of these are (i) where \( \mathcal{R} \) is the col­lec­tion of all balls (or cubes) con­tain­ing the ori­gin, and (ii) where \( \mathcal{R} \) is the col­lec­tion of all rect­angles con­tain­ing the ori­gin, with sides par­al­lel to the axes. For each of these res­ults a Vi­tali-type cov­er­ing the­or­em has played a de­cis­ive res­ult. Thus it may seem sur­pris­ing that the ali­en no­tion of square func­tions would turn out to be the ap­pro­pri­ate idea in re­lated situ­ations, where cov­er­ing ar­gu­ments were un­avail­ing. In for­mu­lat­ing the res­ults ob­tained this way we shall, as is usu­al, deal with the cor­res­pond­ing max­im­al func­tion \[ M_{\mathcal{R}}(f)(x)=\sup_{R\in\mathcal{R}}\frac{1}{\mu(R)}\bigg|\int_{R}f(x-y)\,d\mu(y)\bigg|, \] and the pos­sib­il­ity of as­sert­ing in­equal­it­ies of the type \begin{equation} \|M_{\mathcal{R}}(f)\|_{p}\leq A_{p}\|f\|_{p}. \label{eqntwth} \end{equation}

The­or­em 12:

The in­equal­ity \eqref{eqntwth} holds in the fol­low­ing cases:

  1. \( \mathcal{R} \) is the col­lec­tion of spheres centered at the ori­gin; \( d\mu \) is the uni­form sur­face meas­ure; and \( n\geq 3 \), with \( p > n/(n-1) \).

  2. \( \mathcal{R} \) is the col­lec­tion of ini­tial seg­ments \( \{\gamma(t), 0\leq t\leq h\} \) of a smooth curve \( t\rightarrow\gamma(t) \), with \( \gamma(0)=0 \), and \( \gamma \) hav­ing nonzero “curvature” at the ori­gin; here \( d\mu \) is arc-length, \( n\geq 1 \) and \( p > 1 \).

  3. \( \mathcal{R} \) is the col­lec­tion of rect­angles (in \( \mathbf{R}^{2} \)) con­tain­ing the ori­gin, which make an angle \( \theta_{k} \) with a fixed dir­ec­tion, where \( \{\theta_{k}\} \) is a se­quence of num­bers tend­ing rap­idly to zero; here \( p > 1 \).

The proof of each part of this the­or­em re­quires its own square func­tion. We shall not de­scribe these rather com­plic­ated quad­rat­ic func­tions here, but refer the read­er to the lit­er­at­ure for fur­ther de­tails.15


Since the ori­gin­al draft of this es­say was writ­ten two new res­ults were found which use square func­tions in a de­cis­ive way.

The first is the solu­tion of the prob­lem of Cauchy’s in­teg­ral for Lipschitz curves by Coi­f­man, McIn­tosh, and Mey­er [e50]. It is to be noted that in Calderón’s ini­tial work on this prob­lem (1965), square func­tions were already used in a cru­cial way. In par­tic­u­lar the in­equal­ity \[ c\|F\|_{H^{p}}\leq\|A(F)\|_{p}, \quad p\leq 1 ,\] was proved there for this pur­pose.

The second res­ult deals with the stand­ard max­im­al func­tion in \( \mathbf{R}^{n} \) \[ M_{n}(f)(x)=\sup_{r > 0}\frac{1}{c_{n}r^{n}}\bigg|\int_{|y|\leq r}f(x-y)\,dy\bigg|, \] where \( c_{n} \) is the volume of the unit ball in \( \mathbf{R}^{n} \).

The ques­tion that arises is, how does the \( L^{p} \) norm of \( M_{n} \) be­have for large \( n \)? The best that can be proved by the usu­al Vi­tali cov­er­ing ar­gu­ments gives \[ \|M_{n}(f)\|_{p}\leq A(p, n)\|f\|_{p}, \quad 1 < p ,\] with \( A(p, n)\leq A(p)\,2^{n/p} \), which is a large growth as \( n\rightarrow\infty \). However much more can be said.

The­or­em 13:

\( \|M_{n}(f)\|_{p}\leq A_{p}\|f\|_{p},\ 1 < p\leq\infty \), with \( A_{p} \) in­de­pend­ent of \( n \).

The idea of the proof is to con­sider in \( \mathbf{R}^{m} \) the max­im­al func­tions \( M_{m,k} \) defined by \[ M_{m,k}(f)(x)=\sup_{r > 0}\frac{\big|\int_{|y|\leq r}f(x-y)|y|^{k}\,dy\big|}{\int_{|y|\leq r}|y|^{k}\,dy},\quad k\geq 0. \] Then if \( m \) is so large that \( p > m/(m-1) \),

\begin{equation} \|M_{m,k}(f)\|_{p}\leq A_{p,m}\|f\|_{p} \label{eqntwfo} \end{equation}

with \( A_{p,m} \) in­de­pend­ent of \( k \), \( k\geq 0 \). This fol­lows from The­or­em 12, Part (a). From this The­or­em 13 is ob­tained by lift­ing the \( m \)-di­men­sion­al res­ult \eqref{eqntwfo} in­to \( \mathbf{R}^{n} \), where \( n\geq m \) (and \( k=n-m \)), by in­teg­rat­ing over the Grass­man­ni­an of \( m \)-planes in \( \mathbf{R}^{n} \) through the ori­gin.


[1] A. Zyg­mund: “Une re­marque sur un théorème de M. Kaczmarz” [A re­mark on a the­or­em of M. Kaczmarz], Math. Z. 25 : 1 (1926), pp. 297–​298. MR 1544811 JFM 52.​0278.​01 article

[2] A. Zyg­mund: “Sur l’ap­plic­a­tion de la première moy­enne arith­métique dans la théor­ie des séries de fonc­tions or­tho­gonales” [On the ap­plic­a­tion of the first arith­met­ic mean to the the­ory of series of or­tho­gon­al func­tions], Fun­dam. Math. 10 (1927), pp. 356–​362. JFM 53.​0267.​04 article

[3] J. Mar­cinkiewicz and A. Zyg­mund: “A the­or­em of Lus­in,” Duke Math. J. 4 : 3 (1938), pp. 473–​485. MR 1546069 JFM 64.​0268.​01 Zbl 0019.​42001 article

[4] A. Zyg­mund: “On the con­ver­gence and sum­mab­il­ity of power series on the circle of con­ver­gence, I,” Fun­dam. Math. 30 (1938), pp. 170–​196. Part II was pub­lished in Proc. Lon­don Math. Soc. 47:1. JFM 64.​1054.​01 Zbl 0019.​01602 article

[5] A. Zyg­mund: “On cer­tain in­teg­rals,” Trans. Am. Math. Soc. 55 (1944), pp. 170–​204. MR 0009966 Zbl 0061.​13902 article

[6] A. Zyg­mund: “Proof of a the­or­em of Lit­tle­wood and Pa­ley,” Bull. Am. Math. Soc. 51 : 6 (1945), pp. 439–​446. MR 0012306 Zbl 0060.​14703 article

[7] A. P. Cal­der­on and A. Zyg­mund: “On the ex­ist­ence of cer­tain sin­gu­lar in­teg­rals,” Acta Math. 88 (December 1952), pp. 85–​139. Ded­ic­ated to Pro­fess­or Mar­cel Riesz, on the oc­ca­sion of his 65th birth­day. MR 0052553 Zbl 0047.​10201 article

[8] A. Zyg­mund: “On the Lit­tle­wood–Pa­ley func­tion \( g^*(\theta) \),” Proc. Natl. Acad. Sci. U. S. A. 42 : 4 (April 1956), pp. 208–​212. MR 0077700 Zbl 0072.​07201 article

[9] M. Weiss and A. Zyg­mund: “A note on smooth func­tions,” Nederl. Akad. Wetensch. Proc. Ser. A 62 : 1 (1959), pp. 52–​58. MR 0107122 Zbl 0085.​05701 article

[10] A. Zyg­mund: Tri­go­no­met­ric series, 2nd edition, vol. I. Cam­bridge Uni­versity Press (New York), 1959. First volume of an en­larged edi­tion of 1935 ori­gin­al. MR 0107776 Zbl 0085.​05601 book

[11] A.-P. Calder­ón and A. Zyg­mund: “Loc­al prop­er­ties of solu­tions of el­lipt­ic par­tial dif­fer­en­tial equa­tions,” Stu­dia Math. 20 : 2 (1961), pp. 171–​225. This ex­pands on an art­icle pub­lished in Proc. Natl. Acad. Sci. U.S.A. 46:10 (1960). MR 0136849 Zbl 0099.​30103 article

[12] E. M. Stein and A. Zyg­mund: “On the dif­fer­en­ti­ab­il­ity of func­tions,” Stu­dia Math. 23 : 3 (1963–1964), pp. 247–​283. Ded­ic­ated to E. Hille on the oc­ca­sion of his 70th birth­day. MR 0158955 Zbl 0122.​30203 article