# Celebratio Mathematica

## Antoni Zygmund

### The development of square functions in the work of A. Zygmund

#### by Elias M. Stein

I’ve de­cided to write this es­say about “square func­tions” for two reas­ons. First, their de­vel­op­ment has been so in­ter­twined with the sci­entif­ic work of A. Zyg­mund that it seems highly ap­pro­pri­ate to do so now on the oc­ca­sion of his 80th birth­day. Also these func­tions are of fun­da­ment­al im­port­ance in ana­lys­is, stand­ing as they do at the cross­ing of three im­port­ant roads many of us have trav­elled by: com­plex func­tion the­ory, the Four­i­er trans­form (or or­tho­gon­al­ity in its vari­ous guises), and real-vari­able meth­ods. In fact, the more re­cent ap­plic­a­tions of these ideas, de­scribed at the end of this es­say, can be seen as con­firm­a­tion of the sig­ni­fic­ance Zyg­mund al­ways at­tached to square func­tions.

This is go­ing to be a partly his­tor­ic­al sur­vey, and so I hope you will al­low me to take the usu­al liber­ties as­so­ci­ated with this kind of en­ter­prise: I will break up the ex­pos­i­tion in­to cer­tain “his­tor­ic­al peri­ods”, five to be pre­cise; and by do­ing this I will be able to sug­gest my own views as to what might have been the key in­flu­ences and ideas that brought about these de­vel­op­ments.

One word of ex­plan­a­tion about “square func­tions” is called for. A deep concept in math­em­at­ics is usu­ally not an idea in its pure form, but rather takes vari­ous shapes de­pend­ing on the uses it is put to. The same is true of square func­tions. These ap­pear in a vari­ety of forms, and while in spir­it they are all the same, in ac­tu­al prac­tice they can be quite dif­fer­ent. Thus the meta­morph­os­is of square func­tions is all im­port­ant.

#### First period (1922–1926): The primordial square functions

It ap­pears that square func­tions arose first in an ex­pli­cit form in a beau­ti­ful the­or­em of Kaczmarz and Zyg­mund deal­ing with the al­most every­where sum­mab­il­ity of or­tho­gon­al ex­pan­sions. The the­or­em was proved in 1926 as the cul­min­a­tion of sev­er­al pa­pers each had writ­ten at about that time. The the­or­em it­self was an out­growth of what cer­tainly was one of the main pre­oc­cu­pa­tions of ana­lysts at that time, namely the ques­tion of con­ver­gence of Four­i­er series. The prob­lem was the fol­low­ing. Sup­pose $f=f(\theta)$ is a con­tinu­ous func­tion on the circle, $0\leq\theta\leq 2\pi$, or more gen­er­ally as­sume that $f$ is in $L^{2}(0,2\pi)$ or even that $f$ is merely in­teg­rable; then does its Four­i­er series $$\sum a_{n}e^{in \theta} \quad\text{with}\quad a_{n}=\frac{1}{2\pi}\int_{0}^{2\pi}f(\theta)e^{-in \theta}\,d\theta, \label{eqnon}$$ con­verge al­most every­where?

A re­lated par­al­lel is­sue was the cor­res­pond­ing ques­tion for a gen­er­al or­thonor­mal ex­pan­sion, but now lim­ited to $f\in L^{2}$. Thus if $\{\phi_{n}\}$ is an or­thonor­mal sys­tem, and if $f\sim\sum a_{n}\phi_{n} \quad\text{with}\quad a_{n}=\int f\overline{\phi}_{n} ,$ where $\sum |a_{n}|^{2} < \infty$, then what could be said about the con­ver­gence al­most every­where of $$\sum_{n=1}^{\infty}a_{n}\phi_{n}(x)? \label{eqntw}$$

The peri­od we are deal­ing with (1922–1926) was marked by sev­er­al strik­ing achieve­ments in this area, whose es­sen­tial in­terest is not di­min­ished even when viewed from the dis­tant per­spect­ive of more than a half cen­tury. The first res­ult to men­tion was the con­struc­tion by Kolmogorov in 1923 [e3] of an $L^{1}$ func­tion whose Four­i­er series \eqref{eqnon} di­verged al­most every­where.1 This con­struc­tion made even more press­ing the ques­tion of wheth­er the Four­i­er series \eqref{eqnon} con­verges al­most every­where when (say) $f$ be­longs to $L^{2}$, a prob­lem that was not solved till more than forty years later. We shall turn to that in a mo­ment, but now we point out that Kolmogorov’s ex­ample put in­to sharp­er re­lief the $L^{2}$ res­ults for gen­er­al or­thonor­mal de­vel­op­ments that had been ob­tained (in 1922 and 1923) by Rademach­er and Men­shov. They showed that if $$\sum|a_{n}|^{2}(\log n)^{2} < \infty \label{eqnth}$$ then the series \eqref{eqntw} con­verges a.e.

Moreover the con­di­tion \eqref{eqnth} is best pos­sible in the sense that if $\{\lambda_{n}\}$ is mono­ton­ic and $\smash{\lambda_{n}}/\log n\rightarrow 0 ,$ then there ex­ists an or­thonor­mal sys­tem $\{\phi_{n}\}$ and ex­pan­sion \eqref{eqntw} which di­verged a.e., while $\sum \smash{|a_{n}|^{2}\lambda_{n}^{2}} < \infty .$

For or­din­ary Four­i­er series it was proved2 that the con­di­tion \eqref{eqnth} could be re­laxed and be re­placed by $$\sum_{-\infty}^{\infty}|a_{n}|^{2}\log(|n|+2) < \infty. \label{eqnfo}$$

This last res­ult stood un­sur­passed for forty years un­til Car­leson in 1966 showed that in­deed the Four­i­er series of an $L^{2}$ func­tion con­verged al­most every­where. It may be in­ter­est­ing to note here that the ba­sic tools re­quired for Car­leson’s the­or­em — the prop­er­ties of the Hil­bert trans­form and their re­la­tion with par­tial sums of Four­i­er series — were first brought to light in this early peri­od: Kolmogorov’s proof of the weak-type (1, 1) prop­erty in 1925; M. Riesz’s pa­per of 1927 [e11] con­tain­ing the $L^{p}$ in­equal­it­ies for con­jug­ate func­tions and par­tial sums; and Be­sicov­itch’s work (in 1923 [e2] and 1926 [e6]) which began the de­vel­op­ment of “real-vari­able” meth­ods for Hil­bert trans­forms.

Against this back­ground we can now state the idea of Kaczmarz and Zyg­mund. It as­serts as a gen­er­al prin­ciple that for an $L^{2}$ or­thonor­mal ex­pan­sion (i.e., one where $\sum |a_{n}|^{2} < \infty$), at al­most all points the sum­mab­il­ity of the series $\sum a_{n}\phi_{n}(x)$ by one meth­od one has as a con­sequence the sum­mab­il­ity by any oth­er meth­od which is es­sen­tially stronger than con­ver­gence. A spe­cial (but typ­ic­al) case is as fol­lows:

The­or­em 1:

Sup­pose $\sum|a_{n}|^{2} < \infty$. Then $\sum a_{n}\phi_{n}(x)$ is Cesàro sum­mable at al­most each point $x$ where it is Abel sum­mable.

Re­call that the series is Abel sum­mable at $x$ if $\lim_{r\rightarrow 1}-\sum a_{n}r^{n}\phi_{n}(x) \quad\text{exists}.$ In ad­di­tion, set­ting \begin{align*} s_{n} &=\sum_{k\equiv 0}^{n}a_{k}\phi_{k} \quad\text{and}\\ \sigma_{n} & =(s_{0}+s_{1}+\cdots+s_{n-1})/n , \end{align*} the Cesàro sum­mab­il­ity at $x$ means the ex­ist­ence of the lim­it $\lim_{n\rightarrow\infty}\sigma_{n}(x) .$

If a series is Cesàro sum­mable it is auto­mat­ic­ally Abel sum­mable (an ex­er­cise!), but the con­verse is in gen­er­al not true. To gain a bet­ter idea of the scope of The­or­em 1 let us point out that $\sigma_{n}(x) =\sum_{k=0}^{n}\Bigl(1-\frac kn\Bigr)a_{k}\phi_{k}(x)$ and a res­ult sim­il­ar to The­or­em 1 holds when $\sigma_{n}(x)$ is re­placed by $\sigma_{n}^{\epsilon}(x) = \sum_{k=0}^{n}\Bigl(1-\frac kn\Bigr)^{\epsilon}a_{k}\phi_{k}(x) \quad\text{with }\epsilon > 0$ (which cor­res­ponds es­sen­tially to $(C, \epsilon)$ sum­mab­il­ity), but not for $\epsilon=0$ which of course would give the usu­al con­ver­gence.

For the proof of The­or­em 1 Kaczmarz and Zyg­mund used a square func­tion which they in­tro­duced for this pur­pose, namely $$K(f)=\biggl(\sum_{n=2}^{\infty}n|\sigma_{n}-\sigma_{n-1}|^{2}\biggr)^{\mkern-2mu{1/2}} \label{eqnfi}$$ with $f\sim\sum a_{n}\phi_{n}$. The ba­sic fact was the $L^{2}$ in­equal­ity.

$\|K(f)\|_{L^{2}} \leq C \|f\|_{L^{2}}.$

Clearly $\sigma_{n}=\sigma_{n-1}=\frac{1}{n(n-1)}\sum_{k=0}^{n-1}ka_{k}\phi_{k},$ so $\|\sigma_{n}-\sigma_{n-1}\|_{2}^{2}\leq\frac{c}{n^{4}}\sum_{k < n}k^{2}|a_{k}|^{2}, \quad n\geq 2,$ and thus $\sum_{2}^{\infty}n\|\sigma_{n}-\sigma_{n-1}\|_{2}^{2}\leq c^{\prime}\sum|a_{k}|^{2}=c^{\prime}\|f\|_{2}^{2} ,$ which proves the lemma.

To prove the the­or­em one in­vokes a vari­ant of the clas­sic­al Tauberi­an ar­gu­ment, namely, if $\smash{\sum A_{n}}$ is Abel sum­mable and $\sum nA_{n}^{2} < \infty$, then $\sum A_{n}$ con­verges. Now set $A_{n}=\sigma_{n}-\sigma_{n-1}$; then the Abel sum­mab­il­ity of $\sum A_{n}$ fol­lows from the cor­res­pond­ing Abel sum­mab­il­ity of $\sum a_{n}\phi_{n}$. The Tauberi­an con­di­tion holds at al­most all points be­cause of the lemma, and hence one ob­tains a.e. the con­ver­gence of $\sum(\sigma_{n}-\sigma_{n-1})$, prov­ing the the­or­em.

We have seen the first ex­ample of a square func­tion, namely \eqref{eqnfi}. While here it plays a minor role, its ba­sic char­ac­ter is already re­vealed: Be­cause of the agil­ity of its quad­rat­ic nature it can ex­ploit eas­ily any situ­ation in which or­tho­gon­al­ity might be im­port­ant.

#### Second period (1931–1938): Littlewood and Paley

Our scene shifts now from the Con­tin­ent to Eng­land, and to the work of Lit­tle­wood and Pa­ley. Our at­ten­tion will be fo­cused on two im­port­ant series of con­nec­ted pa­pers: three jointly by Lit­tle­wood and Pa­ley 1931–1938 [e14], [e17], [e19], and two by Pa­ley 1932 [e15], [e16]. The in­vest­ig­a­tions de­scribed in these pa­pers were ini­ti­ated sim­ul­tan­eously (the first pa­per in each series was sub­mit­ted in April 1931), but be­cause of Pa­ley’s death in 1933 the fi­nal ver­sions of sev­er­al of the pa­pers were prob­ably Lit­tle­wood’s work alone. It is also in­ter­est­ing to note that no ref­er­ence is made in these pa­pers to the res­ults de­scribed above, and so it is a reas­on­able guess that they were not aware of the pos­sible rel­ev­ance of the ideas of Kaczmarz and Zyg­mund.

The main theme of the Lit­tle­wood–Pa­ley work was to con­sider the “dy­ad­ic de­com­pos­i­tion” of Four­i­er series, namely $f(\theta)=\sum_{k=0}^{\infty}\Delta_{k}(\theta),$ with \begin{align*} & \Delta_{k}(\theta) =\sum_{2^{k-1}\leq|n| < 2^{k}}a_{n}e^{in \theta}, \quad k\geq1;\\ & \Delta_{0} =a_{0}. \end{align*}

Their ba­sic res­ult was that the $L^{p}$ norm of a func­tion was equi­val­ent with the $L^{p}$ norm of the square func­tion as­so­ci­ated with its dy­ad­ic de­com­pos­i­tion.

The­or­em 2:

For $1 < p < \infty$, $\biggl\Vert\Bigl(\sum_{k=0}^{\infty}|\Delta_{k}(\theta)|^{2}\Bigr)^{1/2}\biggr\Vert_{p}\simeq\|f\|_{p}.$

To prove this the­or­em they needed and thus for­mu­lated an “abeli­an” ana­logue, where par­tial sums are re­placed by Abel means, i.e., the Pois­son in­teg­ral of $f=u(r, \theta)$. Thus giv­en $f$, let $\Phi$ be the holo­morph­ic func­tion in the unit disc with $\operatorname{Re}(\Phi)=u$, and $\operatorname{Im}(\Phi(0))=0$. They defined an­oth­er square func­tion the “$g$-func­tion” of $f$ by $g(f)(\theta)=\biggl(\int_{0}^{1}(1-r)\bigl|\Phi^{\prime}(r`e^{i\theta})\bigr|^{2}\,dr\biggr)^{1/2}$ and proved the fol­low­ing

The­or­em 3:

With $1 < p < \infty$ $$\|g(f)\|_{p}\simeq\|f\|_{p} \quad\textit{if } a_{0}=0. \label{eqnsi}$$

Pa­ley sought a bet­ter un­der­stand­ing of the nature of these prob­lems by con­sid­er­ing vari­ants of The­or­em 2 where the Four­i­er series ex­pan­sion is re­placed by the Walsh–Pa­ley ex­pan­sion. The Walsh–Pa­ley func­tions (called Walsh–Kaczmarz func­tions at that time) are now usu­ally de­scribed as fol­lows. We identi­fy the in­ter­val $[0,1]$ with the com­pact group con­sist­ing of an in­fin­ite product of cop­ies of the two-ele­ment group (via the usu­al bin­ary ex­pan­sion). The char­ac­ters of that group are the Walsh–Pa­ley func­tions. Writ­ing each in­teger as a sum of powers of 2 gives a nat­ur­al enu­mer­a­tion of the char­ac­ters $\{\phi_{n}\}_{n=0}^{\infty}$. If we set \begin{align*} & f\sim\sum a_{n}\phi_{n} \quad\text{and}\\ & \Delta_{k}=s_{2^{k}}-s_{2^{k-1}}=\sum_{2^{k-1} < n\leq 2^{k}}a_{n}\phi_{n} \quad\text{with}\\ & \Delta_{0}=a_{0}, \end{align*} then Pa­ley’s the­or­em reads as

The­or­em 4:

For the Walsh–Pa­ley series, with $1 < p < \infty$ $\biggl\Vert\Bigl(\sum|\Delta_{k}|^{2}\Bigr)^{1/2}\biggr\Vert_{p}\simeq\|f\|_{p}.$

What makes the proof of The­or­em 4 easi­er than that of The­or­em 2 are the vari­ous sim­pli­fic­a­tions in­her­ent in the fact that $\{s_{2^{k}}(f)\}$ is a mar­tin­gale se­quence. The name “mar­tin­gale” had not yet been coined. Moreover, a sys­tem­at­ic ex­ten­sion of The­or­em 4 from the point of view of mar­tin­gales, and its fur­ther ex­plor­a­tion in the ma­gic­al world of Browni­an mo­tion — all these came much later, as we shall see. However in Pa­ley’s time some of the ar­gu­ments typ­ic­al of mar­tin­gale the­ory were already un­der­stood. Thus it had been ob­served that $s_{2^{k}}(f)$ was con­stant on each $2^{k}$ in­ter­vals (of length $2^{-k}$) of the form $\bigl((l-1)/2^{k}, \ l/2^{k}\bigr) , \qquad l=1,\ldots,2^{k},$ and that the value of $s_{2^{k}}(f)$ on each of these in­ter­vals was the mean-value of $f$ there. From this it is ob­vi­ous when $f\in L^{p}$, $1\leq p\leq\infty$, then $\{s_{2^{k}}(f)\}$ are bounded in $L^{p}$ norm; the ana­logue for Four­i­er series is def­in­itely nonob­vi­ous when $1 < p < \infty$, and in fact false when $p=1$ or $p=\infty$.

We shall now de­scribe the main device Pa­ley used in his proof of The­or­em 4. Pa­ley was, from what one can learn about his life, a man of cour­age and al­most reck­less dar­ing. A hint of that spir­it can be found in his ap­proach to dif­fi­cult math­em­at­ic­al prob­lems. When faced by the proof of an in­equal­ity like $$\int\Bigl(\sum|\Delta_{k}|^{2}\Bigr)^{p/2}\,dx\leq A_{p}^{p}\int|f|^{p}\,dx \label{eqnse}$$ where $p$ is e.g. an even in­teger $2r$, he in­stinct­ively sought to face the prob­lem head-on by mul­tiply­ing out the $r$ in­fin­ite sums, and then com­ing to grips dir­ectly with the res­ult­ing mul­ti­tude of terms. This kind of au­da­cious at­tack is not so com­mon in our time when it is easi­er to rely on a vari­ety of soph­ist­ic­ated gad­gets which are house­hold items for the work­ing ana­lyst. But giv­en Pa­ley’s re­source­ful­ness this ap­proach worked mar­velously well. His key ob­ser­va­tion was that $$\sum_{i_{r}}\int\Delta_{i_{1}}^{2}\Delta_{i_{2}}^{2}\cdots\Delta_{i_{r}}^{2}\,dx \leq \int\Delta_{i_{1}}^{2}\cdots\Delta_{i_{r-1}}^{2}f^{2}\,dx \label{eqnei}$$ where the sum­ma­tion is taken over those $i_{r}$ for which $i_{r} > \max(i_{1},\ldots$, $i_{r-1})$, which in turn fol­lows from the mar­tin­gale prop­erty that $$\int g(x)\Delta_{k}(x)\,dx=0 \label{eqnni}$$ whenev­er $g$ is “meas­ur­able with re­spect to the past”. From \eqref{eqnei} Pa­ley was able to achieve the proof of \eqref{eqnse} in a few strokes.

The same idea in­spired Lit­tle­wood and Pa­ley’s proof of The­or­em 3, al­though the ex­e­cu­tion is more com­plic­ated; a more re­con­dite form of \eqref{eqnei} must be proved, and here noth­ing as simple as \eqref{eqnni} holds. The ap­pro­pri­ate sub­sti­tute must be fash­ioned with care out of Green’s the­or­em in con­junc­tion with the iden­tity $\Delta(|\Phi|^{2})=4|\Phi^{\prime}|^{2} .$ With The­or­em 3 proved, Lit­tle­wood and Pa­ley were able to de­duce The­or­em 4, but here also the steps re­quired were not easy. It was only after their the­ory was reex­amined by Zyg­mund and his stu­dent Mar­cinkiewicz, that a clear­er and broad­er view of the whole sub­ject began to emerge. To this we shall now turn.

#### Third period (1938–1945): Marcinkiewicz and Zygmund

There are two sig­ni­fic­ant events that marked the peri­od we are now con­cerned with. The first, which even pred­ated the Lit­tle­wood–Pa­ley col­lab­or­a­tion, was the in­tro­duc­tion by Lus­in in 1930 [e13] of his “area in­teg­ral”. The idea of Lus­in seems to have sparked no fur­ther in­terest un­til Mar­cinkiewicz and Zyg­mund took up the sub­ject again about 8 years later. There began a brief but very cre­at­ive peri­od of work by them — a flower­ing of the the­ory where con­nec­tions with a vari­ety of oth­er ideas were brought to light. The second event, a tra­gic one, fol­lowed soon there­after with the death of Mar­cinkiewicz in 1940, and it was left to Zyg­mund alone to re­solve some of the is­sues that their work had led them to.

It may help to cla­ri­fy the de­scrip­tion of the prin­cip­al ideas that Mar­cinkiewicz and Zyg­mund con­trib­uted to the study of square func­tions if we or­gan­ize our present­a­tion in terms of the four main lines along which their work pro­ceeded.

The first sub­ject we shall treat (and the only one that was, strictly speak­ing, joint work) deals with the area in­teg­ral of Lus­in. The defin­i­tion of this is as fol­lows. Sup­pose $\Phi(z)$ is holo­morph­ic in the unit disc and define $A(\Phi)(\theta)$ by $$(A(\Phi)(\theta))^{2}=\int_{\Gamma(\theta)}|\Phi^{\prime}(z)|^{2}\,dx\,dy \label{eqnonze}$$ with $\Gamma(\theta)$ a stand­ard “tri­angle” (nontan­gen­tial ap­proach re­gion) in the unit disc with ver­tex at $e^{i\theta}$. Ob­serve that the ex­pres­sion rep­res­ents the area of the im­age of $\Gamma(\theta)$ un­der the map­ping $z\rightarrow\Phi(z)$, with points coun­ted ac­cord­ing to their mul­ti­pli­city. Lus­in’s dis­cov­ery was that if $\Phi$ is bounded, then $A(\Phi)(\theta)$ is fi­nite for al­most any $\theta$; more gen­er­ally that $$\|A(\Phi)(\theta)\|_{2}\simeq\|\Phi\|_{2} \quad\text{if } \Phi(0)=0. \label{eqnonon}$$

Mar­cinkiewicz and Zyg­mund real­ized that on the one hand there was a close ana­logy between the Lit­tle­wood–Pa­ley $g$-func­tion and $A(\Phi)$ (in fact $A$ is a point­wise ma­jor­ant of $g$, and the same kind of $L^{p}$ in­equal­it­ies held for $A$ as for $g)$; but on the oth­er hand they sur­mised that the par­al­lel between these two square func­tions should not be pushed too far. The main res­ult they ob­tained for $A$ was a loc­al­ized ver­sion of Lus­in’s res­ult. This can be stated as fol­lows. Let $\Phi^{*}(\theta)=\sup_{z\in\Gamma(\theta)}|\Phi(z)| .$

The­or­em 5a:

If $\Phi$ is holo­morph­ic in the unit disc, then for al­most every $\theta$, $\Phi^{*}(\theta) < \infty$ im­plies $A(\Phi)(\theta) < \infty$.

The con­verse was proved five years later by Spen­cer,3 namely

The­or­em 5b:

If $\Phi$ is holo­morph­ic in the unit disc, then for al­most every $\theta$, $A(\Phi)(\theta) < \infty$ im­plies $\Phi^{*}(\theta) < \infty$.

A cor­res­pond­ing con­verse for $g$-func­tions is false, and so the area in­teg­ral $A$ has some spe­cial af­fin­it­ies with the bound­ary be­ha­vi­or of $\Phi$, go­ing bey­ond what it shares with $g$.

The second line of in­vest­ig­a­tion was Zyg­mund’s reex­am­in­a­tion of the Lit­tle­wood–Pa­ley the­or­em for the dy­ad­ic de­com­pos­i­tion of Four­i­er series. His ana­lys­is led him to re­cast and sim­pli­fy the ideas of the proof. These sim­pli­fic­a­tions had im­port­ant con­sequences for later work, as we shall see; but their im­me­di­ate in­terest was that it al­lowed him to con­nect the square func­tion $\bigl(\sum|\Delta_{k}|^{2}\bigr)^{1/2}$ with the one he and Kaczmarz had con­sidered a dozen years earli­er in their study of sum­mab­il­ity of or­tho­gon­al series (see \eqref{eqnfi}). We sup­pose that we take the Four­i­er ex­pan­sion and set $f(\theta)\sim\sum_{n\geq 0}a_{n}e^{in \theta} ,$ $f\in L^{p}$, so that $f\in H^{p}$. If we write as be­fore $K(f)(\theta)=\biggl(\sum_{n\geq 1}n \bigl|\sigma_{n}(\theta)-\sigma_{n-1}(\theta)\bigr|^{2}\biggr)^{1/2}$ where $\sigma_{n}(\theta)=\sum_{0\leq k < n}\Bigl(1-\frac kn\Bigr)a_{k}e^{ik\theta} ,$ then we can state the fol­low­ing the­or­em:

The­or­em 6:

$\|K(f)\|_{p}\leq A_{p}\|f\|_{p},\ 1 < p < \infty$.4

The proof of this the­or­em re­quired two steps. First, like that of The­or­em 2, one needed the $L^{p}$ in­equal­it­ies for the $g$-func­tion (see \eqref{eqnsi}). Here the ma­jor sim­pli­fic­a­tion was made by Zyg­mund some years later5 and it came in the proof of the fact that $\|g(f)\|_{p}\leq A_{p}\|f\|_{p} ,$ when $p > 2$. (The case $p=2$ was easy, and the range $p < 2$ was re­du­cible to $p=2$ by the ar­ti­fice stand­ard in those days of us­ing Blasch­ke product de­com­pos­i­tions for $H^{p}$ func­tions.) For the dif­fi­cult case $p > 2$ a “square du­al­ity” was used. An in­geni­ous ar­gu­ment shows that whenev­er $\phi \geq 0$, $$\int g(f)^{2}\phi\,d\theta \leq c \biggl\{\int g(f)g(\phi)M(f)\,d\theta + \int|f|^{2}\phi\,d\theta\biggr\} \label{eqnontw}$$ where $M$ is the Hardy–Lit­tle­wood max­im­al func­tion. For $p\geq 4$, \eqref{eqnontw} then gives the de­sired res­ult as a con­sequence of the case $p\leq 2$ ap­plied to $g(\phi)$. In­cid­ent­ally, the no­tion of square du­al­ity which seems to have ori­gin­ated in this con­text con­tin­ues to find oth­er ap­plic­a­tions of in­terest.

The second sim­pli­fic­a­tion Zyg­mund made was in the man­ner in which one could re­duce the $L^{p}$ con­trol of $\bigl(\sum|\Delta_{k}|^{2}\bigr)^{1/2}$ to that of the $g$-func­tion; and in fact a whole list of oth­er square func­tions (in par­tic­u­lar, $\bigl(\sum_n |\sigma_{n}-\sigma_{n-1}|^{2}\bigr)^{1/2}$) could be handled in the same way.6 This stream­lin­ing of the proof he found can be said to have led dir­ectly to the “Mar­cinkiewicz mul­ti­pli­er the­or­em”.

In its one-di­men­sion­al form the cel­eb­rated the­or­em that bears Mar­cinkiewicz’s name can be stated as fol­lows. Sup­pose we con­sider a trans­form­a­tion $T$ giv­en by a mul­ti­pli­er se­quence $\{\lambda_{n}\}_{-\infty}^{\infty}$, defined by $Tf\sim\sum\lambda_{n}a_{n}e^{in \theta} \quad\text{ whenever } f\sim\sum a_{n}e^{in \theta}.$ Then $T$ is bounded on $L^{p},\ 1 < p < \infty$, if (i) the se­quence $\{\lambda_{n}\}$ is bounded, and (ii) if it var­ies boundedly over each dy­ad­ic block; more pre­cisely, $\sum_{2^{k}\leq|j| < 2^{k+1}} |\lambda_{j}-\lambda_{j-1}|\leq M .$ (Note that the spe­cial case when the se­quence is con­stant on each dy­ad­ic block is an im­me­di­ate con­sequence of The­or­em 2.) In one di­men­sion the the­or­em’s greatest mer­it is, I be­lieve, in its for­mu­la­tion rather than its proof; the lat­ter is much the same as that of The­or­em 6.

It is in the pas­sage to high­er di­men­sions, however, that one finds the great sig­ni­fic­ance of Mar­cinkiewicz’s work on mul­ti­pli­ers. Its im­port­ance was not only the fact that one could use hitherto one-di­men­sion­al meth­ods to prove $n$-di­men­sion­al res­ults; even more pro­found were the ap­plic­a­tions to oth­er ques­tions, such as es­tim­ates for par­tial dif­fer­en­tial equa­tions, already en­vis­aged at that time. We can now see in ret­ro­spect that Mar­cinkiewicz thus an­ti­cip­ated some of the ba­sic in­equal­it­ies later proved by the the­ory of sin­gu­lar in­teg­rals.7 For sim­pli­city of nota­tion we shall state the Mar­cinkiewicz mul­ti­pli­er the­or­em in the case of two di­men­sions. Con­sider the mul­ti­pli­er op­er­at­or $T$ giv­en by $Tf\sim\sum\lambda_{nm}a_{nm}e^{i(n\theta+m\phi)} \quad\text{for } f\sim\sum a_{nm}e^{i(n\theta+m\phi)} .$ Let $I_{k}$ de­note the dy­ad­ic in­ter­val $\{n\mid 2^{k-1}\leq|n| < 2^{k}\} \quad\text{and}\quad J_{l}=\{m\mid 2^{l-1}\leq|m| < 2^{l}\} .$ Write \begin{align*} & \Delta_{1}\lambda_{n,m}=\lambda_{n+1,m}-\lambda_{n,m},\\ & \Delta_{2}\lambda_{n,m}=\lambda_{n,m+1}-\lambda_{n,m}, \text{ and}\\ & \Delta_{1,2}=\Delta_{1}\cdot\Delta_{2}. \end{align*} Now as­sume the fi­nite­ness of the fol­low­ing four quant­it­ies:

1. $\sup_{n,m}|\lambda_{n,m}|$;

2. $\sup_{k,m}\sum_{n\in I_{k}}|\Delta_{1}\lambda_{n,m}|$, and $\sup_{m,l}\sum_{m\in J_{l}}|\Delta_{2}\lambda_{n,m}|$; and

3. $\sup_{k,l}\sum_{n\in I_{k}}\sum_{n\in J_{l}}|\Delta_{1}\Delta_{2}\lambda_{n,m}|$.

The­or­em 7:

Un­der the as­sump­tion made above, $T$ is bounded on $L^{p},\ 1 < p < \infty$.

The last of the four ma­jor lines of in­vest­ig­a­tion con­cern­ing square func­tions that Mar­cinkiewicz and Zyg­mund un­der­took dealt with the at­tempt to find a com­pletely “real-vari­able” ana­logue of the func­tions of Lus­in and Lit­tle­wood–Pa­ley. Start­ing with a func­tion $f$ on the circle, the area in­teg­ral and $g$-func­tions are defined in terms of holo­morph­ic (or har­mon­ic) func­tions whose bound­ary val­ues are re­lated to $f$. Also the dy­ad­ic square func­tion of The­or­em 2 re­quires the Four­i­er ex­pan­sion of $f$. What was de­sired was a vari­ant that could be defined more dir­ectly in terms of the ba­sic real-vari­able op­er­a­tions such as in­teg­ra­tion, dif­fer­en­ti­ation, etc.

After some ex­per­i­ment­a­tion Mar­cinkiewicz hit upon the idea of con­sid­er­ing $$\mu(F)(x)=\biggl(\int_{0}^{\pi}\bigl|F(x+t)+F(x-t)-2F(x)\bigr|^{2} \frac{dt}{t^{3}}\biggr)^{1/2} \label{eqnonth}$$ with $F(x)=\int^{x}f(t)\,dt.$

It was not dif­fi­cult to see that $\|\mu(F)\|_{L^{2}}\simeq\|f\|_{L^{2}} \quad\text{ if }\,\int_{0}^{2\pi}f(x)\,dx =0.$ With this, and us­ing the real-vari­able tools he had already de­veloped, he was able to prove the ana­logue of the the­or­em he and Zyg­mund had found for the area in­teg­ral (The­or­em 5a). The res­ult was as fol­lows.

The­or­em 8a:

Sup­pose $F\in L^{2}$. If $F^{\prime}(x)$ ex­ists in a set $E$, then $\mu(F)(x) < \infty$ for al­most every $x \in E$.

The ques­tions that arose were first, wheth­er some of the oth­er prop­er­ties of the area in­teg­ral or $g$-func­tion held as well for $\mu$; and, more in­ter­est­ingly, what was the real sig­ni­fic­ance of the Mar­cinkiewicz func­tion. Zyg­mund found an an­swer to the first ques­tion in 1944 [5] when he proved

The­or­em 8b:

For $1 < p < \infty$, $\|\mu(F)\|_{L^{p}}\simeq\|f\|_{L^{p}} \quad\text{ if }\, \int_{0}^{2\pi}f(x)\,dx=0.$

The ar­gu­ment he de­veloped to show this was not an easy one. He was re­quired to in­voke the most ar­cane of the square func­tions, the func­tion $g^{*}$, which Lit­tle­wood and Pa­ley had also stud­ied. He es­tab­lished the $L^{p}$ in­equal­it­ies for it and showed that it ac­tu­ally was a ma­jor­ant of the Mar­cinkiewicz func­tion. In­cid­ent­ally $g^{*}$ is defined by $(g^{*}(\Phi)(\theta))^{2}=\int_{0}^{1}\int_{0}^{2\pi}\bigl|\Phi^{\prime}(r e^{i(\theta+\phi)})\bigr|^{2}\bigg|\frac{1-r}{1-r e^{i\phi}}\bigg|^{2}\,d\phi\, dr,$ and so ma­jor­izes also of the area in­teg­ral \eqref{eqnonze}, but it takes in­to ac­count “the tan­gen­tial” ap­proach to the bound­ary.8 The prob­lem that re­mained was to dis­cov­er wheth­er there was a con­verse to the loc­al res­ult giv­en by The­or­em 8a, or to put the ques­tion more broadly, to find the mean­ing of the Mar­cinkiewicz func­tion. It was to be al­most twenty more years be­fore an an­swer to that ques­tion would be found.

#### Fourth period (1950–1964): Zygmund and his students

Start­ing about 1950 a new dir­ec­tion of con­sid­er­able im­port­ance began to emerge in force. Hin­ted at in earli­er work (of Be­sicov­itch and Mar­cinkiewicz, among oth­ers), its thrust was the de­vel­op­ment of “real-vari­able” meth­ods to re­place com­plex func­tion the­ory — that favored ally of one-di­men­sion­al Four­i­er ana­lys­is. What made this new em­phas­is par­tic­u­larly timely, in fact in­dis­pens­able, was that only with tech­niques com­ing from real-vari­able the­ory could one hope to come to grips with many in­ter­est­ing $n$-di­men­sion­al ana­logues of the one-di­men­sion­al the­ory.

The math­em­atician an­im­at­ing this de­vel­op­ment was Ant­oni Zyg­mund. In many ways he set the broad out­lines of the ef­fort, he mastered by his work some of the cru­cial dif­fi­culties, and was throughout the source of in­spir­a­tion for his stu­dents and col­lab­or­at­ors.

##### a: The area integral

A pi­on­eer­ing res­ult in this new dir­ec­tion was Calderón’s ex­ten­sion to $\mathbf{R}^{n}$ of the the­or­em of Mar­cinkiewicz and Zyg­mund con­cern­ing the area in­teg­ral, a sub­ject he had taken up at the sug­ges­tion of Zyg­mund. The set­ting for this is as fol­lows. We let $\mathbf{R}_{+}^{n+1}=\{(x, y), x=(x_{1},\ldots,x_{n})\in \mathbf{R}^{n}, y\in \mathbf{R}^{+}\}$ be the up­per half-space, and sup­pose that $u(x, y)$ is har­mon­ic (with re­spect to the $n+1$ vari­able $x_{1},\ldots,$ $x_{n}, y$). Some­times we shall as­sume that $u$ is in fact the Pois­son in­teg­ral of an ap­pro­pri­ate func­tion $f$ defined on $\mathbf{R}^{n}$, and then we shall write $u=\operatorname{PI}(f)$. We let $\Gamma=\{(x, y),$ $|x| < y\}$ be a stand­ard cone with ver­tex at the ori­gin, $\Gamma^{\prime}$ its trun­cated ver­sion, $\Gamma^{\prime}=\Gamma\cap\{y < 1\}$. For any $\bar{x}\in \mathbf{R}^{n}$, $\Gamma(\bar{x})$ and $\Gamma^{\prime}(\bar{x})$ will be the cor­res­pond­ing cones with ver­tices at $\bar{x}$. The area in­teg­ral of $u$ is defined by $$(A(u)(\bar{x}))^{2}=\int_{\Gamma(\bar{x})}|\nabla u|^{2}y^{1-n}\,dx\, dy \label{eqnonfo}$$ where $|\nabla u|^{2}=|\partial u/\partial y|^{2}+\sum_{j=1}^{n}|\partial u/\partial x_{\partial}|^{2}$.

Sim­il­arly for the loc­al the­ory one needs the ana­logue of \eqref{eqnonfo} where $\Gamma(\bar{x})$ is re­placed by $\Gamma^{\prime}(\bar{x})$; this defines $A_{\mathrm{loc}}(u)(\bar{x})$. The max­im­al func­tion $u^{*}$ is defined by $u^{*}(\bar{x})=\sup_{(x,y)\in\Gamma(\bar{x})}|u(x, y)| ,$ and its loc­al ana­logue $u_{\mathrm{loc}}^{*}$ is giv­en by re­pla­cing $\Gamma(\bar{x})$ by $\Gamma^{\prime}(\bar{x})$ in the defin­i­tion.

The­or­em 9a:

Sup­pose $u$ is har­mon­ic in $\mathbf{R}_{+}^{n+1}$. Then $A_{\mathrm{loc}}u(\bar{x}) < \infty$ at al­most every point $\bar{x}\in \mathbf{R}^{n}$ where $u_{\mathrm{loc}}^{*}(\bar{x}) < \infty$.

Calderón’s proof of this the­or­em was pub­lished at the same time (1950) as an­oth­er im­port­ant res­ult he found, namely the ex­ten­sion of Privalov’s the­or­em: $u$ has a nontan­gen­tial lim­it at al­most every $\bar{x}\in \mathbf{R}^{n}$, where $u_{\mathrm{loc}}^{*}(\bar{x}) < \infty$. We shall dis­cuss the ideas be­hind the proof of The­or­em 9a later when we take up its con­verse. Now we turn to the “glob­al” ver­sion, i.e., the high­er-di­men­sion­al ana­logue of the Lit­tle­wood–Pa­ley the­or­em (The­or­em 3).

The­or­em 9b:

Sup­pose $u= \operatorname{PI} (f)$, then $\|A(u)\|_{L^{p}}\simeq\|f\|_{L^{p}},\quad 1 < p < \infty.$

It would be dif­fi­cult after 25 years to re­call the pre­cise thoughts that mo­tiv­ated the proof of The­or­em 9b, nor would it be easy now for one to ap­pre­ci­ate the dif­fi­culties that seemed then to stand in the way. But I do re­mem­ber that those of us who were gradu­ate stu­dents of Zyg­mund in the middle 1950’s were shaped by the event, akin to the Cre­ation, which ap­peared to some of us to be the be­gin­ning of everything im­port­ant: the 1952 Acta pa­per which de­veloped via the Calderón–Zyg­mund lemma, the real vari­able meth­ods giv­ing the ex­ten­sion of the Hil­bert trans­form to $n$-di­men­sions. What was more nat­ur­al, there­fore, than to at­tempt to prove the $L^{p}$ bounded­ness of $f\rightarrow A(u)$ by ad­apt­ing these meth­ods? This idea in­deed worked, al­though the ini­tial com­plic­ated proofs were later much sim­pli­fied. The ana­lys­is suc­ceeded as well for the Mar­cinkiewicz func­tion \eqref{eqnonth}, and proved also that the map­pings $f\rightarrow A(u)$ and $f\rightarrow\mu(F)$ were of weak-type (1, 1).

We turn now to the proof of The­or­em 9a. Its one-di­men­sion­al ver­sion (The­or­em 5a) had been done by us­ing com­plex func­tion the­ory, in par­tic­u­lar con­form­al map­pings. So a com­pletely dif­fer­ent ap­proach was needed. The idea be­hind it can be un­der­stood by ex­amin­ing the case $p=2$ of The­or­em 9b, which has an easy proof. A dir­ect cal­cu­la­tion shows that $$\int_{\mathbf{R}^{n}}A^{2}(u)\,dx=c\int_{\mathbf{R}_{+}^{n+1}}y|\nabla u|^{2}\,dx\,dy, \label{eqnonfi}$$ where $c$ is the volume of the unit ball. Next we can use the fact that $|\nabla u|^{2}=\tfrac{1}{2}\Delta(|u|^{2}) ,$ and so by Green’s the­or­em \begin{align*} \int_{\mathbf{R}^{n}}A^{2}(u)\,dx & =\frac{c}{2}\iint_{\mathbf{R}_{+}^{n+1}}y\Delta(|u|^{2})\,dx\,dy\\ & =\frac{c}{2}\int|u(x,0)|^{2}\,dx, \end{align*} which proves The­or­em 9b for $p=2$, since $u(x, 0)=f(x)$. Thus in or­der to con­trol $A_{\mathrm{loc}}(u)(x)$ on a set $E$, it is nat­ur­al to con­sider $\int_{E}A_{\mathrm{loc}}^{2}(u)(x)\,dx$ which in turn is dom­in­ated by $c \int_{R(E)} y|\nabla u|^{2}\,dx\,dy ,$ where $R(E)$ is a stand­ard “saw­tooth” re­gion in $\mathbf{R}_{+}^{n+1}$ based on $E$. At this stage (which is the turn­ing point of the proof) Calderón in­voked Green’s the­or­em for an­oth­er re­gion con­tain­ing $R(E)$, whose Green’s func­tion he could es­sen­tially bound from be­low by $c^{\prime}y$.

To prove the con­verse of The­or­em 9a along these lines ap­peared to re­quire, among oth­er things, ap­pro­pri­ate bounds from above for Green’s func­tion for such re­gions, and that seemed much bey­ond what could be done then.9 What turned out to be the right course of ac­tion was to fin­esse the prob­lem of Green’s func­tion and to pro­ceed dir­ectly with es­tim­ates that fol­lowed from the fi­nite­ness of $\int_{R(E)}y|\nabla u|^{2}\,dx\,dy .$ These ar­gu­ments also proved to be use­ful in oth­er situ­ations, as we shall see later. The res­ult ob­tained was

The­or­em 9c:

Sup­pose $u$ is har­mon­ic in $\mathbf{R}_{+}^{n+1}$. Then $u_{\mathrm{loc}}^{*}(\bar{x}) < \infty$ for al­most all points $\bar{x}\in \mathbf{R}^{n}$ where $A_{\mathrm{loc}}(u)(\bar{x}) < \infty$.

I re­mem­ber quite vividly the ex­cite­ment sur­round­ing the events at the time of this work. It was March 1959, and I had re­turned to the Uni­versity of Chica­go the fall be­fore. Fre­quently I met with my friends Guido Weiss and Mary Weiss, and to­geth­er we of­ten found ourselves in Zyg­mund’s of­fice (Eck­hart 309, two doors from mine). With our teach­er our con­ver­sa­tions ranged over a wide vari­ety of top­ics (not all math­em­at­ic­al) and more than once the sub­ject of square func­tions arose. When this happened the mood would change, if only slightly, as if in de­fer­ence to their spe­cial status, and the en­igma that sur­roun­ded them. I had an idea which seemed prom­ising. But be­fore we could see where it might lead came the spring break. Fur­ther work would have to be held in abey­ance since we were each go­ing our own ways: Zyg­mund trav­elled to Bo­ston to vis­it Calderón; Guido and Mary Weiss, hav­ing bor­rowed my Chev­ro­let, drove to Vir­gin­ia for a va­ca­tion trip; and I went to New York to be mar­ried.

##### b: The Marcinkiewicz function

In­flu­enced by the re­newed in­terest in area in­teg­rals, and en­cour­aged by some re­cent work he had done with Mary Weiss,10 Zyg­mund re­turned to the study of the Mar­cinkiewicz in­teg­ral \eqref{eqnonth} and the prob­lem of find­ing a con­verse to The­or­em 8a. He was con­vinced that now (more than 20 years after Mar­cinkiewicz’s ori­gin­al work) the time was ripe to see mat­ters to a con­clu­sion. He sug­ges­ted to me that we work on the prob­lem to­geth­er, and of course I was very happy to ac­cept his of­fer. For me this was a unique and re­ward­ing col­lab­or­a­tion — not just be­cause of the spe­cial sat­is­fac­tion one de­rives when ac­cep­ted as an equal by one’s teach­er — but also be­cause as it turned out he did most of the work that really coun­ted!

We real­ized first that The­or­em 8a it­self could be some­what strengthened; what was re­quired was the no­tion of the de­riv­at­ive $F^{\prime}(x)$ ex­ist­ing (at $x$) “in the $L^{2}$ sense”. Thus $F^{\prime}(x)$ ex­is­ted in this gen­er­al­ized sense if11 $$\frac{1}{h}\int_{0}^{h}\bigg|\frac{F(x+t)-F(x)}{t}-F^{\prime}(x)\bigg|^{2}\,dt\rightarrow 0, \quad\text{ as }\, h\rightarrow 0. \label{eqnonsi}$$

The finer ver­sion of The­or­em 8a was then: If $F\in L^{2}$ had a de­riv­at­ive in the sense of \eqref{eqnonsi} at each $x \in E$, then $\mu(F)(x) < \infty$ for al­most every $x\in E$. It was in this form that one might seek a con­verse. The ba­sic plan was to try to make mat­ters turn on the ana­log­ous situ­ation which held for the area in­teg­ral, where one can pass from the fi­nite­ness of a quad­rat­ic ex­pres­sion to the ex­ist­ence of a lim­it. After a series of re­duc­tions we were able to show that at each point $x$ where $\mu(F)(x) < \infty$ one had $$\int_{|t|\leq y}\bigg|\frac{\partial^{2}u}{\partial y^{2}}(x+t, y)+\frac{\partial^{2}u}{\partial y^{2}}(x-t, y)\bigg|^{2}\,dt\,dy < \infty \label{eqnonse}$$ with $u=\operatorname{PI}(F)$. On the oth­er hand we could show (us­ing The­or­em 5b) that at al­most every $x$ where $$\int_{|t|\leq y}\bigg|\frac{\partial^{2}u}{\partial y^{2}}(x+t, y)\bigg|^{2}\,dt\,dy < \infty \label{eqnonei}$$ the con­clu­sion \eqref{eqnonsi} ac­tu­ally held.

The ba­sic dif­fi­culty, the pas­sage from \eqref{eqnonse} to \eqref{eqnonei}, was over­come by Zyg­mund us­ing a clev­er “desym­met­riz­a­tion” ar­gu­ment; sev­er­al weeks later he presen­ted me with an es­sen­tially fi­nal draft of the pa­per which he had typed him­self!

There were sev­er­al vari­ants of the fi­nal res­ult — in­volving ex­ten­sions to $n$-di­men­sions, or high­er de­riv­at­ives, or even frac­tion­al de­riv­at­ives. The simplest ver­sion, however, was the fol­low­ing:

The­or­em 10:

Let $F\in L^{2}(0,2\pi)$. Then the set of point $x$ where $\int_{0}^{\pi}\bigl|F(x+t)+F(x-t)-2F(x)\bigr|^{2}\,dt/t^{3} < \infty,$ and the set of points where $F^{\prime}(x)$ ex­ists in the $L^{2}$ sense (i.e., \eqref{eqnonsi}) dif­fer by a set of meas­ure zero.

#### Fifth period (1966–present): Further applications of square functions

We have traced the de­vel­op­ment of square func­tions from their be­gin­nings to a stage where their nature was much bet­ter un­der­stood, in terms of a series of deep the­or­ems that had been ob­tained. Yet it is only more re­cently that their cent­ral role in sev­er­al fields of ana­lys­is has be­come more ap­par­ent. I shall try to de­scribe this very briefly in terms of three spe­cif­ic areas: $H^{p}$ spaces, sym­met­ric dif­fu­sion semig­roups, and dif­fer­en­ti­ation the­ory in $\mathbf{R}^{n}$.

##### a: $H^{p}$ theory

Be­gin­ning in about 1966 two sep­ar­ate dir­ec­tions of re­search in­volving square func­tions were un­der­taken, and when brought to­geth­er these ul­ti­mately led to a rich har­vest in the the­ory of $H^{p}$ spaces. The first star­ted with Burk­hold­er’s [e31] ex­ten­sion of Pa­ley’s the­or­em (The­or­em 4 for Walsh–Pa­ley series) to gen­er­al mar­tin­gales. He ob­served that Pa­ley’s ar­gu­ment ex­ten­ded to this gen­er­al set­ting, but also found his own ap­proach which was very dif­fer­ent. He showed that if $E_{k}=E(\,\cdot\,\mid\mathcal{F}_{k})$ are the con­di­tion­al ex­pect­a­tions for an in­creas­ing se­quence of $\sigma$-fields $\{\mathcal{F}_{k}\}_{k=0}^{\infty}$, then with $E_{-1}(f)\equiv 0$, $$\biggl\Vert\Bigl(\sum_{k=0}^{\infty}\bigl|(E_{k}-E_{k-1})(f)\bigr|^{2}\Bigr)^{1/2}\biggr\Vert_p\simeq\lim_{k\rightarrow\infty}\|E_{k}(f)\|_{p}, \quad 1 < p < \infty. \label{eqnonni}$$

Next, in work with Gundy, and later also with Sil­ver­stein, the fol­low­ing ad­vances were made:12 It was shown that \eqref{eqnonni} ex­ten­ded to $p\leq 1$ if $\lim_{k\rightarrow\infty}\|E_{k}(f)\|_{p}$ was re­placed with $\|\sup_{k}E_{k}(f)\|_{p}$, for a large class of mar­tin­gales. This class in­cid­ent­ally in­cludes those oc­cur­ring for the Walsh–Pa­ley series, but more im­port­antly these res­ults went over to the (con­tinu­ous para­met­er) mar­tin­gales arising from Browni­an mo­tion ap­plied to har­mon­ic func­tions. To be more pre­cise, let $z_{t}(\omega)$ de­note the stand­ard Browni­an mo­tion in the com­plex $z$-plane, start­ing at the ori­gin and stopped when reach­ing the unit circle. Here $0\leq t < \infty$ is the time para­met­er, and $\omega$ la­bels the Browni­an path, with $\omega \in\Omega$, $\Omega$ be­ing the prob­ab­il­ity space. If $u$ is har­mon­ic in the unit disc, $t\rightarrow u(z_{t}(\omega))$ is a con­tinu­ous-time mar­tin­gale. Let $M_{B}(u)(\omega)=\sup_{0\leq t < \infty} |u(z_{t}(\omega))|$ be the Browni­an max­im­al func­tion, and $S(u)(\omega)$ the mar­tin­gale square func­tion, $S(u)(\omega)=\biggl(\int_{0}^{\infty}|\nabla u(z_{t}(\omega))|^{2}\,dt\biggr)^{1/2} .$ Their res­ult then was that $$\|Su\|_{L^{p}(\Omega)}\simeq\|M_{B}(u)\|_{L^{p}(\Omega)},\quad 0 < p < \infty, \label{eqntwze}$$ whenev­er $u(0)=0$.

The most strik­ing ap­plic­a­tion of this circle of ideas was a con­clu­sion drawn from \eqref{eqntwze}, to wit, whenev­er $F=u+iv$ is holo­morph­ic in the unit disc, then $F\in H^{p}$ if and only if $u^{*}\in L^{p},\ 0 < p < \infty$.

The second line of re­search began when a more dir­ect con­nec­tion between stand­ard mul­ti­pli­er op­er­at­ors and square func­tion was dis­covered. The res­ult was easy to state. Whenev­er $T$ is a mul­ti­pli­er op­er­at­or of the Mar­cinkiewicz type on $\mathbf{R}^{n}$ (more pre­cisely one that sat­is­fies the kind of con­di­tions put in Hörmander’s ver­sion of that mul­ti­pli­er the­or­em), then the area in­teg­ral cor­res­pond­ing to $T(f)$ is point­wise dom­in­ated by a $g^{*}$ func­tion of $f$, i.e., $$A(T\mkern-4mu f)(x)\leq cg_{\lambda}^{*}(f)(x), \label{eqntwon}$$ where $g_{\lambda}^{*}(f)(x)=\biggl(\int\bigl|\nabla u(x-t, y)\bigr|^{2}\Bigl(\frac{y}{y+|t|}\Bigr)^{n\lambda}y^{1-n}\,dy\,dt\biggr)^{1/2},$ and $\lambda$ is a para­met­er which de­pends on the nature of the mul­ti­pli­er. An $H^{p}$ the­ory in $\mathbf{R}^{n}$ had already been ini­ti­ated sev­er­al years be­fore (by the ef­forts of G. Weiss and oth­ers), and us­ing it and \eqref{eqntwon} it fol­lowed that these mul­ti­pli­ers also ex­ten­ded to bounded op­er­at­ors on $H^{p}$.

From these con­sid­er­a­tions it might be guessed that a ba­sic tool for $H^{p}$ the­ory is the re­la­tion between square func­tions and max­im­al prop­er­ties of (har­mon­ic) func­tions. Here im­port­ant con­tri­bu­tions were made by C. Fef­fer­man. One of the res­ults ob­tained in this dir­ec­tion was the fol­low­ing the­or­em:

The­or­em 11:

Sup­pose that $u$ is har­mon­ic in $\mathbf{R}_{+}^{n+1}$, and $u(x, y)\rightarrow 0$ as $y\rightarrow\infty$. Then [e40] $\|A(u)\|_{p}\simeq\|u^{*}\|_{p}, \quad 0 < p < \infty.$

In­cid­ent­ally it should be re­marked that the proof used the same ap­proach as its “loc­al” ana­logue, The­or­em 9c, but ad­di­tion­al ar­gu­ments of a quant­it­at­ive nature were of course needed. More re­cently some of these res­ults for square func­tions have been ex­ten­ded to product do­mains, and in this con­text gen­er­al­iz­a­tions of The­or­ems 9 and 11 have been found.13

##### b: Symmetric diffusion semigroups

The semig­roups which are the sub­ject of the title are a fam­ily of op­er­at­ors $\{T^{t}\}_{t\geq 0}$, each bounded and sel­fad­joint on $L^{2}$, with $T^{t}$ hav­ing norm $\leq 1$ on every $L^{p}$, $1\leq p\leq\infty$, and $T^{t_{1}+t_{2}}=T^{t_{1}}T^{t_{2}} ,$ with $\lim_{t\rightarrow 0}T_{f}^{t}=f$ for $f\in L^{2}$. Some­times the ad­di­tion­al hy­po­theses are made that $T^{t}(1)=1$, and $T^{t}$ is pos­it­iv­ity-pre­serving.

The sig­ni­fic­ance of this no­tion de­rives from the many im­port­ant ex­amples of such semig­roups in ana­lys­is, and the many rich prop­er­ties that they share. In fact some of the ba­sic res­ults dis­cussed above have ses­sions val­id in this con­text. Here we men­tion two, a max­im­al the­or­em, and a mul­ti­pli­er the­or­em in the spir­it of Mar­cinkiewicz’s the­or­em (The­or­em 7).

The­or­em 11a:

$\bigl\|\sup_{t > 0}|T^{t}f|\bigr\|_{p}\leq A_{p}\|f\|_{p},\ 1 < p\leq\infty$.

To for­mu­late the mul­ti­pli­er the­or­em we write $T^{t}$ in terms of its spec­tral de­com­pos­i­tion, $T^{t}\mkern-2mu=\mkern-2mu\int_0^{\infty} e^{-\lambda t}\,dE(\lambda) ,$ where $E(\lambda)$ is a spec­tral res­ol­u­tion on $L^{2}$. For each bounded Borel meas­ur­able func­tion $m$ on $(0, \infty)$, con­sider the “mul­ti­pli­er” op­er­at­or $T_{m}$ giv­en by $T_{m}=\int_0^\infty m(\lambda)\,dE(\lambda) .$ Here we as­sume that $m$ is of the form $m(\lambda)=\lambda\int_{0}^{\infty}M(s)e^{-\lambda s}\,ds ,$ with $M$ a bounded func­tion.

The­or­em 11b:

$\|T_{m}(f)\|_{p}\leq A_{p}\|f\|_{p},\ 1 < p < \infty$.

A key tool used for the proof of both these the­or­ems are the Lit­tle­wood–Pa­ley type func­tions $g_{k}(f)(x)=\biggl(\int_{0}^{\infty}t^{2k-1}\Bigl|\frac{\partial^{k}}{\partial t^{k}}T^{t}(f)\Bigr|^{2}\,dt\biggr)^{1/2} \quad\text{with }k=1,2,\ldots.$ Also for $T_{m}$ a re­la­tion of the same kind as \eqref{eqntwon} holds.14

##### c: Differentiation theorems in $\mathbf{R}^{n}$

Prob­ably the most dra­mat­ic ap­plic­a­tions of square func­tions oc­cur in dif­fer­en­ti­ation the­ory. The gen­er­al prob­lem here is to prove that $$\lim_{\operatorname{diam} R\rightarrow 0}\frac{1}{\mu(R)}\int_{R}f(x-y)\,d\mu(y)=f(x)\quad \text{ a.e.} \label{eqntwtw}$$ where $R$ ranges over a suit­able col­lec­tion $\mathcal{R}$ of sets “centered” at the ori­gin. The clas­sic­al ex­amples of these are (i) where $\mathcal{R}$ is the col­lec­tion of all balls (or cubes) con­tain­ing the ori­gin, and (ii) where $\mathcal{R}$ is the col­lec­tion of all rect­angles con­tain­ing the ori­gin, with sides par­al­lel to the axes. For each of these res­ults a Vi­tali-type cov­er­ing the­or­em has played a de­cis­ive res­ult. Thus it may seem sur­pris­ing that the ali­en no­tion of square func­tions would turn out to be the ap­pro­pri­ate idea in re­lated situ­ations, where cov­er­ing ar­gu­ments were un­avail­ing. In for­mu­lat­ing the res­ults ob­tained this way we shall, as is usu­al, deal with the cor­res­pond­ing max­im­al func­tion $M_{\mathcal{R}}(f)(x)=\sup_{R\in\mathcal{R}}\frac{1}{\mu(R)}\bigg|\int_{R}f(x-y)\,d\mu(y)\bigg|,$ and the pos­sib­il­ity of as­sert­ing in­equal­it­ies of the type $$\|M_{\mathcal{R}}(f)\|_{p}\leq A_{p}\|f\|_{p}. \label{eqntwth}$$

The­or­em 12:

The in­equal­ity \eqref{eqntwth} holds in the fol­low­ing cases:

1. $\mathcal{R}$ is the col­lec­tion of spheres centered at the ori­gin; $d\mu$ is the uni­form sur­face meas­ure; and $n\geq 3$, with $p > n/(n-1)$.

2. $\mathcal{R}$ is the col­lec­tion of ini­tial seg­ments $\{\gamma(t), 0\leq t\leq h\}$ of a smooth curve $t\rightarrow\gamma(t)$, with $\gamma(0)=0$, and $\gamma$ hav­ing nonzero “curvature” at the ori­gin; here $d\mu$ is arc-length, $n\geq 1$ and $p > 1$.

3. $\mathcal{R}$ is the col­lec­tion of rect­angles (in $\mathbf{R}^{2}$) con­tain­ing the ori­gin, which make an angle $\theta_{k}$ with a fixed dir­ec­tion, where $\{\theta_{k}\}$ is a se­quence of num­bers tend­ing rap­idly to zero; here $p > 1$.

The proof of each part of this the­or­em re­quires its own square func­tion. We shall not de­scribe these rather com­plic­ated quad­rat­ic func­tions here, but refer the read­er to the lit­er­at­ure for fur­ther de­tails.15

#### Epilogue

Since the ori­gin­al draft of this es­say was writ­ten two new res­ults were found which use square func­tions in a de­cis­ive way.

The first is the solu­tion of the prob­lem of Cauchy’s in­teg­ral for Lipschitz curves by Coi­f­man, McIn­tosh, and Mey­er [e50]. It is to be noted that in Calderón’s ini­tial work on this prob­lem (1965), square func­tions were already used in a cru­cial way. In par­tic­u­lar the in­equal­ity $c\|F\|_{H^{p}}\leq\|A(F)\|_{p}, \quad p\leq 1 ,$ was proved there for this pur­pose.

The second res­ult deals with the stand­ard max­im­al func­tion in $\mathbf{R}^{n}$ $M_{n}(f)(x)=\sup_{r > 0}\frac{1}{c_{n}r^{n}}\bigg|\int_{|y|\leq r}f(x-y)\,dy\bigg|,$ where $c_{n}$ is the volume of the unit ball in $\mathbf{R}^{n}$.

The ques­tion that arises is, how does the $L^{p}$ norm of $M_{n}$ be­have for large $n$? The best that can be proved by the usu­al Vi­tali cov­er­ing ar­gu­ments gives $\|M_{n}(f)\|_{p}\leq A(p, n)\|f\|_{p}, \quad 1 < p ,$ with $A(p, n)\leq A(p)\,2^{n/p}$, which is a large growth as $n\rightarrow\infty$. However much more can be said.

The­or­em 13:

$\|M_{n}(f)\|_{p}\leq A_{p}\|f\|_{p},\ 1 < p\leq\infty$, with $A_{p}$ in­de­pend­ent of $n$.

The idea of the proof is to con­sider in $\mathbf{R}^{m}$ the max­im­al func­tions $M_{m,k}$ defined by $M_{m,k}(f)(x)=\sup_{r > 0}\frac{\big|\int_{|y|\leq r}f(x-y)|y|^{k}\,dy\big|}{\int_{|y|\leq r}|y|^{k}\,dy},\quad k\geq 0.$ Then if $m$ is so large that $p > m/(m-1)$,

$$\|M_{m,k}(f)\|_{p}\leq A_{p,m}\|f\|_{p} \label{eqntwfo}$$

with $A_{p,m}$ in­de­pend­ent of $k$, $k\geq 0$. This fol­lows from The­or­em 12, Part (a). From this The­or­em 13 is ob­tained by lift­ing the $m$-di­men­sion­al res­ult \eqref{eqntwfo} in­to $\mathbf{R}^{n}$, where $n\geq m$ (and $k=n-m$), by in­teg­rat­ing over the Grass­man­ni­an of $m$-planes in $\mathbf{R}^{n}$ through the ori­gin.

### Works

[1] A. Zyg­mund: “Une re­marque sur un théorème de M. Kaczmarz” [A re­mark on a the­or­em of M. Kaczmarz], Math. Z. 25 : 1 (1926), pp. 297–​298. MR 1544811 JFM 52.​0278.​01 article

[2] A. Zyg­mund: “Sur l’ap­plic­a­tion de la première moy­enne arith­métique dans la théor­ie des séries de fonc­tions or­tho­gonales” [On the ap­plic­a­tion of the first arith­met­ic mean to the the­ory of series of or­tho­gon­al func­tions], Fun­dam. Math. 10 (1927), pp. 356–​362. JFM 53.​0267.​04 article

[3] J. Mar­cinkiewicz and A. Zyg­mund: “A the­or­em of Lus­in,” Duke Math. J. 4 : 3 (1938), pp. 473–​485. MR 1546069 JFM 64.​0268.​01 Zbl 0019.​42001 article

[4] A. Zyg­mund: “On the con­ver­gence and sum­mab­il­ity of power series on the circle of con­ver­gence, I,” Fun­dam. Math. 30 (1938), pp. 170–​196. Part II was pub­lished in Proc. Lon­don Math. Soc. 47:1. JFM 64.​1054.​01 Zbl 0019.​01602 article

[5] A. Zyg­mund: “On cer­tain in­teg­rals,” Trans. Am. Math. Soc. 55 (1944), pp. 170–​204. MR 0009966 Zbl 0061.​13902 article

[6] A. Zyg­mund: “Proof of a the­or­em of Lit­tle­wood and Pa­ley,” Bull. Am. Math. Soc. 51 : 6 (1945), pp. 439–​446. MR 0012306 Zbl 0060.​14703 article

[7] A. P. Cal­der­on and A. Zyg­mund: “On the ex­ist­ence of cer­tain sin­gu­lar in­teg­rals,” Acta Math. 88 (December 1952), pp. 85–​139. Ded­ic­ated to Pro­fess­or Mar­cel Riesz, on the oc­ca­sion of his 65th birth­day. MR 0052553 Zbl 0047.​10201 article

[8] A. Zyg­mund: “On the Lit­tle­wood–Pa­ley func­tion $g^*(\theta)$,” Proc. Natl. Acad. Sci. U. S. A. 42 : 4 (April 1956), pp. 208–​212. MR 0077700 Zbl 0072.​07201 article

[9] M. Weiss and A. Zyg­mund: “A note on smooth func­tions,” Nederl. Akad. Wetensch. Proc. Ser. A 62 : 1 (1959), pp. 52–​58. MR 0107122 Zbl 0085.​05701 article

[10] A. Zyg­mund: Tri­go­no­met­ric series, 2nd edition, vol. I. Cam­bridge Uni­versity Press (New York), 1959. First volume of an en­larged edi­tion of 1935 ori­gin­al. MR 0107776 Zbl 0085.​05601 book

[11] A.-P. Calder­ón and A. Zyg­mund: “Loc­al prop­er­ties of solu­tions of el­lipt­ic par­tial dif­fer­en­tial equa­tions,” Stu­dia Math. 20 : 2 (1961), pp. 171–​225. This ex­pands on an art­icle pub­lished in Proc. Natl. Acad. Sci. U.S.A. 46:10 (1960). MR 0136849 Zbl 0099.​30103 article

[12] E. M. Stein and A. Zyg­mund: “On the dif­fer­en­ti­ab­il­ity of func­tions,” Stu­dia Math. 23 : 3 (1963–1964), pp. 247–​283. Ded­ic­ated to E. Hille on the oc­ca­sion of his 70th birth­day. MR 0158955 Zbl 0122.​30203 article