by Ronald Getoor
I first met Kai Lai Chung in 1955 when he gave a seminar talk at Princeton, where I was an instructor at the time. I believe that he spoke about his work on Markov chains. After so many years I remember very little about the talk, but I clearly remember how impressed I was with the enthusiasm and energy displayed by the speaker. I had the pleasure of spending the academic year 1964–65 at Stanford, and during that time Kai Lai and I became good friends. We had the opportunity to discuss mathematics in some depth, and we often had lunch together. He had become interested in potential theory, and had invited Marcel Brelot to visit Stanford during the spring quarter of 1965, and give a course on classical potential theory. By the end of the term, he and I were the only ones still attending Brelot’s lectures! During the 1970s we had an extensive correspondence about Markov processes, probabilistic potential theory and related topics. Interacting with Kai Lai on any level was always extremely stimulating and rewarding.
In what follows I am going to to comment on some of his work that was especially important and influential in areas that are of interest to me.
Excursions
During the early 1970s there was a considerable body of work on what might be called the general theory of excursions of a Markov process. Perhaps the definitive work in this direction was the paper of Maisonneuve [e9]. Shortly thereafter Chung’s paper [6] on Brownian excursions appeared. Some of his results had been announced earlier in [5]. Chung did not make use of the general theory; rather, working by hand, he made a deep study of the excursions of Brownian motion from the origin, using the special properties of Brownian motion. This paper was a tour de force of direct methods for penetrating the mysteries of these excursions. Guided somewhat, it seems, by analogy with his previous work on Markov chains, and inspired by Lévy’s work, he obtained a wealth of explicit formulas for the distributions of various random variables and processes derived from an excursion. I shall describe briefly a few of his results, without reproducing the detailed explicit expressions in the paper.
Let \( B = B(t) \) denote one-dimensional Brownian motion starting from the origin, and let \( Y = |B| \). Fix \( t > 0 \). Following Chung, define \begin{align*} \gamma(t) &=\sup\{s\le t : Y(s) =0\},\\ \beta(t) &= \inf\{s\ge t: Y (s) = 0\}. \end{align*} The intervals \( (\gamma(t), \beta(t)) \) and \( (\gamma(t), t) \) are called the excursion interval straddling \( t \), and the interval of meandering ending at \( t \), respectively. Let \( L(t) = \beta(t) - \gamma(t) \) and \( L^-(t) = t - \gamma(t) \) denote the lengths of these intervals. Chung begins by giving a direct derivation of a number of results, originally due to Lévy, which lead to the joint distribution of \( (\gamma(t), Y(t), \beta(t)) \). Moreover, based on his earlier work on Markov chains, he is able to write these formulas in a particularly illuminating form. Define \begin{align*} Z^-(u) &= Y(\gamma(t) + u) \quad\text{for } 0 \le u \le L^-(t),\\ Z(u) &= Y(\gamma(t) +u) \quad\text{for }0\le u \le L(t). \end{align*} Then \( Z^- \) is called the meandering process, and \( Z \) the excursion process. Theorem 4 gives the joint law of \( \gamma(t) \) and \( Z^- \), while Theorem 6 contains the joint law of \( \gamma(t), L(t) \) and \( Z \). (Chung denotes both the meander process and the excursion process by \( Z \); I have changed the notation for this exposition.) Chung then applies these results to calculate the distributions of various functionals of these processes. Particularly interesting is Theorem 7, which contains an explicit formula for the maximum of \( Z \) conditioned on \( \gamma(t) \) and \( L(t) \). A consequence is that \[ F(x) = 1 +2 \sum^\infty_{n=1} (1-2nx) e^{-n^2x} \quad\text{for } 0 < x < \infty \] defines a distribution function! This is discussed in some detail. Other functionals were also studied. Of special interest to me is the occupation time of an interval \( (a. b) \) during an excursion defined by \[ S(t, a{.}b) = \int^{\beta(t)}_{\gamma(t)} 1_{(a.b)} (Z(s))\,ds. \] Among other things, Chung showed that \( S(t, 0,\varepsilon)/\varepsilon^2 \) has a limiting distribution as \( \varepsilon \downarrow 0 \), and computed its first four moments. In [e10] it was shown that this distribution was the convolution of the first passage distribution \( P(R \in ds) \) with itself, where \( R=\inf \{s: Y(s) =1\} \).
Moderate Markov processes
In the paper [7] some of the basic properties of a left-continuous moderate Markov process were formulated and proved. It was more or less ignored when it first appeared, even though the importance of this class of processes was evident from the fundamental paper of Chung and Walsh [1] on time reversal of Markov processes. In [1], it was called the moderately strong Markov property, and the process was right continuous. To the best of my knowledge, the terminology “moderate Markov property” first appeared in [2]. In 1987, Fitzsimmons [e11] was able to modify somewhat the Chung–Walsh methods, and so to construct a left-continuous moderate Markov dual process for any given Borel right process and excessive measure \( m \) as duality measure. The Chung–Walsh theory corresponds to \( m \) being the potential of a measure \( \mu \) which served as a fixed initial distribution. More importantly, Fitzsimmons showed that this dual was a powerful tool in studying the potential theory of the underlying Borel right process. Consequently, there was renewed interest in left-continuous moderate Markov processes, and the Chung–Glover paper [7] was immediately relevant. It has become the basic reference for properties of these processes.
Probabilistic potential theory
The paper [3] was perhaps Chung’s most influential contribution to what is commonly known as probabilistic potential theory. (This excludes his work on gauge theorems and Schrödinger equations.) In it, he obtained a beautiful expression for the equilibrium distribution of a set, in terms of the last-exit distribution from the set and the potential kernel density of the underlying process. He emphasized and clearly stated that his approach involved working directly with the last exit time. This was an important innovation since such times are not stopping times, and so were not part of the available machinery at that time. Immediately following Chung’s paper (more precisely, its announcement) and inspired by it, Meyer [e8] and Getoor and Sharpe [e7] obtained similar results under different hypotheses. Numerous authors then developed techniques for handling last exit and more general times, which became part of the standard machinery of Markov processes. In two additional papers, [4] and (with K. Murali Rao) [8], conditions were given under which the equilibrium measure obtained from the last-exit distribution is a multiple of the measure of minimum energy, as in classical situations. In [8], symmetry was not assumed, and so a modified form of energy was introduced in order to obtain reasonable results. Additional implications in potential theory of the hypotheses that he had introduced in [3], and also their relationship with the more common duality hypotheses, were explored with K. Murali Rao in [9] and with Ming Liao and Rao in [10]. Of particular importance was the result giving sufficient conditions for the validity of Hunt’s hypothesis B in [9]. These four papers were very original, but for some reason they were not as influential as the paper [3].
For historical reasons, I should point out that the relationship between the equilibrium measure and the last-exit distribution had appeared a few years earlier in Port and Stone’s memoir on infinitely divisible processes — see sections 8 and 11 of [e6]. One may wonder why Chung’s paper [3], was immediately so influential, while the result in Port and Stone was hardly noticed at the time. Certainly it was unknown to Chung, and evidently Meyer was also unaware of it. The most likely reasons for this are two-fold: (1) The result in Port and Stone was buried in a memoir of just over two hundred pages; in addition, their proof of the integral condition for the transience or recurrence of an infinitely divisible process attracted the most attention at the time. (2) In Chung it was the main result of the paper, it was clearly stated, and proved by a direct easily understood argument.
I shall now explain in a bit more detail what Chung did. I’ll try to emphasize the ideas, leaving aside the technicalities. So, suppose that \( X = (X_t, P^x) \) is a Hunt process taking values in a locally compact, separable Hausdorff space \( E \). If \( B\in \mathcal{E} \) (the \( \sigma \)-algebra of Borel subsets of \( E \)), define the hitting time \( T_B \) and the last exit time \( \lambda_B \) of \( B \) by \[ T_B= \inf\{t > 0: X_t \in B\} \quad\text{and}\quad \lambda_B = \sup \{t > 0: X_t \in B\}, \] where the infimum (respectively, supremum) of the empty set is \( \infty \) (respectively, 0). Let \[ U(x, B) = E^x \int^\infty_0 1_B(X_t) \,dt \] denote the potential kernel of \( X \), and suppose that \( U(\,\cdot\,, K) \) is bounded for \( K \) compact; in particular, \( X \) is transient. For the moment, suppose \( X \) is a Brownian motion in \( \mathbb{R}^d \) for \( d\ge 3 \). Then, \[ U(x, B) = \int_Bu(x, y) \,dy \] where \( u(x, y) = c_d|x-y|^{2-d} \) is the Newtonian potential kernel appropriately normalized. A classical result in potential theory states that if \( K \subset \mathbb{R}^d \) is compact and has positive (Newtonian) capacity, then there exists a unique measure \( \mu_K \), called the equilibrium measure or distribution of \( K \), carried by \( K \) and whose potential \begin{equation}\label{1} p_K(x) = U\mu_K(x) = \int u(x, y)\,\mu_k(dy) \end{equation} is less than or equal to 1 everywhere, and takes the value 1 on \( K \). Actually, \( p_K\equiv 1 \) on \( K \) only if \( K \) is regular; in general, there may be an exceptional subset of \( K \) of capacity zero on which \( p_K < 1 \). The function \( p_K \) is called the equilibrium potential of \( K \), and may be characterized as the unique superharmonic function \( v \) on \( \mathbb{R}^d \) such that \( 0 \le v \le 1 \), \( v \) is harmonic on \( \mathbb{R}^d\backslash K \), and \( \{v < 1\}\cap K \) has capacity zero — \( v \equiv 1 \) on \( K \) if \( K \) is regular. Evidently Kakutani [e1] was the first person to note that \begin{equation}\label{2} p_K(x) = P^x(T_K < \infty) = P^x(X_t \in K \text{ for some } t > 0). \end{equation} One may ask for what class of Borel sets \( B\subset \mathbb{R}^d \) does there exist a measure \( \mu_B \) such that \begin{equation}\label{3} P^x(T_B < \infty) = \int u(x, y) \,\mu_B(dy), \end{equation} and what can be said about \( \mu_B \). This is the equilibrium problem, as stated in the first paragraph of Chung’s paper.
Now return to the situation in which \( X \) is a Hunt process, as described in the first few sentences of the preceding paragraph. For \( B\in \mathcal{E} \), recall the definitions of the hitting time \( T_B \) and the last exit time \( \lambda_B \). The set \( B \) is transient, provided \( P^x(\lambda_B < \infty)=1 \) for all \( x \). Also note that \[ p_B(x) = P^x(T_B < \infty) = P^x(\lambda_B > 0). \] Fix \( B \) transient, and let \( p = p_B \). It is easily checked that \( p \) is excessive, and \( P_t p \to 0 \) as \( t\to \infty \). Here, \( P_{t} = (P_t(x,\,\cdot\,)) \) is the transition semigroup of \( X \). Formally, from semigroup theory, \( (p-P_\varepsilon p)/\varepsilon \to - \mathcal{G}p \), where \( \mathcal{G} \) is the “generator” of \( (P_t) \), and \( p = U(-\mathcal{G} p) \), with \( U \) the potential kernel of \( X \) as defined above. Of course, in general \( p \) is not in the domain of \( \mathcal{G} \). However, if we want to represent \( p \) as the potential of something, then one expects it to be some sort of limit of \( p_\varepsilon = (p - P_{\varepsilon} p)/\varepsilon \) as \( \varepsilon \downarrow 0 \). This idea had been used by McKean and Tanaka [e4], Volkonski [e3] and Šur [e5] to represent excessive functions as potentials of additive functionals. More relevant to the present discussion, using the same basic idea, Hunt [e2] had shown, for what are now called Hunt processes satisfying, in addition, the existence of a nice dual process and subject to a type of Feller condition and a transience hypothesis, that, if \( B \) has compact closure, then \eqref{3} holds, where now \( u(x, y) \) is the potential density associated with \( X \) and its dual; in particular, \( U(x, dy) = u(x, y) \,m(dy) \), where \( m \) is the duality measure — Lebesgue measure when \( X \) is Brownian motion.
Chung’s key observation was to note that \[ p-P_{\varepsilon}p = P^{\centerdot}(\lambda_B > 0) - P^{\centerdot}(\lambda_B > \varepsilon) = P^{\centerdot}(0 < \lambda_B \le \varepsilon). \] Suppose \( f\ge 0 \) is a bounded continuous function, and for simplicity write \( \lambda = \lambda_B \). Then, by the Markov property, \begin{align*} U\bigl[f(p -P_{\varepsilon}p)\bigr] &= E^{\centerdot} \int^\infty_0 f(X_t) \,P^{X(t)}(0 < \lambda \le\varepsilon) \,dt \\ &= E^{\centerdot} \int^\infty_0 f(X_t) \,1_{\{0 < \lambda \circ \theta_t \le \varepsilon\}}\, dt. \end{align*} Here, \( \theta_t \) is the shift operator which shifts the origin of the path from 0 to \( t \) so that \( X_s\circ \theta_t=X_{s+t} \) for \( s\ge 0 \). It is easily checked that \( \lambda \circ \theta_t=(\lambda-t)^+ \). Plugging this into the last integral and recalling that \( p_{\varepsilon}=(p -P_{\varepsilon}p)/\varepsilon \), one finds \begin{align} \label{4} U[fp_{\varepsilon}] &= \frac 1{\varepsilon} E^{\centerdot} \Bigl[\int^\lambda_{(\lambda - {\varepsilon})^+} f(X_t) \,dt; \lambda > 0\Bigr] \\ &\to E^x\bigl[f(X_{\lambda-}), \lambda > 0\bigr] \qquad\text{as }\varepsilon\downarrow 0. \nonumber \end{align} Suppose that there exists a Radon measure \( m \) on \( E \) such that \( U(x, dy) = u(x, y) \,m(dy) \). Then, Chung imposed analytic conditions on the potential density \( u(x, y) \) which implied the existence of a measure \( \mu_B \) such that \begin{align*} U[fp_{\varepsilon}](x) &= \int u(x, y) f(y) p_{\varepsilon}(y) \,m(dy) \\ &\to \int u(x, y) f(y) \mu_B (dy) = U[f\mu_B] (x) \qquad\text{as }\varepsilon\downarrow 0 \end{align*} for all bounded continuous \( f \) with compact support. Combining this with \eqref{4}, we obtain \begin{equation}\label{5} E^x[f(X_{\lambda -}); \lambda > 0 ] = U[f\mu_B] (x), \end{equation} and taking a sequence of such \( f \) increasing to 1, \begin{equation}\label{6} p_B(x) = P^x [ T_B < \infty] = P^x[\lambda_B > 0] =U\mu_B(x). \end{equation} Defining the last-exit distribution \( L_B(x, dy) = P^x[X_{\lambda-} \in dy, \lambda > 0] \), \eqref{5} implies that \begin{equation}\label{7} L_B(x, dy) = u(x, y) \,\mu_B(dy). \end{equation} This formula \eqref{7} is the celebrated result of Chung which gives the probabilistic meaning of the equilibrium measure \( \mu_B \). The measure \( \mu_B \) is carried by \( \overline B \), even by \( \partial B \) when \( X \) has continuous paths. Under Chung’s or Hunt’s hypotheses, \( \mu_B \) is a Radon measure; more generally, under duality without Feller conditions, \( \mu_B \) is \( \sigma \)-finite.
Let me derive a simple consequence of \eqref{5}, and for simplicity I shall suppose \( X \) is a Brownian motion in \( \mathbb{R}^d \) with \( d\ge 3 \). Let \( B\subset \mathbb{R}^d \) be transient, for example with \( \overline B \) compact. As before, \( \lambda = \lambda_B \). Since the paths are continuous, \eqref{5} and the Markov property imply that \[ E^x\bigl[f(X_\lambda); 0 < \lambda \le t \bigr] = Uf\mu_B(x) - P_t Uf \mu_B(x) \] for \( t > 0 \) and \( f \) bounded with compact support. Now, \( P_t (x, dy) = g_t(y-x) \,dy \), where \( g_t \) is the familiar Gauss kernel. Hence, \[ E^x\bigl[f(X_\lambda); 0 < \lambda \le t\bigr] = \iint^t_0 ds\ g_s (y-x) f(y)\,\mu_B(dy). \] Integrating over \( \mathbb{R}^d \) we obtain, since \( g_s \) is a probability density, \[ \int_{\mathbb{R}^d} dx\, E^x\bigl[f(X_\lambda); 0 < \lambda \le t\bigr] = t \int f\,d\mu_B; \] that is, \begin{equation}\label{8} P^m \bigl[X_\lambda \in dy, \lambda \in dt\bigr] = dt \,\mu_B(dy) \quad\text{for }t > 0, \end{equation} where \( m \) is Lebesgue measure. Thus, \( X_\lambda \) and \( \lambda \) are independent under the \( \sigma \)-finite measure \( P^m \), and their joint distribution under \( P^m \) is the product of \( \mu_B \) and Lebesgue measure. To my mind, this is one of the nicest probabilistic interpretations of the equilibrium measure for Brownian motion. Actually, this is valid in much more generality. For example, if \( X \) has a strong dual and the duality measure \( m \) is invariant, then \begin{equation}\label{9} P^m\bigl(X_{\lambda-} \in dy, \,\lambda \in dt\bigr) = dt\,\mu_B (dy) \quad\text{for }t > 0. \end{equation} See [e7]. In particular this holds for transient Lévy processes in \( \mathbb{R}^d \) whose potential kernel is absolutely continuous. In general, if \( m \) is not invariant, then \( X_\lambda \) and \( \lambda \) are not independent under \( P^m \).