by Richard S. Palais and Chuu-Lian Terng
Introduction
Many mathematicians consider Shiing-Shen Chern to be the outstanding contributor to research in differential geometry in the second half of the twentieth century. Just as geometry in the first half-century bears the indelible stamp of Élie Cartan, so the seal of Chern appears large on the canvas of geometry that has been painted in the past fifty years. And beyond the great respect and admiration that his scientific accomplishments have brought him, there is also a remarkable affection and esteem for Chern on the part of countless colleagues, students, and personal friends. This reflects another aspect of his career — the friendship, warmth, and consideration Chern has always shown to others throughout a life devoted as much to helping younger mathematicians develop their full potential as to his own research.
Our recounting of Chern’s life is in two sections: the first, more biographical in nature, concentrates on details of his personal and family history; the second gives a brief report on his research and its influence on the development of twentieth-century mathematics.
Our main sources for the preparation of this article were the four volumes of Chern’s selected papers [24], [28], [30], [29] published by Springer-Verlag, a collection of Chern’s Chinese articles by Science Press [27], and many conversations with Chern himself.
Early life
Chern was born on October 28, 1911 in Jia Xin. His father, Bao Zheng Chern, passed the city level Civil Service examinations at the end of the Qing Dynasty, and later graduated from Zhe Jiang Law School and practiced law. He and Chern’s mother, Mei Han, had one other son and two daughters.
Because his grandmother liked to have him at home, Shiing-Shen was not sent to elementary school, but instead learned Chinese at home from his aunt. His father was often away working for the government, but once when his father was at home he taught Shiing-Shen about numbers, and the four arithmetic operations. After his father left, Shiing-Shen went on to teach himself arithmetic by working out many exercises in the three volumes of Bi Shuan Mathematics. Because of this he easily passed the examination and entered Xiu Zhou School, fifth grade, in 1920.
His father worked for the court in Tianjin and decided to move the family there in 1922. Chern entered Fu Luen middle school that year and continued to find mathematics easy and interesting. He worked a large number of exercises in Higher Algebra by Hall and Knight, and in Geometry and Trigonometry by Wentworth and Smith. He also enjoyed reading and writing.
1926–30, Nankai University
Chern passed the college entrance examinations in 1926, at age fifteen, and entered Nankai University to study Mathematics. In the late 1920s there were few mathematicians with a PhD degree in all of China, but Chern’s teacher, Lifu Jiang, had received a doctoral degree from Harvard with Julian Coolidge. Jiang had a strong influence on Chern’s course of study; he was very serious about his teaching, giving many exercises and personally correcting all of them. Nankai provided Chern with an excellent education during four happy years.
1930–34, Qing Hua graduate school
In the early 1930s, many mathematicians with PhD degrees recently earned abroad were returning to China and starting to train students. It appeared to Chern that this new generation of teachers did not encourage students to become original and strike out on their own, but instead set them to work on problems that were fairly routine generalizations of their own thesis research. Chern realized that to attain his goal of high quality advanced training in mathematics he would have to study abroad. Since his family could not cover the expense this would involve, he knew that he would require the support of a government fellowship. He learned that a student graduating from Qing Hua graduate school with sufficiently distinguished records could be sent abroad with support for further study, so, after graduating from Nankai in 1930, he took and passed the entrance examination for Qing Hua graduate school. At that time the four professors of mathematics at Qing Hua were Qinglai Xiong, Guangyuan Sun, Wuzhi Yang (C. N. Yang’s father), and Zhifan Zheng (Chern’s father-in-law to be), and Chern studied projective differential geometry with Professor Sun.
While at Nankai Chern had taken courses from Jiang on the theory of curves and surfaces, using a textbook written by W. Blaschke. Chern had found this deep and fascinating, so when Blaschke visited Beijing in 1932, Chern attended all of his series of six lectures on web geometry. In 1934, when Chern graduated from Qing Hua, he was awarded a two-year fellowship for study in the United States but, because of his high regard for Blaschke, he requested permission from Qing Hua to use the fellowship at the University of Hamburg instead. The acting chairman, Professor Wuzhi Yang, helped both to arrange the fellowship for Chern and for his permission to use it in Germany. This was the year that the Nazis were starting to expel Jewish professors from the German universities, but Hamburg University had opened only several years before and, perhaps because it was so new, it remained relatively calm and a good place for a young mathematician to study.
1934–36, Hamburg University
Chern arrived at Hamburg University in September of 1934, and started working under Blaschke’s direction on applications of Cartan’s methods in differential geometry. He received the Doctor of Science degree in February 1936. Because Blaschke travelled frequently, Chern worked much of the time with Blaschke’s assistant, Kähler. Perhaps a the major influence on him while at Hamburg was Kähler’s seminar on what is now a known as Cartan–Kähler Theory. This was then a new theory and everyone at the a Institute attended the first meeting. By the end of the seminar only Chern was left, but he felt that he had benefited greatly from it. When his two year fellowship ended in the summer of 1936, Chern was offered appointments at both Qing Hua and Beijing University. But he was also offered another year of support from The Chinese Culture Foundation and, with the recommendation of Blaschke, he went to Paris in 1936–37 to work under the renowned geometer Élie Cartan.
1936–37, Paris
When Chern arrived in Paris in September of 1936, Cartan had so many students eager to work with him that they lined up to see him during his office hours. Fortunately, after two months Cartan invited Chern to see him at home for an hour once every other week during the remaining ten months he was in Paris. Chern spent all his efforts preparing for these biweekly meetings, working very hard and very happily. He learned moving frames, the method of equivalence, more of Cartan–Kähler theory, and most a importantly according to Chern himself, he learned the mathematical language and the way of thinking of Cartan. The three papers he wrote during this period represented the fruits of only a small part of the research that came out of this association with Cartan.
1937–43, Kunming and the Southwest University Consortium
Chern received an appointment as Professor of Mathematics at Qing Hua in 1937. But before he could return to China, invading Japanese forces had touched off the long and tragic Sino-Japanese war. Qing Hua joined with Peking University and Nankai University to form a three-university consortium, first at Changsha, and then, beginning in January 1938, at Kunming, where it was called the Southwest Associated University. Chern taught at both places. It had an excellent faculty, and in particular Luogeng Hua was also Professor of Mathematics there. Chern had many excellent students in Kunming, some of whom later made substantial contributions to mathematics and physics. Among these were the mathematician H. C. Wang and the Nobel prize-winning physicist C. N. Yang. Because of the war, there was little communication with the outside world and the material life was meager. But Chern was fortunate enough to have Cartan’s recent papers to study, and he immersed himself in these and in his own research. The work begun during this difficult time would later become a major source of inspiration in modern mathematics.
Chern’s family
In 1937 Chern and Ms. Shih-Ning Cheng became engaged in Changsha, having been introduced by Wuzhi Yang. She had recently graduated from Dong Wu University, where she had studied biology. They were married in July of 1939, and Mrs. Chern went to Shanghai in 1940 to give birth to their first child, a son Buo Lung. The war separated the family for six years and they were not reunited until 1946. They have a second child, a daughter, Pu (married to Chingwu Chu, one of the main contributors in the development of high temperature superconductors).
The Cherns have had a beautiful and full marriage and family life. Mrs. Chern has always been at his side and Chern greatly appreciated her efforts to maintain a serene environment for his research. He expressed this in a poem he wrote on her sixtieth birthday:
Thirty-six years together
Through times of happiness
And times of worry too.
Time’s passage has no mercy.We fly the Skies and cross the Oceans
To fulfill my destiny;
Raising the children fell
Entirely on your shoulders.How fortunate I am
To have my works to look back upon,
I feel regrets you still have chores.Growing old together in El Cerrito is a blessing.
Time passes by,
And we hardly notice.
In 1978 Chern wrote in the article “A summary of my scientific life and works”:
“I would not conclude this account without mentioning my wife’s role in my life and work. Through war and peace and through bad and good times we have shared a life for forty years, which is both simple and rich. If there is credit for my mathematical works, it will be hers as well as mine.”
1943–45, Institute for Advanced Study at Princeton
By now Chern was recognized as one of the outstanding mathematicians of China, and his work was drawing international attention. But he felt unsatisfied with his achievements, and when O. Veblen obtained a membership for him at the Institute for Advanced Study in 1943, he decided to go despite the great difficulties of wartime travel. In fact, it required seven days for Chern to reach the United States by military aircraft!
This was one of the most momentous decisions of Chern’s life, for in those next two years in Princeton he was to complete some of his most original and influential work. In particular, he found an intrinsic proof of The Generalized Gauss–Bonnet Theorem [9], and this in turn lead him to discover the famous Chern characteristic classes [10]. In 1945 Chern gave an invited hour address to the American Mathematical Society, summarizing some of these striking new advances. The written version of this talk [11] was an unusually influential paper, and as Heinz Hopf remarked in reviewing it for Mathematical Reviews it signaled the arrival of a new age in global differential geometry (“Dieser Vortrag… zeigt, dass wir uns einer neuen Epoche in der ‘Differentialgeometrie im Grossen’ befinden”).
1946–48, Academia Sinica
Chern returned to China in the spring of 1946. The Chinese government had just decided to set up an Institute of Mathematics as part of Academia Sinica. Lifu Jiang was designated chairman of the organizing committee, and he in turn appointed Chern as one of the committee members. Jiang himself soon went abroad, and the actual work of organizing the Institute fell to Chern. At the Institute, temporarily located in Shanghai, Chern emphasized the training of young people. He selected the best recent undergraduates from universities all over China and lectured to them twelve hours a week on recent advances in topology. Many of today’s outstanding Chinese mathematicians came from this group, including Wenjun Wu, Shantao Liao, Guo Tsai Chen, and C. T. Yang. In 1948 the Institute moved to Nanjing, and Academia Sinica elected eighty-one charter members, Chern being the youngest of these.
Chern was so involved in his research and with the training of students that he paid scant attention to the civil war that was engulfing China. One day however, he received a telegram from J. Robert Oppenheimer, then Director of the Institute for Advanced Study, saying “If there is anything we can do to facilitate your coming to this country please let us know.” Chern went to read the English language newspapers and, realizing that Nanjing would soon become embroiled in the turmoil that was rapidly overtaking the country, he decided to move the whole family to America. Shortly before leaving China he was also offered a position at the Tata Institute in Bombay. The Cherns left from Shanghai on December 31, 1948, and spent the Spring Semester at the Institute in Princeton.
1949–60, Chicago University
Chern quickly realized that he would not soon be able to return to China, and so would have to find a permanent position abroad. At this time, Professor Marshall Stone of the University of Chicago Mathematics Department had embarked on an aggressive program of bringing to Chicago stellar research figures from all over the world, and in a few years time he had made the Chicago department one of the premier centers for mathematical research and graduate education worldwide. Among this group of outstanding scholars was Chern’s old friend, André Weil, and in the summer of 1949 Chern too accepted a professorship at the University of Chicago. During his eleven years there Chern had ten doctoral students. He left in 1960 for the University of California at Berkeley, where he remained until his retirement in 1979.
Chern and C. N. Yang
Chern’s paper on characteristic classes was published in 1946 and he gave a one semester course on the theory of connections in 1949. Yang and Mills published their paper introducing the Yang–Mills theory into physics in 1954. Chern and Yang were together in Chicago in 1949 and again in Princeton in 1954. They are good friends and often met and discussed their respective research. Remarkably, neither realized until many years later that they had been studying different aspects of the same thing!
1960–79, UC Berkeley
Chern has commented that two factors convinced him to make the move to Berkeley. One was that the Berkeley department was growing vigorously, giving him the opportunity to build a strong group in geometry. The other was… the warm weather. During his years at Berkeley, Chern directed the thesis research of thirty-one students. He was also teacher and mentor to many of the young postdoctoral mathematicians who came to Berkeley for their first jobs. (This group includes one of the coauthors of this article; the other was similarly privileged at Chicago.) During this period the Berkeley Department became a world-famous center for research in geometry and topology. Almost all geometers in the United States, and in much of the rest of the world too, have met Chern and been strongly influenced by him. He has always been friendly, encouraging, and easy to talk with on a personal level, and since the 1950s his research papers, lecture notes, and monographs have been the standard source for students desiring to learn differential geometry. When he “retired” from Berkeley in 1979, there was a week long “Chern Symposium” in his honor, attended by over three hundred geometers. In reality, this was a retirement in name only; during the five years that followed, not only did Chern find time to continue occasional teaching as Professor Emeritus, but he also went “up the hill” to serve as the founding director of the Berkeley Mathematical Sciences Research Institute (MSRI).
1981–present, the three institutes
In 1981 Chern, together with Calvin Moore, Isadore Singer, and several other San Francisco Bay area mathematicians wrote a proposal to the National Science Foundation for a mathematical research institute at Berkeley. Of the many such proposals submitted, this was one of only two that were eventually funded by the NSF. Chern became the first director of the resulting Mathematical Sciences Research Institute (MSRI), serving in this capacity until 1984. MSRI quickly became a highly successful institute and many credit Chern’s influence as a major factor.
In fact, Chern has been instrumental in establishing three important institutes of mathematical research: The Mathematical Institute of Academia Sinica (1946), The Mathematical Sciences Research Institute in Berkeley, California (1981), and The Nankai Institute for Mathematics in Tianjin, China (1985). It was remarkable that Chern did this despite a reluctance to get involved with details of administration. In such matters his adoption of Laozi’s philosophy of “Wu Wei” (roughly translated as “Let nature take its course”) seems to have worked admirably. Chern has always believed strongly that China could and should become a world leader in mathematics. But for this to happen he felt two preconditions were required:
The existence within the Chinese mathematical community of a group of strong, confident, creative people, who are dedicated, unselfish, and aspire to go beyond their teachers, even as they wish their students to go beyond them.
Ample support for excellent library facilities, research space, and communication with the world-wide mathematical community. (Chern claimed that these resources were as essential for mathematics as laboratories were for the experimental sciences).
It was to help in achieving these goals that Chern accepted the job of organizing the mathematics institute of Academia Sinica during 1946 to 1948, and the reason why he returned to Tianjin to found the Mathematics Institute at Nankai University after his retirement in 1984 as director of MSRI.
During 1965–76, because of the Cultural Revolution, China lost a whole generation of mathematicians, and with them much of the tradition of mathematical research. Chern started visiting China frequently after 1972, to lecture, to train Chinese mathematicians, and to rekindle these traditions. In part because of the strong bonds he had with Nankai University, he founded the Nankai Mathematical Research Institute there in 1985. This Institute has its own housing, and attracts many visitors both from China and abroad. In some ways it is modeled after the Institute for Advanced Study in Princeton. One of its purposes is to have a place where mature mathematicians and graduate students from all of China can spend a period of time in contact with each other and with foreign mathematicians, concentrating fully on research. Another is to have an inspiring place in which to work; one that will be an incentive for the very best young mathematicians who get their doctoral degrees abroad to return home to China.
Honors and awards
Chern was invited three times to address The International Congress of Mathematicians. He gave an Hour Address at the 1950 Congress in Cambridge, Massachusetts (the first ICM following the Second World War), spoke again in 1958, at Edinburgh, Scotland, and was invited to give a second Hour Address at the 1970 ICM in Nice, France. These Congresses are held only every fourth year and it is unusual for a mathematician to be invited twice to give a plenary Hour Address.
During his long career Chern was awarded numerous honorary degrees. He was elected to the US National Academy of Sciences in 1961, and received the National Medal of Science in 1975 and the Wolf Prize in 1983. The Wolf Prize was instituted in 1979 by the Wolf Foundation of Israel to honor scientists who had made outstanding contributions to their field of research. Chern donated the prize money he received from this award to the Nankai Mathematical Institute. He is also a foreign member of The Royal Society of London, Academie Lincei, and the French Academy of Sciences. A more complete list of the honors he received can be found in the Curriculum Vitae in [28].
An overview of Chern’s research
Chern’s mathematical interests have been unusually wide and far-ranging and he has made significant contributions to many areas of geometry, both classical and modern. Principal among these are:
Geometric structures and their equivalence problems
Integral geometry
Euclidean differential geometry
Minimal surfaces and minimal submanifolds
Holomorphic maps
Webs
Exterior Differential Systems and Partial Differential Equations
The Gauss–Bonnet Theorem
Characteristic classes
Since it would be impossible within the space at our disposal to present a detailed review of Chern’s achievements in so many areas, rather than attempting a superficial account of all facets of his research, we have elected to concentrate on those areas where the effects of his contributions have, in our opinion, been most profound and far-reaching. For further information concerning Chern’s scientific contributions the reader may consult the four volume set, Shiing-Shen Chern: Selected Papers [24], [28], [30], [29]. This includes a Curriculum Vitae, a full bibliography of his published papers, articles of commentary by André Weil and Phillip Griffiths, and a scientific autobiography in which Chern comments briefly on many of his papers.
One further caveat; the reader should keep in mind that this is a mathematical biography, not a mathematical history. As such, it concentrates on giving an account of Chern’s own scientific contributions, mentioning other mathematicians only if they were his coauthors or had some particularly direct and personal effect on Chern’s research. Chern was working at the cutting edge of mathematics and there were of course many occasions when others made discoveries closely related to Chern’s and at approximately the same time. A far longer (and different) article would have been required if we had even attempted to analyze such cases. But it is not only for reasons of space that we have avoided these issues. A full historical treatment covering this same ground would be an extremely valuable undertaking, and will no doubt one day be written. But that will require a major research effort of a kind that neither of the present authors has the training or qualifications even to attempt.
Before turning to a description of Chern’s research, we would like to point out a unifying theme that runs through all of it: his absolute mastery of the techniques of differential forms and his artful application of these techniques in solving geometric problems. This was a magic mantle, handed down to him by his great teacher, Élie Cartan. It permitted him to explore in depth new mathematical territory where others could not enter. What makes differential forms such an ideal tool for studying local and global geometric properties (and for relating them to each other) is their two complementary aspects. They admit, on the one hand, the local operation of exterior differentiation, and on the other the global operation of integration over cochains, and these are related via Stokes’ Theorem.
Geometric structures and their equivalence problems
Much of Chern’s early work was concerned with various “equivalence problems”. Basically, the question is how to determine effectively when two geometric structures of the same type are “equivalent” under an appropriate group of geometric transformations. For example, given two curves in space, when is there a Euclidean motion that carries one onto the other? Similarly, when are two Riemannian structures locally isometric? Classically one tried to associate with a given type of geometric structure various “invariants”, that is, simpler and better understood objects that do not change under an isomorphism, and then show that certain of these invariants are a “complete set”, in the sense that they determine the structure up to isomorphism. Ideally one should also be able to specify what values these invariants can assume by giving relations between them that are both necessary and sufficient for the existence of a structure with a given set of invariants. The goal is a theorem like the elegant classic paradigm of Euclidean plane geometry, stating that the three side lengths of a triangle determine it up to congruence, and that three positive real numbers arise as side lengths precisely when each is less than the sum of the other two. For smooth, regular space curves the solution to the equivalence problem was known early in the last century. If to a given space curve \( \sigma(s) \) (parametrized by arc length) we associate its curvature \( \kappa(s) \) and torsion \( \tau(s) \), it is easy to show that these two smooth scalar functions are invariant under the group of Euclidean motions, and that they uniquely determine a curve up to an element of that group. Moreover any smooth real valued functions \( \kappa \) and \( \tau \) can serve as curvature and torsion as long as \( \kappa \) is positive. The more complex equivalence problem for surfaces in space had also been solved by the mid 1800s. Here the invariants turned out to be two smooth quadratic forms on the surface, the first and second fundamental forms, of which the first, the metric tensor, had to be positive definite and the two had to satisfy the so-called Gauss and Codazzi equations. The so-called “form problem”, that is the local equivalence problem for Riemannian metrics, was also solved classically (by Christoffel and Lipschitz). The solution is still more complex and superficially seems to have little in common with the other examples above.
As Chern was starting his research career, a major challenge facing geometry was to find what this seemingly disparate class of examples had in common, and thereby discover a general framework for the Equivalence Problem. Cartan saw this clearly, and had already made important steps in that direction with his general machinery of “moving frames”. His approach was to reduce a general equivalence problem to one of a special class of equivalence problems for differential forms. More precisely, he would associate to a given type of local geometric structure in open sets \( U \) of \( \mathbf{R}^n \), an “equivalent” structure, given by specifying:
a subgroup \( G \) of \( \mathbf{GL}(n, \mathbf{R}) \),
certain local coframe fields \( \{\theta_i \} \) in open subsets \( U \) of \( \mathbf{R}^n \) (i.e., \( n \) linearly independent differential 1-forms in \( U \)).
The condition of equivalence for \( \{\theta_i\} \) in \( U \) and \( \{\theta_i^{\ast}\} \) in \( U^{\ast} \) is the existence of a diffeomorphism \( \varphi \) of \( U \) with \( U^{\ast} \) such that \[ \varphi^{\ast} (\theta_i^{\ast}) = \sum_{i=1}^n a_{ij} \theta_j ,\] where \( (a_{ij} ) \) is a smooth map of \( U \) into \( G \). A geometric structure defined by the choices (1) and (2) is now usually called a “\( G \)-structure”, a name introduced by Chern in the course of formalizing and explicating Cartan’s approach. For a given geometric structure one must choose the related \( G \)-structure so that its notion of equivalence coincides with that for the originally given geometric structure, so the invariants of the \( G \)-structure will also be the same as for the given geometric structure. In the case of the form problem one takes \( G = \mathbf{O}(n) \), and given a Riemannian metric \( ds^2 \) in \( U \) chooses any \( \theta_i \) such that \[ ds^2 = \sum_{i=1}^n\theta_i^2 \] in \( U \). While not always so obvious as in this case (and a real geometric insight is sometimes required for their discovery) most other natural geometric equivalence problems, including the ones mentioned above, do admit reformulation in terms of \( G \)-structures.
But do we gain anything besides uniformity from such a reformulation? In fact, we do, for Cartan also developed general techniques for finding complete sets of invariants for \( G \)-structures. Unfortunately, however, carrying out this solution of the Equivalence Problem in complete generality depends on his powerful but difficult theory of Pfaffian systems in involution, with its method of prolongation, a theory not widely known or well understood even today. In fact, while his preeminence as a geometer was clearly recognized towards the end of his career, many great mathematicians confessed to finding Cartan’s work hard going at best, and few mathematicians of his day were able to comprehend fully his more novel and innovative advances. For example, in a review of one of his books (Bull. Amer. Math. Soc. vol. 44, p. 601) H. Weyl made this often quoted admission:
“Cartan is undoubtedly the greatest living master in differential geometry… Nevertheless I must admit that I found the book, like most of Cartan’s papers, hard reading…”
Given this well-known difficulty Cartan had in communicating his more esoteric ideas, one can easily imagine that his important insights on the Equivalence Problem might have lain buried. Fortunately they were spared such a fate.
Recall that Chern had spent his time at Hamburg studying the Cartan-Kähler a theory of Pfaffian systems with Kähler, and immediately after Hamburg Chern spent a year in Paris continuing his study of these techniques with Cartan. Clearly Chern was ideally prepared to carry forward the attack on the Equivalence Problem. In a series of beautiful papers over the next twenty years not only did he do just that, but he also explained and reformulated the theory with such clarity and geometric appeal that much (though by no means all!) of the theory has become part of the common world-view of differential geometers, to be found in the standard textbooks on geometry. Those two decades were also, not coincidentally, the years that saw the development of the theory of fiber bundles and of connections on principal \( G \)-bundles. These theories were the result of the combined research efforts of many people and had multiple sources of inspiration both in topology and geometry. One major thread in that development was Chern’s work on the Equivalence Problem and his related research on characteristic classes that grew out of it. In order to discuss this important work of Chern we must first define some of the concepts and notations that he and others introduced.
Using current geometric terminology, a \( G \)-structure for a smooth n-dimensional manifold \( M \) is a reduction of the structure group of its principal tangent coframe bundle from \( \mathbf{GL}(n, \mathbf{R}) \) to the subgroup \( G \). In particular, the total space of this reduction is a principal \( G \)-bundle, \( P \), over \( M \) consisting of the admissible coframes \[ \theta = (\theta_1, \dots, \theta_n) ,\] and we can identify the \( G \)-structure with this \( P \). There are \( n \) canonically defined 1-forms \( \omega_i \) on \( P \); if \[ \Pi : P\rightarrow M \] is the bundle projection, then the value of \( \omega_i \) at \( \theta \) is \( \Pi^{\ast} (\theta_i ) \). The kernel of \( D\Pi \) is of course the subbundle of the tangent bundle \( T\!P \) of \( P \) tangent to its fibers, and is usually called the vertical subbundle \( V \). Clearly the canonical forms \( \omega_i \) vanish on \( V \). The group \( G \) acts on the right on \( P \), acting simply transitively on each fiber, so we can identify the vertical space \( V_{\theta} \) at any point \( \theta \) with the Lie algebra \( L(G) \) of left-invariant vector fields on \( G \). Now, as Ehresmann first noted, a “connection” in Cartan’s sense for the given \( G \)-structure (or as we now say, a \( G \)-connection for the principal bundle \( P \)) is the same as a “horizontal” subbundle \( H \) of \( T\!P \) complementary to \( V \) and invariant under \( G \). Instead of \( H \) it is equivalent to consider the projection of \( T\!P \) onto \( V \) along \( H \) which, by the above identification of \( V_{\theta} \) with \( L(G) \), is an \( L(G) \)-valued 1-form \( \omega \) on \( P \), called the “connection 1-form”. If we denote the right action of \( g \in G \) on \( P \) by \( R_g \), then the invariance of \( H \) under \( G \) translates to the transformation law \[ R^{\ast}_g (\omega) = \operatorname{ad}(g^{-1}) \circ \omega \] for \( \omega \), where \( \operatorname{ad} \) denotes the adjoint representation of \( G \) on \( L(G) \). \( L(G) \)-valued forms on \( P \) transforming in this way are called equivariant. Since \( L(G) \) is a subalgebra of the Lie algebra \( L(\mathbf{GL}(n, \mathbf{R})) \) of \( n{\times}n \) matrices, we can regard \( \omega \) as an \( n{\times}n \) matrix-valued 1-form on \( P \), or equivalently as a matrix \( \omega_{ij} \) of \( n^2 \) real-valued 1-forms on \( P \).
If \( \sigma : [0, 1] \rightarrow M \) is a smooth path in \( M \) from \( p \) to \( q \), then the connection defines a canonical \( G \)-equivariant map \( \pi_{\sigma} \) of the fiber \( P_p \) to the fiber \( P_q \), called parallel translation along \( \sigma \); namely \( \pi_{\sigma} (\theta) =\tilde{\sigma}(1) \), where \( \tilde{\sigma} \) is the unique horizontal lift of \( \sigma \) starting at \( \theta \). In general, parallel translation depends on the path \( \sigma \), not just on the endpoints \( p \) and \( q \). If it depends only on the homotopy class of \( \sigma \) with fixed endpoints, then the connection is called “flat”. It is easy to see that this is so if and only if the horizontal subbundle \( H \) of \( T\!P \) is integrable, and using the Frobenius integrability criterion, this translates to \[ d\omega_{ij} =\sum_k \omega_{ik} \wedge \omega_{kj} .\] Thus it is natural to define the matrix \( \Omega_{ij} \) of so-called curvature forms of the connection, (whose vanishing is necessary and sufficient for flatness) by \[ d\omega_{ij}= \sum_k \omega_{ik} \wedge \omega_{kj} - \Omega_{ij} \] or, in matrix notation, \[ d\omega = \omega \wedge \omega - \Omega .\] Since \( \omega \) is equivariant, so is \( \Omega \). Differentiating the defining equation of the curvature forms gives the Bianchi identity, \[ d\Omega = \Omega \wedge \omega - \omega \wedge \Omega .\] A local cross-section \( \theta : U \rightarrow P \) is called an “admissible local coframe” for the \( G \)-structure, and we can use it to pull back the connection forms and curvature forms to forms \( \psi_{ij} \) and \( \Psi_{ij} \) on \( U \). Any other admissible coframe field \( \hat{\theta} \) in \( U \) is related to \( \theta \) by a unique “change of gauge”, \( g \) in \( U \) (i.e., a unique map \( g : U \rightarrow G \)) such that \[ \hat{\theta}(x) = R_{g(x)} \theta(x) .\] If we use \( \hat{\theta} \) to also pull back the connection and curvature forms to forms \( \hat{\psi} \) and \( \hat{\Psi} \) on \( U \), then, using matrix notation, it follows easily from the equivariance of \( \omega \) and \( \Omega \) that \[ \hat{\psi} = dg\, g^{-1} + g\psi g^{-1} \quad\text{and}\quad \hat{\Psi} = g\Psi g^{-1} .\]
But where do connections fit into the Equivalence Problem? While Cartan’s solution to the equivalence problem for \( G \)-structures was complicated in the general case, it became much simpler for the special case that \( G \) is the trivial subgroup \( \{e\} \). For this reason Cartan had developed a method by which one could sometimes reduce a \( G \)-structure on a manifold \( M \) to an \( \{e\} \)-structure on a new manifold obtained by “adding variables” corresponding to coordinates in the group \( G \). Chern recognized that this new manifold was just the total space \( P \) of the principal \( G \)-bundle, and that Cartan’s reduction method amounted to finding an “intrinsic \( G \)-connection” for \( P \), i.e., one canonically associated to the \( G \)-structure. Indeed the canonical 1-forms \( \omega_i \) together with a linearly independent set of the connection forms \( \omega_{ij} \), defined by the intrinsic connection, give a canonical coframe field for \( P \), which of course is the same as an \( \{e\} \)-structure. Finally, Chern realized that in this setting one could describe geometrically the invariants for a \( G \)-structure given by Cartan’s general method; in fact they can all be calculated from the curvature forms of the intrinsic connection.
Note that this covers one of the most important examples of a \( G \)-structure; namely the case \( G = \mathbf{O}(n) \), corresponding to Riemannian geometry. The intrinsic connection is of course the “Levi-Civita connection”. Moreover, in this case it is also easy to explain how to go on to “solve the form problem”, i.e., to find explicitly a complete set of local invariants for a Riemannian metric. In fact, they can be taken as the components of the Riemann curvature tensor and its covariant derivatives in Riemannian normal coordinates. To see this, note first the obvious fact that there is a local isometry of the Riemannian manifold \( (M, g) \) with \( (M^{\ast}, g^{\ast}) \) carrying the orthonormal frame \( e_i \) at \( p \) to \( e^{\ast}_i \) at \( p^{\ast} \) if and only if in some neighborhood of the origin the components \( g_{ij} (x) \) of the metric tensor of \( M \) with respect to the Riemannian normal coordinates \( x_k \) defined by \( e_i \) are identical to the corresponding components \( g^{\ast}_{ij} (x) \) of the metric tensor of \( M^{\ast} \) with respect to Riemannian normal coordinates defined by \( e^{\ast}_i \). The proof is then completed by using the easy, classical fact ([e1], Appendix II) that each coefficient in the Maclaurin expansion of \( g_{ij} (x) \) can be expressed as a universal polynomial in the components of the Riemann tensor and a finite number of its covariant derivatives.
Let us denote by \( N (G) \) the semidirect product \( G \ltimes \mathbf{R}^n \) of affine transformations of \( \mathbf{R}^n \) generated by \( G \) and the translations. Correspondingly we can “extend” the principal \( G \)-bundle \( P \) of linear frames to the associated principal \( N (G) \)-bundle \( N (P ) \) of affine frames. Chern noted in [12] that the above technique could be expressed more naturally, and could be generalized to a wide class of groups \( G \), if one looked for intrinsic \( N (G) \)-connections on \( N (P ) \). The curvature of an \( N (G) \)-connection on \( N (P ) \) is a two-form \( \Omega \) on \( N (P ) \) with values in the Lie algebra \( L(N (G)) \) of \( N (G) \). Now \( L(N (G)) \) splits canonically as the direct sum of \( L(G) \) and \( L(\mathbf{R}^n) = \mathbf{R}^n \), and \( \Omega \) splits accordingly. The \( \mathbf{R}^n \) valued part, \( \tau \), of \( \Omega \) is called the torsion of the connection, and what Chern exploited was the fact that he could in certain cases define “intrinsic” \( N (G) \) connections by putting conditions on \( \tau \). For example, the Levi-Civita connection can be characterized as the unique \( N (\mathbf{O}(n)) \) connection on \( N (P ) \) such that \( \tau = 0 \). In fact, in [12] Chern showed that if the Lie algebra \( L(G) \) satisfied a certain simple algebraic condition (“property C”) then it was always possible to define an intrinsic \( N (G) \) connection in this way, and he proved that any compact \( G \) satisfies property C. He also pointed out here, from the point of view of Cartan’s theory of pseudogroups, why some \( G \)-structures do not admit intrinsic connections. The pseudogroup of a \( G \)-structure \( \Pi : P \rightarrow M \) is the pseudogroup of local diffeomorphisms of \( M \) whose differential preserves the subbundle \( P \). It is elementary that the group of bundle automorphisms of a principal \( G \)-bundle that preserve a given \( G \)-connection is a finite dimensional Lie group and so a fortiori the pseudogroup of a \( G \)-structure with a canonically defined connection will be a Lie group. But there are important examples of groups \( G \) for which the pseudogroup of a \( G \)-structure is of infinite dimension. For example, if \( n = 2m \) and we take \( G = \mathbf{GL}(m, \mathbf{C}) \), then a \( G \)-structure is the same thing as an almost-complex structure, and the group of automorphisms is an infinite pseudogroup.
Chern solved many concrete equivalence problems. In [1] and [4] he carried this out for the path geometry defined by a third order ordinary differential equation. Here the \( G \)-structure is on the contact manifold of unit tangent vectors of \( \mathbf{R}^2 \), and \( G \) is the ten-dimensional group of circle preserving contact transformations. In [2], [3] he generalized this to the path geometry of systems of \( n \)-th order ordinary differential equations. In [8] he considers a generalized projective geometry, i.e., the geometry of \( (k + 1)(n - k) \)-parameter family of \( k \)-dimensional submanifolds in \( \mathbf{R}^n \), and in [6], [7] the geometry defined by an \( (n - 1) \)-parameter family of hypersurfaces in \( \mathbf{R}^n \). In [22] (jointly with Moser) and in [23] he considers real hypersurfaces in \( \mathbf{C}^n \). This latter research played a fundamental role in the development of the theory of CR manifolds.
Integral geometry
The group \( G \) of rigid motions of \( \mathbf{R}^n \) acts transitively on various spaces \( S \) of geometric objects (e.g., points, lines, affine subspaces of a fixed dimension, spheres of a fixed radius) so that these spaces can be regarded as homogeneous spaces, \( G/H \), and the invariant measure on \( G \) induces an invariant measure on \( S \). This is the so-called “kinematic density”, first introduced by Poincaré, and the basic problem of integral geometry is to express the integrals of various geometrically interesting quantities with respect to the kinematic density in terms of known integral invariants (see [17]). The simplest example is Crofton’s formula for a plane curve \( C \), \[ \int n (\ell \cap C)\, d\ell= 2L (C) \] where \( L(C) \) is its length, \( n(\ell \cap C) \) is the number of its intersection points with a line in the plane, and \( d\ell \) is the kinematic density on lines. We can interpret this formula as saying that the average number of times that a line meets a curve (i.e., is incident with a point on the curve) is equal to twice the length of the curve.
In [5], Chern laid down the foundations for a much generalized integral geometry. In [e4], André Weil says of this paper that:
“… it lifted the whole subject at one stroke to a higher plane than where Blaschke’s school had left it, and I was impressed by the unusual talent and depth of understanding that shone through it.”
Chern first extended the classical notion of “incidence” to a pair of elements from two homogeneous spaces \( G/H \) and \( G/K \) of the same group \( G \). Given \( aH \in G/H \) and \( bK \in G/K \), Chern calls them “incident” if \[ aH \cap bK \neq\emptyset .\] This definition plays an important role in the theory of Tits buildings.
In [13] and [17] Chern obtained fundamental kinematic formulas for two submanifolds in \( \mathbf{R}^n \). The integral invariants in Chern’s formula arise naturally in Weyl’s formula for the volume of a tube \( T_{\rho} \) of radius \( \rho \) about a \( k \)-dimensional submanifold \( X \) of \( \mathbf{R}^n \). Setting \( m = n - k \), Weyl’s formula is: \[ V (T_\rho) =\sum_{0\leq i\leq k,\, i \text{ even}} c_i \mu_i (X)\rho^{m+i}. \] Here the \( c_i \) are constants depending on \( m \) and \( i \), \[ \mu_i(X) = \int_M I_i(\Omega) \] where \( I_i \) is a certain adjoint invariant polynomial of degree \( i/2 \) on the Lie algebra of \( \mathbf{O}(n) \), and \( \Omega \) is the curvature form with respect to the induced metric on \( X \). Chern’s formula (also discovered independently by Federer) is: \[ \int \mu_e (M_1 \cap g M_2) \, dg = \sum_{0 \leq i \leq e,\, i \text{ even}} c_i \mu_i (M_1) \mu_{e-i} (M_2), \] where \( M_1 \) and \( M_2 \) are submanifolds of \( \mathbf{R}^n \) of dimensions \( p \) and \( q \) respectively, \( e \) is even, \( 0 \leq e \) \( \leq p + q - n \), and \( c_i \) are constant depend on \( n, p, q, e \). Griffiths made the following comment concerning this paper [e3]:
“Chern’s proof of [this formula] exhibits a number of characteristic features. Of course, one is the use of moving frames…. Another is that the proof proceeds by direct computation rather than by establishing an elaborate, conceptual framework; in fact upon closer inspection there is such a conceptual framework, as described in [5], however, the philosophical basis is not isolated but is left to the reader to understand by seeing how it operates in a nontrivial problem.”
Euclidean differential geometry
One of the main topics in classical differential geometry is the study of local invariants of submanifolds in Euclidean space under the group of rigid motions, i.e., the equivalence problem for submanifolds. The solution is classical. In fact, the first and second fundamental forms, \( I \) and \( \mathit{II} \), and the induced connection \( \nabla^\nu \) on the normal bundle of a submanifold satisfy the Gauss, Codazzi and Ricci equations, and they form a complete set of local invariants for submanifolds in \( \mathbf{R}^n \). Explicitly these invariants are as follows:
\( I \) is the induced metric on \( M \),
\( \mathit{II} \) is a quadratic form on \( M \) with values in the normal bundle \( \nu(M ) \) such that, for any unit tangent vector \( u \) and unit normal vector \( v \) at \( p \), \[ {\mathit{II}}_v (u) = \langle{\mathit{II}}(u), v\rangle \] is the curvature at \( p \) of the plane curve \( \sigma \) formed by intersecting \( M \) with the plane spanned by \( u \) and \( v \), and
if \( s \) is a smooth normal field then \( \nabla^{\nu} (s) \) is the orthogonal projection of the differential \( ds \) onto the normal bundle \( \nu(M) \).
\( {\mathit{II}}_v = \langle{\mathit{II}},v\rangle \) is called the second fundamental form in the direction of \( v \). The self-adjoint operator \( A_v \) corresponding to \( {\mathit{II}}_v \) is called the shape operator of \( M \) in the direction \( v \).
Chern’s work in this field involved mainly the relation between the global geometry of submanifolds and these local invariants. He wrote many important papers in the area, but because of space limitations we will concentrate only on the following:
(1) Minimal surfaces
Since the first variation for the area functional for submanifolds of \( \mathbf{R}^n \) is the trace of the second fundamental form, a submanifold \( M \) of \( \mathbf{R}^n \) is called minimal if \[ \operatorname{trace}(\mathit{II}\,) = 0 .\] Let \( \mathbf{Gr}(2, n) \) denote the Grassmann manifold of 2-planes in \( \mathbf{R}^n \). The Gauss map \( G \) of a surface \( M \) in \( \mathbf{R}^n \) is the map from \( M \) to the Grassmann manifold \( \mathbf{Gr}(2, n) \) defined by \[ G(x) = \text{the tangent plane to }M\text{ at }x .\] The Grassmann manifold \( \mathbf{Gr}(2, n) \) can be identified as the hyperquadric \[ z^2_1 + \dots + z^2_n = 0 \] of \( \mathbf{{CP}}^{n-1} \) (via the map that sends a 2-plane \( V \) of \( \mathbf{R}^n \) to the complex line spanned by \( e_1 + ie_2 \), where \( (e_1 , e_2 ) \) is an orthonormal base for \( V \)). Thus \( \mathbf{Gr}(2, n) \) has a complex structure. On the other hand, an oriented surface in \( \mathbf{R}^n \) has a conformal and hence complex structure through its induced Riemannian metric. In [16], Chern proved that an immersed surface in \( \mathbf{R}^n \) is minimal if and only if the Gauss map is antiholomorphic. This theorem was proved by Pinl for \( n = 4 \) and is the starting point for relating minimal surfaces with the value distribution theory of Nevanlinna, Weyl, and Ahlfors. One of the fundamental results of minimal surface theory is the Bernstein uniqueness theorem, which says that a minimal graph \( z = f (x, y) \) in \( \mathbf{R}^3 \), defined for all \( (x, y) \in \mathbf{R}^2 \), must be a plane. Note that the image of the Gauss map of an entire graph lies in a hemisphere. Bernstein’s theorem as generalized by Osserman says that if the image of the Gauss map of a complete minimal surface of \( \mathbf{R}^3 \) is not dense, then the minimal surface is a plane. In [16], using a classical theorem of E. Borel, Chern generalized the Bernstein–Osserman theorem to a density theorem on the image of the Gauss map of a complete minimal surface in \( \mathbf{R}^n \), that is not a plane. More refined density theorems were established in [18], a joint paper with Osserman.
Motivated by Calabi’s work on minimal 2-spheres in \( \mathbf{S}^n \), Chern developed in [20] a general formalism for osculating spaces for submanifolds. He proved that given a minimal surface in a space form there is an integer \( m \) such that the osculating spaces of order \( m \) are parallel along the surface, and gave a complete system of local invariants, with their relations. As a consequence, he proved an analogue of Calabi’s theorem: if a minimal sphere of constant Gaussian curvature \( K \) in a space form of constant sectional curvature \( c \) is not totally geodesic, then \[ K = \frac{2c}{m(m + 1)} .\]
(2) Tight and taut immersions
We first recall a theorem of Fenchel, proved in 1929: if \( \alpha(s) \) is a simple closed curve in \( \mathbf{R}^3 \), parametrized by its arc length, and \( k(s) \) is its curvature function, then \[ \int |k(s)|\,ds \geq 2\pi ,\] and equality holds if and only if \( \alpha \) is a convex plane curve. Fary and Milnor proved that if \( \alpha \) is knotted then this integral must be greater than \( 4\pi \).
In [14] and [15], Chern and Lashof generalized these results to submanifolds of \( \mathbf{R}^n \). Let \( M^m \) be a compact m-dimensional manifold, \[ f : M \rightarrow \mathbf{R}^n \] an immersion, \( \nu^1 (M) \) the unit normal sphere bundle of \( M \), and \( dv \) the natural volume element of \( \nu^1 (M ) \). Let \[ N : \nu^1 (M ) \rightarrow \mathbf{S}^{n-1} \] denote the normal map, i.e., \( N \) maps the unit normal vector \( v \) at \( x \) to the parallel unit vector at the origin. Let \( da \) denote the volume element of \( \mathbf{S}^{n-1} \). Then the Lipschitz–Killing curvature \( G \) on \( \nu^1(M ) \) is defined by the equation \[ N^{\ast} (da) = G\, \mathit{dA} ,\] i.e., \( G(v) \) is the absolute value of the determinant of the shape operator \( A_v \) of \( M \) along the unit normal direction \( v \). The absolute total curvature \( \tau (M, f ) \) of the immersion \( f \) is the normalized volume of the image of \( N \), \[ \tau (M, f ) =\frac{1}{c_{n-1}}\,\int_{\nu^1(M)} |\operatorname{det}(A_v)|\,dv, \] where \( c_{n-1} \) is the volume of the unit \( (n - 1) \)-sphere. In [14] Chern and Lashof generalized Fenchel’s theorem by showing that \[ \tau(M, f ) \geq 2 ,\] with equality if and only if \( M \) is a convex hypersurface of an \( (m + 1) \)-dimensional affine subspace \( V \). In [15] they obtained the sharper result that \[ \tau (M, f ) \geq \sum\beta_i (M ), \] where \( \beta_i (M ) \) is the \( i \)-th Betti number of \( M \).
An immersion \( f : M \rightarrow \mathbf{R}^n \) is called tight if \( \tau (M, f ) \) is equal to the infimum, \( \tau (M ) \), of the absolute total curvature among all the immersions of M into Euclidean spaces of arbitrary dimensions. The study of absolute total curvature and tight immersion has become an important field in submanifold geometry that has seen many interesting developments in recent years. An important step in this development is Kuiper’s reformulation of tightness in terms of critical point theory. He showed that for a given compact manifold \( M \), \( \tau (M ) \) is the Morse number \( \gamma \) of \( M \), i.e., the minimum number of critical points a nondegenerate Morse function must have. Moreover, an immersion of \( M \) is tight if and only if every nondegenerate height function has exactly \( \tau (M ) = \gamma \) critical points. Another development is the concept of taut immersion introduced by Banchoff and Carter–West. An immersion of \( M \) into \( \mathbf{R}^n \) is called taut if every nondegenerate Euclidean distance function from a fixed point in \( \mathbf{R}^n \) to the submanifold has exactly \( \gamma \) critical points. Taut implies tight, and moreover a taut immersion is an embedding. Tautness is invariant under conformal transformations, hence using stereographic projection we may assume taut submanifolds lie in the sphere. Pinkall proved that the tube \( M_\epsilon \) of radius \( \epsilon \) around a submanifold \( M \) in \( \mathbf{R}^n \) is a taut hypersurface if and only if \( M \) is a taut submanifold. In particular, this gives two facts: one is that the parallel hypersurface of a taut hypersurface in \( \mathbf{S}^n \) is again taut, another is that to understand taut submanifolds it suffices to understand taut hypersurfaces. Since the Lie sphere group (the group of contact transformations carrying spheres to spheres) is generated by conformal transformations and parallel translations, tautness is invariant under the Lie sphere group. Note also that the \( \epsilon \)-tube \( M \) of a submanifold \( M \) in \( S^n \) is an immersed Legendre submanifold of the contact manifold of the unit tangent bundle of \( S^n \). Thus tautness really should be defined for Legendre submanifolds of the contact manifold of the unit tangent sphere bundle of \( \mathbf{S}^n \). Chern and Cecil make this concept precise in [26] and lay some of the basic differential geometric groundwork for Lie sphere geometry. There are many interesting examples of tight and taut submanifolds and many interesting theorems concerning them. But some of the most basic questions are still unanswered; for example there are no good necessary and sufficient conditions known for a compact manifolds to be immersed in Euclidean space as a tight or taut submanifold, and a complete set of local invariants for Lie sphere geometry is yet to be found.
The generalized Gauss–Bonnet theorem
Geometers tend to make a sharp distinction between “local” and “global” questions, and it is common not only to regard global problems as somehow more important, but even to consider local theory “old-fashioned” and unworthy of serious effort. Chern however has always maintained that research on these seemingly polar aspects of geometry must of necessity go hand-in-hand; he felt that one could not hope to attack the global theory of a geometric structure until one understood its local theory (i.e., the equivalence problem), and moreover, once one had discovered the local invariants of a theory, one was well on the way towards finding its global invariants as well! We shall next explain how Chern came to this contrary attitude, for it is an interesting and revealing story, involving the most exciting and important events of his research career: his discovery of an “intrinsic proof” of the Generalized Gauss–Bonnet Theorem and, flowing out of that, his solution of the characteristic class problem for complex vector bundles by his striking and elegant construction of what are now called “Chern classes” from his favorite raw material, the curvature forms of a connection. The Gauss–Bonnet Theorem for a closed, two-dimensional Riemannian manifold \( M \) was surely one of the high points of classical geometry, and it was generally recognized that generalizing it to higher dimensional Riemannian manifolds was a central problem of global differential geometry. The theorem states that the most basic topological invariant of \( M \), its Euler characteristic \( \chi (M ) \), can be expressed as \( 1/2\pi \) times the integral over M of its most basic geometric invariant, the Gaussian curvature function \( K \). Although there were many published proofs of this, Chern reproved it for himself by a new method that was very natural from a moving frames perspective. Moreover, unlike the published proofs, Chern’s had the potential to generalize to higher dimensions.
To explain Chern’s method, we start by applying the standard moving frames approach to n-dimensional oriented Riemannian manifolds \( M \), then specialize to \( n = 2 \). The orientation together with the Riemannian structure give an \( \mathbf{SO}(n) \) structure for \( M \). Since the Lie algebra \( L(\mathbf{SO}(n)) \) is just the skew-adjoint \( n{\times}n \) matrices, in the principal \( \mathbf{SO}(n) \) bundle \( F (M ) \) of oriented orthonormal frames of \( M \), in addition to the \( n \) canonical 1-forms \( (\omega_{i}) \), we will have the connection 1-forms for the Levi-Civita connection, a skew-adjoint \( n {\times} n \) matrix of 1-forms \( \omega_{ij} \), characterized uniquely by the zero torsion condition, \[ d\omega_i =\sum_j \omega_{ij} \wedge \omega_j .\] The components \( R_{ijkl} \) of the Riemann curvature tensor in the frame \( \omega_i \) are determined from the curvature forms \( \Omega_{ij} \) by \[ \Omega_{ij} = \frac{1}{2}\sum_{kl} R_{ijkl} \omega_k \wedge \omega_l \] (plus the condition of being skew-symmetric in \( (i, j) \) and in \( (k, l) \)).
When \( n = 2 \), the Lie algebra \( L(\mathbf{SO}(n)) \) is 1-dimensional; \( \omega_{11} = \omega_{22} = 0 \) and \( \omega_{21} = -\omega_{12} \), so there is only one independent \( \omega_{ij} \), namely \( \omega_{12} \), and so only one curvature equation, \[ d\omega_{12} = -\Omega_{12} = -R_{1212} \,\omega_1 \wedge \omega_2 .\] Now it is easily seen that \( R_{1212} \) is a constant on every fiber \( \Pi^{-1} (x) \), and its value is in fact the Gaussian curvature \( K(x) \). We can identify the area 2-form, \( \mathit{dA} \), on \( M \) with \( -\theta_1 \wedge \theta_2 \), where \( (\theta_1 , \theta_2 ) \) is any oriented orthonormal frame, so that \[ \Pi^{\ast}(\mathit{dA}) = -\omega_1 \wedge \omega_2 .\] Thus we can rewrite the above curvature equation as a formula for the pull-back of the Gauss–Bonnet integrand, \( K \mathit{dA} \), to \( F (M ) \): \begin{equation} \label{ast} \Pi^\ast (K \mathit{dA}) = d\omega_{12}. \tag{*} \end{equation} In [25] Chern remarks that, along with zero torsion equations, the formula \( (\text{*}) \) contains
“… all the information on local Riemannian geometry in two dimensions [and] gives global consequences as well. A little meditation convinces one that \( (\text{*}) \) must be the formal basis of the Gauss–Bonnet formula, and this is indeed the case. It turns out that the proof of the n-dimensional Gauss–Bonnet formula can be based on this idea….”
Chern noticed a remarkable property of \( (\text{*}) \). Since the Gauss–Bonnet integrand is a 2-form on a 2-dimensional manifold, it is automatically closed, and hence its pull-back under \( \Pi^{\ast} \) must also be closed. But (except when \( M \) is a torus) \( K \mathit{dA} \) is never exact, so we do not expect its pull back to be exact. Nevertheless, \( (\text{*}) \) says that it is! This phenomenon of a closed but nonexact form on the base of a fiber bundle becoming exact when pulled up to the total space is called transgression. As we shall see, it plays a key rôle in Chern’s proof.
By elementary topology, in the complement \( M^{\prime} \) of any point \( p \) of a closed Riemannian manifold \( M \) one can always define a smooth vector field \( e_1 \) of unit length, and the index of this vector field at \( p \) is \( \chi(M ) \). We will now see how this well-known characterization of the Euler characteristic together with the transgression formula \( (\text{*}) \) leads quickly to Chern’s proof of the Gauss–Bonnet theorem for two-dimensional \( M \). Let \( e_2 \) denote the unit length vector field in \( M^{\prime} \) making \( (e_1 , e_2 ) \) an oriented frame, and let \( \theta \) denote the dual coframe field in \( M^{\prime} \). Since \( \Pi \) composed with \( \theta \) is the identity map of \( M^{\prime} \), we have \[ d(\theta^{\ast}(\omega_{12} )) = \theta^\ast (d\omega_{12} ) = K \mathit{dA} \] in \( M^{\prime} \), so \[ \int_M K \mathit{dA} = \int_{M^{\prime}} K \mathit{dA} =\int_{M^{\prime}} d(\theta^{\ast} (\omega_{12} )) .\] If we write \( M_\epsilon \) for the complement of the open \( \epsilon \)-ball about \( p \), then \[ \int_{M^{\prime}}= \lim_{\epsilon\rightarrow 0}\int_{M_\epsilon} ,\] and by Stokes’ Theorem, \[ \int_M K \mathit{dA} = \lim_{\epsilon \rightarrow 0} \int_{S_{\epsilon}} \theta^\ast (\omega_{12} ) ,\] where \( S_\epsilon = \partial M_\epsilon \) is the distance sphere of radius \( \epsilon \) about \( p \). The proof will be complete if we can identify the right hand side of the latter equation with \( 2\pi \) times the index of \( \epsilon_1 \) at \( p \).
Choose Riemannian normal coordinates in a neighborhood \( U \) of \( p \) and let \( (\hat{e}_1 , \hat{e}_2 ) \) denote the local frame field in \( U \) defined by orthonormalizing the corresponding coordinate basis vectors, and \( \hat{\theta} \) the dual coframe field. If \( \alpha(x) \) denotes the angle between \( e_1(x) \) and \( \hat{e}_1 (x) \), then we recall that the standard expression for the index or winding number of \( e_1 \) with respect to \( p \) is \[ \frac{1}{2\pi}\int_C d\alpha \] where \( C \) is a small simple closed curve surrounding \( p \); so we will be done if we can show that the right hand side above is equal to \[ \int_{S_\epsilon} d\alpha .\]
Let \( \rho(\alpha) \in \mathbf{SO}(2) \) denote rotation through an angle \( \alpha \). The gauge transformation \( g : U \rightarrow \mathbf{SO}(2) \) from the coframe \( \hat{\theta} \) to the coframe \( \theta \) is just \( g(x) = \rho(\alpha(x)) \), so by the transformation law for pull-backs of connection forms noted above, \[ \theta^\ast (\omega_{12} ) = d\alpha + \hat{\theta}^\ast (\omega_{12} ) .\] Thus \( \int_{S_\epsilon} \theta^\ast (\omega_{12} ) \) can be written as the sum of two terms. The first is the desired \( \int_{S_\epsilon} d\alpha \), and the second term, \[ \int_{S_\epsilon} \hat{\theta}^\ast (\omega_{12} ) \] clearly tends to zero with \( \epsilon \) since the integrand is continuous at \( p \), while the length of \( S_\epsilon \) tends to zero.
We now return to the case of a general n-dimensional oriented Riemannian manifold \( M \) and develop some machinery we will need to explain the remarkable results that grew out of this approach to the two-dimensional Gauss–Bonnet Theorem.
A basic problem is how to construct differential forms on \( M \) canonically from the metric. Up in the coframe bundle, \( F (M ) \), there is an easy way to construct differential forms naturally from the metric — simply take “polynomials” in the curvature forms \( \Omega_{ij} \). Certain forms \( \Lambda \) constructed this way will “define” a form \( \lambda \) on \( M \) by the relation \[ \Lambda = \Pi^\ast \lambda ,\] and these are the forms we are after.
To make this precise we consider the ring \( \mathcal{R} \) of polynomials with real (or complex) coefficients in \( n(n - 1)/2 \) variables \( \{X_{ij}\} \), \( 1 \leq i < j \leq n \). We use matrix notation; \( X \) denotes the \( n {\times} n \) matrix \( X_{ij} \) of elements of \( \mathcal{R} \), where \( X_{ji} = -X_{ij} \) for \( i < j \), and \( X_{ii} = 0 \). For \( g \in \mathbf{SO}(n) \), \[ \operatorname{ad}(g)X = gXg^{-1} \] is the matrix \[ \sum_{k,l} g_{ik} X_{kl} g_{jl} \] of elements of \( \mathcal{R} \). If for \( g \) in \( \mathbf{SO}(n) \) and \( P \) in \( \mathcal{R} \) we define \( \operatorname{ad}(g)P \) in \( \mathcal{R} \) by \[ \bigl(\operatorname{ad}(g)P \bigr)(X) = P \bigl(\operatorname{ad}(g)X\bigr) ,\] this defines an “adjoint” action of \( \mathbf{SO}(n) \) on \( \mathcal{R} \) (by ring automorphisms). The subring of “ad-invariant” elements of \( \mathcal{R} \) is denoted by \( \mathcal{R}^{\operatorname{ad}} \). For future reference we note that we can also regard \( X \) as representing the general \( n {\times} n \) skew-symmetric matrix, i.e., the general element of the Lie algebra \( L(\mathbf{SO}(n)) \), and \( \mathcal{R} \) is just the ring of polynomial functions on \( L(\mathbf{SO}(n)) \).
The curvature 2-forms \( \Omega_{ij} \), being of even degree, commute with each other under exterior multiplication, so we can substitute them in elements \( P \) of \( \mathcal{R} \); if \( P (X) \) is homogeneous of degree \( d \) in the \( X_{ij} \), then \( P (\Omega) \) will be a differential \( 2d \)-form on \( F (M ) \).
Now let \( \theta \) be a local orthonormal coframe field in an open set \( U \) of \( M \), i.e., a local section \( \theta : U \rightarrow F (M ) \), and let \( \Psi = \theta^\ast (\Omega ) \) denote the matrix of pulled back curvature forms in \( U \). Since \( \theta^\ast \) is a Grassmann algebra homomorphism, for any \( P \) in \( \mathcal{R} \), \[ \theta^\ast \bigl(P (\Omega)\bigr) = P (\Psi) .\] In particular for any \( x \) in \( U \) we have \[ \theta^\ast \bigl(P (\Omega )\bigr)_x = P (\Psi_x ) .\] If \( \hat{\Psi} \) is the matrix of curvature forms in \( U \) corresponding to some other local coframe field, \( \hat{\theta} \) in \( U \), and \( g : U \rightarrow \mathbf{SO}(n) \) is the change of gauge mapping \( \theta \) to \( \hat{\theta} \), then as noted \[ \hat{\Psi}_x = \operatorname{ad}(g(x))\Psi ,\] so we find \[ P (\hat{\Psi}_x ) = \bigl(\operatorname{ad}(g(x))P\bigr)(\Psi_x ) .\] Thus in general the pulled back form \( P (\Psi) \) depends on the choice of \( \theta \) and is only defined locally, in \( U \). However if (and only if ) P is in the subring \( \mathcal{R}^{\operatorname{ad}} \) of \( \operatorname{ad} \)-invariant polynomials, the form \( P (\Psi) \) is a globally well-defined form on \( M \), independent of the choice of local frame fields \( \theta \) used to pull back the locally defined curvature matrices \( \Psi \). In this case it is clear that \[ \Pi^\ast \bigl(P (\Psi)\bigr) = P (\Omega ) ,\] a relation that uniquely determines \( P (\Psi) \).
There are many ways one might attempt to generalize the Gauss–Bonnet Theorem for surfaces, but perhaps the most obvious and natural is to associate with every compact, oriented, n-dimensional Riemannian manifold without boundary, \( M \), an \( n \)-form \( \lambda \) on \( M \) that is canonically defined from the metric, and has the property that \[ \lambda = c_n \chi(M ) ,\] where \( c_n \) is some universal constant. If \( n \) is odd then Poincaré duality implies that \( \chi(M ) = 0 \) when \( M \) is without boundary, and since we will only consider the closed case here, we will assume \( n = 2k \). (On the other hand, for odd-dimensional manifolds with boundary, the Gauss–Bonnet Theorem is interesting and decidedly nontrivial!). From the above discussion it is clear that we should define \( \lambda = P (\Psi) \), where \( P \) is an ad-invariant polynomial, homogeneous of degree \( k \) in the \( X_{ij} \). In fact there is an obvious candidate for \( P \) — the classical Pfaffian, \( \operatorname{Pf} \), uniquely determined (up to sign) by the condition that \[ \operatorname{Pf}(X)^2 = \det(X) \] (cf. [e2], p. 309).
A Generalized Gauss–Bonnet Theorem had already been proved in two papers, one by Allendoerfer and the other by Fenchel. Both proofs were “extrinsic” — they assumed \( M \) could be isometrically embedded in some Euclidean space. (A paper of Allendoerfer and Weil implied that the existence of local isometric embeddings was enough, thereby settling the case of analytic metrics). These earlier proofs wrote the Generalized Gauss–Bonnet integrand as the volume element times a scalar that was a complicated polynomial in the components of the Riemann tensor. In [9] Chern for the first time wrote the integrand as the Pfaffian of the curvature forms and then provided a simple and elegant intrinsic proof of the theorem along the lines of the above proof for surfaces.
Let \( \mathbf{S}(M ) \) denote the bundle of unit vectors of the tangent bundle to \( M \), and \( \pi : \mathbf{S}(M ) \rightarrow M \) the natural projection. Given a coframe \( \theta \) in \( F (M ) \) let \( e_1 (\theta) \) denote the first element of the frame dual to \( \theta \). Then \( e_1 : F (M ) \rightarrow \mathbf{S}(M ) \) is a fiber bundle and clearly \( \Pi : F (M ) \rightarrow M \) factors as \[ \Pi = \pi \circ e_1 .\] Let \( \lambda \) be the \( n \)-form \( \operatorname{Pf}(\Psi) \) on \( M \), and \( \Lambda = p^\ast (\lambda) \) its pull-back to \( \mathbf{S}(M ) \). In [9] Chern first proves a transgression lemma for \( \Lambda \), i.e., he explicitly finds an \( (n - 1) \)-form \( \Theta \) on \( \mathbf{S}(M ) \) satisfying \( d\Theta = \Lambda \). As in two dimensions let \( M^{\prime} \) be the complement of some point \( p \) in \( M \) and construct a smooth cross-section \( \xi \) of \( \mathbf{S}(M ) \) over \( M \). Then \( \pi\circ \xi \) is the identity map of \( M^{\prime} \), so just as in the two dimensional argument we find \[ d(\xi^\ast (\theta)) = \lambda ,\] and \[ \int_M = \lim_{\epsilon \rightarrow\theta}\int_{S_\epsilon} \xi^\ast (\Theta) .\] Finally, the construction of \( \Theta \) is so explicit that Chern is able to evaluate the right hand side by an argument similar to the one in the surface case, and he finds that it is indeed a universal constant times the Euler characteristic of \( M \).
Mathematicians in general value proofs of new facts much more highly than elegant new proofs of old results. It is worth commenting why [9] is an exception to this rule. The earlier proofs of the Generalized Gauss–Bonnet Theorem were virtually a dead end while, as we shall see below, Chern’s intrinsic proof was a key that opened the door to the secrets of characteristic classes.
Characteristic classes
The coframe bundle, \( F (M ) \), that keeps reappearing in our story, is an important example of a mathematical structure known as a principal \( G \)-bundle. These were first defined and their study begun only in the late 1930s, but their importance was quickly recognized by topologists and geometers, and the theory underwent intensive development during the 1940s. By the end of that decade the beautiful classification theory had been worked out, and with it the related theory of “characteristic classes”, a concept whose importance for the mathematics of the latter half of the twentieth century it would be difficult to exaggerate. (As we will see below, in the language we have been using, the classification problem is the equivalence problem for principal bundles, and characteristic classes are invariants for this equivalence problem).
In order to explain Chern’s role in these important developments we will first review some of the basic mathematical background of the theory.
We will consider only the case of a Lie group \( G \). Since the theory is essentially the same for a Lie group and one of its maximal compact subgroups, we will also assume that \( G \) is compact. A “space” will mean a paracompact topological space, and a \( G \)-space will mean a space, \( P \), together with a continuous right action of \( G \) on \( P \). We will write \( R_g \) for the homeomorphism \( p \mapsto pg \). The \( G \)-space \( P \) is called a principal \( G \)-bundle if the action is free, i.e., if for all \( p \) in \( P \), \( R_g (p) \neq p \) unless \( g \) is the identity element \( e \) of \( G \). More specifically, \( P \) is called a principal \( G \)-bundle over a space \( X \) if we are given some fixed homeomorphism of \( X \) with the orbit space \( P/G \), or equivalently if there is given a “projection map” \[ \Pi : P \rightarrow X \] such that the \( G \) orbits of \( P \) are exactly the “fibers” \( \Pi^{-1} (x) \) of the map \( \Pi \). \( P \) is called the total space of the bundle, and we often denote the bundle by the same symbol as the total space. A map \( \sigma : X \rightarrow P \) that is a left inverse to \( \Pi \) is called a section. Two \( G \)-bundles over \( X \), \( \Pi_i : P_i \rightarrow X \), \( i = 1, 2 \) are considered “equivalent” if there is a \( G \)-equivariant homeomorphism \[ \varphi : P_1 \rightarrow P_2 \quad\text{such that}\quad \Pi_1 = \Pi_2 \circ \varphi .\] The principal \( G \)-bundle over \( X \) defined by \( P = X \times G \) with \( R_g (x, \gamma) = (x, \gamma g) \) and \( \Pi(x, \gamma) = x \) is called the product bundle, and any bundle equivalent to the product bundle is called a trivial bundle. Clearly \( x \mapsto (x, e) \) is a section of the product bundle, so any trivial bundle has a section. Conversely, if \( \Pi : P \rightarrow X \) has a section \( \sigma \), then \( \varphi(x, g) = R_g (\sigma(x)) \) is an equivalence of the product bundle with \( P \), i.e., a principal \( G \)-bundle is trivial if and only if it admits a section. We will denote the set of equivalence classes [P ] of principal \( G \)-bundles \( P \) over \( X \) by \( \operatorname{Bndl}_G (X) \).
Given a principal \( G \)-bundle \( \Pi : P \rightarrow X \) and a continuous map \( f : Y \rightarrow X \), we can define a bundle \( f^\ast (P ) \) over \( Y \), called the bundle induced from \( P \) by the map \( f \). Its total space is \[ \bigl\{(p, y) \in P \times Y \bigm| \Pi(p) = f (y)\bigr\} ,\] with the projection \( (p, y) \mapsto y \) and the \( G \)-action \[ R_g (p, y) = \bigl(R_g (p), y\bigr) .\] It is easy to see that \( f^\ast \) maps equivalent bundles to equivalent bundles, so it induces a map (also denoted by \( f^\ast \)) from \( \operatorname{Bndl}_G (X) \) to \( \operatorname{Bndl}_G (Y ) \). If \( \Pi : P \rightarrow X \) is a principal \( G \)-bundle then \( \Pi^\ast (P ) \) is a principal \( G \)-bundle over the total space \( P \), called the “square” of the original bundle. In fact this bundle is always trivial, since it admits the “diagonal” section \( p \mapsto (p, p) \). As we will see below, this simple observation is the secret behind transgression!
The first nontrivial fact in the theory is the so-called “covering homotopy theorem”; it says that the induced map \[ f^\ast : \operatorname{Bndl}_G (X) \rightarrow \operatorname{Bndl}_G (Y ) \] depends only on the homotopy class \( [f ] \) of \( f \). We can paraphrase this by saying that \( \operatorname{Bndl}_G (\,\cdot\,) \) is a contravariant functor from the category of spaces and homotopy classes of maps to the category of sets. Now a cohomology theory is also such a functor, and a characteristic class for \( G \)-bundles can be defined as simply a natural transformation from \( \operatorname{Bndl}_G (\,\cdot\,) \) to some cohomology theory \( H^\ast (\,\cdot\,) \). Of course this fancy language isn’t essential and was only invented about the same time as bundle theory. It just says that a characteristic class \( c \) is a function that assigns to each principal \( G \)-bundle \( P \) over any space \( X \) an element \( c(P ) \) in \( H^\ast (X) \), with the “naturality” property that \[ c\bigl(f^\ast (P )\bigr) = f^\ast \bigl(c(P )\bigr) ,\] for any continuous \( f : Y \rightarrow X \). We fix some cohomology theory \( H^\ast (\,\cdot\,) \) and denote by \( \operatorname{Char}(G) \) the set of all characteristic classes for \( G \)-bundles. Since \( H^\ast (X) \) has the structure of a ring with unit, so does \( \operatorname{Char}(G) \), and the characteristic class problem for \( G \) is the problem of explicitly identifying this ring. Note that a trivial bundle is induced from a map to a space with one point, so all its characteristic classes (except the unit class) must be zero. More generally, equality of all characteristic classes of a bundle is a necessary (and in some circumstances sufficient) test for their equivalence, and this is one of the important uses of characteristic classes.
The remarkable and beautiful classification theorem for principal \( G \)-bundles “solves” the classification problem at least in the sense of reducing it to a standard problem of homotopy theory. Given spaces \( X \) and \( Z \) let \( [X, Z] \) denote the set of homotopy classes of maps of \( X \) into \( Z \). Note that \( [\,\cdot\, , Z] \) is a contravariant functor, much like \( \operatorname{Bndl}_G \) — any map \( f : Y \rightarrow X \) induces a pull-back map \[ f^\ast : [h] \mapsto [h \circ f ] \] of \( [X, Z] \) to \( [Y, Z] \). Moreover if \( \Pi : P \rightarrow Z \) is any principle \( G \)-bundle then we have a map \[ [h] \mapsto [h^\ast (P)] \] of \( [X, Z] \) to \( \operatorname{Bndl}_G (X) \) that is “natural” (i.e., it commutes with all “pull-back” maps \( f^\ast \)). We call \( P \) a universal principal \( G \)-bundle if the latter map is bijective. The heart of the classification theorem is the fact that universal \( G \)-bundles do exist. In fact it can be shown that a principal \( G \)-bundle is universal provided its total space is contractible, and there are even a number of methods for explicitly constructing such bundles.
We will denote by \( \mathcal{U}_G \) some choice of universal principal \( G \)-bundle. Its base space will be denoted by \( \mathcal{B}_G \) and is called the classifying space for \( G \). (Although \( \mathcal{B}_G \) is not unique, its homotopy type is). If \( \Pi : P \rightarrow X \) is any principal \( G \)-bundle then, by definition of universal, there is a unique homotopy class \( [h] \) of maps of \( X \) to \( \mathcal{B}_G \) such that \( P \) is equivalent to \( h^\ast (\mathcal{U}_G ) \). Any representative \( h \) is called a classifying map for \( P \). Clearly if \( f : Y \rightarrow X \) then \( h \circ f \) is a classifying map for \( f^\ast (P ) \). Also, the classifying map for \( \mathcal{U}_G \) is just the identity map of \( \mathcal{B}_G \).
It is now easy to give a solution of sorts to the characteristic class problem for \( G \); namely \( \operatorname{Char}(G) \) is canonically isomorphic to \( H^\ast (\mathcal{B}_G ) \). In fact each \( c \in H^\ast (\mathcal{B}_G ) \) defines a characteristic class (also denoted by \( c \)) by the formula \[ c(P ) = f^\ast (c) ,\] where \( f \) is a classifying map for \( P \), and the inverse map is just \( c \mapsto c(\mathcal{U}_G ) \).
This is a distillation of ideas developed between 1935 and 1950 by Chern, Ehresmann, Hopf, Feldbau, Pontryagin, Steenrod, Stiefel, and Whitney. While elegant in its simplicity, the above version is still too abstract and general to be of use in finding \( \operatorname{Char}(G) \) for a specific group \( G \). It is also of little use in calculating the characteristic classes of bundles that come up in geometric problems, for it is not often an easy matter to find a classifying map from geometric data. We shall discuss how Chern put flesh on these bones by finding concrete models for classifying spaces and, more importantly, by showing how to calculate explicitly de Rham theory representatives of many characteristic classes from the curvature forms of connections.
Let \( \mathbf{V}(n, N + n) \) denote the Stiefel manifold of \( n \)-frames in \( \mathbf{R}^{N +n} \), consisting of all orthonormal sequences \[ e = (e_1 , \dots, e_n) \] of vectors in \( \mathbf{R}^{N +n} \). There is an obvious free action of \( \mathbf{O}(n) \) on \( \mathbf{V}(n, N + n) \), and the orbit of \( e \) consists of all \( n \)-frames spanning the same \( n \)-dimensional linear subspace that \( e \) does. Thus we have an \( \mathbf{O}(n) \) principal bundle \[ \Pi : \mathbf{V}(n, N +n) \rightarrow \mathbf{Gr}(n, N +n) ,\] where \( \mathbf{Gr}(n, N +n) \) is the Grassmannian of all \( n \)-dimensional linear subspaces of \( \mathbf{R}^{N +n} \). In the early 1940s it was known from results of Steenrod and Whitney that this bundle is “universal for compact k-dimensional polyhedra”, provided \( N \geq k + 1 \). This means that for any compact polyhedral space \( X \), with \( \dim(X) \leq k \), every principal \( \mathbf{O}(n) \) bundle over \( X \) is of the form \[ h^\ast \bigl(\mathbf{V}(n, N +n)\bigr) \] for a unique \( [h] \) in \( [X, \mathbf{Gr}(n, N + n)] \). In [12] Chern and Y. F. Sun generalized these results to show that this bundle is also universal for compact \( k \)-dimensional ANRs. (If one wants universal bundles in the strict sense described above, one need only form the obvious inductive limit, \[ \Pi : \mathbf{V}(n, \infty) \rightarrow \mathbf{Gr}(n, \infty) ,\] by letting \( N \) tend to infinity. But for the finite dimensional problems of geometry it is preferable to stick with these finite dimensional models). By replacing the real numbers respectively by the complex numbers and the quaternions, Chern and Sun proved analogous results for the other classical groups \( \mathbf{U}(n) \) and \( \mathbf{Sp}(n) \). They went on to note that if \( G \) is any compact Lie group, then by taking a faithful representation of \( G \) in some \( \mathbf{O}(n) \), \( \mathbf{V}(n, N +n) \) becomes a principal \( G \) bundle by restriction, and the corresponding orbit space \[ \mathbf{V}(n, N + n)/G \] becomes a classifying space \( \mathcal{B}_G \) for compact ANRs of dimension \( \leq k \).
The Grassmannians make good models for classifying spaces, for they are well-studied explicit objects whose cohomology can be investigated using both algebraic and geometric techniques. From such computations Chern knew that there was an n-dimensional “Euler class” \( e \) in \( \operatorname{Char}(\mathbf{SO}(n)) \). If \( M \) is a smooth, compact, oriented \( n \)-dimensional manifold then \( e(F (M )) \in H^n (M ) \) when evaluated on the fundamental class of \( M \) is just \( \chi(M ) \). One can thus interpret the Generalized Gauss–Bonnet Theorem as saying that \( \lambda = \operatorname{Pf}(\Psi) \) represents \( e(F (M )) \) in de Rham cohomology. This inspired Chern to look for a general technique for representing characteristic classes by de Rham classes. This was in 1944–1945, while Chern was in Princeton, and he discussed this problem frequently with his friend André Weil who encouraged him in this search.
It might seem natural to start by trying to represent \( \mathbf{SO}(n) \) characteristic classes by closed differential forms, but Chern made what was to be a crucial observation: the cohomology of the real Grassmannians is complicated. In particular it contains a lot of \( \mathbf{Z}_2 \) torsion, and this part of the cohomology is invisible to de Rham theory. On the other hand Chern knew that Ehresmann, in his thesis, had calculated the homology of complex Grassmannians and showed there was no torsion. In fact Ehresmann showed that certain explicit algebraic cycles (the “Schubert cells”) form a free basis for the homology over \( \mathbf{Z} \). It follows from de Rham’s Theorem that all the cohomology classes for \( \mathcal{B}_{\mathbf{U}(n)} \) can be represented by closed differential forms. These forms, when pulled back by the classifying map of a principal \( \mathbf{U}(n) \)-bundle, will then represent the characteristic classes of the bundle in de Rham cohomology. While this is fine in theory, it still depends on knowing a classifying map, while what is needed in practice is a method to calculate these characteristic forms from geometric data. We now explain Chern’s beautiful algorithm for doing this.
Let \( \Pi : P \rightarrow M \) be a smooth principle \( \mathbf{U}(n) \)-bundle over a smooth manifold \( M \). Recall that a connection for \( P \) can be regarded as a 1-form \( \omega \) on \( P \) with values in the Lie algebra of \( \mathbf{U}(n) \), \( L(\mathbf{U}(n)) \), which consists of all \( n {\times} n \) skew-Hermitian complex matrices. Equivalently we can regard \( \omega \) as an \( n {\times} n \) matrix of complex-valued 1-forms \( \omega_{ij} \) on \( P \) satisfying \( \omega_{ji} = -\bar{\omega}_{ij} \), and similarly for the associated curvature 2-forms \( \Omega_{ij} \).
We will denote by \( \mathcal{R} \) the ring of complex-valued polynomial functions on the vector space \( L(\mathbf{U}(n)) \). Using the usual basis for the \( L(\mathbf{U}(n)) \), we can identify \( \mathcal{R} \) with complex polynomials in the \( 2n(n - 1) \) variables \( X_{ij} \), \( Y_{ij} \), \( 1 \leq i < j \leq n \) and the \( n \) variables \( Y_{ii} \), \( 1 \leq i \leq n \). \( Z \) will denote the \( n {\times} n \) matrix of elements in \( \mathcal{R} \) defined by \begin{align*} & Z_{ij} = X_{ij} + \sqrt{-1}\,Y_{ij},\\ & Z_{ji} = -X_{ij} + \sqrt{-1}\,Y_{ij}\quad\text{and}\\ & Z_{ii} = \sqrt{-1}\,Y_{ii} \end{align*} for \( 1 \leq i < j \leq n \). We can also regard \( Z \) as representing the general element of \( L(\mathbf{U}(n)) \), and we will write \( Q(Z) \) rather than \( Q(X_{ij} , Y_{ij}) \) to denote elements of \( \mathcal{R} \). The adjoint action of the group \( \mathbf{U}(n) \) on its Lie algebra \( L(\mathbf{U}(n)) \) is now given by \[ \operatorname{ad}(g)(Z) = gZg^{-1} ,\] just as in the \( \mathbf{SO}(n) \) case above, and as in that case we define the adjoint action of \( \mathbf{U}(n) \) on \( \mathcal{R} \) by \[ \bigl(\operatorname{ad}(g)Q\bigr)(Z) = Q\bigl(\operatorname{ad}(g)Z\bigr) .\] As before we denote by \( \mathcal{R}^{\operatorname{ad}} \) the subring of \( \mathcal{R} \) consisting of ad invariant polynomials. Once again we can substitute the curvature forms \( \Omega_{ij} \) for the \( Z_{ij} \) in an element \( Q(Z) \) in \( \mathcal{R} \), and obtain a differential form \( Q(\Omega) \) on \( P \); if \( Q \) is homogeneous of degree \( d \) in its variables then \( Q(\Omega) \) is a \( 2d \)-form. The same argument as in the \( \mathbf{SO}(n) \) case shows that if \( Q \in \mathcal{R}^{\operatorname{ad}} \) then \( Q(\Omega) \) is the pull-back of a uniquely determined form \( Q(\Psi) \) on \( M \). Using the Bianchi identity, Chern showed that \[ \mathit{dQ}(\Psi) = 0 \] (cf. [e2], p. 297), so \( Q(\Psi) \) represents an element \( [Q(\Psi)] \) in \( H^\ast (M ) \), the complex de Rham cohomology ring of \( M \). If we use a different connection \( \omega \) on \( P \) with curvature matrix \( \Omega \) then we get a different closed form \( Q(\Psi ) \) on \( M \) with \[ \Pi^\ast \bigl(Q(\Psi^{\prime})\bigr) = Q(\Omega^{\prime}) .\] What is the relation between \( Q(\Psi^{\prime} ) \) and \( Q(\Psi) \)? Weil provided Chern with the necessary lemma: they differ by an exact form, so that \( [Q(\Psi)] \) is a well-defined element of \( H^\ast (M ) \), independent of the connection. We will denote it by \( \hat{Q}(P ) \). (Weil’s lemma can be derived as a corollary of the fact that \( Q(\Psi) \) is closed. For the easy but clever proof see [e2], p. 297).
If \( h : M \rightarrow M \) is a smooth map, then a connection on \( P \) “pulls-back” naturally to one on the \( \mathbf{U}(n) \)-bundle \( h^\ast (P ) \) over \( M \). The curvature forms likewise are pull-backs, from which it is immediate that \[ Q\bigl(h^\ast (P )\bigr) = h^\ast \bigl(Q(P )\bigr) .\] In other words, \( Q \mapsto \hat{Q} \) is a map from \( \mathcal{R}^{\operatorname{ad}} \) into \( \operatorname{Char}(\mathbf{U}(n)) \). It is clearly a ring homomorphism, and in recognition of Weil’s lemma Chern called it the Weil homomorphism, but it is more commonly referred to as the Chern–Weil homomorphism.
For \( \mathbf{U}(n) \) the ring \( \mathcal{R}^{\operatorname{ad}} \) of ad-invariant polynomials on its Lie algebra has an elegant and explicit description that follows easily from the diagonalizability of skew-Hermitian operators and the classic classification of symmetric polynomials. Extend the adjoint action of \( \mathbf{U}(n) \) to the polynomial ring \( \mathcal{R}[t] \) by letting it act trivially on the new indeterminate \( t \). The characteristic polynomial \[ \operatorname{det}(Z + tI) = \sum_{k=0}^n \sigma_k (Z)\,t^{n-k} \] is clearly ad-invariant, and hence its coefficients \( \sigma_k (Z) \) belong to \( \mathcal{R}^{\operatorname{Ad}} \). Substituting a particular matrix for \( Z \) in \( \sigma_k (Z) \) gives the \( k \)-th elementary symmetric function of its eigenvalues; in particular \( \sigma_1 (Z) = \operatorname{trace}(Z) \) and \( \sigma_n (Z) = \det(Z) \). Now if \[ P (t_1, \dots, t_n ) \in \mathbf{C}[t_1 , \dots, t_n ] \] then of course \[ P \bigr(\sigma_1 [Z],\dots, \sigma_n [Z]\bigl) \] is also in \( \mathcal{R}^{\operatorname{ad}} \). In fact, \[ \mathcal{R}^{\operatorname{ad}} = C[\sigma_1, \dots,\sigma_n ] ,\] i.e., \[ P (t_1 , \dots, t_n ) \mapsto P \bigl(\sigma_1 [Z], \dots, \sigma_n [Z]\bigr) \] is a ring isomorphism. From this fact, together with Ehresmann’s explicit description of the homology of complex Grassmannians, Chern was easily able to verify that the Chern–Weil homomorphism is in fact an isomorphism of \( \mathcal{R}^{\operatorname{ad}} \) with \( \operatorname{Char}(\mathbf{U}(n)) \). For technical reasons it is convenient to renormalize the polynomials \( \sigma_k (Z) \), defining \[ \gamma_k (Z) = \sigma_k \Bigl( \frac{1}{2\pi i} Z\Bigr) .\] Then we get a \( \mathbf{U}(n) \)-characteristic class \( c_k = \hat{\gamma}_k \) of dimension \( 2k \), called the \( k \)-th Chern class, and these \( n \) classes \( c_1 , \dots \), \( c_n \) are polynomial generators for the characteristic ring \( \operatorname{Char}(\mathbf{U}(n)) \); that is each \( \mathbf{U}(n) \)-characteristic class \( c \) can be written uniquely as a polynomial in the Chern classes.
If \( F (Z) \) is a formal power series, \( F = \sum^\infty_0 F_r \), where \( F_r \) is a homogeneous polynomial of degree \( r \), then for finite dimensional spaces, \( \hat{F}_r \) will vanish for large \( r \) so \[ \hat{F} = \sum^\infty_0\hat{F}_r \] will be a well-defined characteristic class. Many important classes were defined in this way by Hirzebruch, and Chern used the power series \[ E(Z) = \operatorname{trace}\Bigl(\exp\Bigl( \frac{1}{2\pi i} Z\Bigr)\Bigr) \] to define the Chern character, \[ \mathbf{ch} = \hat{E} .\] It plays a vital role in the Atiyah–Singer Index Theorem.
Chern also developed a generalization of the Chern–Weil homomorphism for an arbitrary compact Lie group \( G \). The adjoint action of \( G \) on its Lie algebra \( L(G) \) induces one on the ring \( \mathcal{R} \) of complex-valued polynomial functions on \( L(G) \), so we have a subring \( \mathcal{R}^{\operatorname{ad}} \) of adjoint invariant polynomials. Substituting curvature forms of \( G \)-connections on \( G \)-principal bundles into such invariant polynomials \( Q \), we get as above a Chern–Weil homomorphism \( Q \mapsto Q \) of \( \mathcal{R}^{\operatorname{ad}} \) to the characteristic ring \( \operatorname{Char}(G) \) (defined with respect to complex de Rham cohomology) and this is again an isomorphism. Of course, for general \( G \) the homology of the classifying space \( \mathcal{B}_G \) will have torsion, so there will be other characteristic classes beyond those picked up by de Rham theory. Moreover the explicit description of the ring of adjoint invariant polynomials is in general fairly complicated.
Chern left the subject of characteristic classes for nearly twenty years, but then returned to it in 1974 in a now famous joint paper with J. Simons [21]. This paper is a detailed and elegant study of the phenomenon of transgression in principal bundles. Let \( M \) be an \( n \)-dimensional smooth manifold, \( \Pi : P \rightarrow M \) a smooth principal \( G \)-bundle over \( M \), \( \omega \) a \( G \)-connection in \( P \), and \( \Omega \) the matrix of curvature 2-forms. Given an adjoint invariant polynomial \( Q \) on \( L(G) \), homogeneous of degree \( \ell \), we have a globally defined closed \( 2\ell \)-form \( Q(\Psi) \) on \( M \) that represents the characteristic class \( \hat{Q}(P ) \in H^{2\ell} (M ) \), and that is characterized by \[ \Pi^{\ast} \bigl(Q(\Psi)\bigr) = Q(\Omega) .\] Chern and Simons first point out the simple reason why \( Q(\Omega) \) must be an exact form on \( P \). Indeed, by the naturality of characteristic classes under pull-back, \( Q(\Omega) \) represents \( \hat{Q}(\Pi^{\ast} (P )) \). But as we saw earlier, \( \Pi^{\ast} (P ) \), the “square” of the bundle \( P \), is a principal \( G \)-bundle over \( P \) with a global cross-section, hence it is trivial and all of its characteristic classes must vanish. In particular \[ \hat{Q}\bigl(\Pi^{\ast} (P )\bigr) = 0 ,\] i.e., \( Q(\Omega) \) is exact.
They next write down an explicit formula in terms of \( Q \), \( \omega \), and \( \Omega \) for a \( (2 \ell- 1) \)-form \( \mathit{TQ}(\omega) \) on \( P \), and show that \[ d\mathit{TQ}(\omega) = Q(\Omega) .\] \( \mathit{TQ}(\omega) \) is natural under pull-back of a bundle and its connection. Now suppose \( 2 > n \). Then \( Q(\Psi) = 0 \), so of course \( Q(\Omega) = 0 \), i.e., in this case \( \mathit{TQ}(\omega) \) is closed, and so defines an element \( [\mathit{TQ}(\omega)] \) of \( H^{2\ell -1} (P ) \). If \( 2\ell > n + 1 \) Chern and Simons show this cohomology class is independent of the choice of connection \( \omega \), and so defines a “secondary characteristic class”. However if \( 2 = n + 1 \) then they show that \( [\mathit{TQ}(\omega)] \) does depend on the choice of connection \( \omega \).
They now consider the case \( G =\mathbf{GL}(n, \mathbf{R}) \) and consider the adjoint invariant \( n \) polynomials \( Q_k \) defined by \[ \det(X + tI) =\sum^n_{i=0} Q_i (X)\,t^{n-i} .\] Taking \( Q = Q_{2k-1} \) they again show \( Q(\Omega) = 0 \) provided \( \omega \) restricts to an \( \mathbf{O}(n) \) connection on an \( \mathbf{O}(n) \)-subbundle of \( P \), so of course in this case too we have a cohomology class \( [\mathit{TQ}(\omega)] \). They specialize to the case that \( P \) is the bundle of bases for the tangent bundle of \( M \) and \( \omega \) is the Levi-Civita connection of a Riemannian structure. Then \( [\mathit{TQ}(\omega)] \) is defined, but depends in general on the choice of Riemannian metric. Now they prove a remarkable and beautiful fact — \( [\mathit{TQ}(\omega)] \) is invariant under conformal changes of the Riemannian metric! Such conformal invariants have recently been adopted by physicists in formulating so-called conformal quantum field theories.
Chern also returned to the consideration of characteristic classes and transgression in another joint paper, this one with R. Bott [19]. Here they consider holomorphic bundles over complex analytic manifolds, where there is a refined exterior calculus, using the \( \partial \) and \( \bar{\partial} \) operators, and they prove a transgression formula for the top Chern form of a Hermitian structure with respect to the operator \( i\partial \bar{\partial} \). This work has applications both to complex geometry (especially the study of the zeros of holomorphic sections), and to algebraic number theory. In recent years it has played an important role in papers by J. M. Bismut, H. Gillet, and C. Soulé.
“Retirement”
For most mathematicians, retirement is a one-time event followed by a period of declining mathematical activity. But as with so much else, Chern’s attitude towards retirement is highly nonstandard. Both authors remember well attending a series of enjoyable so-called retirement parties for Chern, as he retired first from UC Berkeley, then several years later as Director of MSRI, etc. But in each case, instead of retiring, Chern merely replaced one demanding job with another.
Finally, in 1992, Dr. Hu Guo-Ding took over as director of the Nankai Institute of Mathematics and Chern declared himself truly retired. In fact though, he travels back to Nankai one or more times each year and continues to play an active role in the life of the Institute. The Institute now has an excellent library, has become increasingly active in international exchanges, and has many well-trained younger members. In 1995, the occasion of the tenth anniversary of the Nankai Institute was celebrated with a highly successful international conference, attended by many well-known physicists and mathematicians.
Chern also continues to be very active in mathematical research, and when asked why he doesn’t slow down and take it a little easier, his stock “excuse” is that he does not know how to do something else. He says he tries to work in areas that he feels have a future, avoiding the current fashions. His recent interests have been Lie sphere geometry, several complex variables, and particularly Finsler geometry. Chern’s interest in the latter subject has a long history. Already in 1948 he solved the equivalence problem for the subject in “Local Equivalence and Euclidean connections in Finsler spaces” (reprinted in [28]). Chern feels that the time is now ripe to recast all the beautiful global results of Riemannian geometry of the past several decades in the Finsler context, and he points out that thinking of Riemannian geometry as a special case of Finsler geometry was already advocated by David Hilbert in his twenty-third problem at the turn of the last century. Chern himself has recently taken some steps in that direction, in “On Finsler geometry” (C. R. Acad. Sci. Paris, t. 314, Série I, p. 757–761, 1992), and with David Bao, “On a notable connection in Finsler geometry” (Houston Journal of Math., v. 19, no. 1, 1993). He has also recently spelled out the general program in a paper that is as yet unpublished, “Riemannian geometry as a special case of Finsler geometry”.