Celebratio Mathematica — Wright

by Allyn Jackson

Margaret Wright.

Photo courtesy of the Simons Foundation.

Margaret H. Wright has been a leading figure in numerical analysis for more than forty years. Born in California in 1944, she earned a bachelor’s degree in mathematics (1964) and a master’s degree in computer science (1965) from Stanford University. For six years she worked as a programmer at GTE Sylvania. Seeking more independence and autonomy in her career, she returned to Stanford to earn a PhD in computer science in 1976, under the direction of Gene Golub and Walter Murray. She then worked for 12 years as a research associate in the Systems Optimization Laboratory that George Dantzig had founded at Stanford. In 1988, Wright took a position as a researcher at AT&T Bell Labs, a place she called “paradise.” In 2001, in the midst of growing uncertainty about the future of Bell Labs, she moved to the Courant Institute of Mathematical Sciences at New York University (NYU), where today she is the Silver Professor of Computer Science.

Wright’s work, which has focused primarily on optimization, unites theoretical prowess, technical command, and real-world implementation. She has been at the center of major developments in applied mathematics, including the interior-point revolution in linear programming.

While Wright describes herself as “boring” and a “goody two-shoes”, she exudes an easygoing charisma. That quality, together with her stature in research, a high level of personal integrity, and a disarming sense of humor, has made her a trusted and beloved leader in a host of administrative capacities, including as the head of the Scientific Computing Research Department at Bell Labs, as chair of the Computer Science Department at NYU, and as president of the Society for Industrial and Applied Mathematics (SIAM).

Among her honors are the Award for Distinguished Public Service from the American Mathematical Society (2002) and the John von Neumann Lecture Prize from SIAM (2019). She was elected to the American Academy of Arts and Sciences and to both the U.S. National Academy of Engineering and the U.S. National Academy of Sciences.

The following interview with Wright, conducted in September and October 2019, has been edited and condensed.

Math and sanitation

Jackson: I want to start with some recent history. In July this year you received the von Neumann Prize and presented the prize lecture at ICIAM [International Congress of Industrial and Applied Mathematics] in Valencia, Spain.1 You talked about a problem you worked on that came from the Department of Sanitation of the City of New York. Can you tell me about that problem and what the outcome was?

Wright: In October 2018, a person working in information technology at the New York Department of Sanitation emailed Russ Caflisch, the director of Courant, and sketched out an “optimization” problem. Russ asked if I wanted to help, and I thought, Why not? It would be like my days at Bell Labs, where I had really enjoyed working on real-world problems.

The problem was this. The Department of Sanitation has several thousand employees. Each sanitation worker has a “home station,” and these are spread over the five boroughs. Often employees want to change their home station. There are periods a few times a year when they can request this change by filling in a paper form and listing in order of preference three home stations they would like to move to. Then the department reassigns people if possible.

The absolute priority for reassignment is seniority, which is a unique integer for each person; you can’t have two people with the same seniority. Also, a worker who does not wish to move cannot be forced to do so. The department’s high-level policy, broadly speaking, is to reassign people in priority order, subject to availability. However, there is a complication when someone of lower priority moves to a position that has been requested by a higher-priority person, thus freeing up a previously occupied position. In such a case, further iterations of the assignment process are required.

At first this felt like a problem in optimization, but it’s actually a matching problem. So I started reading the literature about matching problems of this kind, and met on several occasions (for “fun”) with colleagues from the Department of Sanitation. This matching problem is sometimes called “housing with sitting tenants.” There is an algorithm that solves the problem, with a great name, “You Request My House, I Get Your Turn” (YRM-IGYT).2 (It is also called “Top Trading Cycles” in the econometrics literature). It starts by going down the list in priority order. When it’s your turn, suppose someone wants your house. Then they get your turn — they are moved up the priority list. Various properties of this algorithm can be proved. Using it initially seemed ideal for the Sanitation Department, but then I found some examples where the algorithm produced an answer that did not do what the department wanted.

Unfortunately, this doesn’t have a great happy ending, like “Mathematician comes in, solves problem, everyone is happy.” The YRM-IGYT algorithm, nice as it was, does not really solve the problem; the Department of Sanitation developed a new, related algorithm. This phenomenon — needing to adjust the mathematical problem to match reality — is familiar when you are working on real-world problems. And it was very enjoyable for me to work with the Department of Sanitation team.

Jackson: And it’s nice that the Department of Sanitation felt they could reach out to the mathematicians.

Wright: Yes. When I talked to the person who had written to Russ, I asked about this. The person said, “Well, Courant is a famous math institute. We have a problem. It’s math.”

I also want to say that, in many real-world problems, finding an algorithm is only the first step. Next you have to write “bullet-proof code,” as we say. Often people who are on the more theoretical side don’t think about that. They just say, here’s an algorithm, end of story. Writing good code is essential.

A very American life

Jackson: The Department of Sanitation work was a recent chapter in the story of your life. Let’s go back to the beginning, to the first chapter. Where in your mind does the story of your life begin?

Wright: I have not had a very exciting life! It’s very American. I was born in San Francisco, because my mother was a doctor and she wanted me to be born in a certain hospital in San Francisco where she had done her training. We lived in an agricultural town, Hanford, California, where the population was about 10,000. It’s probably not much bigger than that now.

I didn’t like living in a small town. It was too much of everybody knowing what everybody else was doing. Not that as a child I did anything bad! But I have a strong sense of privacy. We lived in Hanford until I was ten, which is when my parents split up. Then my mom and the kids moved to Tucson.

My birthday is in February, and the cutoff date for going to kindergarten was March 1st. So I was one of the youngest children in kindergarten. At that time kids were allowed to skip grades in California, though later it was considered bad psychologically for them. I skipped fourth grade in Hanford, so that for most of my time in school and college, I was almost two years younger than everybody else.

Jackson: Your mother became a doctor at a time when there were very few women becoming doctors. What was her story?

Wright: She said that half of the students in her medical school class were women. She got her medical degree in 1937. When she went to the 50th class reunion, there were half women! Now, this is only one data point, the University of California Medical School. But she thought the worst discrimination against women started after World War II. When she was in medical school, there wasn’t a lot of “women are inferior” feeling — and she did say proudly that she’d been number one in her class. So she never talked about the struggle of being a woman.

Jackson: How did she juggle career and family when you were growing up?

Wright: She had three children. My dad was also a doctor. He was a general practitioner of a kind they don’t have anymore — he did surgery, he delivered babies. I remember as a child, I would hear the phone ring at 2 AM, and my dad would get up and rush out to deliver a baby. My mom’s specialty was “eye, ear, nose and throat,” a specialty that doesn’t exist anymore. She did not have those last-minute emergencies, which would have been very difficult with three children.

I’ve thought about this, given how things are now, when women are so tormented about domestic arrangements if they have a challenging career. My mom never talked about it at all. We had a very nice woman who was from Iowa who came to stay with us after school. There was another person who would occasionally come in for major house cleaning. It sounds idyllic, doesn’t it?

Jackson: It sounds great! Did your parents encourage you a lot in school, did they insist on perfect homework? What were they like in that regard?

Wright: Well — this goes with me being a boring person — I always did so well in school, and so did my brothers, that no one ever said anything. I was always either the top student or the next-to-top student.

Jackson: What did you want to be when you grew up? Did you dream about being a singer, or an astronaut, something like that?

Wright: No! As I said, I am a very boring person! My dad kept saying to my brothers and me, “One of you has to be a doctor.” I am not exactly a rebellious person, but I thought: I don’t know about that! My uncle was a lawyer. My family was not associating with professors and academics and researchers, or with engineers or scientists. What adults did I know who had interesting-sounding jobs? Doctor and lawyer. That’s it.

A high school counselor once said to me, “I see you have gotten all As in biology and chemistry. Have you thought that you might do something in those areas? You could be a nurse.” Now, I am going to be accused of elitism! Nursing is a perfectly good profession. But I said: “My mom is a doctor, and if I were going to be in medicine, that’s what I would want to be.” The counselor was shocked. She said, “Okay, but what about having a family?” I said, “Well, my mom managed to do it.”

I knew I was going to go to college. I didn’t worry about it at all, in contrast to kids today. Women were not admitted to the Ivy Leagues at that time — this would have been in 1959-1960. My parents had both gone to UC Berkeley. Although they weren’t together when I was in high school, they both were of course interested in where I would go to college. As a child of a Berkeley alum, I could get in-state tuition at Berkeley. I knew Stanford was a good school, and they admitted women.

So I applied to Stanford and Berkeley, and my “safety school” was the University of Arizona, because we were in Tucson. This is going to be a very obnoxious comment, but if you were in the top three-quarters of your high school class, you would get admitted to Arizona. So I figured I would get in! In the end I got admitted to all of them and decided to go to Stanford, possibly to be different from my parents.

Jackson: Cost was not an issue? Berkeley would have been essentially free with in-state tuition, and Stanford not.

Wright: Yes, and Stanford had the staggering figure of I think \$1000 a year as tuition.

Jackson: That sounds like nothing today!

Wright: Yes, though not at that time. But my parents were both doctors. They could afford it. They said, it will be expensive, but okay. That’s how I ended up at Stanford, which as it turned out was great. Berkeley would probably have been okay too.

Jackson: Yes, both are excellent, you couldn’t have gone wrong. But Stanford has a different feel, being a private university.

Wright: It was different. A big part of life at the University of Arizona was sororities and fraternities. I was against them, possibly because I wasn’t a “popular” girl. There were the “cool” people, and I was never in that group. The cool girls in high school talked about going to the University of Arizona and joining sororities, and I remember thinking: That sounds really horrible. Stanford at that time had fraternities, but no sororities.

Jackson: When you were in high school, did you encounter the attitude that girls are not supposed to be smart?

Wright: People did say that, if you are a girl and you are too smart, boys won’t like you, and you shouldn’t do better than everybody else. We played a game in our math class where one student would stand behind the desk of another, and the teacher would ask a question. Whoever got the answer fastest would move on to stand behind the next chair. I always went all the way around the room. Afterwards people would say, “The boys are not going to like you if you do this.” I remember thinking, Too bad for them!

Jackson: Good for you! Where did that confidence come from?

Wright: Well, I’ve thought about that. If I am not sure I can do something well, I am insecure. I was a nervous wreck before giving the von Neumann Lecture, as I am before every big talk. But in the context of high school, I knew that I was a good student. If I say I knew I was smart, it sounds obnoxious, but I really did. I just never had any trouble with anything in high school.

Of course in college that changes, right? In my freshman year at Stanford, which was 1960, one of the freshmen committed suicide. He had been from a small town in the Midwest. When he left to go to Stanford, he went by train. A band came to the station and played for him, and the mayor of his town made a speech about how proud the town was of him. When he got to Stanford, he was totally out of his depth. If you grow up in an environment where you are always the best and you can maintain that in your life, good for you. But most people can’t. Certainly at Stanford, I knew I wasn’t the best.

Jackson: It’s much more selective, it’s a different environment.

Wright: Right. Some students went to very challenging prep schools, where they were really pushed academically. That wasn’t happening at Catalina High School, where I went.

Getting the computer to do what you want

Jackson: What was your intended major when you arrived at Stanford?

Wright: I had a fabulous French teacher in high school. And I really liked math because it was so neat — and I am using the word “neat” in the sense of tidy. It fit together, you could prove things. I felt that was wonderful. But I also liked English. I’ve always been a big reader. People would say that I always had my nose in a book.

So when I first went to Stanford and had to indicate my interests, I put mathematics, history, English, French. Those were the areas I thought I might major in. At that time there was a general studies requirement, so for the first two years you had to take English, history, science, math, and so on. That was not a problem for me. The hard part came when I was a junior and had to pick a major. Because my mom had had a job and my parents had gotten divorced, I knew from the beginning that I needed to be able to earn my own living.

Someone told me that I could get a good job if I majored in math. I remember being puzzled. Why would anyone pay me to do math? It’s too much fun! The person just said: Trust me. So you can see how completely naive I was about the world of scholarship and academia, or for that matter, working in industry. No one came to Stanford and gave talks about how you use math in industry, as is done today. But I just thought, Okay, I’ll major in math. And what a good choice that was.

Stanford had computer science courses, but no computer science major. A big change in my life came when I took my first computer science course. I especially liked that you could make the computer do what you wanted. In math, you could prove things, but with the computer, you could write a program to solve a problem, and the solution would come out the way you wanted it to.

Jackson: How did you program? Did you use keypunch cards?

Wright: We did. We programmed in a language called ALGOL, which still exists, but in minor form. You had to go to the computing center and bring your stack of cards. You would punch them on the keypunch machine, and then hand the stack to the person at the desk, and they would take it away. You would come back later, and they would give you your output.

I took my first computer science course my junior year and loved it. If there had been a major in computer science, I probably would have done that. Don’t get me wrong, I loved math. But the computer science was so great!

Jackson: Who taught computer science?

Wright: Gene Golub taught the first course, and Cleve Moler, who was then a PhD student, taught the second one.

Jackson: Two greats in computer science.

Wright: Yes, and both of them were excellent teachers. This sounds corny, but they conveyed excitement. Computer science was very interactive. I liked that a lot.

Socially noticeable as different — a female, and a math major

Jackson: When you were an undergraduate, were you the only female in your math classes?

Wright: Mostly, yes. I had a differential equations class that had 150 students, and I was the only woman. I never dared to miss class because if I did, it was obvious that I was not there!

Jackson: Did that affect you? Did you feel out of place?

Wright: Yes, I felt out of place. No one said anything, but when you are a minority in any context, you do stand out. Also, it was common for undergraduates to ask one another about their majors. When I said math, they’d say, “I always hated math, I was never good in math” — in an indignant way, as if it was my fault that they didn’t like math! Often this brought the conversation to a screeching halt. If you were, say, a history major, people could ask which country, or which period. With math, people didn’t know what to say. So my being socially noticeable as different did happen, and it still happens.

When I went to graduate school to get my master’s and later my PhD, there were not very many women, but it was certainly not a case of one out of a big number. There were a small number.

Jackson: How did you decide to go on for a master’s at Stanford, and did you think about going directly for a PhD?

Wright: I needed to get a job when I graduated. By the time I got my master’s, I could get the degree in computer science, and I knew there were programmers who got paid. So I got a master’s in computer science, but I also took several numerical analysis courses, and that’s really what I was interested in.

There was no one in my family who had done an academic PhD, so I had no idea what that meant. I regarded professors with awe. I couldn’t picture myself being a professor.

Jackson: Did you have any women professors at Stanford?

Wright: Definitely not in math or computer science. I don’t remember any women professors on the technical side. Thinking about it in retrospect, I am sure that it affected me. I never thought, “I can’t possibly do this”; I was just interested in other things. I never had what you might call a mentor. No professor ever said, “Are you interested in research? Maybe you could think about graduate school” — which is what we professors do now. But I didn’t expect that. Students now have more of an expectation that someone will be supportive. So it’s different now — it’s much better!

Jackson: After your master’s, you got a job at Sylvania [in Mountain View, California], which makes me think of light bulbs. But I don’t think you were working on light bulbs.

Wright: It was GTE Sylvania, and GTE stands for General Telephone and Electronics. They made electronic equipment. They had a group that wrote computer programs for simulations, although they didn’t call them simulations. For example, one of the projects I worked on was ship photo recognition. I suspect it might have had a military purpose. They would get photos of ships and scan them in, in some way, and we were supposed to say what kind of ship it was. That sounds pretty modern, right?

They needed computer programs to perform the simulation and compute an estimate of how wide the ship was, or of other measurements. These were linear least-squares problems, and there was a brand-new technique called quasi-Newton methods that had been developed by Fletcher and Powell in England.3 I wrote FORTRAN codes that did that. And I found it utterly fascinating.

This was teamwork, and it had good and bad aspects. It was really fun to talk to the other people and learn from them. But there was also the “You’re not in charge” feeling, so if I wanted to do something a certain way, others might say, “No, we have to do it this way.” I think that’s fairly typical for teamwork. It’s rare that everyone agrees. I worked there for six years. I learned a lot, not so much about technical topics, but about the real world, about getting a project done, what you had to do when it had to be ready in a month. I really liked that.

Moving on to a PhD

Jackson: At this time, you got married and had a daughter.

Wright: That’s right. I got married in 1965. I had to get a job, because my husband was a law student at Stanford and his tuition needed to be paid. The tuition was tiny compared to now. Being a programmer paid pretty well, and I wanted to work. At that time, living together without being married was completely frowned upon. So we got married, and we had a daughter in 1968.

During our daughter’s early years, I wanted to work part-time — 20 hours, or two and a half days, per week. This was amazingly easy to arrange with Sylvania. I did not appreciate at the time how enlightened this was on their part. Child care was, of course, an important issue, but we were extremely lucky to find a young mother with a daughter who was basically the same age as our daughter. This wonderful woman was happy to have a part-time job that included spending the day with her child, having a playmate of the same age, and being paid. So that arrangement — which could not have been better — lasted until our daughter started kindergarten. Then, again by good fortune, and because we were willing, in fact happy, to pay a reasonable salary, we hired a dedicated “older” woman — probably slightly younger than I am now! — who would come to the house when our daughter finished school and remain until I got home. She was able to continue working for us essentially all the way through high school.

Eventually I thought about returning to Stanford to get a PhD because at Sylvania I was getting fed up with having other people tell me what to do and also with the blatant discrimination against women, which was common at the time. I was unsure what I needed to do, but when I contacted a Stanford professor, he said that because I had been a master’s student, I could return as a PhD student without actually applying. I really should not have been allowed to do that. I should have been forced to take the GREs!

As a result, I returned to Stanford in 1971 as a PhD student, planning to work in numerical analysis, with Gene Golub. Gene loved having visitors, and one interesting visitor after another visited Stanford thanks to him. While I was a PhD student, Philip Gill came from England for a short visit. He had worked on optimization at the National Physical Laboratory, which is something that probably doesn’t exist anymore — a government research lab with a permanent staff who were basically free to do whatever they wanted. The next year, Walter Murray, who was a close colleague of Philip’s from NPL, also visited and taught an elementary numerical analysis class. I was Walter’s teaching assistant.

In England at that time, and even today, students specialize at a much earlier age than in the U.S. Mathematics undergraduates in England arrive already knowing a lot of mathematics. That’s what Walter was used to. At Stanford, the students who took the undergraduate introduction to numerical analysis often didn’t know much mathematics. Some had barely had calculus. Walter assumed they knew all about integration, analysis, differential equations — and they didn’t. The students would come to see me as the TA, and of course I was sympathetic, because I knew that they didn’t have the right background. I would say to Walter, “You really can’t assume they know this.” And he would say, “They should know that!” And I would say, “They don’t know it — believe me, they don’t!” Two graduate students were also taking the class — two graduate students mixed with around 60 undergraduates. The graduate students kept telling Walter the class was too slow and too easy. Walter would say to me, “Look, they are telling me it’s too easy!” And I said, “Talk to the other students!” Not surprisingly, the undergraduates were afraid to say anything to him.

Jackson: That’s not easy for you as a TA and a graduate student to go to the professor and say, “You’re not doing your course right.”

Wright: Well, Walter is a very friendly, informal person. But I don’t think he ever fully appreciated the lack of mathematical knowledge of the undergraduates.

A year or two later, the department created the Forsythe Award for Student Contributions to Teaching, and I was the first recipient. I’m sorry if this sounds like bragging, but I was pretty proud of it. It was named after George Forsythe, who had been the founding chair of the computer science department — a wonderful man. The citation said something tactful like “for contributions in being a teaching assistant for a visitor.”

When Walter went back to England, Gene Golub, to whom I will always be grateful, suggested I go to the NPL and said he would support me on his grant. I am amazed, looking back on it now, that he could use his grant to send me to England for six months. I don’t know how happy funding agencies would be about that today! So my little family and I went to England. And it was really great.

The NPL was a wonderful place. I had been warned that people weren’t going to be friendly, they would be “stiff upper lip” British people. They weren’t anything like that. They were very friendly and constantly inviting us over to dinner. And I got a lot of work done on my thesis. That was in 1974–75.

Jackson: Were there women at the NPL?

Wright: Yes, there were. Were they at the same level as Walter and Philip? No. They were usually programmers.

When people there wrote papers, they would give their names with only initials. Instead of Philip E. Gill, as it would usually be in the U.S., it would be P. E. Gill. But if one author was the programmer, and the programmer was a woman, then they would put her first name, for example P. E. Gill and Gwen Peters. I asked why they did not give everybody’s first name. That led to a lively discussion! I was only a visitor, only a student. I had no power. But I kept asking, what is this?

Jackson: Can you tell me about the topic of your PhD thesis?

Wright: Walter had an idea, which he had worked on in his thesis, about two classes of methods for nonlinearly constrained optimization. One was penalty function methods, the other was barrier function methods. In a sense they are complementary.

If you are trying to minimize some function, and you have constraints, one way to solve the constrained problem is to transform it into a problem that has no constraints (i.e., an unconstrained problem). One way to do that is to create a new function by combining the one you are trying to minimize with the constraints.

With a penalty method, the new function penalizes you if the constraints aren’t satisfied. So, for example, if you have a constraint $ x = 1 $, you would add something that penalizes you when $ x $ is not equal to 1. Under fairly general conditions, you can show that the minimizers of the penalty function converge to the solution.

Barrier methods are sort of the opposite. Here you have an inequality constraint, for example, $ x \ge 1 $. Starting at a point where the constraint is satisfied, you minimize a new function that combines the original function with a “barrier” function that becomes infinite as you get closer and closer to the constraint $ x \ge 1 $.

In his thesis, Walter had proved various results about penalty functions, and he thought I could work on barrier functions. That’s what my thesis was about. He didn’t know that at this time barrier methods were considered to be inherently flawed. The flaw is that, broadly speaking, when you take something that’s finite and add things to it that gradually become infinite, you create something ill-conditioned. So, although barrier methods were very popular in the 1960s, by 1975 other methods had come along that seemed to be better and didn’t suffer from inherent ill-conditioning.

Still, I was happy with my thesis. At this time I wanted to stay in the Bay Area, for personal reasons. George Dantzig had created a research group in the Stanford operations research department, and I was hired there in a soft-money position. Things at the NPL were not going well — they were being forced more and more to work on projects and get grants. So Philip and Walter were interested in moving, and George Dantzig was able to obtain funding to hire them too.

Dantzig: Small, sweet, smart

Jackson: George Dantzig is a huge name in your field. Can you tell me about him?

Wright: He is one of the sweetest and smartest people I have ever known. He was quite small in terms of height, very soft-spoken. I never heard him raise his voice. And he was just a delight. He is the called the father of linear programming and won all kinds of awards for his mathematical contributions.

One of his dreams was to have an organization, which he called the Systems Optimization Laboratory (SOL), to deal with very large real-world problems. George had written a couple of papers urging that such a place be created.4 He had to rely on government grants, so he could never really get the huge organization that he wanted. But he started with having a few people who were working on optimization and trying to solve real-world problems. That was his dream. I was very lucky to become a teeny part of that.

When the Stanford OR department hired me to work at SOL, some people there thought my job would be to take someone else’s algorithm and write a program that carried it out. But I thought that I was going to come up with new algorithms! It was definitely a mismatch. But George just said, “Fine, Margaret, you can work on that.” And he was the one with the money.

Jackson: He understood that you were capable of doing that.

Wright: He didn’t say that explicitly, but I optimistically assumed that was what he thought.

Michael Saunders, who is originally from New Zealand, had finished his PhD in computer science at Stanford a few years earlier. He worked on the numerical aspects of linear programming, and his work is rightly famous. He then returned to New Zealand for a few years. Philip and Walter and I, who knew Michael well, were hoping he would come back to Stanford, and that we four would be together. That’s what eventually happened. This was around 1979. We were then called — I wouldn’t say “affectionately”! — the Gang of Four.

I’ve drifted in this conversation away from Dantzig, but I can tell you that as a human being, he was amazing. People would come to Stanford to the OR department and say they wanted to meet “the famous Professor Dantzig.” One time it was a group from Japan, all dressed up in suits. Maybe they were people from a business that used linear programming. George came into my office and said, “There is a group here and they want to meet me. I can’t face it. Can I just sit in here for a while?” He couldn’t take the “this is the famous Professor Dantzig” thing, because he was basically shy. Of course I said, “Yes, you can hide in my office until they go away!”

The media go wild for algorithms

Jackson: Dantzig was the founder of the simplex method. There is a 2004 paper on the simplex method by Spielman and Teng5 in which they said something sort of puzzling. They wrote: “In spite of half a century of attempts to unseat it, the simplex method remains the most popular method for solving linear programs. However, there has been no satisfactory theoretical explanation of its excellent performance.” Other methods seemed like they might end up being better than simplex, but simplex has continued to be used and be very effective. Can you comment on this?

Wright: Yes, I can, and I’ll bring in my own knowledge of barrier functions.

Theoretical computer science developed a huge amount of very important theory, including the ideas of polynomial-time, NP-complete, exponential-time, etc. The idea grew that a polynomial-time algorithm would always be better than an algorithm with, say, worst-case exponential time.

For years people believed that the simplex method must be worst-case exponential, but there was no example. Then, in the 1970s, Klee and Minty gave an example where, on a problem with $ n $ variables, the simplex method takes $ 2^{n-1} $ steps.6 That was the worst case. But people who routinely solved large linear programs had observed that in practice simplex behaves like a polynomial-time method — in practice it takes $ 2n $ or $ 3n $ iterations. It’s very puzzling, a huge gap between theory and practice.

For at least a decade people tried to find a polynomial-time algorithm for linear programming. Several attempts turned out to be wrong. This all changed in 1979. Leonid Khachiyan, in the Soviet Union, was working on various algorithms for nonlinear problems that had been developed earlier in the Soviet Union. He came up with what turned out to be a polynomial-time linear programming algorithm.7 It was a big story on the front page of the New York Times.8 Remember, the Soviet Union was our enemy — and they had this algorithm. The press went wild and published stories about how the Russians would be able to crack our codes, which was complete nonsense.9

Reporters were calling George of course, because they wanted his comments. He didn’t want to talk to them, so one of them got me. The reporter asked, “Can you tell me in simple terms what a polynomial-time algorithm is?” I said, “Do you know what a polynomial is?” He said no, so I said, “Okay, let’s take $ x^2 $ and $ x^3 $…”. And he said, “That’s too hard! I think polynomial-time means really, really fast.” I said, “Actually it doesn’t mean really, really fast.” He said, “Well, that’s what I’m going to put in my story!” He missed his chance to have the definition of a polynomial!

Lots of people, including the Gang of Four, started implementing Khachiyan’s algorithm and running it on a variety of linear programs. Simplex was always much, much faster. We knew the Klee–Minty example showed that simplex was exponential-time in the worst case, but on a standardized test set of linear programming problems, Khachiyan’s algorithm was slower. This was a case where the “fast” algorithm in theory, Khachiyan’s, was slower than the “slow” algorithm. You’ll still meet people from theoretical computer science who will say, “Isn’t there a polynomial-time algorithm from Khachiyan that’s always faster than the simplex method?” You just have to throw up your hands!

Then in 1984, Narendra Karmarkar of Bell Labs announced a polynomial-time algorithm for linear programming that was supposedly 50 times faster than the simplex method on every problem. That’s a dramatic statement. That was also on the front-page of the New York Times, as Ron Graham says, “above the fold.”10

Jackson: Was Ron Graham behind the huge publicity for Karmarkar’s work?

Wright: I would guess so. Ron is a great mathematician, a great scientist. At the time he was a high-level manager in the Mathematics Research Center at Bell Labs. And he is a master of publicity. He can take an apparently dry topic in mathematics and generate a huge amount of public interest in it.

There were news articles all over, including in Time magazine, all featuring the 28-year-old Narendra Karmarkar. By the way, Khachiyan had also been 28 when he announced his algorithm, so people were talking about “the magic age, 28.”

George was fascinated to learn about Karmarkar’s work (and also about Khachiyan’s) because George loved linear programming. So he invited Karmarkar to speak at Stanford. But Karmarkar was being very cagey. He said the algorithm was AT&T proprietary, so he could not give details. In his talk, he wrote a few equations on the board and just made a few comments.

Now, the Gang of Four — Philip, Walter, Michael, and I — noticed that the equations Karmarkar wrote down had the same format as the equations in a barrier method. One of us said, “That looks like a barrier function.” People have asked us which one of us figured that out. But it was a group thing. I did my thesis on barrier methods, Walter gave me my thesis problem, Philip knew all about it, Michael knew all about it. Any one of us could have said this, and we don’t remember who it was. But it got said: This looks like a barrier function.

So we and John Tomlin (of Ketron) started working on this. Karmarkar’s method, which is called an interior-point method, was being presented as something brand-new. Claims were being made that Karmarkar’s method was consistently much faster than the simplex method on large linear programs. On the other hand, remembering what had happened with Khachiyan’s method, many experts in linear programming believed that the simplex method would always be faster.

In the midst of this controversy, in August 1985 I gave a talk at the Mathematical Programming Symposium, a major conference in optimization held that year at MIT, about our work on Karmarkar’s method. Now, how did I happen to give the talk? People have asked me that many times: “You’re the one people used to mistake for the programmer! How did you get to give this talk?” The answer is that Walter was originally invited to give this talk, and then something came up and he couldn’t go. Michael and Philip then agreed that I should give the talk, which we worked on together late into the night before.

The room was packed. I said that we had shown, first, that there was an equivalence between Karmarkar’s interior-point method and barrier methods.11 Most of the audience had never heard of barrier methods because they had gone out of style, and this result was a surprise to many. But the numerical results were what aroused the strongest emotion. In our extensive computational tests on a large suite of linear programs, recognized as a “fair” comparison, the results were split — sometimes the Karmarkar method was faster, sometimes simplex was faster. Why was this dramatic? Because the simplex devotees in the audience were upset that a “nonlinear” method (Karmarkar) could be competitive with the simplex method. And the Karmarkar fans were absolutely convinced that his new algorithm must always be faster than the simplex method. So neither group was happy. Many audience members came up to me afterward and started yelling and complaining. It was quite exciting.

What about today? Bob Bixby, from Rice University, is one of the world’s leading experts on computational linear programming. I see Bob about once a year, and I ask him, “Which is better, simplex or barrier?” And the answer is: simplex is better some of the time, and barrier is better some of the time.

So that quotation from the Spielman–Teng paper is still true. And no one knows exactly which method is better for which linear programming problem. In other parts of numerical analysis, we know that if a problem is of a certain type, you should use Method A, and if it’s of another type you should use Method B. But not for linear programming.

Jackson: It’s still not understood.

Wright: I still meet people who say, “Well surely no one uses the simplex method anymore.” And I say, “Yes, they do!”

For completeness, I have to mention that that paper of Spielman and Teng, which is on smoothed analysis of algorithms, is fantastic. They prove that, in the context of smoothed analysis, the simplex method is polynomial-time. Both of them have been showered with honors, deservedly so, for that analysis.

The interior-point revolution

Jackson: AT&T said that Karmarkar’s work was proprietary. How did you find out enough about it to work out the equivalence to barrier methods?

Wright: Karmarkar’s early talks were typically titled something like, “The New Age of Linear Programming,” and almost never contained any equations. So when he spoke at Stanford, he was asked many questions and wrote a few sparse equations on the board. But one of these was the equation for finding the next step in the iteration defined in his method. That’s the one that looked like the equations that came up in barrier methods. It’s a diagonal matrix and each diagonal element is 1 over the squared values of the constraints. That made us think that his method was connected with barrier methods. Soon thereafter, he published a paper that included a few equations.

So we talked about it and worked through it. It’s not difficult, I have to say. One of the things that was stated frequently at the beginning of the publicity was that no one would understand the mathematics of this for ten years. Well, it doesn’t take very long to figure it out, once you know it’s there. It wasn’t deep, new mathematics. Now, once it was known, Nesterov and Nemirovskii, two people originally from the Soviet Union, wrote a beautiful book about the theory of barrier methods.12 That was deep mathematics.

To complete the circle in a way, in 1996 Kurt Anstreicher wrote a paper about SUMT, Sequential Unconstrained Mathematical Technique.13 SUMT refers to a class of methods (penalty and barrier functions) and to a widely used software package dating from the 1960s. His paper proved that SUMT is a polynomial-time algorithm for linear programming. People had spent years desperately trying to find polynomial-time methods, and this one was there all the time. They just didn’t have the right perspective.

Interior-point methods are now a whole field, and research on barrier methods led to the field of semidefinite programming. Karmarkar’s work really changed optimization in a big way.14

Jackson: Can you tell me more about the impact of the equivalence between barrier methods and Karmarkar’s interior-point methods?

Wright: Given the date of his PhD, Karmarkar would not have been taught about barrier methods, whose heyday was in the 1960s. As I mentioned, by the mid-1970s they were considered fatally flawed. When I finished my thesis on barrier methods in 1976, Walter and I wrote a paper about a barrier method. This was my first experience trying to get a paper published. One of the referees wrote: “I can’t believe anyone would waste their time on a useless method that is not worth worrying about.” That was of course very devastating to me, that this person completely dismissed this part of my thesis. When Karmarkar’s results came in 1984, no one was thinking about barrier methods. Once people started thinking about them, the floodgates opened.

Barrier methods had been proposed in the 1960s originally for nonlinearly constrained optimization. After Karmarkar’s work was connected to barrier methods, people started to look for generalizations, such as optimization subject to constraints involving a matrix — for example, the matrix must be positive semidefinite. This activity continues today.

People used to teach linear programming and the simplex method, and that was it. Today, if you don’t teach Newton’s method and barrier methods, you are not doing a good job. It’s part of the field now.

Let me just say that George Dantzig, bless his heart, was very excited about all the developments around interior-point methods. He was quoted as saying something like, “I’d give 20 years of my life to be around as a young person now and work on this.” It would have been a whole new outlook for him.

From theorems to algorithms to software

Jackson: The Gang of Four did everything. You proved theorems, you designed algorithms, you wrote software. It is unusual to have four people working so closely.

Wright: It was unusual. We agreed we would write all our papers with all four names, in alphabetical order. Philip Gill of course favored that! I said, “I’m going to change my name to Aardvark, then I’ll get to be first!”

But there was no other way to do it. We all knew that if we tried for every paper to ask, Who did the majority of the work?, then we would all hate each other pretty soon. We thought that keeping the four names, always in the same order, was a very good policy, but it sometimes led to misunderstandings outside the Gang.

Jackson: Do you have a favorite example of some work the Gang of Four did, in which you proved theorems, came up with algorithms, and then wrote software?

Wright: There is a class of methods called sequential quadratic programming (SQP) methods, in which a nonlinearly constrained problem is solved by constructing a sequence of quadratic programs (quadratic objective, linear constraints). In contrast, penalty and barrier functions generate a sequence of unconstrained problems. Bob Wilson, of the Stanford business school, proposed SQP methods for convex programming in his PhD thesis in 1963. At that time, people said that such a method would never work and would be far too expensive. But we wanted to develop some theory about SQP methods and also to write code, because programming them was tricky. So the Gang of Four worked together on that. The code is called NPSOL, for “Nonlinear Programming Solution” or “Nonlinear Programming at SOL”. If you think writing papers together is complicated, writing code together is almost impossible! But we managed. Stanford set up a procedure through which people could license NPSOL, and it became very widely used and generated royalties.

Jackson: Where did NPSOL end up getting used?

Wright: Companies that use optimization, like Boeing and oil companies, were probably the biggest customers. Jet Propulsion Laboratory also used it. It was basically free for academic purposes, and the price was reasonable through the Stanford licensing agreement.

Jackson: In 1981, you, Philip Gill, and Walter Murray wrote the now-classic book Practical Optimization.15 What was the motivation behind that book?

Wright: What often happened is that people would want to use NPSOL, and then one of us would have to give them lectures about optimization. So after we had worked on NPSOL, one of us said that we should write a book summarizing our experiences.

Michael didn’t want to work on the book because he had too many other things to do — and probably because he knew how many arguments we would have! The rest of us did not fully grasp how hard it was going to be, so we went ahead.

Did you know that this was the second book ever published with $ \mathrm{\TeX} $?

Jackson: No, I didn’t know that.

Wright: [Donald] Knuth’s was the first, of course. I heard about $ \mathrm{\TeX} $ because I went to a lot of talks in the computer science department. I thought we should do the book in $ \mathrm{\TeX} $ because we didn’t have a good secretary to do technical typing. Using $ \mathrm{\TeX} $ was difficult then: you had to create the text, compile it, wait to print it all (slowly) on a roll of paper, correct mistakes, and repeat — a tedious process. But it was worth it because we were in control.

For the camera-ready copy, we were very lucky. There was a high-resolution printer in the computer science department called the Alphatype. Printing on the Alphatype was expensive and complicated, but we knew a helpful undergraduate who agreed (with Knuth’s permission) to print our pages out in the middle of the night. I recently read a letter written by Knuth about the early days of $ \mathrm{\TeX} $, in which he described how he was “pleasantly surprised” to discover pages from Practical Optimization near the Alphatype. He did not know at the time who was writing those pages!

It was very exciting to be early users of $ \mathrm{\TeX} $; we could never have done the book without it. Knuth should win every prize in the world for creating $ \mathrm{\TeX} $ and enhancing the work of writers.

Jackson: And giving it out for free, from the very beginning.

Wright: Yes, absolutely.

Bell Labs: Open doors, open discussions

Jackson: To return to the timeline — you worked for twelve years at the Systems Optimization Lab. Then in 1988 you moved to Bell Labs. I’m wondering how you made that transition.

Wright: At some point I started thinking I should leave SOL, because I felt that no one outside Stanford knew who I was. There were the four of us working together, which was great, but I thought, in 20 years do I still want to be doing this? We did not want to change to having single-authored papers. Walter was a research professor, and the rest of us were research associates. At the time at Stanford, only one person per department could be a research professor. We talked to the chair of the OR department and asked if we could all be research professors. He said it was impossible, and that was that.

So you can see, our situation posed this optimization problem, of making us all individually happy while maintaining the structure we had set up. It just wasn’t going to work. Leaving was heartbreaking, but in the end it was a good thing.

I’ve talked to young women who were in the same exact situation, meaning they have a research soft-money position that they really like, but they are not going to get promoted at their university. It doesn’t bother them at the beginning, and it bothers them later. That is what happened to me. And it wasn’t clear at all that the OR department cared about having women faculty. Stanford around that time made a big public announcement that they’d hired six women in the engineering school. I think only one of those six got tenure. It’s not dissimilar to the way things still are. They say, “We care deeply about this” — it’s like when you make a call and you get a machine and it says “Your call is very important to us.” You think, If it were very important, a person would answer the phone!

When I thought about leaving SOL, I made some discreet inquiries, and in the end I interviewed seriously at Wisconsin, in the computer science department, and at Bell Labs. I had offers from both. Wisconsin was wonderful. I agonized over this. Should I go to this great university, or should I go to an industrial lab, which I had no experience of? Finally I said, I’ll see what Bell Labs is like. And it was indeed wonderful.

My offer came in 1986, and I asked to stay a few months at Stanford because Philip, Walter, and I were trying to finish our second book.16 It took a lot longer than that. Finally, in February of 1988, on President’s Day holiday, I went to Bell Labs. I knew people there, so it wasn’t like going into a completely unknown situation. And I had taken leave from Stanford, so I could go back if I wanted. Still, it was scary. But once I was at Bell Labs, I realized that this was the place for me.

At that point, my daughter was 20 and a college student. My husband and I had broken up. So it seemed like a good time to go, in the sense that I wasn’t disrupting anybody else’s life.

Jackson: What did you experience of the legendary atmosphere of Bell Labs?

Wright: How can I say this? I am basically a goody two-shoes. I try to follow the rules, probably too much so. I’m kind of a nerd — not a wild and crazy nerd but a goody two-shoes nerd. I was immediately struck by how different things were at Bell Labs.

Not long after I arrived, a visitor was giving a talk to a small audience, maybe 15 people. I have to say, the speaker was totally boring. About five minutes into the talk, a couple of the Bell Labs people just got up and left. They didn’t try to sneak out discreetly, they just got up and left. I was shocked! I stayed, of course. Afterwards, I said, “I can’t believe that people walked out.” One of my colleagues, who is not a nasty, mean person, said, “The talk was boring, they have better things to do, they just left. What’s the problem?” At Stanford it was much more polite. You just wouldn’t walk out of a talk.

Every once in a while there would be a meeting where a top manager would speak and people from the scientific end, from Bell Labs, were invited. And they gave that manager a hard time! They’d say, “Wait a minute, this is a really stupid idea! Why are you doing this?” No one seemed to mind it. Lively questioning was always encouraged at Bell Labs — it took me a while to get used to being free to speak out.

People’s doors were always open. Colleagues would come in with an idea, write something on the board and say, “What do you think?” We didn’t do that at Stanford, because the Gang of Four were working together, so we tended to talk to each other. There was always the possibility that students would come by and want to ask questions about homework, so my door was closed at least half the time. At Bell Labs, the doors were open, and people were always available for a technical discussion.

There were 14 percent women among the members of technical staff. I noticed early on a difference in approach between men and women. Several of the men would drop by and say, “I’ve got this great idea, isn’t it terrific?” The women tended to be much more guarded. Several of my female colleagues noticed the same thing, we talked about it, and eventually agreed that it’s okay to be openly positive about one’s own work. So being at Bell Labs made me a bit more outgoing. The “good little girl” does not always get recognized.

A method from the vegetable research station

Jackson: Can you explain what direct-search methods are, and in particular what the Nelder–Mead algorithm is?

Wright: First I’ll tell you about a wireless project I worked on called WISE. That was one of the most fun things I worked on at Bell Labs.

In the Computing Science Research Center at Bell Labs, people’s job was to do good research. It was not to contribute explicitly to the company’s bottom line. We were supposed to publish, go to conferences, give talks, etc. But if we did want to talk to people who had real problems to solve, Bell Labs was very happy about and welcomed that.

My director at the time, Ravi Sethi, happened to talk to an AT&T executive who worked on wireless systems. The question was where to put wireless base stations in a building, in order to optimize coverage. This was 20 years ago, before, for example, cell phones were in wide use. Ravi talked about this problem to me and others in Bell Labs, including Brian Kernighan, a well-known computer scientist; David Gay, who also worked in optimization; Steve Fortune, who worked in computational geometry; and Reinaldo Valenzuela, a wireless engineer. It took several meetings before we figured out how to formulate what was needed and what skills were required. The associated project needed, of course, to have a cute name, so it was called WISE (“wireless systems engineering”).

The obvious first question: How do you define coverage? And how do you calculate it? You need to calculate the power that’s received at any given point in the building. Steve realized that you need to use the structure of the building, because walls are made of various materials, and radio waves are governed by physical laws describing how waves bounce off objects and reflect. Steve worked on that, and Brian worked on writing brilliant code and designing the user interface, because they wanted to be able to show customers exactly what coverage they would get.

What about the optimization? Well, this was a very complicated function, as you can imagine. Here’s a building, here’s a picture of the walls, here are some radio waves bouncing around. You can’t just write down a formula. When you use computational geometry to calculate the power that’s received, it doesn’t come out to be a nice, differentiable function. That was the key point. We needed to optimize it, but we didn’t have a mathematical form for the function.

I had never previously been interested in optimization methods that don’t use derivatives, i.e., you are optimizing a function that you can calculate, but you do not have its derivatives. When I gave a talk about this work at the 1995 Dundee Meeting in Numerical Analysis, I tried to introduce a distinction between what I called model-based methods and direct-search methods. With a model-based method, you make a mathematical model of the objective function, and then you use the derivative of the model to approximate the derivative of the function. What I called direct-search methods were those that do not “in their heart” calculate derivatives. I actually wrote that in a paper!17

The WISE project led me to the Nelder–Mead method, a direct-search method that does not build a model of the function. It has a certain set of manipulations that it performs on the points you’ve observed, or where you’ve evaluated the function. Nelder and Mead, sadly no longer with us, were statisticians in the UK. They worked at the National Vegetable Research Station, which is a name that many people find hilarious! But when you think about the importance of statistics in agriculture, it makes sense.

Nelder and Mead published their method in 1965 and called it “the simplex method.”18 Apparently they either didn’t know about the Dantzig simplex method or didn’t think it was important to say theirs was different. So that is often confusing.

It’s interesting that the Nelder–Mead method was published in The Computer Journal, a very prestigious journal. At that time, you could publish papers that said — and this is what Nelder and Mead basically did — “We have thought of this method, here is an example, and it works well on these problems.” Period. No proof, no lemma, no theory. But their method was incredibly effective and worked really well. It was (and is still) widely used and extremely popular. Others had written about it, but as far as I could find, existing theory was not about the original method, but rather about modified versions.

Being a mathematician, I thought there’s got to be some theory about the original method, we just need to find it. So in addition to developing a specialized version of Nelder–Mead that we used for WISE, I hoped to prove, if possible, some results about the original method. Jeff Lagarias, a brilliant mathematician who was at Bell Labs, and I started to talk about it.

One of the great things about Bell Labs, as it then was, was that you could walk up one flight of stairs, or over to another building, and there would be a leading world expert in the area you were interested in. And you were encouraged to talk to each other. That’s why I’ve described Bell Labs as paradise.

I was a friend of Jeff’s, and we got interested in the Nelder–Mead method. The method can be described on one piece of paper with a few figures. And you think, How hard can this be? Turns out it was hard.

A clever counterexample

Wright: I gave my talk in Dundee and said that Jeff and I hoped to prove convergence of Nelder–Mead for strictly convex functions. In the audience — I loved what happened! — was Ken McKinnon (from Edinburgh). He raised his hand and said, “I think I have a counterexample to that.” Thank God I didn’t say “we proved this”! I was surprised, but I said: Okay. Ken said he would bring it in the next day.

Jackson: Had he already thought of that counterexample?

Wright: I don’t know. He may have thought of it on the spot; he is very smart. He must be one of the few people in the whole world who taught the Nelder–Mead method in his optimization class. So he knew it very well.

He came the next day, and by gosh, he had a counterexample, the McKinnon counterexample, as it is known — very clever, in two dimensions, strictly convex, continuously differentiable. I was actually happy he had done that, although I wished we had thought of it!

Jeff and I had some nice results that made the theory more tidy. We agreed that we would try to publish two papers in the SIAM Journal on Optimization: Ken’s paper with the counterexample, and then a paper by Jeff and me, plus Paul Wright and Jim Reeds, who also worked at Bell Labs and who had talked to us about the problem.19

Despite the counterexample, Nelder–Mead works well in practice. The question remained, What can you say about the Nelder–Mead method?

Bjorn Poonen was a summer intern in the Math Center at Bell Labs, and Jeff talked to him about Nelder–Mead. Bjorn said he thought we could give a convergence proof for Nelder–Mead for two-dimensional problems that satisfy certain very strict conditions. Then I left Bell Labs, Bjorn went to Berkeley, and we still had not written this paper. It took a long time, and most of the delay was my struggling to try to figure out one thing that I hoped we could prove. I never could figure that out, so finally we wrote a paper.20

Jackson: That appeared much later, in 2012.

Wright: Yes, much later. It applies to an extremely limited problem set: two variables, strictly convex, bounded level sets, twice-continuously differentiable, positive definite Hessian. That does not fit practical problems, does it? It didn’t help anybody solve real-world problems, but it was a missing piece in the theory, and that’s why it was important.

I kept trying to understand the Nelder–Mead method. I still haven’t succeeded. You can run so many problems on it, not just in two dimensions, and it will work brilliantly. I know there is some mathematical property that, if we could only find it, would explain how it does so well.

At the SIAM meeting at Stanford in 1997, I decided to organize a session where George Dantzig and John Nelder would talk about the two simplex methods. We took a great picture of Dantzig and Nelder together. It has a certain sweetness, because George Dantzig was fairly short and John Nelder was very tall.

Roger Mead had been at the University of Reading in the UK. When I gave a talk there, naturally I wanted to meet him. He had retired and had no idea that his method had 20,000 citations! Today it has more than 30,000!21

John Nelder (left) and George Dantzig, at the 1997 SIAM meeting at Stanford.

Photo courtesy of Margaret Wright.

Jackson: Your paper with Lagarias and Poonen uses methods from discrete dynamical systems. Does that come out of left field, the use of discrete dynamical systems here?

Wright: No one else had taken this view, so yes! That was Jeff’s insight. You can look at the pictures that go with Nelder–Mead — for example, there are web sites with applets that run Nelder–Mead, and in some sense they look like discrete dynamical systems.

How did Nelder and Mead think of the method? I asked both of them about this, and they said, “Well, you look at the points, and you see which one is the worst, and then you move away from it to a new point.” That makes sense, right? You do it in a structured way, with a simplex in $ n $ dimensions, and, there you go. But the moves are not obvious.

Jackson: There is a lot of geometric intuition there, maybe based on the contact with practical problems, like the ones they must have had at the Vegetable Research Station.

Wright: It was indeed intuition about real-world problems. In the UK in the early 1960s, most of the work in optimization was about methods that were practical. Plus Nelder and Mead both stressed that their main interest was statistics.

I don’t have good geometric intuition. I am terrible in three-dimensional geometry, for example. Steve Fortune used to think it was hilarious because he would sketch a three-dimensional figure, and I would say, “I can’t understand that.”

Jackson: So what kind of intuition do you have? How would you characterize it?

Wright: Algebraic. Linear algebra, with a matrix that is ill-conditioned, with a singular value decomposition, where you have the singular vectors and the singular values. The eigendecomposition. You are in $ n $ dimensions, but you have a matrix, and you can look at it and you can figure out what its product with a vector is going to be like. That’s what my intuition is.

Academia is different

Jackson: Can you tell me about the circumstances of your going to NYU?

Wright: Because of the 1995 split of AT&T into Lucent and NCR, Bell Labs became much more focused on business. This is not to blame anybody. It was just different. AT&T went from having very little competition to having a lot of competition. So the people at Bell Labs started being pushed to work on business-related topics. Also, because of the split, the math center went to AT&T and the computer science center stayed at Lucent. Many of my colleagues were suddenly in different locations.

This was a very tense time, 2001, 2002. Before the split, hardly anybody ever left Bell Labs. After the split, everybody wondered, Should we leave? Nearby were New York City, IBM, Rutgers University, and so on, but there weren’t enough jobs at the right level for everyone in the computer science center, which was about 60 people. And we all wanted to stay together. We had tremendous loyalty to each other. It’s corny to say it, but we loved our center.

At some point Brian Kernighan said he was leaving to be a professor at Princeton. That was the first chip in the wall. Once word gets out that someone is leaving, people start getting in touch, and that’s what happened to me. Around that time, I got a call from Dave McLaughlin, who was then director of the Courant Institute. They needed a chair for the computer science department. I wasn’t really interested in being chair, but that’s what the position was. Dave is wonderful — he said, “Margaret, just visit us. No commitment, just come and see.” So I did. I knew people there, and it was a great group. I was very close to going to another school, but I thought, I love New York City. It’s the Courant Institute. I’ll go to NYU.

Jackson: Was that a difficult transition?

Wright: The transition to department chair was not difficult for me in some sense, because at Bell Labs I had been head of a department of six people. But it was different at NYU. I don’t think I am a hard-nosed person, but if someone is not doing what they are supposed to do, and you are their boss, I think you should have some power over them, to try to get them to do the right thing. At Bell Labs, I could fire people. That makes a difference — even if I didn’t fire them, they knew I had that power. But a tenured faculty member basically cannot be fired. The department chair can say, “I want you to do this,” and the faculty member can just reply, “I don’t want to do that.”

Soon after I went to NYU, there was a reception for new faculty in arts and sciences, and the president of NYU spoke. Everyone just stood there. No one was disrespectful, no one interrupted. At Bell Labs, if a top executive had a meeting, people would go, and they would listen and join in. Here, people just were not paying any attention. And I thought: This is a different place.

Jackson: Another difference is that you teach at NYU.

Wright: Right. I had promised to stay at least three years as chair. I negotiated that I would not have to teach at first. I tend to throw myself into teaching and spend a lot of time on it, and I figured I needed time to learn how to be chair.

Students make a big difference. In my first week as chair, I was sitting in my office and a student came and said, “I’m in a class, and the professor is being totally unfair to me.” She started crying. I was stunned. I had just met the faculty member whom she was talking about. When I contacted him, he said, “Oh, please. She’s failing the class and trying every trick in the book to get her grade changed.” She had lied to me! And her lie was of course immediately going to be detected. That shocked me. I realized then I would need to deal with a more complex environment and a different group of people.

Jackson: Less mature, many of them a lot younger.

Wright: Right. And now I always keep a box of Kleenex on my desk! There was another female student who was desperately trying to get her grade changed to an A. The professor told me she didn’t really deserve an A, but he might do it because he could not stand it when women cried. So I said, “Send her to me.” I don’t care if you are a woman crying, I’m not going to change your grade. I thought it was interesting to learn how people react to certain pressures.

Service to the profession: It’s fun

Jackson: You have done an enormous amount of service work. You received the Distinguished Public Service Award from the American Mathematical Society in 2002, and after that kept doing even more! In particular, you served on two international panels to assess the mathematical sciences in the United Kingdom. I am interested to hear your observations about serving on those panels.

Wright: I was a member of the first panel, which was in 2006. Jean-Pierre Bourguignon, the greatest chair in the world, was the chair of it. The United Kingdom has what they call research councils to fund science. The one that handles mathematics is called EPSRC, Engineering and Physical Sciences Research Council. The EPSRC called for the international assessment. That was very interesting, and I learned a lot.

Four years later, another international assessment was held, and the EPSRC asked me to be the chair. Peter Hall from Australia was the vice-chair. We had both been on the first assessment, so we put into effect things that would make the new assessment more efficient.

One of the things that always happens in such reviews is that the people being reviewed prepare ahead of time, and they fill the time telling you what they’ve done. The panel members often end up being frustrated, because they can’t ask the questions they want to ask. So when I was chair of the panel, we tried as hard as we could to be absolutely ruthless about having time for the panel’s questions. Also, the panel felt it was essential that we have a fairly long session without anyone there from EPSRC. That was not welcomed by the EPSRC people, but they agreed.

Margaret Wright at the White House in 2006 to celebrate the awarding of National Medals of Science to Hy Bass (left) and Brad Efron (second to left). Wright served on the nominating committee. At right is Tony Chan, at the time serving as Assistant Director of Mathematical and Physical Sciences at the National Science Foundation.

Photo courtesy of Margaret Wright.

I was very happy with the review, and I think the committee was too. The EPSRC was not happy with the review. I won’t go into details, but the EPSRC was planning to do things a certain way, and the panel was pretty much unanimous in saying that we didn’t think what they were going to do was a good idea.

The overall point that the panel made was the unity of mathematics. You can’t plan the good research that will happen, so it’s important that all the people in mathematics respect and support each other, and the EPSRC should do the same. The EPSRC would have liked it much more if we’d divided mathematics up into little categories.

Jackson: The EPSRC was hoping for some kind of prioritized list of areas that would be good to support?

Wright: Yes, of course they were.

Jackson: What was the most fun you had in your public service?

Wright: Probably the UK review I was the chair of, because it was such a terrific committee. We had many enjoyable times and good discussions. We shared a similar, very positive view about mathematics and what should be done. So that was fun. It might seem weird to say it’s “fun”!

I would also say serving as president of SIAM was a lot of fun. It’s a great organization. We discussed a lot of interesting issues, we started the Community Lecture for the general public and Diversity Day. I really enjoyed that.

I have enjoyed my committee work because I’ve met so many great people whom I would not have known otherwise, because they work in different fields.

Visibility in the community key for women

Jackson: I want to ask about your current thoughts on the status of women in mathematics and computer science. How have things changed since you first entered these fields?

Wright: They have obviously changed a lot. Department chairs used to say explicitly to women, “We are not going to promote you because you have a husband who will support you, you don’t need the money, and we have this man who needs to get a bigger raise than you do.” They would be very clear about it. It’s more subtle now, but I don’t think the problem is fully solved. By “problem” I mean that not everyone is treated fairly based on ability and that there are sometimes different standards. But it’s much better.

Certain things have contributed to that, like having more women on committees. There is evidence that if you have a committee to pick invited speakers that contains no women, there are much less likely to be women invited speakers, than if you have a woman on the committee. Does this matter? Yes, because being visible in the community as an invited speaker makes a big difference. Similarly, being visible as the editor of a journal makes a difference. Generally, putting women in positions where they are visible and which are signs of respect is very important, and it happens much more now. The more women are in positions of power, I think the fairer it will get. Things are better but not ideal, and it will take a while before they are ideal.

Jackson: Do you have thoughts about why there are fewer women in computer science than in math?

Wright: A huge amount of research, some controversial, has been done about this. I don’t have any theories that are based on real data. One issue that was discussed a lot in the computer science program at Carnegie Mellon University was that the incoming male undergraduates often had been programming since high school or even junior high school. The females had not done anything like that. Is that because the girls were not encouraged? We don’t know.

Certainly the tech fields have been dominated by men. People have asked me and other women computer scientists, “What difference does it make if all the top people are men? If women are determined and competitive, they should want to get into a field where there is no one like them.” I’m sorry, but most people would not do that. I’m pretty competitive, but I want a career that I enjoy and where I will be welcome. I don’t think most people want to go into a field where no one would like them and they would be ignored or insulted.

Look at pictures in the newspaper of top executives in technology. It’s getting better, but for a long time it was 100 percent men. The press also loves the image of the rebellious young programmer wearing a T-shirt. They don’t think it’s very impressive if you’re just a regular person.

Jackson: There is the stereotype of a mathematician as somebody who’s absorbed in his work to the point of being isolated, withdrawn, even a bit crazy. That kind of behavior is not acceptable in women.

Wright: Right, and it’s similar in computer science — the stereotypical computer scientist is a little crazy. By the way, I feel comfortable in both fields!

I’m not very outgoing. I would be a terrible person in a job that required a lot of socializing. “You don’t want to come to the party, you just want to stay home and work?” someone might ask. “Yes,” I would say. In high school I was in the science club and the math club. These were not the “cool” people, the social people — no, it was a bunch of nerds, before being a nerd became something great. Now people look up to nerds. Well, good, it’s about time!

Footnotes

1:A video of Wright’s von Neumann Lecture can be found at mediauniweb.uv.es/iciam2019.

2:T. Sönmez and M. U. Ünver, “House allocation with existing tenants: A characterization,” Games and Economic Behavior 69 (2010), 425–445.

3:R. Fletcher and M. J. D. Powell, “A rapidly convergent descent method for minimization,” The Computer Journal 6 (1963/64), 163–168.

4:One example is: G. B. Dantzig, “On the need for a systems optimization laboratory,” in: Optimization Methods for Resource Allocation (Proceedings of a 1971 NATO Conference), R. W. Cottle, J. Krarup, eds., English Universities Press, London, 1974.

5:Daniel A. Spielman and Shang-Hua Teng, “Smoothed analysis of algorithms: why the simplex algorithm usually takes polynomial time,” Journal of the ACM 51:3 (2004), 385–463.

6:Victor Klee and George J. Minty, “How good is the simplex algorithm?” in Inequalities, III (Proc. Third Sympos., Univ. California, Los Angeles, Calif., 1969), pp. 159–175, Oved Shisha, ed., Academic Press, New York, 1972.

7:See L. G. Khachiyan, “A polynomial algorithm in linear programming,” (Russian), Doklady Akademii Nauk SSSR 224 (1979), no. 5, 1093–1096.

8:Malcolm W. Browne, “A Soviet discovery rocks world of mathematics”, New York Times, 7 November 1979.

9:It was nonsense, but the Russians were watching. Khachiyan died in 2005, and three years later Vladimir Gurvich published a short memorial in Discrete Applied Mathematics and recounted the following anecdote: “In 1979 papers about Leo appeared in the West, in SIAM News and even in the New York Times. For decades such publicity meant nothing good in the USSR…. Some of Leo’s colleagues in the Computing Center stopped saying hello to him, just in case; on the other hand, some strangers started to do so. The article in the New York Times compared Leo’s breakthrough in LP with the first Russian Sputnik. The word Sputnik caused concern in the highest Soviet spheres and Leo was summoned, not to the KGB but only to the GKNT (Government Committee for Science and Technology, an analogue of the French CNRS)…. An officer asked Leo how his result was related to Sputniks. In particular, he asked does it enable us to attack American Sputniks or, save God, vice versa. One should know Leo’s attitude. He did not answer, ‘What nonsense!’ Instead he said, ‘I find it very unlikely, however, I am not an expert in Sputniks.’ Sure, such an answer cannot be taken as a final one. So the officer kept asking until Leo got tired and confessed, as he frequently did, that his work is purely theoretical and has no practical applications at all. This could be safely reported to the higher level and the interviewer was happy.”

10:James Gleick, “Breakthrough in problem solving,” New York Times, 19 November 1984.

11:Philip E. Gill, Walter Murray, Michael A. Saunders, J. A. Tomlin, and Margaret H. Wright, “On projected Newton barrier methods for linear programming and an equivalence to Karmarkar’s projective method,” Mathematical Programming 36:2 (1986), 183–209.

12:Yurii Nesterov and Arkadii Nemirovskii, Interior-point polynomial algorithms in convex programming, SIAM Studies in Applied Mathematics, 13 (1994).

13:Kurt M. Anstreicher, “On long step path following and SUMT for linear and quadratic programming,” SIAM Journal of Optimization 6:1 (1996), 33–46.

14:Wright spoke on the history of interior-point methods at the American Mathematical Society meeting in Phoenix in 2004 and wrote up the lecture as “The interior-point revolution in optimization: history, recent developments, and lasting consequences,” Bulletin of the American Mathematical Society (N.S.) 42:1 (2005), 39–56.

15:Philip E. Gill, Walter Murray, and Margaret H. Wright, Practical Optimization, Academic Press, 1981. In 2019, the Society for Industrial and Applied Mathematics reissued Practical Optimization in its series “Classics in Applied Mathematics.”

16:Philip E. Gill, Walter Murray, and Margaret H. Wright, Numerical linear algebra and optimization, Addison-Wesley, 1991.

17:M. H. Wright, “Direct search methods: once scorned, now respectable,” pp. 191–208 in Numerical analysis 1995, D. F. Griffiths and G. A. Watson, eds., Pitman Res. Notes Math. Ser., 344, Longman, Harlow, 1996.

18:J. A. Nelder and R. Mead, “A simplex method for function minimization,” Computer Journal 7:4 (1965), 308–313.

19:K. I. M. McKinnon, “Convergence of the Nelder–Mead simplex method to a nonstationary point,” SIAM Journal on Optimization 9:1 (1999), 148–158; Jeffrey C. Lagarias, James A. Reeds, Margaret H. Wright, and Paul E. Wright, “Convergence properties of the Nelder–Mead simplex method in low dimensions,” SIAM Journal on Optimization 9:1 (1999), 112–147.

20:Jeffrey C. Lagarias, Bjorn Poonen, and Margaret H. Wright, “Convergence of the restricted Nelder–Mead algorithm in two dimensions,” SIAM Journal on Optimization 22:2 (2012), 501–532.

21:Wright described the history of the Nelder–Mead problem in: “Nelder, Mead, and the other simplex method,” Documenta Mathematica, 2012, Extra vol.: Optimization stories, 271–276.