Case Studies Highlighting CFAR’s Impact on Existential Risk

January 2017

One way to look at what CFAR has done is to look at our alumni. What are CFAR alumni doing, and how do they think that their involvement with CFAR has influenced their path?

For these case studies, we have chosen to focus on alumni who are doing work that is relevant to existential risk, especially in the domain of AI safety, with an emphasis on how their interactions with CFAR and the surrounding community have influenced their path. We attempt to compare their actual path with guesses at how things would have gone in the counterfactual world where they did not become involved with CFAR.

The case studies below are divided into two sections: organizations in which CFAR alumni have played a large role and individual CFAR alumni. They include case studies of people who are working directly on existential risk, as well as case studies of people who are doing work in effective altruism more broadly which we consider to be relevant to existential risk (along with other cause areas).

Organization Case Studies

MIRI

The Machine Intelligence Research Institute is one of the main organizations where technical research on AI safety takes place. MIRI narrowed its mission in 2013 to focus on technical research and currently has 6 research fellows, with several other people also contributing to its research as well as a general staff that helps make the organization run.

Most of MIRI’s staff have attended a CFAR workshop, and many attribute some role to CFAR in the path that led them to MIRI or in increasing their effectiveness. MIRI’s executive director Nate Soares estimates that the personal efficacy differences staff have gained from their interactions with CFAR account for 5-10% of the total output of MIRI research. For example, he thinks that the CFAR content on double crux has helped MIRI researchers collaborate more effectively. Nate considers MIRI’s research team to have been heavily talent constrained from 2013-16, in that if one of their researchers hadn’t joined MIRI, then MIRI would most likely have had a smaller research team rather than a different researcher on staff.

Here are brief accounts of the role that CFAR played for several members of the MIRI team.

Benya Fallenstein was the first major research hire that MIRI made after choosing to focus on technical research, and is the lead researcher on the agent foundations research agenda (one of MIRI’s two research agendas). Before coming to a CFAR workshop in July 2013, Benya had collaborated with MIRI on research while attending the University of Vienna. MIRI had discussed hiring her full-time, but she was very hesitant to do so because (for various hard-to-articulate reasons) the idea felt psychologically untenable to her. In a dialog with a CFAR staff member shortly after the July 2013 workshop, Benya was able to figure out why leaving her PhD program felt so bad (making use of CFAR techniques such as goal factoring). She realized that these downsides were fixable and then made plans to come work for MIRI which met her needs and felt tenable.

Nate attributes much of MIRI’s success in pivoting towards pure technical research to Benya’s influence, noting that in addition to her strong technical work she has played the primary role in creating MIRI’s research culture. She also helped attract other researchers to MIRI. Nate attributes his own presence at MIRI to Benya; he was brought on as a researcher in 2014 after attending a MIRI research workshop because Benya asked after him, and expects it would have taken him at least a couple additional years to become involved if not for Benya. Nate was heavily involved in MIRI’s preparations for the January 2015 AI safety conference and became MIRI’s executive director in May 2015, so events would have happened quite differently if Nate’s arrival at MIRI had been delayed.

Jessica Taylor joined MIRI in August 2015 and is the lead researcher on MIRI’s machine learning research agenda (one of MIRI’s two research agendas). Nate Soares guesses that MIRI would probably not have a machine learning research agenda if Jessica was not at MIRI. Before attending a CFAR workshop in January 2015, Jessica was already involved in the broader Bay Area rationality community and had a default plan of doing machine learning research at Google, with hopes that this would have some impact on AI safety. At her CFAR workshop, Jessica chose to focus heavily on the question of her career path (including by applying CFAR techniques like goal factoring, by thinking through the question in conversations with other participants, and by having conversations about it with CFAR staff), and wound up deciding to work at MIRI. She guesses that it is more likely than not that she would have eventually joined MIRI if not for her CFAR experience, but it would have been delayed and there is somewhere between a 10%-50% chance that it would not have happened. Jessica estimates that things that she has gotten from CFAR (including alumni events and interactions with the alumni network) have made her roughly 15-20% more productive. She reports that it seems to have helped with: “general productivity, modeling the work I’m doing, thinking about what things about the work feel good or bad, conversational strategies like double crux, noticing motivation problems, debugging them, figuring out what I actually think about things, and figuring out what things I’m afraid of.” For example, Jessica has found focusing (an introspection technique which she learned through CFAR) useful when she gets stuck on a research question. Posing questions to herself like “Why do I even think this is possible?” and exploring the thoughts and intuitions that emerge often leads to her getting unstuck.

Scott Garrabrant joined MIRI in December 2015 and is primarily responsible for MIRI’s research on logical induction, which Nate Soares considers to be MIRI’s most important research result of the past 5 years. Scott had already made progress on some of MIRI’s research problems while he was a graduate student in mathematics at UCLA before attending the 2015 MIRI Summer Fellows Program, a three-week training for prospective AI safety researchers run by CFAR and funded by MIRI. Scott says, “Before the workshop I had a year left in graduate school and was trying to decide between starting a postdoc in combinatorics or working for MIRI when I graduated. Because of the plan change I had from the workshop, I not only committed to the MIRI option, but worked hard to quickly complete my PhD in 5 months which allowed me to work at MIRI sooner. The workshop facilitated me being around people who got me excited about the MIRI option and also encouraged me to munchkin and graduate at a weird time.” Scott guesses that, if he had not attended CFAR, he most likely would have joined MIRI half a year later but might have chosen to do a postdoc instead. While at MIRI, Scott has noticed several ways in which his experiences with CFAR have influenced his research process. Among other examples, Scott mentions that he spends more time reflecting on how he does research, he has formed long-term collaborations with people who he otherwise would not have reached out to, and he has used internal double crux to find a new way to conceptualize a research question that he was working on.

Patrick LaVictoire is a research fellow at MIRI. He reports: “At the time of the 2012 CFAR workshop, I was having difficulty deciding between an academic career and working as a data scientist. The workshop introduced me to several ideas (particularly IFS and goal factoring) which helped me eventually make the decision to leave academia for Silicon Valley. It also deepened my existing connections with people in the MIRI community, which led to me attending MIRI workshops and co-writing a paper, and eventually joining MIRI in 2015. (Subjectively and with the benefit of hindsight, I think the existence of CFAR changed my likelihood of working at MIRI from about 30% to about 60%, though I don’t expect to be especially well-calibrated on such questions.)”

Tsvi Benson-Tilsen is a graduate student in logic at UC Berkeley who works as a research associate with MIRI. He has assisted with Scott Garrabrant’s research and helped prepare MIRI papers. Tsvi was already in graduate school and interested in AI safety when he attended the 2015 MIRI Summer Fellows Program, a training run by CFAR and funded by MIRI. He had assumed that he would need to wait several years until he had more education and qualifications before contributing to technical research on AI safety, but his experience at MSFP led him to realize that he didn’t need to wait for permission before doing the work and made him feel more comfortable diving in and contributing what he could. He also credits CFAR with improving his ability to do research, including some specific techniques (double crux, trigger-action plans, urge propagation, and goal factoring) as well as a broader mindset shift towards introspecting on his research process and looking for ways to improve. In particular, Tsvi reports that an increased ability to think about the big picture of research has been essential to his work on AI safety.

FLI

The Future of Life Institute is a research and outreach organization founded in March 2014 that works to mitigate existential risks facing humanity, particularly existential risk from advanced artificial intelligence. 4 of its 5 founders are CFAR alumni (Jaan Tallinn, Max Tegmark, Meia Chita-Tegmark, and Victoria Krakovna).

FLI organized the January 2015 AI safety conference in Puerto Rico which brought together leading AI researchers in academia and industry with leading AI safety researchers to help foster a field-wide discussion of AI safety. This conference led to a $10M grant for AI safety research from Elon Musk (to which the Open Philanthropy Project added $1.2M), which was administered by FLI. It also led to an open letter on Research Priorities for Robust and Beneficial Artificial Intelligence which has been signed by over 8,000 people, including technology leaders such as Peter Norvig (Director of Research at Google), Steve Wozniak (co-founder of Apple), and Eric Horvitz (Microsoft Research Director). Since then, FLI has organized various other events which have helped bring concern about AI safety into the mainstream of the field of machine learning, while also beginning to work on other sources of existential risk such as nuclear weapons.

The Boston rationality community played an important role in the start of FLI. Boston had an active rationality community in 2010 (including monthly Less Wrong meetups), which became significantly more active and coordinated in 2011-13 after many of its members attended CFAR workshops and cross-fertilized with the San Francisco Bay Area rationality community. According to Boston rationalist and CFAR alum Jim Babcock, a major turning point came in 2013 when he and several other Boston rationalists decided to live together in a group house which they called Citadel (modeled on the rationalist group houses which they had seen in California and New York). Citadel became a hub for practicing rationality techniques, hosting visitors, discussing the most important problems facing humanity, and other events for the Boston rationality community. Many Citadel residents were passionate about seeking to reduce existential risk.

One of FLI’s five co-founders, Victoria Krakovna, was part of the Boston rationality community and a founding resident of Citadel. Victoria is currently a researcher on DeepMind’s AI safety team; her story is described in more detail in an individual case study below. In brief, Victoria attended a proto-CFAR workshop in 2011, which contributed to her becoming heavily involved in the growing Boston rationality community, which led to her meeting FLI co-founders Jaan Tallinn and Max Tegmark and becoming part of the team that launched FLI.

Three other FLI co-founders (Jaan Tallinn, Max Tegmark, Meia Chita-Tegmark) attended the March 2013 CFAR workshop and connected with the Boston rationality community in the fall of 2013. Jaan, Max, and FLI co-founder Anthony Aguirre had previously collaborated on the 2011 Foundational Questions Institute conference, and in 2013 they and Meia were interested in doing something more about existential risk. Meeting a community of people who were passionate about existential risk helped crystallize the vision of creating an organization where idealistic volunteers would play a large role of bringing in ideas and making things happen. In the words of Max Tegmark, “CFAR was instrumental in the birth of the Future of Life Institute: 4 of our 5 co-founders are CFAR alumni, and seeing so many talented idealistic people motivated to make the world more rational gave me confidence that we could succeed with our audacious goals.”

In 2014, FLI volunteers did much of the work required to launch a new organization and plan a major conference. Some of the volunteers came from the Boston rationality community (others came from universities and EA groups), with Victoria in charge of coordinating volunteers and Citadel serving as a hub of volunteer activity. Members of the broader rationality/EA community outside of Boston (some of whom were connected via CFAR alumni) also contributed their expertise to the project (for example, by helping to design FLI’s website). Jim Babcock guesses that, in the absence of CFAR, Citadel house would not have been formed in 2013 and the Boston rationality community would have been much less capable of supporting the founding of FLI. Without inspiration and support from the rationality community, Victoria guesses that FLI probably would not have been founded in the form it took and there may not have been an AI safety conference in January 2015, though some of FLI’s co-founders most likely would have pursued some project aimed at mitigating existential risk.

Arbital

Arbital is an online discussion platform intended to make it easier to discuss complex topics such as existential risk. It is a small startup which has the potential to have a substantial impact on public discourse and the chances of an existential win. The seed of Arbital was an idea that Eliezer Yudkowsky, currently an Arbital advisor, had in 2014. Work on Arbital began in 2015 when Alexei Andreev decided to take on the project of building it, and the four-person team at Arbital formed in spring of 2016.

All four members of the Arbital team are CFAR alums, and each attributes a substantial causal role to CFAR in the path that led them to Arbital. Arbital co-founder Alexei Andreev describes Arbital as a project that requires both strong technical skill and close value alignment, which makes it relatively difficult to find people for its team.

Alexei Andreev is a co-founder of Arbital. When Alexei first came to a proto-CFAR workshop in 2011, he was a game designer who had recently become passionate about rationality and existential risk, and was considering making games that would educate people about those two topics. In 2012 he left the gaming industry to get a job at Google doing earning to give (donating primarily to MIRI), in 2014 he left Google to pursue jobs at startups which involved higher expected donations (but more risk) and heard about Eliezer’s idea which was the seed of Arbital, and in 2015 he decided to pursue the Arbital idea instead of doing earning to give.

Each of these steps could have happened in the counterfactual world where Alexei was not involved with CFAR, but there is some chance that each step would have been delayed or would not have taken place. Conversations with CFAR staff in 2011-12 helped Alexei reach the conclusion that earning to give was a more promising option than direct work in gaming (but Alexei suspects that counterfactual Alexei would also have reached that conclusion, though not as quickly). The 2011 workshop helped Alexei develop his social skills, which he thinks may have contributed to his success at finding jobs. Knowing other CFAR alums who had founded successful startups may have contributed to his decision to leave Google in 2014. Counterfactual Alexei might not have heard about the idea for Arbital when actual Alexei did, though he thinks that his relationship with MIRI probably would have led to him learning about the idea. Alexei suspects that counterfactual Alexei would have been less involved in the rationality community, and therefore less eager to take on a heavily rationality-flavored project like Arbital. He also notes that he would’ve been significantly less likely to choose to do Arbital if he hadn’t already decided to do a startup, so progressing more slowly along the same career path might have prevented him from starting Arbital.

Eric Rogstad is a co-founder of Arbital. His best guess is that, if he had not attended the June 2014 CFAR workshop, he would be working in a typical software job and donating 10% of his income to a mix of GiveWell recommended charities and MIRI. Attending a CFAR workshop put him in connection with a community of people who were working directly on existential risk reduction, which made direct work a salient option to consider rather than just “one of those things that it might be good to do.” Eric cites specific conversations with other CFAR alums as instrumental to his decision to pursue direct work, including conversations at an alumni workshop where he tried to sort through why he didn’t feel fully on board with his explicit verbal models of likely future scenarios. By late 2015, Eric was looking to do direct work related to existential risk, and he decided to join Alexei (who he originally met at the 2014 CFAR alumni reunion) in co-founding Arbital.

Eric Bruylant is Arbital’s community architect, with the role of building the community of users and predicting how the software’s features will impact the community. Before attending the 2015 MIRI Summer Fellows program (run by CFAR and funded by MIRI), Eric had a strong interest in AI safety, had read lots of relevant content on the internet, and had ambitious ideas about ways to have a large positive impact on the world. After attending MSFP, Eric developed a skillset and social network which have allowed him to work productively towards bringing about some of his ambitious ideas. Eric described the mindset of people at CFAR as “the exact opposite of learned helplessness”, and found that experiencing more of this mindset, in combination with an increased ability to see what was going on with his mind, was particularly helpful for making this shift. He also developed an ability to talk with people about his ideas in a way that built social momentum towards action (at least in cases where the idea held up to their scrutiny). Eric guesses that these changes probably would not have happened if he had not attended CFAR. Eric learned about Arbital from Alexei while he was in the Bay Area for MSFP, and in early 2016 he decided that he might have a comparative advantage at the Arbital community architect role and got back in touch with Alexei about joining the Arbital team.

Stephanie Zolayvar is a software engineer and designer at Arbital. In 2014, she was a software engineer at Google in Seattle, doing earning to give primarily to global poverty charities and also to existential risk charities. Her best guess is that she would still be in Seattle doing something similar if she had not attended CFAR (though it’s possible that another event, like EA Global, might have shifted her path). Steph attended the September 2014 CFAR workshop, followed by a CFAR mentorship training workshop, followed by volunteering at several CFAR workshops. After getting to know people at CFAR events who were doing direct work, Steph started to think of herself as being in the reference class of people who could do direct work. In mid 2015, she decided to leave Google and move to the Bay Area to look for opportunities to do direct work rather than earning to give (with the extra benefit that she’d be moving close to friends that she’d made). Steph also set about becoming capable of doing whatever work she considered important; she estimates that 1/3 of the progress that she’s made towards this goal has come through CFAR programs and 2/3 through other things happening in the Bay Area rationality/EA community. Without this progress, she thinks that she probably would not have been able to work effectively at a startup such as Arbital.

Individual Case Studies

Victoria Krakovna

Victoria Krakovna is a member of DeepMind’s safety team, where she does technical research on AI safety. She is also one of the 5 co-founders of the Future of Life Institute (described in a separate case study above).

Victoria first attended a proto-CFAR workshop in summer 2011, and has regularly participated in CFAR events since then. Before 2011, Victoria had not given much thought to AI safety and had mainly engaged with the rationality community by reading things online. When Victoria moved to Boston for her PhD program in fall 2011, she connected with some of the Boston rationalists who she had met at the workshop and quickly became involved in the growing Boston rationality community. Conversations with members of the community contributed to her taking the possibility of AI-related existential risk more seriously. In 2013, Victoria was one of a group Boston rationalists who came together to start a group house called Citadel. She first met her fellow FLI co-founders Jaan Tallinn and Max Tegmark when they visited Citadel that fall. Victoria’s work at FLI led to her meeting the leaders of DeepMind at FLI’s AI safety conference, which eventually led to her opportunity to join DeepMind’s AI safety team. Victoria guesses that she would have been less likely to become heavily involved in the Boston rationality community if not for the connections that she made through CFAR, and that it is less likely that she’d be working on AI safety if not for her engagement with the Boston rationality community.

Victoria also suspects that her involvement with CFAR influenced the way that she approached graduate school. In the fourth year of her PhD program, she switched to a new project with a new advisor which was a better fit for her interests. She thinks that the strategic ways of thinking that she practiced at CFAR (and within the broader rationality community) contributed to her willingness to change course, which helped her finish her PhD at least a year sooner than she otherwise would have. Having her PhD was essential for getting a research job at DeepMind.

Peter McIntyre

Peter McIntyre is the Director of Coaching at 80,000 Hours, an organization which conducts research on high-impact careers and provides career advice to help people maximize the positive impact of their life’s work (some of the people who they have worked with aim to do this through work on AI safety). He has also co-founded Effective Altruism Australia, which has raised about 500,000 AUD for GiveWell-recommended global poverty charities.

In January 2015, Peter attended a CFAR workshop and interned at CFAR for a month. At the time, he was a student in medical school in Australia who was highly interested in effective altruism. He was planning to finish his last 2 years of medical school and then explore career options such as consulting, working for EA organizations, and startups.

Peter cites two main impacts of his time with CFAR. First, it heightened his focus on self-development, which has led to increased personal efficacy. This included a broad mindset shift towards being more proactive about solving problems, as well as regular use of some CFAR techniques such as pre-hindsight and resolve cycles. Second, Peter’s time with CFAR led to him running the 2015 EA Global Melbourne conference, which was pivotal to his career. At his workshop Peter met some other workshop participants who were involved in organizing EA Global, which led to Peter coordinating with them to run EAG Melbourne. At EAG Melbourne Peter had conversations that led him to decide to take time off from medical school to immediately explore other job options. Further, through his work on EA Global he made connections with people which led to job offers at Vesparum Capital, the Future of Humanity Institute, and 80,000 Hours. He worked briefly at Vesparum and FHI before deciding that 80K was the organization where he felt he could make the biggest contribution. Peter guesses that, if he had not come to CFAR in January 2015, it is more likely than not that he wouldn’t have become involved in organizing EA Global (though still would have attended) and would have finished medical school, perhaps working for an EA organization after a 2-year delay.

Stefan Schubert

Stefan Schubert is a researcher at the Centre for Effective Altruism who has done work on the epistemics of public discourse and is now focused primarily on cause prioritization: coming up with new considerations and models which are relevant for determining how to do good effectively. He initially came across the online rationality community in late 2013 when he was a postdoc at the London School of Economics, which (he suspects) put him on a path towards eventually getting involved in effective altruism. His involvement in EA accelerated rapidly as a result of attending a CFAR workshop in June 2014 and accelerated further when he then decided to attend EA Global 2014. He began to feel a stronger sense of urgency to do important work, which he attributes in large part to meeting people at the workshop who were very motivated to do good in the world. Stefan undertook projects in political rationality, including co-founding the Swedish Network for Evidence-Based Policy, which were inspired by his engagement in the rationality and EA communities. He joined CEA at the beginning of 2016. If he had not come to CFAR in 2014, Stefan guesses that he would not have attended EA Global 2014 and it would have taken longer for him to become heavily involved in EA. Possibly he would have followed a path very similar to his actual path but with a 1-2 year delay, or he might have ended up in a more traditional academic job doing research which drew on EA ideas but was less directly focused on what he sees as the most important open questions in EA.

Ben Hoffman

Ben Hoffman writes about effective altruism, rationality, and existential risk, with the overarching aims of building detailed models of the relevant parts of the world and improving the quality of discourse in the EA and rationality communities. He recently spent several months doing detailed research into the AI landscape to develop a deeper understanding of the risks posed by AI.

Ben has participated in several CFAR programs, beginning with the summer 2011 workshop run by proto-CFAR, and has also interacted with the surrounding rationality community in ways that make it difficult to disentangle the effects. His best guess is that, if not for the influence of the “CFARbundle” (including CFAR and the surrounding rationality community) he would have ended up doing earning to give, perhaps with a job in finance, donating to MIRI and other EA causes. Instead, Ben decided to move to the San Francisco Bay Area and do direct work, first at GiveWell and now independently.

Ben cites 3 main CFARbundle-influenced epistemic shifts which led to his current work. First, he became more comfortable attempting to build his own models, guided by his own taste and judgment, rather than attempting to reason in ways which met external standards or official decision rules. Second, he began seeking out conversations with people in his broader social network who were good sources of information and arguments, or who he could use a sounding board to help him think things through. Third, in part as a result of the first two shifts, he was exposed to a wider range of viewpoints, arguments, and perspectives. These three epistemic shifts contributed to his current work on existential risk by making him feel capable of building his own models of the topic, exposing him to a great deal of information about what other people have been thinking and doing about existential risk, and leading him to notice that there seemed to be a scarcity of people who were working on interesting things related to existential risk.

Ben estimates that CFAR is directly responsible for roughly 20% of these epistemic shifts, with an additional 40% coming from the broader rationality community (with CFAR playing an indirect role by having some influence on the surrounding community and by causing Ben to have more contact with that community). Ben found that CFAR content helped directly with doing more self-guided model-building, and thinks that the CFAR mindset towards problem-solving helped him notice that he was ignoring an obvious resource by not making use of the people around him in his thinking.


Thanks to the alumni who agreed to be profiled in these case studies! Many other CFAR alumni are doing valuable work on existential risk and other important causes, including work at OpenAI and at EA organizations. If you are a CFAR alum who is interested in being included in a future case study, let us know.