Eliezer Yudkowsky

Eliezer Shlomo Yudkowsky (born September 11, 1979) is an American AI researcher and writer best known for popularizing the idea of friendly artificial intelligence . [2] [3] He is a co-founder and research fellow at the Machine Intelligence Research Institute , a private research nonprofit based in Berkeley, California . [4] He never attended high school or college and has no formal education in artificial intelligence. Yudkowsky claims that he is self-taught in the field. [5] His work on the prospect of a runaway intelligence explosion Was an impact is Nick Bostrom’s Superintelligence: Paths, Dangers, Strategies.

Work in artificial intelligence safety

Goal learning and incentives in software systems

Yudkowsky’s views on the safety challenges Posed by future generations of AI systems are Discussed in the standard undergraduate textbook in AI, Stuart Russell and Peter Norvig ‘s Artificial Intelligence: A Modern Approach . Russell and Norvig quotes Yudkowsky’s proposal for autonomous and adaptive systems

Yudkowsky (2008) [6] goes into more detail about how to design a Friendly AI . He asserts that friendliness should be designed, but that the designers should recognize that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design – to design a mechanism for evolving the system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes. [2]

Citing Steve Omohundro ‘s idea of instrumental convergence , Russell and Norvig caution that autonomous decision-making systems with poorly designed goals only want your program to play chess or prove theorems, if you give it to the ability to learn and alter itself, you need safeguards “. [2] [7]

In response to the instrumental convergence concern, Yudkowsky and other MIRI researchers have recommended that work be done to specify software that converges on safe default behaviors even when their goals are misspecified. [8] The Future of Life Institute (FLI) summarizes this research program in the Open Letter on Artificial Intelligence research priorities document:

If a system is selecting the actions that are best suited to a given task, then avoiding conditions is a natural subgoal (and conversely, seeking unconstrained situations is sometimes a useful heuristic). This could become problematic, however, if we wish to repackage the system, to deactivate it, or to significantly alter its decision-making process; such a system would rationally avoid these changes. Systems That do not exhibit thesis Behaviors-have-been termed correctablesystems, and both theoretical and practical work in this area appears tractable and useful. For example, it may be possible to design the functions of decision-making or not, but it may be possible to avoid the possibility of being shut down or repurposed, and it may be developed to better understand the space of potential systems that avoid undesirable behaviors. [9]

Yudkowsky argues that the systems have become more intelligent, new formal tools, and that they are more likely to be in the process of becoming more effective. [8] [10] These lines of research are discussed in MIRI’s 2015 technical agenda. [11]

System reliability and transparency

Yudkowsky studies decision theories that achieve better outcomes than causal decision theory in Newcomblike problems . [12] This includes decision procedures that allow agents to co-operate with one another in the prisoner’s dilemma . [13] Yudkowsky has also written on theoretical prerequisites for self-verifying software. [14] [10]

Yudkowsky argues that it is important for the future, and that it is important to ensure that it is stable and to allow greater human oversight and analysis. [10] Citing papers on this topic by Yudkowsky and other MIRI researchers, the FLI research priorities document that defines the reasoning in embodied and logically non-omniscient agents would be valuable for the design, use, and oversight of AI agents. [9] [15]

Capabilities forecasting

In their discussion of Omohundro and Yudkowsky’s work, Russell and Norvig cites IJ Good’s 1965 prediction that when computer systems begin to outperform humans in software engineering tasks, which may result in a feedback loops of capable AI systems. This raises the possibility that it could have a certain level of capability. [2]

In the intelligence explosion scenario inspired by good hypothetical, recursively self-improving AI systems quickly transition from subhuman general intelligence to superintelligent . [10] Nick Bostrom ‘s 2014 book Superintelligence: Paths, Dangers, Strategies sketches out Good for argument in greater detail, while making a case for expecting AI systems to eventually outperform humans across the board. Bostrom cites writing by Yudkowsky on inductive value learning and on the risk of anthropomorphizing advanced AI systems, eg: “AI might make an apparentlysharp jump in intelligence purely as the result of anthropomorphism, the human tendency to think of ‘village idiot’ and ‘Einstein’ as the extreme ends of intelligence, instead of nearly indistinguishable points on the scale of minds-in-general. ” [6] [16]

The Open Philanthropy Project , an offshoot of the charity evaluator GiveWell , credits Yudkowsky and Bostrom with several (paraphrased) arguments for expecting future AI advances to have a large societal impact: [17]

Over a short geological timescale, humans have a significant impact on the biosphere, often leaving the welfare of other species dependent on the objectives and decisions of humans. It seems plausible that the human and human capabilities have been crucial in this field. If advanced artificial intelligence agents become more powerful than humans, it seems possible that they could become dominant in the biosphere, leaving humans dependent on their objectives and decisions. As with the interaction between humans and other species in the natural environment, these problems could be the result of competition for resources rather than malice.

In comparison with other evolutionary changes, there was little time between our hominid ancestors and the evolution of humans. There is little room for improvement in human intelligence, but it is possible that the growth in intelligence can be small on some absolute scale. [… T] his makes it seem plausible that intelligent agents are more intelligent than humans can have dramatic real-world consequences even if the difference in intelligence is small in an absolute sense. [15]

Russell and Norvig raise the objection that there are known limits to intelligent problem-solving from computational complexity theory ; If there are many limits to how efficiently algorithms can solve various computer science tasks, then intelligence explosion may not be possible. [2] Yudkowsky has debated the likelihood of intelligence explosion with economist Robin Hanson , who argues that progress is likely to accelerate over time, but is not likely to be localized or discontinuous. [18]

Rationality writing

Between 2006 and 2009, Yudkowsky and Robin Hanson were the principal contributors to Overcoming Bias , [19] a cognitive and social science blog sponsored by the Oxford University Future of Humanity Institute . In February 2009, Yudkowsky founded LessWrong , [20] a “community blog devoted to refining the art of human rationality”. [21] Overcoming Bias has since functioned as Hanson’s personal blog. LessWrong has been covered in depth in Business Insider . [22]

Yudkowsky has also written several works of fiction. [23] His fan fiction story, Harry Potter and the Methods of Rationality , uses stud Elements from JK Rowling ‘s Harry Potter series to Illustrate topics in science. [21] [25] [25] [26] [27] [28] [29] The New Yorker describes Harry Potter and the Methods of Rationality as a retelling of original Rowling’s “in an attempt to explain Harry’s wizardry through the scientific method “. [30]

Over 300 blogposts by Yudkowsky have been released as six books, collected in a single ebook titled Rationality: From AI to Zombies by the Machine Intelligence Research Institute in 2015. [31] His latest ebook is titled Inadequate Equilibria: Where and How Civilizations Get Stuck . [32]

Personal views

Yudkowsky identified as an atheist [33] and a “small-l libertarian”. [34]

Academic publications

  • Yudkowsky, Eliezer (2007). “Levels of Organization in General Intelligence” (PDF) . Artificial General Intelligence . Berlin: Springer.
  • Yudkowsky, Eliezer (2008). “Cognitive Biases Potentially Affecting Judgment of Global Risks” (PDF) . In Bostrom, Nick ; Ćirković, Milan. Global Catastrophic Risks . Oxford University Press. ISBN  978-0199606504 .
  • Yudkowsky, Eliezer (2008). “Artificial Intelligence as a Positive and Negative Factor in Global Risk” (PDF) . In Bostrom, Nick ; Ćirković, Milan. Global Catastrophic Risks . Oxford University Press. ISBN  978-0199606504 .
  • Yudkowsky, Eliezer (2011). “Complex Value Systems in Friendly AI” (PDF) . Artificial General Intelligence: 4th International Conference, AGI 2011, Mountain View, CA, USA, August 3-6, 2011 . Berlin: Springer.
  • Yudkowsky, Eliezer (2012). “Friendly Artificial Intelligence” . In Eden, Ammon; Moor, James; Søraker, John; et al. Singularity Hypotheses: A Scientific and Philosophical Assessment . Berlin: Springer. ISBN  978-3-642-32559-5 .
  • Bostrom, Nick ; Yudkowsky, Eliezer (2014). “The Ethics of Artificial Intelligence” (PDF) . In Frankish, Keith; Ramsey, William. Cambridge Handbook of Artificial Intelligence . New York: Cambridge University Press. ISBN  978-0-521-87142-6 .
  • LaVictoire, Patrick; Fallenstein, Benja; Yudkowsky, Eliezer; Bárász, Mihály; Christiano, Paul; Herreshoff, Marcello (2014). “Program Equilibrium in the Prisoner’s Dilemma via Löb’s Theorem” . Multiagent Interaction without Prior Co-ordination: Papers from the AAAI-14 Workshop . AAAI Publications.
  • Soares, Nate; Fallenstein, Benja; Yudkowsky, Eliezer (2015). “Corrigibility” . AAAI Workshops: Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, January 25-26, 2015 . AAAI Publications.

See also

  • AI box
  • Friendly artificial intelligence
  • Less Wrong
  • Open Letter on Artificial Intelligence


  1. Jump up^ Yudkowsky, Eliezer. “Eliezer S. Yudkowsky” . yudkowsky.net . Retrieved October 7, 2015 .
  2. ^ Jump up to:e Russell, Stuart ; Norvig, Peter (2009). Artificial Intelligence: A Modern Approach . Prentice Hall. ISBN  978-0-13-604259-4 .
  3. Jump up^ Leighton, Jonathan (2011). The Battle for Compassion: Ethics in an Apathetic Universe . Algora. ISBN  978-0-87586-870-7 .
  4. Jump up^ Kurzweil, Ray (2005). The Singularity Is Near . New York City: Viking Penguin. ISBN  0-670-03384-7 .
  5. Jump up^ Saperstein, Gregory (August 9, 2012). “5 Minutes With a Visionary: Eliezer Yudkowsky” .
  6. ^ Jump up to:Yudkowsky b , Eliezer (2008). “Artificial Intelligence as a Positive and Negative Factor in Global Risk” (PDF) . In Bostrom, Nick ; Ćirković, Milan. Global Catastrophic Risks . Oxford University Press. ISBN  978-0199606504.
  7. Jump up^ Omohundro, Steve (2008). “The Basic AI Drives” (PDF) . Proceedings of the First AGI Conference . IOS Press.
  8. ^ Jump up to:b Soares, Nate; Fallenstein, Benja; Yudkowsky, Eliezer (2015). “Corrigibility” . AAAI Workshops: Workshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, January 25-26, 2015 . AAAI Publications.
  9. ^ Jump up to:b Future of Life Institute (2015). Research priorities for robust and beneficial artificial intelligence (PDF) (Report) . Retrieved October 12,2015 .
  10. ^ Jump up to:d Yudkowsky Eliezer (2013). “Five theses, two lemmas, and a couple of strategic implications” . MIRI Blog . Retrieved October 12, 2015 .
  11. Jump up^ Soares, Nate; Fallenstein, Benja (2015). “Aligning Superintelligence with Human Interests: A Technical Research Agenda” (PDF) . In Miller, James; Yampolskiy, Roman; Armstrong, Stuart; et al. The Technological Singularity: Managing the Journey . Springer.
  12. Jump up^ Soares, Nate; Fallenstein, Benja (2015). “Toward Idealized Decision Theory”. arXiv : 1507.01986  [ cs.AI ].
  13. Jump up^ LaVictoire, Patrick; Fallenstein, Benja; Yudkowsky, Eliezer; Bárász, Mihály; Christiano, Paul; Herreshoff, Marcello (2014). “Program Equilibrium in the Prisoner’s Dilemma via Löb’s Theorem” . Multiagent Interaction without Prior Co-ordination: Papers from the AAAI-14 Workshop . AAAI Publications.
  14. Jump up^ Fallenstein, Benja; Soares, Nate (2015). Vingean Reflection: Reliable Reasoning for Self-Improving Agents (PDF) (Technical report). Machine Intelligence Research Institute. 2015-2.
  15. ^ Jump up to:b GiveWell (2015). Potential risks from advanced artificial intelligence(Report) . Retrieved October 12, 2015 .
  16. Jump up^ Bostrom, Nick (2014). Superintelligence: Paths, Dangers, Strategies . ISBN  0199678111 .
  17. Jump up^ Yudkowsky, Eliezer (2013). Intelligence Explosion Microeconomics(PDF) (Technical report). Machine Intelligence Research Institute . 2013-1.
  18. Jump up^ Hanson, Robin ; Yudkowsky, Eliezer (2013). The Hanson-Yudkowsky AI Foom Debate . Machine Intelligence Research Institute .
  19. Jump up^ “Overcoming Bias: About” . Robin Hanson . Retrieved February 1, 2012 .
  20. Jump up^ “Where did Less Wrong come from?” (LessWrong FAQ) ” . Retrieved September 11, 2014 .
  21. ^ Jump up to:b Miller, James (2012). Singularity Rising . ISBN  978-1936661657 .
  22. Jump up^ Miller, James (July 28, 2011). “You Can Learn How To Become More Rational” . Business Insider . Retrieved March 25, 2014 .
  23. Jump up^ Eliezer S. Yudkowsky. “Fiction” . Yudkowsky . Retrieved September 14,2015 .
  24. Jump up^ David Brin (June 21, 2010). “CONTRARY BRIN: A secret of college life … more controversies and science!” . Davidbrin.blogspot.com . Retrieved August 31, 2012 . “Harry Potter and the Key to Immortality”, Daniel Snyder, The Atlantic
  25. Jump up^ Authors (April 2, 2012). “Rachel Aaron interview (April 2012)” . Fantasybookreview.co.uk . Retrieved August 31, 2012 .
  26. Jump up^ “Civilian Reader: An Interview with Rachel Aaron” . Civilian-reader.blogspot.com. May 4, 2011 . Retrieved August 31, 2012 .
  27. Jump up^ Hanson, Robin (October 31, 2010). “Hyper-Rational Harry” . Overcoming Bias . Retrieved August 31, 2012 .
  28. Jump up^ Swartz, Aaron. “The 2011 Review of Books (Aaron Swartz’s Raw Thought)” . archive.org. Archived from the original on March 16, 2013 . Retrieved October 4, 2013 .
  29. Jump up^ “Harry Potter and the Methods of Rationality” . fanfiction.net. February 28, 2010 . Retrieved December 29, 2014 .
  30. Jump up^ Packer, George (2011). “No Death, No Taxes: The Libertarian Futurism of a Silicon Valley Billionaire” . The New Yorker : 54 . Retrieved October 12,2015 .
  31. Jump up^ Rationality: From AI to Zombies,MIRI, 2015-03-12
  32. Jump up^ https://intelligence.org/equilibriabook/,MIRI, 2017-10-26
  33. Jump up^ “The Correct Contrarian Cluster – Less Wrong” . lesswrong.com .
  34. Jump up^ 7, Eliezer Yudkowsky Response Essays September; 2011. “Is That Your True Rejection?” . Cato Unbound .

Leave a Comment

Your email address will not be published. Required fields are marked *