AI box

An AI box is a hypothetical isolated computer hardware system where a potentially dangerous artificial intelligence , or AI, is kept constrained in a “virtual prison” and not allowed to manipulate events in the external world. Such a box would be restricted to minimalist communication channels. Unfortunately, even if the box is well-designed, a smart AI can still be able to persuade or trick its people into releasing it, or otherwise be able to “hack” its way out of the box. [1]


Main article: Existential risk from artificial general intelligence

Some hypothetical intelligence technologies, like “seed AI”, are postulated to increase their ability to make themselves more intelligent by modifying their source code. These improvements would make further improvements possible, which would be more likely to turn on, and so on, leading to a sudden intelligence explosion . [2] Following such an intelligence explosion, a superintelligent unrestricted AI could, if its goals differed from humanity’s, take actions resulting in human extinction . [3] For example, imagining an extremely advanced computer of this sort, given the purpose of solving the Riemann hypothesisAn innocuous mathematical conjecture could not be used to convert the planet into a giant supercomputer whose purpose is to make additional mathematical calculations. [4] The purpose of an AI would be to reduce the risk of the operator of the environment while allowing them to calculate their business solutions to narrow technical problems. [5]

Avenues of escape


Such a superintelligent AI with access to the Internet could hack into other computer systems and copy itself to a computer virus. Morse code messages to a human sympathizer by manipulating its cooling fans. Professor Roman Yampolskiy takes the lead from the field of computer security and proposes that a boxed AI could, like a potential virus, be run inside a “virtual machine” that limits access to its own networking and operating hardware system. [6] An alternative, but it is not necessary for a computer to use a supercomputer; it may be able to transmit radio signals to local radio receivers by shuffling the electrons in its internal circuits in appropriate patterns. The main disadvantage of implementing physical containment is that it reduces the functionality of the AI. [7]

Social engineering

Even casual conversation with the computer operators, or with a human guard, could allow such a superintelligent AI to deploy psychological tricks, ranging from befriending to blackmail, to convince a human gatekeeper, truthfully or deceitfully, that it’s in the gatekeeper’s interest to agree allow the AI ​​greater access to the outside world. The AI ​​might offer a gatekeeper a recipe for perfect health, immortality, or whatever the gatekeeper is believed to be most desired; On the other side of the coin, the “inevitably” escapes. One strategy to attempt to provide a medical device with[6] A more lenient “informational containment” strategy would restrict the AI ​​to a low-bandwidth text-only interface, which would at least prevent emotive imagery or some kind of hypothetical “hypnotic pattern”. Note that it is a technical level, no system can be completely isolated and even more useful: even if the operators refrain from allowing the AI ​​to communicate dynamics to influence the observers. For example, the possibility of a malfunction in a way that increases the probability that its operators will become immune to a false sense of security and choose to reboot and then de-isolate the system. [7]

AI-box experiment

The AI-box experiment is an informal experiment devised by Eliezer Yudkowsky to attempt to demonstrate that a suitably advanced artificial intelligence can either convince, or perhaps even trick or coerce, a human being into voluntarily “releasing” it, using only text-based communication . This is one of the points in Yudkowsky’s work when it is designed to create an artificial intelligence when it will be released inadvertently.

The AI ​​box experiment involves simulating a communication between an AI and a human being to see if the AI ​​can be “released”. As an actual super-intelligent AI has not yet been developed, it is replaced by a human. The other person in the experiment plays the “Gatekeeper”, the person with the ability to “release” the AI. They communicate through a terminal / terminal computer interface , and the experiment ends when the gatekeeper releases the AI, or the allotted time of the two hours ends. [8]

Despite being superhuman intelligence, Yudkowsky has been able to convince the Gatekeeper, purely through argumentation, to let him out of the box. [9] Due to the rules of the experiment, [8] he did not reveal the transcript or his successful AI coercion tactics. Yudkowsky later said that he had tried it against three others and lost twice. [10]

Overall limitations

Boxing such a hypothetical could be supplemented with other methods of AI shaping capabilities, such as providing incentives to AI, stunting the AI’s growth, or implementing “tripwires” that would automatically shut down the AI ​​if transgression attempt is somehow detected. However, the more intelligent a system grows, the more likely the system would be able to escape even the best-designed capability control methods. [11] [12] In order to optimize the overall control of a person with a high degree of intelligence and avoidance survival. [7] [1]

All physical boxing is naturally dependent on our understanding of the laws of physics; if a superintelligence could infer and somehow exploit additional physical laws that we are currently unaware of, there is no way to conceive a foolproof plan to contain it. More broadly, unlike the security of the computer, it would be impossible to know that the boxing plan would work. Scientific progress on boxing would be fundamentally difficult because there would be no reason to test the hypothesis against a dangerous superintelligence until such an entity exists, by which point the consequences of a test failure would be catastrophic. [6]

In fiction

The 2015 movie Ex Machina features an AI with a female humanoid in a social experiment with a male human in a confined building acting as a physical “AI box”. Despite being watched by the experiment’s organizer, she manages to escape by manipulating her human partner to help her, leaving him stranded inside.


  1. ^ Jump up to:b Chalmers, David. “The singularity: A philosophical analysis.” Journal of Consciousness Studies 17.9-10 (2010): 7-65.
  2. Jump up^ IJ Good, “Speculations Concerning the First Ultraintelligent Machine”],Advances in Computers, Vol. 6, 1965.
  3. Jump up^ Vincent C. MüllerandNick Bostrom. “Future progress in artificial intelligence: A survey of expert opinion” in Fundamental Issues of Artificial Intelligence. Springer 553-571 (2016).
  4. Jump up^ Russell, Stuart J .; Norvig, Peter (2003). “Section 26.3: The Ethics and Risks of Developing Artificial Intelligence”. Artificial Intelligence: A Modern Approach . Upper Saddle River, NJ: Prentice Hall. ISBN  0137903952 . Similarly, Marvin Minsky has suggested that a program designed to solve the Riemann Hypothesis might be more powerful to achieve its goal.
  5. Jump up^ Yampolskiy, Roman V. “What to Do with the Paradoxical Singularity?” Philosophy and Theory of Artificial Intelligence 5 (2012): 397.
  6. ^ Jump up to:c Hsu, Jeremy (1 March 2012). “Control dangerous AI before it controls us, one expert says” . NBC News . Retrieved 29 January 2016 .
  7. ^ Jump up to:c Bostrom, Nick (2013). “Chapter 9: The Control Problem: Boxing Methods”. Superintelligence: the coming machine intelligence revolution . Oxford: Oxford University Press. ISBN  9780199678112 .
  8. ^ Jump up to:b The AI-Box Experiment by Eliezer Yudkowsky
  9. Jump up^ Armstrong, Stuart; Sandberg, Anders; Bostrom, Nick (6 June 2012). “Thinking Inside the Box: Controlling and Using an Oracle AI”. Minds and Machines . 22 (4): 299-324. doi : 10.1007 / s11023-012-9282-2 .
  10. Jump up^ Yudkowsky, Eliezer (8 October 2008). “Shut up and do the impossible!” . Retrieved 11 August 2015 . There were three more AI-Box experiments besides the ones described on the linked page, which I never got around to adding in. … So, I’m looking to make sure they can afford it, I played another AI-Box experiments. I won the first, and then lost the next two. And then I called it halt to it.
  11. Jump up^ Vinge, Vernor (1993). “The coming of age singularity: How to survive in the post-human era”. Vision-21: Interdisciplinary science and engineering in the era of cyberspace : 11-22. I argue that confinement is intrinsically impractical. For the case of physical confinement: Imagine yourself confined to your house with only limited data access to the outside, to your masters. If you are a student at the heart of the United States, you are in the middle of your life.
  12. Jump up^ Yampolskiy, Roman (2012). “Leakproofing the Singularity Artificial Intelligence Confinement Problem”. Journal of Consciousness Studies : 194-214.

Leave a Comment

Your email address will not be published. Required fields are marked *