Back to Results

HOUSE_OVERSIGHT_016250.jpg

Source: HOUSE_OVERSIGHT  •  Size: 0.0 KB  •  OCR Confidence: 85.0%
View Original Image

Extracted Text (OCR)

agent that perceives, and acts in order to maximize, its expected utility. Subfields such as logical planning, robotics, and natural-language understanding are special cases of the general paradigm. AI has incorporated probability theory to handle uncertainty, utility theory to define objectives, and statistical learning to allow machines to adapt to new circumstances. These developments have created strong connections to other disciplines that build on similar concepts, including control theory, economics, operations research, and statistics. In both the logical-planning and rational-agent views of AI, the machine’s objective—whether in the form of a goal, a utility function, or a reward function (as in reinforcement learning)—1s specified exogenously. In Wiener’s words, this is “the purpose put into the machine.” Indeed, it has been one of the tenets of the field that AI systems should be general-purpose—i.e., capable of accepting a purpose as input and then achieving it—rather than special-purpose, with their goal implicit in their design. For example, a self-driving car should accept a destination as input instead of having one fixed destination. However, some aspects of the car’s “driving purpose” are fixed, such as that it shouldn’t hit pedestrians. This is built directly into the car’s steering algorithms rather than being explicit: No self-driving car in existence today “knows” that pedestrians prefer not to be run over. Putting a purpose into a machine which optimizes its behavior according to clearly defined algorithms seems an admirable approach to ensuring that the machine’s “conduct will be carried out on principles acceptable to us!” But, as Wiener warns, we need to put in the right purpose. We might call this the King Midas problem: Midas got exactly what he asked for—namely, that everything he touched would turn to gold—but too late he discovered the drawbacks of drinking liquid gold and eating solid gold. The technical term for putting in the right purpose is value alignment. When it fails, we may inadvertently imbue machines with objectives counter to our own. Tasked with finding a cure for cancer as fast as possible, an AI system might elect to use the entire human population as guinea pigs for its experiments. Asked to de-acidify the oceans, it might use up all the oxygen in the atmosphere as a side effect. This is a common characteristic of systems that optimize: Variables not included in the objective may be set to extreme values to help optimize that objective. Unfortunately, neither AI nor other disciplines (economics, statistics, control theory, operations research) built around the optimization of objectives have much to say about how to identify the purposes “we really desire.” Instead, they assume that objectives are simply implanted into the machine. AI research, in its present form, studies the ability to achieve objectives, not the design of those objectives. Steve Omohundro has pointed to a further difficulty, observing that intelligent entities must act to preserve their own existence. This tendency has nothing to do with a self-preservation instinct or any other biological notion; it’s just that an entity cannot achieve its objectives if it’s dead. According to Omohundro’s argument, a superintelligent machine that has an off-switch—which some, including Alan Turing himself, in a 1951 talk on BBC Radio 3, have seen as our potential salvation—will take steps to disable the switch in some way.! Thus we may face the prospect of superintelligent machines—their actions by definition unpredictable by us and their ' Omohundro, “The Basic AI Drives,” in Proc. First AGI Conf., 171: “Artificial General Intelligence,” eds. P. Wang, B. Goertzel, & S. Franklin (IOS press, 2008). 30 HOUSE_OVERSIGHT_016250

Document Preview

HOUSE_OVERSIGHT_016250.jpg

Click to view full size

Document Details

Filename HOUSE_OVERSIGHT_016250.jpg
File Size 0.0 KB
OCR Confidence 85.0%
Has Readable Text Yes
Text Length 3,801 characters
Indexed 2026-02-04T16:27:28.706093