Back to Results

HOUSE_OVERSIGHT_013205.jpg

Source: HOUSE_OVERSIGHT  •  Size: 0.0 KB  •  OCR Confidence: 85.0%
View Original Image

Extracted Text (OCR)

Chapter 16 AGI Preschool Co-authored with Stephan Vladimir Bugaj 16.1 Introduction In conversations with government funding sources or narrow AI researchers about AGI work, one of the topics that comes up most often is that of “evaluation and metrics” — i.e., AGI intelligence testing. We actually prefer to separate this into two topics: environments and methods for careful qualitative evaluation of AGI systems, versus metrics for precise measurement of AGI systems. The difficulty of formulating bulletproof metrics for partial progress toward advanced AGI has become evident throughout the field, and in Chapter 8 we have elaborated one plausible explanation for this phenomenon, the "trickiness" of cognitive synergy. [LW ML09], summarizing a workshop on “Evaluation and Metrics for Human-Level AT’ held in 2008, discusses some of the general difficulties involved in this type of assessment, and some requirements that any viable approach must fulfill, On the other hand, the lack of appropriate methods for careful qualitative evaluation of AGI systems has been much less discussed, but we consider it actually a more important issue — as well as an easier (though not easy) one to solve. We haven’t actually found the lack of quantitative intelligence metrics to be a major obstacle in our practical AGI work so far. Our OpenCogPrime implementation lags far behind the CogPrime design as articulated in Part 2 of this book, and according to the theory underlying CogPrime, the more interesting behaviors and dynamics of the system will occur only when all the parts of the system have been engineered to a reasonable level of completion and integrated together. So, the lack of a great set of metrics for evaluating the intelligence of our partially- built system hasn’t impaired too much. Testing the intelligence of the current OpenCogPrime system is a bit like testing the flight capability of a partly-built airplane that only has stubs for wings, lacks tail-fins, has a much less efficient engine than the one that’s been designed for use in the first "real" version of the airplane, etc. There may be something to be learned from such preliminary tests, but making them highly rigorous isn’t a great use of effort, compared to working on finishing implementing the design according to the underlying theory. On the other hand, the problem of what environments and methods to use to qualitatively evaluate and study AGI progress, has been considerably more vexing to us in practice, as we've proceeded in our work on implementing and testing OpenCogPrime and developing the CogPrime theory. When developing a complex system, it’s nearly always valuable to see what this system does in some fairly rich, complex situations, in order to gain a better intuitive understanding of the parts and how they work together. In the context of human-level AGI, the theoretically best way to do this would be to embody one’s AGI system in a humanlike body 289 HOUSE_OVERSIGHT_013205

Document Preview

HOUSE_OVERSIGHT_013205.jpg

Click to view full size

Document Details

Filename HOUSE_OVERSIGHT_013205.jpg
File Size 0.0 KB
OCR Confidence 85.0%
Has Readable Text Yes
Text Length 2,991 characters
Indexed 2026-02-04T16:18:46.616858