HOUSE_OVERSIGHT_016373.jpg

Source: HOUSE_OVERSIGHT • other • Size: 0.0 KB • OCR Confidence: 85.0%

Extracted Text (OCR)

and J. S. Mill and later by behavioral psychologists, like Pavlov and B. F. Skinner. On this view, the abstractness and hierarchical structure of representations is something of an illusion, or at least an epiphenomenon. All the work can be done by association and pattern detection—especially if there are enough data. Over time, there has been a seesaw between this bottom-up approach to the mystery of learning and Plato’s alternative, top-down one. Maybe we get abstract knowledge from concrete data because we already know a lot, and especially because we already have an array of basic abstract concepts, thanks to evolution. Like scientists, we can use those concepts to formulate hypotheses about the world. Then, instead of trying to extract patterns from the raw data, we can make predictions about what the data should look like if those hypotheses are right. Along with Plato, such “rationalist” philosophers and psychologists as Descartes and Noam Chomsky took this approach. Here’s an everyday example that illustrates the difference between the two methods: solving the spam plague. The data consist of a long unsorted list of messages in your in-box. The reality is that some of these messages are genuine and some are spam. How can you use the data to discriminate between them? Consider the bottom-up technique first. You notice that the spam messages tend to have particular features: a long list of addressees, origins in Nigeria, references to million-dollar prizes or Viagra. The trouble is that perfectly useful messages might have these features, too. If you looked at enough examples of spam and non-spam emails, you might see not only that spam emails tend to have those features but that the features tend to go together in particular ways (Nigeria plus a million dollars spells trouble). In fact, there might be some subtle higher-level correlations that discriminate the spam messages from the useful ones—a particular pattern of misspellings and IP addresses, say. If you detect those patterns, you can filter out the spam. The bottom-up machine-learning techniques do just this. The learner gets millions of examples, each with some set of features and each labeled as spam (or some other category) or not. The computer can extract the pattern of features that distinguishes the two, even if it’s quite subtle. How about the top-down approach? I get an email from the editor of the Journal of Clinical Biology. It refers to one of my papers and says that they would like to publish an article by me. No Nigeria, no Viagra, no million dollars; the email doesn’t have any of the features of spam. But by using what I already know, and thinking in an abstract way about the process that produces spam, I can figure out that this email is suspicious. (1) I know that spammers try to extract money from people by appealing to human greed. (2) I also know that legitimate “open access” journals have started covering their costs by charging authors instead of subscribers, and that I don’t practice anything like clinical biology. Put all that together and I can produce a good new hypothesis about where that email came from. It’s designed to sucker academics into paying to “publish” an article in a fake journal. The email was a result of the same dubious process as the other spam emails, even though it looked nothing like them. I can draw this conclusion from just one example, and I can go on to test my hypothesis further, beyond anything in the email itself, by googling the “editor.” 153 HOUSE_OVERSIGHT_016373

Document Preview

Click to view full size

Extracted Information

People Mentioned

Noam Chomsky

Document Details

Filename	HOUSE_OVERSIGHT_016373.jpg
File Size	0.0 KB
OCR Confidence	85.0%
Has Readable Text	Yes
Text Length	3,556 characters
Indexed	2026-02-04T16:27:55.863332