HOUSE_OVERSIGHT_017001.jpg
Extracted Text (OCR)
23. Wikipedia. Web. 23 Aug. 2010.
<http://www.wikipedia.org/>.
24. Hoiberg, Dale, ed. Encyclopaedia Britannica. Chicago:
Encyclopaedia Britannica, 2002.
25. Gregorian, Vartan, ed. Censorship: 500 Years of Conflict.
New York: New York Public Library, 1984.
26. TreB, Werner. Wider Den Undeutschen Geist:
Biicherverbrennung 1933. Berlin: Parthas, 2003.
27. Sauder, Gerhard. Die Biicherverbrennung: 10. Mai 1933.
Frankfurt/Main: Ullstein, 1985.
28. Barron, Stephanie, and Peter W. Guenther. Degenerate
Art: the Fate of the Avant-garde in Nazi Germany. Los
Angeles: Los Angeles County Museum of Art, 1991.
29. Google News Archive Search. Web.
<http://news. google.com/archivesearch>.
30. Digital Scriptorium. Web.
<http://www.scriptorrum.columbia.edu>.
31. Visual Eyes. Web. <http://www.viseyes.org>.
32. ARTstor. Web. <http://www.artstor.org>.
33. Europeana. Web. <http://www.ecuropeana.eu>.
34. Hathi Trust Digital Library. Web.
<http://www.hathitrust.org>.
35. Barry, John M. The Great Influenza: the Epic Story of the
Deadliest Plague in History. New York: Viking, 2004.
36. J-B.M. was supported by the Foundational Questions in
Evolutionary Biology Prize Fellowship and the Systems
Biology Program (Harvard Medical School). Y.K.S. was
supported by internships at Google. S.P. acknowledges
support from NIH grant HD 18381. E.A. was supported by
the Harvard Society of Fellows, the Fannie and John Hertz
Foundation Graduate Fellowship, the National Defense
Science and Engineering Graduate Fellowship, the NSF
Graduate Fellowship, the National Space Biomedical
Research Institute, and NHGRI Grant T32 HG002295 .
This work was supported by a Google Research Award.
The Program for Evolutionary Dynamics acknowledges
support from the Templeton Foundation, NIH grant
RO1GM078986, and the Bill and Melinda Gates
Foundation. Some of the methods described in this paper
are covered by US patents 7463772 and 7508978. We are
grateful to D. Bloomberg, A. Popat, M. McCormick, T.
Mitchison, U. Alon, S$. Shieber, E. Lander, R. Nagpal, J.
Fruchter, J. Guldi, J. Cauz, C. Cole, P. Bordalo, N.
Christakis, C. Rosenberg, M. Liberman, J. Sheidlower, B.
Zimmer, R. Darnton, and A. Spector for discussions; to C-
M. Hetrea and K. Sen for assistance with Encyclopaedia
Britannica's database, to S. Eismann, W. Tre, and the
City of Berlin website (berlin.de) for assistance
documenting victims of Nazi censorship, to C. Lazell and
G.T. Fournier for assistance with annotation, to M. Lopez
for assistance with Fig. 1, to G. Elbaz and W. Gilbert for
reviewing an early draft, and to Google’s library partners
and every author who has ever picked up a pen, for books.
Supporting Online Material
Wwww.sciencemag.org/cgi/content/full/science.1199644/DC1
Materials and Methods
Figs. S1 to S19
References
27 October 2010; accepted 6 December 2010
Published online 16 December 2010;
10.1126/science. 1199644
Fig. 1. “Culturomic” analyses study millions of books at
once. (A) Top row: authors have been writing for millennia;
~129 million book editions have been published since the
advent of the printing press (upper left). Second row:
Libraries and publishing houses provide books to Google for
scanning (middle left). Over 15 million books have been
digitized. Third row: each book is associated with metadata.
Five million books are chosen for computational analysis
(bottom left). Bottom row: a culturomic “timeline” shows the
frequency of “apple” in English books over time (1800-
2000). (B) Usage frequency of “slavery.” The Civil War
(1861-1865) and the civil rights movement (1955-1968) are
highlighted in red. The number in the upper left (1e-4) is the
unit of frequency. (C) Usage frequency over time for “the
Great War” (blue), “World War I’ (green), and “World War
II” (red).
Fig. 2. Culturomics has profound consequences for the study
of language, lexicography, and grammar. (A) The size of the
English lexicon over time. Tick marks show the number of
single words in three dictionaries (see text). (B) Fraction of
words in the lexicon that appear in two different dictionaries
as a function of usage frequency. (C) Five words added by
the AHD in its 2000 update. Inset: Median frequency of new
words added to AHD4 in 2000. The frequency of half of these
words exceeded 10” as far back as 1890 (white dot). (D)
Obsolete words added to AHD4 in 2000. Inset: Mean
frequency of the 2220 AHD headwords whose current usage
frequency is less than 10°. (E) Usage frequency of irregular
verbs (red) and their regular counterparts (blue). Some verbs
(chide/chided) have regularized during the last two centuries.
The trajectories for “speeded” and “speed up” (green) are
similar, reflecting the role of semantic factors in this instance
of regularization. The verb “burn” first regularized in the US
(US flag) and later in the UK (UK flag). The trregular
“snuck” is rapidly gaining on “sneaked.” (F) Scatter plot of
the irregular verbs; each verb’s position depends on its
regularity (see text) in the early 19th century (x-coordinate)
and in the late 20th century (y-coordinate). For 16% of the
verbs, the change in regularity was greater than 10% (large
font). Dashed lines separate irregular verbs (regularity<50%)
Sciencexpress / www.sciencexpress.org / 16 December 2010 / Page 6 / 10.1126/science.1199644
Downloaded from www.sciencemag.org on December 16, 2010
HOUSE_OVERSIGHT_017001
Extracted Information
Dates
Document Details
| Filename | HOUSE_OVERSIGHT_017001.jpg |
| File Size | 0.0 KB |
| OCR Confidence | 85.0% |
| Has Readable Text | Yes |
| Text Length | 5,386 characters |
| Indexed | 2026-02-04T16:29:56.197397 |