The Readers Project: Procedural Agents and Literary Vectors

Daniel Howe; John Cayley

“The Readers Project: Procedural Agents and Literary Vectors” by Howe and Cayley

Next: “The Real-Time Frontier: Stylized Feature... »

« Previous: “The reacTable: a tangible tabletop musical...

Conference:

SIGGRAPH 2011

Type(s):

Art Papers and Presentations

Title:

The Readers Project: Procedural Agents and Literary Vectors

Presenter(s)/Author(s):

Daniel Howe

John Cayley

Abstract:

The Readers Project is an aesthetically oriented system of software entities designed to explore the culture of human reading. These entities, or “readers,” navigate texts according to specific reading strategies based upon linguistic feature analysis and real-time probability models harvested from search engines. As such, they function as autonomous text generators, writing machines that become visible within and beyond the typographic dimension of the texts on which they operate. Thus far the authors have deployed the system in a number of interactive art installations at which audience members can view the aggregate behavior of the readers on a large screen display and also subscribe, via mobile device, to individual reader outputs. As the structures on which these readers operate are culturally and aesthetically implicated, they shed critical light on a range of institutional practices – particularly those of reading and writing – and explore what it means to engage with the literary in digital media.

References:

1. This paper is largely concerned with the details of our project?s analytical and computational methods. However these are pursued as integral to a practice of digital literary art, fully within the context of long-standing discussions concerning the interrelation of digital media and ?the literary.? There is an extensive critical literature on this subject, recently summarized and extended, although from a relatively theoretical perspective, in N. Katherine Hayles, Electronic Literature: New Horizons for the Literary (Notre Dame: University of Notre Dame, 2008).

2. M. Gardner, ?The Fantastic Combinations of John Conway?s New Solitaire Game of ?Life?,? Scientific American, 223, 120?123 (1970).

3. In fact, this is historically/culturally determined, a function of the fact that the z-dimension happens to have had little or no significance for the graphic representation of language, or at best only marginal significance, for reasons associated with the media support for graphic language that have been available to date. This situation could change and, arguably, is now changing as it becomes ever easier to make the z-dimension perceptible within devices that represent graphic language. Note also that in sign language the z-dimension is significant, ?phonologically? in the technical linguistic sense, and in other grammatical ways as well.

4. This phrase is intended to invoke both the natural language processing research that underlies our project and also the concept of ?expressive processing? as vital aspects of much contemporary aesthetic practice, including literary practice, as elaborated by Noah Wardrip-Fruin, Expressive Processing: Digital Fictions, Computer Games, and Software Studies (Cambridge: MIT Press, 2009).

5. See D. Ashlock and J. Tsang, ?Evolved Art Via Control of Cellular Automata,? Proceedings of the Eleventh Conference on Congress on Evolutionary Computation (Piscataway, NJ: IEEE Press, 2009) 3338?3344; D. Burraston and E. Edmonds, Cellular Automata in Generative Electronic Music and Sonic Art: Historical and Technical Review (Sydney: Creativity and Cognition Studios, Faculty of Information Technology, University of Technology, 2005); Kenneth E. Perry, ?Abstract Mathematical Art,? Byte, December 1986, 181?190 (1986); and Mitchell Whitelaw, Metacreation: Art and Artificial Life (Cambridge: MIT Press, 2004).

6. One might also pre-process texts so as to be able to extract other cellular properties that are not regularly represented in traditional orthography, such as phonemes, morphemes, syllables, etc. As will be clear from our description, while the identity of cells is based on traditional orthographic and typographic distinctions, the strategies and behaviors of particular readers are often based on features extracted by computational analysis of the supply texts. Rhyme, which is based on phonemic analysis, represents one of many such examples.

7. Although we would appreciate connecting our aesthetic research more rigorously with, for example, studies of reading in cognitive science, such relations are only loosely suggested here. The authors are nonetheless involved with UK ARHC-funded research network Poetry Beyond Text, based at the Universities of Dundee and Kent, in which both cognitive scientists concerned with reading and even cognitive aestheticians have a role. See: projects.beyondtext.ac.uk/poetrybeyondtext/.

8. We use ?vector? in a figurative sense, related to its definition as: a quantity (e.g., of directed force or attention) that can be resolved into components. ?Vector? also provides us with a noun that can refer to what is really, in this case, a potential direction for the choice of a next word to be read.

9. The term ?poetics? is used here to encompass any property or method of language that may be composed for rhetorical or aesthetic effect.

10. We are aware that there is much sophisticated discussion of the interrelation between typography and semantics, typography and literary aesthetics, and so on. Johanna Drucker?s work is exemplary in this regard. Nonetheless, we believe that the distinction proposed here is both novel and critically generative. J. Drucker, The Visible Word: Experimental Typography and Modern Art, 1909?1923 (Chicago: University of Chicago Press, 1994). 324

11. For precise details of the current definition, please see: thereadersproject.org?p=contents/neighborhood. html. In our scheme ? as a reflection of traditional left-to-right reading in the West ? the NE and SE neighbors will not be null where there are lines of type above or below the current word. The NW, N, SW, and S positions may, however, be null, depending on relative word-lengths.

12. E. F. Moore, ?Machine Models of Self-Reproduction,? Proceedings of Symposia in Applied Mathematics, The American Mathematical Society, Volume 14, 17?33 (1962).

13. A. A. Markov, ?Classical Text in Translation: An Example of Statistical Investigation of the Text Eugene Onegin Concerning the Connection of Samples in Chains,? trans. David Link, Science in Context, 19.4, 591?600 (2006). Online: journals.cambridge.org/production/action/cjoGetFullt xt?fulltextid=637500.

14. See thereadersproject.org/?p=contents/readers.html. We might also count as implemented a subtle variation of a simple reader, the ?writing to be found? reader that was deployed in the Read for us installation, described here: thereadersproject.org/?p=installations/readforus/readforus.html.

15. Note that the preprocessed identification of perigrams for a text is carried out chiefly for reasons of efficiency. Often, depending on network constraints, the frequencies of particular phrases are cached in advance rather than being searched in real- time. The extraction of perigrams means that considerably fewer word combinations need be considered and processed.

16. The Readers Project is written, chiefly, in Processing (processing.org) and Java, and makes use of the RiTa natural language processing library (www.rednoise.org/rita/) developed by Daniel C. Howe. See D. C. Howe, ?RiTa: Creativity Support for Computational Literature,? C&C ’09: Proceeding of the 7th ACM Conference on Creativity and Cognition, Berkeley, October 26?30, 2009 (New York: ACM, 2009) 205?210, retrieved from doi.acm.org/10.1145/1640233. This library also provides objects designed to mine natural language data, in real time, from indexed repositories ? those built by certain of the main internet search engines ? that represent the most extensive corpus of natural language that has ever been available to language art practitioners. The phrases searched are enclosed in double quotes, providing a rough relative frequency for exact word sequences. There are problems with the way that search engines handle punctuation ? whether or not punctuation is considered to break a sequence. (Google, for example, treats punctuation differently in different search portals: all of Google vs. Books.) These problems have been bracketed for the time being.

17. We are also able to constrain our searches to, for example, the indices of Google ?books,? thus disregarding much of the commercially or technically implicated Internet text.

18. We believe that the existence of ?services? (or pretended cultural vectors) such as those provided by Google, combined with a burgeoning, aesthetically motivated ?use? of these services, has profound implications for contemporary artistic practice. Such use also allows artists to engage critically and productively with important socio-economic and political developments in an unprecedented manner. We are unable to address these crucial issues within the scope of this paper, but plan to do so in future contributions.

19. For us, one of the attractions of this approach and these procedures is that they may visualize and perform the workings of protosemantic and sublexical linguistic properties ? both traditional poetic properties like rhyme and less-frequently acknowledged properties such as mesostic relations highlighting their contribution to literary aesthetics. The role of the protosemantic in The Readers Project must wait for fuller treatment in the future. See: S. McCaffery, Prior to Meaning: The Protosemantic and Poetics (Evanston, IL: Northwestern University Press, 2001).

Submit a story:

If you would like to submit a story about this presentation, please contact us: historyarchives@siggraph.org

ACM SIGGRAPH HISTORY ARCHIVES

“The Readers Project: Procedural Agents and Literary Vectors” by Howe and Cayley

Conference:

Type(s):

Title:

Presenter(s)/Author(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Art Paper/Presentation Type:

Submit a story:

Sponsored by: