We are currently developing a prototype of the VisualPage software application which will enable humanities researchers to explore and analyze large collections of digitized printed books. Scholarly archives recognize that the historical and cultural meanings of printed documents are conveyed not only in their linguistic content, but in their bibliographic and visual elements as well, and thus typically provide users with digitized page images as well as extracted text. Although we have tools for large scale analysis of text, researchers interested in those visual features of digitized printed books have been limited to what they can see and compare with human eyes. The VisualPage tool enables researchers to explore large document collections and to identify unique or representative items, historical trends in typography, page layout and book design, and to make comparisons not accessible to the human eye.

For the start-up phase of the project, funded by the National Endowment for the Humanities [HD5156012], we have selected poetry as our focus for our initial work because the visual appearance of the printed page contributes to the reader’s understanding of the poem’s form and meaning through the conventions of line length, line indentation, and the distribution of white space. The initial data set for this start-up period consists of 300 digitized books of Victorian poetry (approximately 60,000 images) published between 1860-1880.

To learn more about this project:

Leave a Reply

Your email will not be published. Name and Email fields are required.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>