Digital Clio


Historical Scholarship in the Digital Age

Textual Analysis

In order for digital history to embrace and employ the great potential of the technological age, text encoding becomes necessary. Text encoding is one method of taking original textual materials from analog to digital representations, electronically searchable for scholarly research. XML is an encoding standard that assists in the creation, retrieval, and storage of electronic documents. Through text encoding and XML, researchers can gain a higher level of expertise about original texts and documents in electronic form. Such technology allows for systematic manipulation and analysis of complex historical texts, deciphering their intricacies into a more understandable form while preserving that complexity. Furthermore, Perry Willett asserts in his article “Electronic Texts: Audiences and Purposes,” “Electronic texts give humanists access to works previously difficult to find, both in terms of locating entire works, with the Internet as a distributed interconnected library, and in access to the terms and keywords within the works themselves, as a first step in analysis.” The digital tools available to historians for textual analysis have opened new realms of historical analysis. As a tool for researching, analyzing, and teaching, XML and textual analysis offers avenues of research to describe and analyze the literary and linguistic past.

Electronic text collections are becoming standards in research, particularly as the Web proliferates and is the first place most turn for information. Willett informs that scholars, students, and librarians are learning that electronic text collections will have to grow considerably in order to reliably meet greater research needs. Currently, electronic text collections are not numerous enough to conduct broad research in a particular field. As more historical documents are placed online, displaying them with XML provides a better functioning history web. Large resources of XML encoded documents improve the ease of retrieving documents and relating material together for analysis. With tools for textual analysis, such as TokenX, historians can integrate electronic texts more deeply and broadly into their research as a method for analyzing the connective tissue within language and across texts. With a unique ability to access and analyze all the words and all the documents at the same time equally, digital analytical tools like TokenX can show change over time and change over place/space. Textual analysis, therefore, provides numerous possibilities that can inform historical strategies and introduce new thinking into the current historiography.

Deciphering language usage and word significance within texts, word clouds represent a critical element in textual analysis. Word clouds provide a visual depiction of the frequency of words in a document’s content. Shown by a variation in font size or color depending on their frequency, the word clouds identify the most crucial words used in a document. Another impressive feature in textual analysis rests in being able to view particular words in context. Emphasizing words in their immediate context allows one to visualize that word’s usage in several instances within a document. Through such features, researchers and historians can mine the text for information not visible without machine-aid to demonstrate some connective tissue between the text and a historical argument.

Textual analysis tools help historians analyze the tentative connective threads in historical texts by manipulating and analyzing all words on an equal footing, even the common words that many researchers continually disregard. As John Burrows states in his chapter on textual analysis, “the real value of studying the common words rests on the fact that they constitute the underlying fabric of a text, a barely visible web that gives shape to whatever is being said.” Therefore, historians and researchers will find inherent value in many different sorts of literary inquiry, helping to resolve debates, to carry arguments forward, and to open entirely new questions on the basis of word usage and context.

Investigating the Utah Expedition, for instance, employing textual analysis and using TokenX’s many features enabled me to discover the importance of public rhetoric in establishing us (“Christian” Americans) versus them (Mormons), or American versus “other,” mentality during this episode. In each of the newspaper articles I encoded and analyzed key words such as “our,” “us,” “them,” and “theirs” stand out as among the most used. Additionally, the context of using these words within the text further demonstrates such a mentality. In depicting Mormon tenets, American newspaper writers described “their abominations,” or “their putridity” and wickedness compared to “our liberty” and “our existence as civilized white men.” Visualizing these words amongst others in word clouds and in their context within the text displayed the American rhetoric as clear indicators of Mormons as alien others in the mid-nineteenth century. With these common words holding so much weight in these texts, a new portrait emerges concerning Mormon and American relations.

Tracing the interconnections of all the different textual threads, analytical procedures employed for comparing texts can visually display the relationships among discourse, rhetoric, and ideas rather than simply the people and their actions. The visual representations can help scholars demonstrate the connections and their analysis to students as another non-linear mode of teaching. Therefore, historians and history instructors will find textual encoding and analysis tools critical for piecing together and visually demonstrating historical analysis to students and colleagues alike.


Filed under: Research, Scholarship, Technology, Tools, , , , ,

2 Responses

  1. […] think of the two together and often refer to them interchangeably – for instance, offering the raw XML of a transcribed newspaper article on my digital history project is both open source and open […]

  2. […] of the two together and often refer to them interchangeably – for instance, offering the raw XML of a transcribed newspaper article on my digital history project is both open source and open […]

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: