Mining large datasets for the humanities
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This paper considers how libraries can support humanities scholars in working with large digitized collections of cultural material. Although disciplines such as corpus linguistics have already made extensive use of these collections, fields such as literature, history, and cultural studies stand at the threshold of new opportunity.
Libraries can play an important role in helping these scholars make sense of big cultural data. In part, this is because many humanities graduate programs neither consider data skills a prerequisite, nor train their students in data analysis methods. As the ‘laboratory for the humanities,’ libraries are uniquely suited to host new forms of collaborative exploration of big data by humanists. But in order to do this successfully, libraries must consider three challenges:
1) How to evolve technical infrastructure to support the analysis, not just the presentation, of digitized artifacts.
2) How to work with data that may fall under both copyright and licensing restrictions.
3) How to serve as trusted partners with disciplines that have evolved thoughtful critiques of quantitative and algorithmic methodologies.
Description
Keywords
Citation
Aiden, Erez, and Jean-Baptiste Michel. Uncharted: Big Data as a Lens on Human Culture. 2013. Print.
Cohen, Patricia. “In 500 Billion Words, a New Window on Culture.” The New York Times 16 Dec. 2010. NYTimes.com. Web. 28 May 2014.
Firth, J. R. Papers in Linguistics, 1934-1951. London; New York: Oxford University Press, 1957. Print.
Grusin, Richard. (Moderator) “The Dark Side of Digital Humanities.” Modern Language Association Convention. Boston. 2013.
Herndon, Joel, and Molly Tamarkin. “What to Do with all of those Hard Drives: Data Mining at Duke.” Coalition for Networked Information. Washington DC. 2012. http://www.cni.org/topics/digital-libraries/hard-drives-data-mining-duke/attachment/cni_what_tamarkin/
IFLA. “IFLA Statement on Text and Data Mining.” N. p., 19 Dec. 2013. Web. 28 May 2014.
Internet Archive. “Downloading in Bulk Using Wget.” 26 Apr. 2012. Web. 28 May 2014.
Jockers, Matthew L. Macroanalysis: Digital Methods and Literary History. Urbana; Chicago; Springfield: University of Illinois Press, 2013. Print.
Mirabella, Grace. In and Out of Vogue. 1st edition. New York: Doubleday, 1995. Print
Moretti, Franco. Distant Reading. 1 edition. London ; New York: Verso, 2013. Print.
Ramsay, Stephen. Reading Machines: Toward an Algorithmic Criticism. Urbana: University of Illinois Press, 2011.
Sidi, Rafael. “ProQuest Service Alert.” ProQuest Blog. N. p., 28 Jan. 2014. Web. 28 May 2014.