Welcome to WordPress. This is your first post. Edit or delete it, then start writing!
There have been two stories recently about the Digging Into Data Challenge Conference that highlighted our Criminal Intent project. The Chronicle of Higher Education also has a two part story on the conference mentioning this project in the second part. They quote the Criminal Intent respondent, Stephen Ramsay,
Mr. Ramsay’s talk celebrated how this kind of Big Data work can enhance rather than diminish the humanities’ traditional engagement with human experience. “The Old Bailey, like the Naked City, has eight million stories. Accessing those stories involves understanding trial length, numbers of instances of poisoning, and rates of bigamy,” he said in his response. “But being stories, they find their more salient expression in the weightier motifs of the human condition: justice, revenge, dishonor, loss, trial. This is what the humanities are about. This is the only reason for an historian to fire up Mathematica or for a student trained in French literature to get into Java.”
The second story is in Science News and is titled Crime’s digital past. They quote us on how digital techniques are received by traditional historians.
Cohen and his colleagues know that many humanities scholars hold digital humanists in as low esteem as Old Bailey prosecutors once held women accused of bigamy. That’s certainly true of historians, in Hitchcock’s view. “About 90 percent of them sit quietly in an archive for a decade and then write a book with their names printed as large as possible on the cover,” Hitchcock says. In their world, data-crunching makes rude noises with no apparent historical meaning.
The With Criminal Intent project was presented by Stéfan Sinclair at a large plenary session of the Association for Canadian Studies in the United States (ACSUS) in Ottawa on November 18th, 2011. The session was chaired by Chad Gaffield, the president of the Social Sciences and Humanities Research Council (SSHRC) and was intended to showcase the transformative potential of the Digging into Data program and the research it supports.
Our White Paper on the Criminal Intent project is now available in a PDF.
Here is the Table of Contents:
Executive Summary 1
0. Introduction and Aims of the Project 2
1. The Old Bailey API 3
2. Zotero As Intermediary 5
3. Voyeur Tools 10
4. ‘Show Me More Like This’ 14
5. Data Warehousing 16
6. Challenges, Academic Reaction and Usability 20
7. Prototyping and Literate Programming 21
8. Pulling It All Together 23
9. References 25
Appendix 1: Connecting Zotero to Voyeur 27
The New York Times has an article on the Criminal Intent project. See, Old Bailey Trials Are Tabulated for Scholars Online (“As the Gavels Fell: 240 Years at Old Bailey, Patricia Cohen, August 17, 2011). They quote a historian who is skeptical of the results of mining, though he appreciates the resource.
“The Old Bailey Online project has done a great service in making those sources widely (and costlessly) available,” Mr. Langbein wrote in an e-mail. But he complained that the claims about data mining have “a breathless quality: ‘you can expect big things from us,’ but as yet it’s all method and no results.” He said that the new findings belittle the work of a generation of scholars who focused on the 18th century as the turning point in the evolution of the criminal justice system.
The challenge for us will be to show results from the methods.
We found a variety of insights and ongoing implications from this project, ranging from the technical to the interpretative. In no particular order:
1) We realized that to pursue intellectual agendas such as the differing crime patterns of women and men we needed multiple points of entry rather than a single massive visualization. We thus followed several tracks at the same time, including data warehousing (see below), mathematical models, and small-to-large visualizations.
2) We had not anticipated how helpful data warehousing would be. It comes out of business intelligence but served us well on the project and was not something many of us were familiar with. It struck us that we need to look more at business processes for ideas for this kind of work.
3) Another technique we did not anticipate in advance is how helpful quick modeling with Mathematica would be. Team members Tim Hitchcock and Bill Turkel used Mathematica to create prototypes of various text mining and visualization tools. One advantage of using Mathematica is that it is very easy to build dynamic prototypes that can be shared with colleagues to get their feedback.
4) As a team we noticed an interesting interaction were we had to accept each others’ approaches. This was particularly important in that those in the Old Bailey who had come in with an appreciation for their structured data had to come to understand how the OB could be seen as a mass of unstructured data for text mining. The text miners in the group in turn had to look more closely at what could be done with structured data. This was a fruitful exchange.
5) We are extremely proud that we got these very diverse projects to interoperate on the level of code and of interpretation, something which is often discussed in the digital humanities but rarely executed.
6) Related to 5), we have a newfound appreciation of the power of APIs, and indeed following our project JISC is now advising projects about the usefulness of APIs.
7) We have several outcomes that are leading to additional work, including a new SSHRC grant to the Canadian partners based on the Mathematica work, a new plug-in for Zotero that works beyond Voyeur for text mining collections, and the aforementioned API admiration from JISC.
SSHRC has awarded funding to Stéfan Sinclair & Geoffrey Rockwell for the Voyeur Notebooks project. Voyeur Notebooks will prototype a new, web-based interface that will allow humanities researchers and students to create analytic works of “literate programming” that interweave narrative about the intellectual process of text analysis with procedural source code. Designing and implementing this prototype will serve the following five core objectives:
- To determine whether or not it is possible to design and implement a literate programming interface as a web application.
- To determine the optimal architecture for functionality to occur immediately within the browser (client-side) or to require a call to a web service (server-side).
- To explore what code syntax would be most appropriate for a primarily humanities audience.
- To explore how Voyeur Notebooks might be designed to leverage social media practices.
- To determine the viability of a more robust and full-featured literate programming interface, perhaps supported by a proposal to the Insight Grant program.
We look forward to developing Voyeur Notebooks within the context of the Criminal Intent collaboration.