August 13th, 2011
The New York Times has an article on the Criminal Intent project. See, Old Bailey Trials Are Tabulated for Scholars Online (“As the Gavels Fell: 240 Years at Old Bailey, Patricia Cohen, August 17, 2011). They quote a historian who is skeptical of the results of mining, though he appreciates the resource.
“The Old Bailey Online project has done a great service in making those sources widely (and costlessly) available,” Mr. Langbein wrote in an e-mail. But he complained that the claims about data mining have “a breathless quality: ‘you can expect big things from us,’ but as yet it’s all method and no results.” He said that the new findings belittle the work of a generation of scholars who focused on the 18th century as the turning point in the evolution of the criminal justice system.
The challenge for us will be to show results from the methods.
June 7th, 2011
We found a variety of insights and ongoing implications from this project, ranging from the technical to the interpretative. In no particular order:
1) We realized that to pursue intellectual agendas such as the differing crime patterns of women and men we needed multiple points of entry rather than a single massive visualization. We thus followed several tracks at the same time, including data warehousing (see below), mathematical models, and small-to-large visualizations.
2) We had not anticipated how helpful data warehousing would be. It comes out of business intelligence but served us well on the project and was not something many of us were familiar with. It struck us that we need to look more at business processes for ideas for this kind of work.
3) Another technique we did not anticipate in advance is how helpful quick modeling with Mathematica would be. Team members Tim Hitchcock and Bill Turkel used Mathematica to create prototypes of various text mining and visualization tools. One advantage of using Mathematica is that it is very easy to build dynamic prototypes that can be shared with colleagues to get their feedback.
4) As a team we noticed an interesting interaction were we had to accept each others’ approaches. This was particularly important in that those in the Old Bailey who had come in with an appreciation for their structured data had to come to understand how the OB could be seen as a mass of unstructured data for text mining. The text miners in the group in turn had to look more closely at what could be done with structured data. This was a fruitful exchange.
5) We are extremely proud that we got these very diverse projects to interoperate on the level of code and of interpretation, something which is often discussed in the digital humanities but rarely executed.
6) Related to 5), we have a newfound appreciation of the power of APIs, and indeed following our project JISC is now advising projects about the usefulness of APIs.
7) We have several outcomes that are leading to additional work, including a new SSHRC grant to the Canadian partners based on the Mathematica work, a new plug-in for Zotero that works beyond Voyeur for text mining collections, and the aforementioned API admiration from JISC.
March 4th, 2011
November 19th, 2010
Joerg Sanders working with John Simpson at the University of Alberta have a first prototype of a “data wharehousing” prototype that will let users explore the Old Bailey data through an interface that lets you compare things. Above you see an example of comparisons by gender over a time period.
June 17th, 2010
Tableau Display of Subset of Old Bailey Data
At the Mind the Gap workshop a Criminal Intent team experimented with a number of promising data visualization and mining techniques. For example we tried a “data warehouse” visual comparison model using the Tableau software on data formatted for it. The image above shows how we can compare based on structural information. Here is a visualization using correspondence analysis.
Correspondence Analysis Visualization
Now we have to select the most promising and develop the ideas.
Normalized Compression Distance
May 6th, 2010
The Criminal Intent project is one of the teams invited to participate in the Mind the Gap: Bridging the Humanities and High Performance Computing workshop. This workshop will bring together humanities research teams that can use High Performance Computing (HPC) with specialists to explore way of using HPC. The workshop is being organized at the University of Alberta and Criminal Intent participants from the UK and Canada will use this workshop to identify data-mining techniques to implement.
April 15th, 2010
Screen shot of Voyeur with Old Bailey data
With Criminal Intent has connected Voyeur with the Old Bailey Online project in a preliminary prototype. Click here to try Voyeur with a subset of the full Old Bailey Corpus.