Voyeur Tools & Old Bailey

Voyeur Tools is a web-based text reading and analysis environment. It is designed to be user-friendly, flexible and powerful. It is also designed to work with larger document sets – like the Old Bailey – than previous web-based text analysis tools.

Users are encouraged to consult the full documentation available at hermeneuti.ca/voyeur. The content below is meant as a quick overview of using Voyeur Tools, with an emphasis on functionality relevant to the Old Bailey and Zotero.

Voyeur Tools

Points of Entry

The most convenient way to work with the Old Bailey is to use the new Old Bailey API (see here for more information). This interface allows you to do fine-grained searches and then to create a new Voyeur Tools corpus from any result set.


Similarly, it is possible to create a Zotero entry from one or more Old Bailey search results (see more information here). Once you have a Zotero entry you can use the Analytics Plugin to send the entry (with one or more documents) directly to Voyeur Tools.



Finally, it’s possible to submit a set of URLs via the Voyeur Tools font page (voyeurtools.org). For example, one could do a search for “murder” in the Old Bailey, enter into each result page, get the URL for the “print-friendly version” and submit the URLs.


Overview of the Default Interface

There are about twenty different tools available in Voyeur, but each tool is modular and can be integrated as part of a skin. The current default skin in Voyeur Tools includes the following tools that are initially visible:

  • Cirrus (a word cloud visualization)
  • Summary (an overview of the corpus including word counts and aggregate trends)
  • Reader (a scalable text reader that can be used to scroll very large documents)

In addition, the following tools become visible as the user begins interacting with data (by clicking on words, for instance):

  • Word Trends (a distribution graph that shows word frequencies across multiple documents or within a single document)
  • Keyword in Context (that shows occurrences of each word in its context)
  • Words in the Corpus (aggregate frequency information for words in multiple documents)
  • Words in the Document (frequency information for words within individual documents)



More documentation and help are available for each of the individual tools. Please note that each tool has specific documentation, but several have the following components:


Additional Tools and Skins

Beyond the tools available in the default skin of Voyeur Tools, there are several other tools and skins available. As described in more detail below, it is possible to export a corpus by clicking on the “skin export” icon and choosing the skin builder, which shows the list of tools available and provides an interface for building new skins (combinations of tools).

One of the more interesting skins provides more advanced functionality to perform correspondence analysis (see a more detailed description here). Essentially, Correspondence Analysis is one way of representing how words cluster around certain documents according to their relative frequency. The steps to open this skin are not ideal at the moment (we hope to add a more user-friendly mechanism soon):


Points of Exit

Voyeur Tools can be used as a stand-alone text analysis environment, but it has also been designed to facilitate integration into a broader research workflow. There are three main ways of extending the usefulness of Voyeur Tools:

  1. generating a (somewhat) persistent URL with the current tool(s), corpus and settings
  2. generating a code snippet to embed the current tool(s), corpus and settings into remote content (like blogs)
  3. exporting data for use in other applications like Excel

All of these points of exit originate with the “Export” button (the icon that resembles a diskette). There’s a difference between exporting from the skin (the combination of tools) and exporting from individual tools. The skin provides links to the tool browser and the skin builder (to create your own combination of tools) but does not provide functionality to embed code snippets or to export data. In contrast, individual tools do provide functionality for exporting data, depending on the type of tool it is. For instance, some of the tabular data tools allow the user to export comma-separated values whereas some of the visualization tools allow the user to export a static image.



Exporting the URL of the current tool(s) and corpus can be useful for bookmarking or for sharing work with colleagues (email, Twitter, etc.). Although no guarantee is made that he corpus will be retained indefinitely, it will not likely be cleaned out if it has been accessed in the past two weeks.

Exporting a code snippet is similar to embedding a YouTube clip: you copy and paste the code snippet into your own content such as a web-based essay or blog. Rather than sending texts to Voyeur Tools you’re bringing tools into your texts and allowing your readers to experiment and play with the analytic functionality. The tool is live in its embedded form, which means that the user can change parameters (locally) or spawn new windows with different tools. It is worth noting that some Content Management Systems like WordPress may not allow iframe tags to be used in blog posts without tweaking administrative settings.

Voyeur Tools provide a wide range of functionality for analyzing and visualizing texts, but there may be cases where it’s preferable to do further work in a different application. Voyeur allows you to export data in various formats, depending on the type of tool being used. Supported export formats include plain text, comma-separated values, XML, and static images.

Finally, Voyeur Tools also proposes bibliographic entries for citing the tool(s) that you have used – doing so helps spread the word about Voyeur Tools which in turn helps to justify continued funding to develop the project.

Final Remarks

Voyeur Tools does a lot, but there are limitations and bugs. Although the underlying system has been designed to support large-scale text analysis, the current server infrastructure has performance and reliability issues. If you try loading a corpus into Voyeur and it doesn’t seem to respond after about a minute, try loading a smaller corpus (or contact us to help load the corpus for you). Similarly, several of the tools are in relatively early development, and it’s best to view results with circumspection and to anticipate problems. Please let us know if you do encounter difficulties as that will help us improve the tools!