About the project

Shorthand was an important part of Dickens's toolkit as a writer, but although he used it extensively for parliamentary reporting, letter writing, and note taking, little is known about how he did so. The unique system that he developed, based upon Gurney's Brachygraphy, is complex and puzzling; Dickens himself called it a 'savage stenographic mystery'.

There are at least 10 known manuscripts of Dickens's shorthand, dating from the 1830s to the late 1860s. These manuscripts are located in 6 archives across the world, as well as 2 private collections. Several manuscripts remain undeciphered, including a letter from the 1850s and a set of shorthand booklets collected by Dickens's shorthand pupil, Arthur Stone. These booklets, totalling c.70 pages, include 6 undeciphered shorthand dictation exercises of 1-2 pages each. Dickens's shorthand has proved extremely difficult to decode and, in most cases, experts have been unable to locate the source texts used for the exercises. They could be published or unpublished passages written by Dickens, or by another author. The mystery of these undeciphered texts is as compelling for the public as it is for academics and the sesquicentenary of Dickens's death in 2020 provides an ideal opportunity to harness wider interest in solving the 'Dickens Code'.

The material is novel in its own right and the task of deciphering it provides a test case with implications far beyond Dickens Studies. An approach that combines machine learning's power to identify patterns across datasets with contextual interpretation by volunteers is likeliest to succeed. However, the limited corpus and idiosyncratic nature of Dickens's shorthand creates barriers to machine learning methods, while the complexity of the material places additional demands upon the human interpreter.

Tackling these challenges offers a template for approaching similarly complex decoding problems, where human expertise and technology have to work hand-in-hand. However, to understand these challenges and identify potential solutions, the 'Dickens Code' problem needs to be viewed in the round. Accordingly, this project will convene a network that draws expertise from different disciplinary areas (Dickens Studies, Digital Humanities, Forensic Linguistics, and Informatics) and stakeholder groups (museums and archives).

In Dickens Studies, enhanced understanding of the author's shorthand will lead to internationally significant insights about Dickens's creative process. For Informatics, the 'Dickens Code' may generate modified or novel approaches to handwritten coded material with broader applicability. In Digital Humanities, the 'Dickens Code' provides a template for engaging users with 'difficult' content. Beyond the Academy, increased public awareness of Dickens's shorthand will bring to light a little-known aspect of the life of one of the world's most famous authors. Throughout his career, Dickens sought to cultivate a close relationship with his readers; 150 years on, the 'Dickens Code' seeks to revitalise this connection, by enabling academics and non-academics to work together to uncover Dickens's last unknown texts.