Documentation

Our Digital Edition

In ShakeXSLT Digital Edition we present three different plays of Shakespeare:

The texts have been taken from the draCor website, but have been annotated in TEI by the project authors.

There is not a unique and standard definition of Digital Edition, but a general one could be the following:

The term edition is generally used to describe the result of an interpretative study of a text [from Digital editions and cataloguing, in Digital Scholarly Editing: Theories and Practices, Matthew James Driscoll and Elena Pierazzo, p. 162]

In this edition, thus, we aim at presenting some of the most important works of Shakespeare, presented in a standard way, through XML-TEI and XSLT. In fact, an edition should aim at providing a permanent record of the text analysed, following some basic requirements. It should be:

conform to XML standards for searchability, worldwide integration, interchange and repurposing of data. ‘A well-made electronic scholarly edition will be built on encoding of great complexity and richness’. The digital edition should include searchable text and images made possible with the use of appropriate and meaningful metadata [...] The web platform or content management system (CMS) on which the edition runs should conform to W3C (World Wide Web Consortium) standards; provide project documentation to allow the user to appreciate the edition’s limitations or customisations and thus better utilise the resource [...], text-image linking, as well as hyperlinking.[from A good digital edition? Some recommendations, in Digital Scholarly Editing: Theories and Practices, Matthew James Driscoll and Elena Pierazzo, pp. 180-181]

These requirements should be the basis of every resource which will be presented in a digital space, a well-structured material space. To rule this space, we need to apply the so-called "Editorialization":

Editorialization is the set of dynamics that produce and structure digital space. These dynamics can be understood as the interactions of individual and collective actions within a particular digital environment. [from On Editorialization, Marcello Vitali-Rosati, p. 7]

SDE criteria

Starting from the article of Patrick Sahle Criteria for Reviewing Scholarly Digital Editions, we have tried to accomplish each criterion in the creation of our digital edition.

To start with the first criterion, enough information for the Bibliographic identification of the SDE has been provided, clarifying the title, the responsible editors and the subject of the edition.

To go on with the second criterion, the texts are presented in a clear and complete version, allowing the reader to have a full understanding of them. Images are present, in order to allow a clearer identification of the texts (the original pages are presented).

For what concerns the 3rd criterion, a documentation has been provided in order to allow also non-expert readers to get a context of the edition. The target audience has been thought to be a general public, which has no previous knowledge of the content, neither of how to use a digital edition.

When we create editions, we are thinking about readers in two disciplines: readers who are editors, and readers who are not editors. […] Making editions that work for both editors and for the popular audience will always be tricky, and moving into the digital world does not really make it much easier. […] [from Results from the catalogue, in Digital Scholarly Editing: Theories and Practices, Matthew James Driscoll and Elena Pierazzo, p. 173]

Thus, talkin about the 4 criterion, we have decided to aim at a simple presentation, allowing a browsing through the index of the text, presented in a sidebar. We have also decided to present three different texts, with different plots, to offer a clearer view of the author's work, since

The telling proximity of one work to another, significant gatherings of materials, illustrations entered into the manuscript alongside the text and so forth all shape the way we understand the manuscript, but are often ignored when preparing scholarly editions. [from Building A Social Edition of the Devonshire Manuscript, in Digital Scholarly Editing: Theories and Practices, Matthew James Driscoll and Elena Pierazzo, p. 140]

Authority through DbPedia

For one of the three texts, The Tempest, we have added some authorities for what concerns the characters. We have thus used the tag xenoData, which provides a container element into which metadata in non-TEI formats may be placed. We have decided to take the information from a LOD environment, dbPedia, since

As Tim Berners-Lee (the inventor of the World Wide Web as we know it) remarks, the internet was originally developed for workers to collaborate and access source documents; with wiki and Web 2.0 technology, it is now returning to its roots. [from Building a social edition, in Digital Scholarly Editing: Theories and Practices, Matthew James Driscoll and Elena Pierazzo, p. 147]

Here we present the list of main characters which have been enriched with the corresponding dbPedia links:

Miranda: dbPedia
Caliban: dbPedia
Ferdinand: dbPedia
Stephano: dbPedia
Prospero: dbPedia
Ariel: dbPedia

Using the LOD environment allows us to take another point of view on the content of the edition. From this perspective, the content is not the only thing that matters, but also

collection of dynamic relationships that this content maintains with other content. And these relationships, which are part of an open process, determine the existence of a piece of content. It is the ensemble of relationships and links that make the content accessible and visible, and thus bring it into existence. Completely independent content would be inaccessible, invisible, actually non-existing. [from Processual Authorities, in On Editorialization Marcello Vitali-Rosati, p. 78]

XML-TEI and XSLT

For managing the three texts we have used two main technologies: TEI and XSLT. TEI is an XML-based markup language that enable scholars to store, analyze, and share humanities textual information.

Without guidelines such as the TEI, exchange and repurposing of data will not be possible and electronic editions will be used as standalone objects with their own set of characteristics, objectives and requirements. [from Results from the catalogue, in Digital Scholarly Editing: Theories and Practices, Matthew James Driscoll and Elena Pierazzo, p. 176]

XML

The XML standard (eXtensible Markup Language) is a flexible way to create information formats and electronically share structured data through the web.

TEI

The Text Encoding Initiative (TEI) is a consortium which collectively develops and maintains a standard for the representation of texts in digital form. The TEI Guidelines have been widely used by libraries, museums, publishers, and individual scholars to present texts for online research, teaching, and preservation.

OUR CHOICES

We have chosen TEI because it allows a specific description of the dramatic plays through a series of technical tags.
We have first of all used the filedesc tag for describing the metadata related to the bibliographic entity, such as the author tag, with the author name, within the titleStmt tag. We have described the publication through publicationStmt by addressing the source from which we have taken the texts (DraCor) and then the original text through the tag sourceDesc by inserting information about the title, the author and a respStmt with the editors' names and the license used. Then we have proceeded with the description of the of the play, through the profileDesc tag, which provides information about the language used and a detailed description of non-bibliographic information about the text, within the particDesc tag, in the listPerson tag. Each participant is identified by the tag person, with a persName tag, and an xml:id, meaningful for the specific character.

We have added information about the date of publication of the original text and a link between the text in DraCor and the related Wikidata entity in the standOff tag, which functions as a container element for linked data, contextual information, and stand-off annotations embedded in a TEI document. In the facsimile tag we have added a IIIF Image of the first page of each of the three comedy contained in the First Folio with its description.

The characters have then been inserted also in the front, within a castList tag, with a series of castItem tags. For every castItem a role and a roleDesc have been provided.

For each scene we have highlighted the head, with the number of the scene, and stage, which contains any kind of stage-direction (specified thorugh a type attribute) within the dramatic text. We have then chosen to specify a tag sp, for identifying the various speeches within the text, with the specification of the speaker of the speech.

The text has been then annotated within the l tag, which contains the transcription of a topographic line in the source document.

For what concerns the text, we have created some specific divs, either for the acts or the scenes (through the specification of the type)

XSLT

XSLT (eXtensible Stylesheet Language Transformations) is a high-level declarative language for transforming XML documents, and thus TEI, into other XML documents or other formats such as HTML. We have used this technology to transform TEI-annotated texts into html, in order to work upon them.

The dominant features of XSLT as a declarative language are that it is a rule-based language, where the rules are not arranged in any particular order, and it is side-effect free which enables XSLT rules to be called any number of times and in any order.

XSLT uses XML syntax. The root element is xsl:stylesheet, which must include a namespace declaration for XSLT. The optional xsl:output element tells the type of the target document. The root element is then filled with template rules, which describe how to transform elements in the source document, in this case, XML document. Each template rule consists of two parts: a pattern and a template. The pattern describes which XML element nodes should be processed by the rule. In some cases, patterns are specified using XPath expressions. On the other hand, the template describes the HTML structure that should be generated when nodes that match the pattern are found. In an XSLT stylesheet, a template rule is represented by an xsl:template element. The pattern is the value of its match attribute, and the template is the element’s content. The template may contain a sequence of text nodes and literal result elements to be copied to the output, and instructions to be executed according to the rules of the particular instruction.

Two XSLT instructions that will be used very often in the XSLT stylesheet we generated are xsl:value-of and xsl:apply-templates. The xsl:value-of instruction extracts the data content of an XML element and inserts it into the output. It has a select attribute which consists of a pattern. The xsl:apply-templates instruction finds all nodes that match the select attribute pattern, and processes each node in turn by applying the template rule that matches the node.

As described earlier, XSLT is a language specifically designed for transforming the structure of XML documents. The transformation process works as follows. A list of nodes from the source document is processed to create a result tree fragment. The result tree is constructed by processing a list containing just the root node. Within a list of source nodes, each list member is processed in order and the result tree structures are appended. A node is processed by finding all the template rules with patterns that match the node, and choosing the best amongst them; the template of the chosen rule is then instantiated with the node as the current node and with the list of source nodes as the current node list. A template typically contains instructions that select an additional list of source nodes for processing. This process is continued recursively until the list of source nodes is empty.

An HTML page is generated automatically for each text. We have designed one template matching with the three of our texts. We have used the free trial of the Oxygen editor.

XSLT is a powerful tool which can be used for building, as in this case, a digital edition. Even if

Robinson’s remark about the lack of easy-to-use production tools is unfortunately still valid: there is no software tool nor suite of tools that allows a scholar to produce a full digital edition, be it image-based with a diplomatic text or a critical edition, in a way comparable to how printed editions are prepared. Most of all, there is no ‘standard way’ to do it[from Production, in Digital Scholarly Editing: Theories and Practices, Matthew James Driscoll and Elena Pierazzo, p. 227]

XSLT is a valid tool for this purpose.

XSLT is far more powerful than CSS and can manipulate XML documents in ways that CSS cannot. XSLT can radically transform documents and can combine multiple XML data sources into a single output file or multiple output files.

The Portrait of Shakespeare

The image is taken from the William Jaggard's First Folio i.e. Mr. William Shakespeare's Comedies, Histories, & Tragedies, a collection of plays by William Shakespeare (1623). It is printed in folio format and contains 36 plays. A folio (from Latin foliō, abl. of folium, leaf) is a book or pamphlet made up of one or more full sheets of paper, on each of which four pages of text are printed, two on each side; each sheet is then folded one time to produce two leaves. Each leaf of a folio book thus is one half the size of the original sheet.

WHAT DID SHAKESPEARE LOOK LIKE?

The portrait of Shakespeare on the title page was engraved by Martin Droeshout and is one of only two portraits with any claim to authenticity. As Droeshout would have only been 15 when Shakespeare died it is unlikely that they actually met. Instead his picture was probably drawn from the memory of others, or from an earlier portrait. In his admiring verse ‘To the Reader’ at the start of the First Folio, the writer Ben Jonson declares that the engraver achieved a good likeness – he ‘hit’ or captured Shakespeare’s face well.

Images

The power of IIIF

We have decided to present the images of the operas through IIIF.
The International Image Interoperability Framework (IIIF, spoken as 'triple-I-eff'), is a model for presenting and annotating digital representation of objects. It offers a set of application programming interfaces (APIs) based on open web standards, defined in specifications derived from shared real world use cases.

It is a set of technical specifications built around shared challenges in cultural heritage access. More than the technical specifications, IIIF is a community of software, tools, content, people, and institutions solving image Interoperability challenges.

Is a way to standardize the delivery of images and audio/visual files from servers to different environments on the Web where they can then be viewed and interacted with in many ways.

Modern Web browsers understand how to display formats like .jpg and .mp4 at defined sizes, but cannot do much else. The IIIF specifications align with general Web standards that define how all browsers work to enable richer functionality beyond viewing an image or audio/visual files. For images, that means enabling deep zoom, comparison, structure (i.e., for an object such as a book, structure = page order) and annotation. For audio/visual materials, that means being able to deliver complex structures (such as several reels of film that make up a single movie) along with things like captions, transcriptions/translations, annotations, and more.

IIIF makes these objects work in a consistent way. That enables portability across viewers, the ability to connect and unite materials across institutional boundaries, and more.

So IIIF is a useful and helpful resource, through which some of the most common problems while dealing with the images, such as different images formats, sizes and storage issues, can be easily solved.

IIIF provides two core APIs:
Image API (I want to get image pixels)
Presentation API (I want to display the images)

There are several more APIs that IIIF supports including Search and Authentication. In this DSE we focused on the Image and Presentation APIs. We implemented both in our Digital Scholarly Edition to present the complete range of options.

Image API

The Image API provides a standardized way to request and deliver images. This can be as simple as, give me the original image to give me an upside-down tiled version of the image in gif format. The IIIF Image API is restful and allows for images to be served dynamically or from a static cache.

Images are requested using URI templates that have the following syntax:

{scheme}://{server}{/prefix}/{identifier}/{region}/{size}/{rotation}/{quality}.{format}

You can see and try how this works in practice in the IIIF API playground.

To show Image API in our DSE we used OpenSeadragon, an open-source, web-based viewer for high-resolution zoomable images. It is a deep-zooming javascript client whose primary advantage is a very smooth loading and navigation experience.
It supports the IIIF Image API natively, and is very simple to get up and running by supplying an info.json reference as its "tileSource". This is described in a dedicated page in its documentation.

Presentation API

The Presentation API attaches basic metadata and structure to digital objects, defining how they appear in viewers. It does this via the Manifest, a JSON file which bundles up all the different elements of an IIIF object (such as a single image, or a series of images) with basic metadata (like title, description, and rights information) and structural information (such as page order).

The IIIF Presentation API specifies a web service that returns JSON-LD structured documents that together describe the structure and layout of a digitized object or other collection of images and related content. It enables you to provide metadata about the structure and layout of image objects. Image type objects represent things like:

images
groups of ordered images
groups of images that represent pages (book, manuscript)

The Presentation API provides metadata about how these image objects can be displayed. Many institutions expose all or a large part of their collections through the IIIF Image and Presentation APIs. In particular, we used the Digital Bodleian and the Miami University Library that provide easy IIIF access to their resources in their search interface.

Manifest

The manifest response contains sufficient information for the client to initialize itself and begin to display something quickly to the user. The manifest resource represents a single object and any intellectual work or works embodied within that object. In particular, it includes the descriptive, rights and linking information for the object. It then embeds the sequence(s) of canvases that should be rendered to the user.

The Bodleian Library gives access to a useful tool, the IIIF Manifest Editor. The IIIF Manifest Editor aims to reduce the guesswork and pure JSON editing when creating and/or manipulating manifests. It features a simple and easy-to-use interface with forms to edit the JSON value-pairs and drag-and-drop of images/canvases to facilitate the ordering and editing of the manifest without JSON knowledge. It's possible to validate and download the created/modified IIIF manifest.

Mirador

Mirador is a configurable, extensible, and easy-to-integrate image viewer, which enables image annotation and comparison of images from repositories dispersed around the world. Mirador has been optimized to display resources from repositories that support the IIIF API's.