The aim of this workshop is to present and discuss current ontology based annotation in text studies and to give participants an introduction and updated insight to the field. One of the expected outcomes from the workshop is to throw light on the consequences and experiences of a renewed database approach to computer assisted textual work, based on the developments over the last decade in text encoding as well as in ontological systems.
The Network for Digital Methods in the Arts and Humanities (NeDiMAH) is a research network running from 2011 to 2015, funded by the European Science Foundation, ESF. The network will examine the practice of, and evidence for, advanced ICT methods in the arts and humanities across Europe, and articulate these findings in a series of outputs and publications. To accomplish this, NeDiMAH provides a locus of networking and interdisciplinary exchange of expertise among the trans-European community of digital arts and humanities researchers, as well as those engaged with creating and curating scholarly and cultural heritage digital collections. NeDiMAH will work closely with the EC funded DARIAH and CLARIN e-research infrastructure projects, as well as other national and international initiatives. NeDiMaH includes the following Working Groups:
The WGs will examine the use of formal computationally-based methods for the capture, investigation, analysis, study, modelling, presentation, dissemination, publication and evaluation of arts and humanities materials for research. To achieve these goals the WGs will organise annual workshops and whenever possible, the NeDiMAH workshops will be organised in connection with other activities and initiatives in the field.
The NeDiMAH WG3, Linked data and ontological methods, proposes to organise a preconference workshop ‘Ontology based annotation’ in connection with the Digital Humanities 2011 in Hamburg.
The use of computers as tools in the study of textual material in the humanities and cultural heritage goes back to the late 1940s, with links back to similar methods used without computer assistance, such as word counting in the late nineteenth century and concordances from the fourteenth century onwards. In the sixty years of computer assisted text research, two traditions can be seen. One is that which includes corpus linguistics and the creation of digital scholarly editions, while the other strain is related to museum and archival texts. In the former tradition, texts are commonly seen as first class feasible objects of study, which can be examined by the reader using aesthetic, linguistic or similar methods. In the latter tradition, texts are seen mainly as a source for information; readings concentrate on the content of the texts, not the form of their writing. Typical examples are museum catalogues and historical source documents. These two traditions will be called form and content oriented, respectively. It must be stressed that these categories are not rigorous; they are points in a continuum.
Tools commonly connected to museum and archive work, such as computer based ontologies, can be used to investigate texts of any genre, be it literary texts or historical sources. Any analysis of a text is based on a close reading of it. The same tools can also be used to study texts which are read according to both the form oriented and the content oriented way (Eide 2008; Zöllner-Weber & Pichler 2007).
The novelty of the approach lies in its focus on toolsets for modelling such readings in formal systems. Not to make a clear, coherent representation of a text, but rather to highlight inconsistencies as well as consistencies, tensions as well as harmonies, in our readings of the texts. The tools used for such modelling can be created to store and show contradictions and inconsistencies, as well as providing the user with means to detect and examine such contradictions. Such tools are typically used in an iterative way in which results from one experiment may lead to adjustments in the model or in the way it is interpreted, similar to modelling as it is described by McCarty (2005). The source materials for this type of research are to be found in the results of decades of digital scholarly editing. Not only in the fact that a wide variety of texts exist in digital form, but also that many of these texts have been encoded in ways which can be used as starting points for the model building. Any part of the encoding can be contested, in the modelling work as well as in the experiments performed on the model. The methods developed in this area, which the TEI guidelines are an example of, provide a theoretical basis for this approach.
In the end of the 1980ies Manfred Thaller developed Kleio, a simple ontological annotation system for historical texts. Later in the 1990s hypertext, not databases, became the tool of choice for textual editions (Vanhoutte 2010: 131). The annotation system Pliny by John Bradley (2008) was design both as a practical tool for scholars abut also because Bradley was interested in how scholars work when studying a text. One of the expected outcomes from this workshop is to throw light on the consequences and experiences of a renewed database approach in computer assisted textual work, based on the development in text encoding over the last decade as well as in ontological systems.
A basic assumption is that reading a text includes a process of creating a model in the mind of the reader. This modelling process of the mind works in similar ways for all texts, being it fiction or non-fictions (see Ryan 1980). Reading a novel and reading a historical source document both result in models. These models will be different, but they can all be translated into ontologies expressed in computer formats. The external model stored in the computer system will be a different model from the one stored in the mind, but it will still be a model of the text reading. By manipulating the computer based model new things can be learned about the text in question.
This method represents an answer to Shillingsburg’s call for editions which are open not only for reading by the reader, but also for manipulation (Shillingsburg 2010: 181), and to Pichler’s understanding of digital tools as means to document and explicate our different understandings and interpretations of a text (Zöllner-Weber & Pichler 2007).
A digital edition can be part of the text model stored in the computer system. As tools and representation shape thinking not only through the conclusions they enable but also through the metaphors they deploy (Galey 2010: 100), this model will inevitably lead to other types of question asked to the text. A hypothesis is that these new questions will lead to answers giving new insight into the texts of study. Some of these insights would not have been found using other methods.
There is a movement in the humanities from seeking local knowledge about specific cases (McCarty, Willard. Humanities Computing. Basingstoke: Palgrave Macmillan, 2005) which in this respect are traditional humanities investigations into specific collections of one or a limited number of texts. The general patterns sought may rather be found on a meta-research level where one investigate into new ways in which research that has a traditional scope can be performed.
Scholars interested in online and shared annotation of texts and media based on ontologies. Practice in the field is not a requirement. Knowlegde of the concept ‘ontology’ or ‘conceptual model’ can be an advantage.
The aim of this workshop is to present and discuss current ontology based annotation in text studies and to give the participant an introduction and updated insight in the field and also bringing together researchers. One of the expected outcomes from this workshop is to throw light on the consequences and experiences of a renewed database approach in computer assisted textual work, based on the developments over the last decade in text encoding as well as in ontological systems.
Bradley, J. (2008). Pliny: A model for digital support of scholarship. Journal of Digital Information 9(1). http://journals.tdl.org/jodi/article/view/209/198. Last checked 2011-11-01
Crane, G. (2006). What Do You Do with a Million Books? D-Lib Magazine 12(3). URL: http://www.dlib.org/dlib/march06/crane/03crane.html. (checked 2011-11-01).
Eide, Ø. (2008). The Exhibition Problem. A Real-life Example with a Suggested Solution. Lit Linguist Computing 23(1): 27-37.
Galey, A. (2010). The Human Presence in Digital Artefacts’. In W. McCarty (ed.), Text and Genre in Reconstruction: Effects of Digitalization on Ideas, Behaviours, Products and Institutions. Cambridge: Open Book Publishers, pp. 93-117.
McCarty, W. (2005). Humanities Computing. Basingstoke: Palgrave Macmillan.
Moretti, F. (2005). Graphs, maps, trees: abstract models for a literary history. London: Verso.
Ryan, M.-L. (1980). Fiction, non-factuals, and the principle of minimal departure. Poetics 9: 403-22.
Shillingsburg, P. (2010). How Literary Works Exist: Implied, Represented, and Interpreted. In W. McCarty (ed.), Text and Genre in Reconstruction: Effects of Digitalization on Ideas, Behaviours, Products and Institutions. Cambridge: Open Book Publishers, pp. 165-82.
Kleio-system, http://www.hki.uni-koeln.de/kleio/old.website/welcome.html, checked 2011-11-01.
Zöllner-Weber, A., and A. Pichler (2007). Utilizing OWL for Wittgenstein’s Tractatus. In H. Hrachovec, A. Pichler and J. Wang (eds.), Philosophie der Informationsgesellschaft / Philosophy of the Information Society. Contributions of the Austrian Ludwig Wittgenstein Society. Kirchberg am Wechsel: ALWS, pp. 248-250.
Vanhoutte, E. (2010) Defining Electronic Editions: A Historical and Functional Perspective. In W. McCarty (ed.), Text and Genre in Reconstruction: Effects of Digitalization on Ideas, Behaviours, Products and Institutions. Cambridge: Open Book Publishers, pp. 119-44.