Birte Lönneker and Jan Christoph Meister
Deep Blue may have beaten Kasparov at chess —but whether computers will ever be able to generate well-formed and aesthetically pleasing narratives is still subject to dispute (Bringsjord/Ferrucci; Pérez y Pérez/Sharples). Most AI researchers have come to the conclusion that the generation of natural language narratives that are both domain independent and Turing test compliant is a 'killer application': it defines the outer limit of computational creativity.
The current paper reports on our research into Story Generator Algorithms (SGAs), that is, computational systems designed to generate natural language narratives. Though attempts at designing and implementing SGAs date back to the early 1970s they have not received a lot of attention in Humanities Computing (HC) circles. The relevance of these experimental and speculative approaches seemed rather limited in the light of practical HC desiderata such as the definition of mark-up conventions, document type definitions, and standards for digital resource building, to mention but a few. However, contrary to this pragmatist line of reasoning we would like to argue that SGAs in their abstract models make explicit some of the cardinal assumptions underlying our intuitive human models of narrative, which in turn have filtered down into the practice of humanities computing in whose object domain narratives play a dominant role. Our own research methodology is therefore in part empirical - we aim to survey and classify the types of SGAs developed thus far - and in part theoretical. The current paper focuses on one of the theoretically oriented tasks: the design of the architecture for a hypothetical 'ideal' SGA that would be able to emulate advanced, aesthetically validated human storytelling capability.
The system architecture of this ideal SGA is derived from advanced models of storytelling developed in narratology, i.e. the humanities methodology dedicated to the scientific study of narratives. Figure 1 shows the architecture with its four domains:
- the goal domain, in which several kinds of story-telling goals are offered, for a random or user selection;
- the knowledge domain, in which static knowledge is represented in concepts and their interrelations in an ontology, to which language-specific lexica as well as a case-base of previous and system-generated stories are related;
- the histoire domain, containing three modules concerned with the question: "what happens?", or with the production of the content of the story;
- the discours domain, combining two modules that aim to answer the question: how is the content presented?
The two system complexes labelled histoire domain and discours domain mirror the two main 'levels' of narratological description introduced by structuralist scholars, histoire and discours (Todorov). However, the 'level'-metaphor used by narratologists —a residue of the structuralist 'deep layer' vs. 'surface layer' dichotomy – misleadingly implies a generative hierarchy which is at the same time uni-linear and strictly bottom-up, starting at the histoire level. We prefer instead to use the non-hierarchical 'domain' metaphor because it is better suited to accommodate backtracking procedures. Those procedures are necessary during the generation of the final product that should eventually reflect the intertwined results of constrained operations on knowledge pertaining to both histoire and discours; therefore, the backtracking is possibly recursive and iterative within and between both domains. With respect to implemented models for the generation of natural language artefacts, this view is more in line with the engagement-reflection model of the story generator MEXICA (Pérez y Pérez/Sharples) than with those of the story generator BRUTUS (Bringsford/Ferrucci) or of generators for technical texts (cf. Reiter), all of which basically use a unidirectional pipeline model. In the two remaining sections, we will briefly and exemplarily present key elements of our model.
Figure 1: Architecture of an ideal SGA
According to a minimal consensus definition, the histoire (story) is a chronological sequence of narrated events, together with their participants. Some narratologists have identified actions as the constitutive elements of stories, where an action is a special type of event, intentionally caused by at least one of the participants. Often it is also claimed that in order to form a story, the events need to be causally related (cf. Rimmon-Kenan 16–17). However, narratology and related approaches have identified further types of relations between elements in the story domain that might as well contribute to storiness. For example, our previous formulation of a computational model of 'episode' (Meister) has shown that the episode is an interpretive construct of several events (four in that model) based on the activation of various semiological (semantic) relations, including contradictories.
In the system architecture shown above, events are represented as classes or frames and have at least the following slots (properties): A follows slot, the filler (value) of which points to the previous event; the slots parallels and causes, pointing to the respective events; the slots involvesExistent and causedByExistent, pointing to the respective participants (modelled as existents); the slot changesStateInto, which as a filler has an attribute-value-pair of one of the involved existents (the effect of the event is the resulting state of the affected existent); and, finally, an isIntentional slot to indicate the intentionality of the event. Some of the fillers are mandatory, others not; some slots can host a list of fillers. Existents like characters and objects (cf. Chatman) are also modelled as concepts with slots (attributes) and fillers (values).
The model defines further narratological notions in the histoire domain in terms of events and their relatedness; in particular, story schema, closed event sequence and story (cf. Forster 93-94) can be defined as follows:
- A story schema applies to a group of stories whose histoire is similar, for example fairy tales, legends, or detective stories. It constitutes a predefined closed set of existent classes and event classes together with their relations. The ProtoPropp generator presented in another paper of this session constitutes an example for a story schema based approach.
- A closed event sequence is a temporal succession of events with a constant (sub)set of participants and a start event and end event. The temporal succession can be obtained from the fillers in the follows slots of the events.
- Instead of explicitly defining what a story is, the model allows a comparison of different event sequences with respect to their storiness which is measured in terms of a) the numerical ratio between all events and action events in the considered event sequence; b) the numerical ratio between the events and the causal relations in the event sequence.
The discours domain should take into account several discours parameters, or aspects of text description defined in structuralist narratology. Such discours parameters include for example the Order of the presented events (which might differ from the order in which they actually occur in the story) and the Frequency with which the same or similar event(s) are presented (Genette), or the Mediation-of-relatedness, to name but a few. Currently, we work with a set of twelve discours parameters.
Every parameter has a list of subconcepts, which represent the actual phenomena they subsume. For example, the Order parameter has the subclasses anachrony and synchrony, with anachrony comprising the phenomena flashback, flashforward, and achrony. One of the aims of the project is to identify a standard phenomenon (default) in every parameter class. For example, in unmarked narrative texts, events are most likely to be presented in their chronological order (Order-subclass: synchrony) and motivation relations such as 'causality' are most likely to be left implicit (Mediation-of-relatedness subclass: implicit).
Discours parameters operate on, or apply to, specific types of elements belonging to the histoire domain. For example, the Order parameter operates on events and event sequences. Furthermore, each discours parameter supplies a modification rule that states in which way it affects the story representation during the preparation of the discours. Thus the flashforward parameter states that the affected story element (event) be moved one element (event) back in the sequence of presented events, i.e. towards the start of the story.
Future work will include the representation of the aesthetical or communicative effect of each of the discours parameters and the study of application restrictions of these parameters and combinations of them. It is important to note that those effects and restrictions are not absolute, but depend on the types of events and existents and their relations used in the histoire domain. Therefore, the 'communication' between operations performed in both domains, described as backtracking above, will be necessary in the generation process.
While the implementation of the hypothetical 'ideal' SGA outlined in the above may seem practically impossible, it proves to be a dream that raises fundamental questions — if nothing else by challenging our HC methodologies which, by and large, have hitherto concentrated on managing static humanist data, yet shied away from tackling the conceptual threshold of dynamism and recursivity inherent in most semantic artefacts.