Fiction Markup Language

  1. Introduction
  2. Why Bother?
  3. Name
  4. Structure
  5. A Start
  6. Elements
  7. Resources


I work on an ongoing hypernovel called Troped that daily becomes more chaotic and difficult to retain control of (for both me and my readers, I think). So, my first priority here is to start the development of a useful extensible markup language (XML) for fiction. From what I gather, it is currently best to go about defining this as an XML schema as opposed to a DTD. Some of the better arguments on this matter I encountered in this article on extensiblity. (I also found this brief introduction to XML schema extremely helpful.)

Why Bother?

Hmm… if this is what you are thinking then why are you here? 🙂 I think that there are three primary reasons to do this: search, advancing literary criticism and advancing the way that people read nonlinear fiction.

The first reason is pretty obvious. If stories are marked up utilizing a language like FicML then computers will be able to do all sorts of neat things with them. They would be able to find us all the stories in which some type of object is mentioned. They would be able to seek stories that occur on certain dates. Why would people want to search through stories in this manner? Who knows? Why would people want to invent something called “hypertext” and put it on a “World Wide Web”?

The second reason, advancing literary criticism, is probably, to me, one of the neatest reasons to do this. I have seen two very interesting papers written on literature utilizing computers. One in which the chronological order of Shakespeare’s plays was determined (Johnson 1997). And one in which Pynchon’s Gravity’s Rainbow was analyzed to look at the covariance of the occurrence of words in the novel (to look for themes, essentially) (Herman 2003). These kinds of analyses could only benefit from the development of an XML tagging structure for fiction.

Finally, advanced hypertexts suffer from the problem of not being “malleable” enough. By malleable, I mean to say that hypertexts should not just exist in the lone structure of the document and links that the author created. Users should be able to read everything that happens to one character in the order of the narrative, or everything that happens to a character in chronological order, or all the scenes involving two particular characters. That, in my opinion, would be a hypertext. And it is also the primary reason why virtually all other document types on the web are moving to this kind of extensible document structure.


For the moment I am referring to this specification as FicML. ((Someone else already has tried to create a FicML for fan fiction, but it’s not clear to me how far along this effort is or if it is ongoing still (there are no dates on the site and many of the links are broken). Tehnically, I think a name better suited to their project would be fanficML, but that’s up to them.)) If at any point in the future there is a serious objection to FicML, I’d be just as content calling the effort FictionML.

General Structure

The most important point I can make here is that this is not an effort to define a structure for stories or fiction–I really think that would be a hopeless pursuit given the incredible (arguably infinite) number of permutations a story could take. This language would operate orthogonally to something like FictionBook markup which attempts to describe the concrete elements of the medium (i.e. title page, prologue, chapters, parts, etc.) More importantly (at least to me and Troped) is the fact that the elements of a story should be able to be arranged in different narrative terms and yet still be identifiable as the same story. Part of what a fiction markup language should try to accomplish is making fiction machine-readable. ((Note that I do not mean machine understandable–that’s totally different.)) A program should be able to look at two FictionML documents and decide if the stories are similar in various ways (i.e. do they have the same characters, settings? Do they take place at the same time?)

There is also no question in my mind that the way that people would go about marking up stories is highly subjective. That is to say, two different people marking up Cannery Row would yield two very different sets of the text with markup inserted in it. So, there is also a very great need to ensure that several different versions of a marked up story could be aggregated by a computer. Again, this can be handled by the use of schema as opposed to a DTD, and also by designing the schema in such a way that it is highly extensible.

Shot Off The Bow

Here is what I think would be a good way to begin the conversation. This would be an example of the XML:

<story title="Goldilocks and the Three Bears" narration="third" narration-type="omniscient">Once there was a little girl named <char firstname="none" surname="none" nickname="goldilocks">Goldilocks</char>. She was a precocious little girl and would go wandering in the <setting>woods</setting>. One day she came upon a <setting contained-in="woods">house</setting> and decided to go in. It was the house where <char-group type="family" surname="bears">the three bears<char-group> lived. At first, in the <setting contained-in="house">kitchen</setting>, she discovered that there were three bowls of porridge. <dialogue type="internal-monologue">How strange</dialogue>, she thought. She decided to try to biggest bowl of porridge first but it was too cold. She tasted the second bowl of porridge but it was too hot! Finally she tried the third bowl of porridge and it was just right.</story>

And here is what I think the schema would look like ((This is not the full schema, just enough to fill in what’s going on with the example above.)):

<element name="character" type="string"/>
            <any namespace="##targetNamespace" maxOccurs="unbounded"/>
        <attribute name="id" type="ID" use="required"/>
        <attribute name="narration-person" type="ID" use="required"/>

<element name="setting" type="string"/>
<element name="dialogue" type="string"/>
            <any namespace="##targetNamespace" maxOccurs="unbounded"/>
        <attribute name="id" type="ID" use="required"/>
        <attribute name="narration-person" type="string" use="required"/>
        <attribute name="narration-person" type="ID" use="required"/>


What I’ve worked out so far, as far as the structure of the schema goes is as follows:



  • <Story>
  • This has to be the parent element and it would have several attributes
    • tense (present, past, etc.)
    • voice (First, Second, Third, etc.)
    • view (omniscient, limited)
    • And within story there are:
      • <Characters>
        • with attributes like role, name, surname, nickname, related-to, related-how
        • Characters have all kinds of tricky relationships which is why I would advocate basing the entire character structure on [XFN]( (The Xhtml Friends Network). I think they’ve done a really good job of working out most of the details on this one.
      • <Setting>
        • with attributes like type and within (where one setting is within a previously mentioned setting
      • <Object>
        • with attributes like type and name (i.e. Excalibur is a sword)
      • <Date>
        • This would include time up to seconds. However it might also be useful to develop a kind of relative dating system to include things like “the next day” with respect to the narrative story order.
      • <Dialogue>
        • This tag is trickier than the others because it is so intimately tied in with “structure” in stories. But nonetheless, I think a fiction markup language would need it. It would have attributes like type (as in internal or external dialogues) as well as who spoke it.
    • To Think About
    • Is there need for a <plot> or <event> tag?


    1. Google Group
    2. Wikipedia Page