************DRAFT**********

Linking Fancy unto Fancy:

Towards a Semantic IDE for Cascading Summaries

 

Then, upon the velvet sinking, I betook myself to linking
Fancy unto fancy, thinking what this ominous bird of yore --
What this grim, ungainly, ghastly, gaunt and ominous bird of yore
Meant in croaking "Nevermore."

E. A. Poe “The Raven”

 

This talk is primarily an exploration of the semantics of what I call summary trees, cascading summaries, or Russian doll texts. I’ll approach the question by developing further an example I have presented before [***], the summary of a story by the Russian fabulist, Krylov, called “Trishka’s Caftan”. I’ll also present a prototype of a stratified browser that displays the cascading summary of that story. My browser implementation isn’t a general tool but basically a prop for this talk, which I hope will serve to make the discussion more concrete. (This is definitely an example of cottage industry computing.)

 

Neither cascading summaries, nor Russian doll texts, nor stratified browsers are household words. A Google search found absolutely nothing, so these terms will indeed require some explanation.  But first I would like to look at the Trishka example from a more literary and linguistic perspective, and consider what is involved in capturing and mapping out the meaning of this text in particular, and literary texts in general. Here is Krylov’s parable (trishkas_caftan.htm).

According to my interpretation, the parable describes the problem of compounding errors. This story has become a Russian proverb, and the fixed expression “wearing Triska’s caftan” has been lexicalized in Russian, in part because of the simple yet emergent plot structure. We certainly have no trouble following the overall logic of the narrative. Much like in Poe’s poem “The Raven”, which is articulated around the raven’s repeated “nevermore”, the plot here constantly comes back to the same narrative crossroads. The repetition is compelling because of a kind of shimmering of meaning: by shimmering I am trying to capture the idea that in these texts the same theme or expression is viewed from different angles. In Poe’s poem, the word “nevermore”, though repeated mechanically, changes meaning depending on which question it answers. It means one thing, then another. The narrator asks the bird’s name and the answer is “Nevermore”. He asks when the raven will leave, and the answer is nevermore; will he get over his lost Lenore?—nevermore, and so on.

 

In a similar way, Trishka vainly tries to vary his response to a problem but each seemingly different response he proposes inevitably means exactly the same thing, i.e. that his repairs are themselves a problem. Both stories have the same general compounding mechanism: one is about the compounding of error, the other about compounding despair. Repetition is forefronted in both cases.

 

These texts are captivating because they display the emergent quality of meaning: the dialogue of events –i.e. the problems posed and the answers attempted-- constitute a collective meaning that is not found in each response separately. In other words, a single mistake does not make Trishka foolish, but not learning from his chain of mistakes does. Likewise, in “The Raven”, it is not any given answer that establishes the meaning, but rather the way all the narrator’s questions point unfailingly to the conclusion that his loss is permanent and profound. Emergent meaning is like compound interest at the bank: it has mathematical properties that are fascinating and disturbing. In these stories that general process is given form in repeated linguistic structures.

 

How meaning emerges is the crucial question, perhaps of life in general, but at least of reading and literary analysis. We can define any reading as the construction of derived meaning from given meaning. If we want software to effectively describe and map the meaning of a story, we must formalize the process of deriving meaning from meaning.

 

The Russian Formalists Yuri Shcheglov and Alexander Zholkovsky use a diagram to map out the meaning of the Trishka story (pe.htm).

 

Let me decipher a bit of this diagram to give you the flavor of their theory, called the Poetics of Expressiveness.

 

Overall, the diagram explains how the text in box 10 gets its meaning by transforming, step by step, the most general, abstract themes of box 3 into the final surface text of box 10.  Along the way themes are transformed by expressive devices, or EDs. According to Zholkovsky and Shcheglov:

 

The metalanguage of elementary EDs makes it possible to record independently every minimal effect of increase in expressiveness en route from Θ [the theme] to T [the text]. In writing out a derivation, the scholar, as it were, ‘counts the tricks’, moving from [the theme] to the T [the text] by gradual approximations, careful not to make any leaps, which would leave unexplained ‘how it [all]*** turned out so well’ [Themes and Text, 38]

 

The first block, numbered 3,  gives us the overall thematic structure of the parable. There are three over-arching themes, which have been labeled inane activity, paradoxicality and symbolic character. There is nothing artistic about these themes as such, but they become artistic by the way they are combined and instantiated in the story. In the Poetics of Expressiveness, there is a radical distinction between thematic information and the way that information is presented via the expressive devices. An expressive device takes general thematic information and reconfigures it into either a derivative thematic configuration or as artistic surface text. That foundational distinction between content and form, or data and display,  is understandable because the goal of the theory of expressiveness is to explain how the two become inextricably intertwined.

 

An expressive device is something that transforms a theme into something more compelling than the “upstream” abstract formulation. Three expressive devices used in the Trishka map are combination (COMB), concretization (CONCR), and concordance (CONCD).

 

 

These expressive devices are part of our textual lexicon, just as words are part of our sentence lexicon. We know how to read stories because they share the same lexicon of expressive devices.

 

This map of the story is not simple, and even so it is far from complete, both because it is a purely synchronic (giving one state of the interpretation and not how the interpretation comes into being with each line read) and because it doesn’t map all the meaning. There is almost no analysis of the fact that Trishka compounds his errors. Their analysis is more appropriate for a shortened version of the story. (trishkas_caftan_short.htm)

 

Cascading summaries offer perhaps a simpler and more intuitive way to understand the way themes are developed in a text. They are similar in approach in that they can serve as a kind of map of how we get “compound interest” from texts, but they are certainly less technical, since they don’t require the theorization that the Poetics of Expressiveness embodies. In fact, cascading summaries are not part of a theory, but rather a natural language competence we all share.

 

A cascading summary is a summary in outline form where subheadings constitute a more detailed and elaborate version  of the higher levels, until we arrive at the lowest level, the original text. One cascading summary of Trishka’s story, presented in my prototype stratified browser, is the following: (version2/t_browser_main.htm)

 

This story is summarized at 6 levels of increasing detail until we reach the original text. Because the summaries are in outline format, they are of course tightly linked. It would of course make more sense to generate the levels from the semantic description of the source text.

 

There are a thousand questions to be answered concerning the ways these summaries can be read and where they might have practical use. It is clear that their principle advantage is that they offer a way to page through blocks of meaning rather than paging through  blocks of signifiers, as ordinary texts do. However, the question I want to look at now is not so much how to display and read them, but rather how to create them by generating multiple summaries from the original story.

 

I took Word’s Autosummarize feature as a starting point for extracting summaries of different lengths and combining them into a cascade. The sentences of each summary level are simply selected from the pool of sentences that make up the original text. (trishkas_caftan_word.htm)

 

This is a kind of computer-assisted summarization, but it does not give a very natural reading. We can certainly imagine other ways to generate summaries. I’m not considering here a fully automatic system, but rather a tool that will facilitate creating cascading summaries from a source text.  I will call that tool a semantic integrated development environment, or semantic IDE. One way to think of a semantic IDE is in terms of an Excel specifically designed for texts.  Generating cascading summaries in this textual Excel would be similar to the way we create spreadsheets by copying cells with formulas that refer to other cells with data or formulas. Blocks and columns of data can be treated as units. In place of the blocks and columns of formulas displayed as numbers, I would like to see blocks and columns that generate text.

 

I won’t consider here some fairly interesting low-level issues, like the relation between grids, the fundamental data format of Excel, and hierarchies, the fundamental data format of linguistic structures. Rather I would like to focus in the time I have left on the notion of a semantic function, which the Poetics of Expressiveness seems to properly formalize in the expressive device.

 

An expressive device maintains a core of meaning while varying the way it is expressed.  For example, as we have seen, the expressive device Concretization replaces what is more abstract with what is less so. We see that function playing out in the cascading summaries: from “People” we move to “Someone” to “Trishka”.

 

So we could imagine a textual Excel function, CONCR, which would offer either a list of more concrete instances of a given word, or would arbitrarily select one instance. That function would be fairly simple to implement with Wordnet, which is precisely designed to map out such relations of between words. Obviously, it would be just as easy to move in the opposite direction, up the ontological hierarchy towards what is more abstract, but it would not be theme preserving.

 

 

For example, from the specific person Trishka we could move just as easily to the theme of a person as to the theme of foolishness, since Trishka is at the intersection of many different themes. In fact, that lack of a clean reversibility is how some theorist [*groupe Mu*] define the associative dimension of all metaphors.

 

Richard can be called “lion-hearted” because we can move from the specific courageous individual, Richard, not only to the abstract category of people, but as well to the category of courageous things, and then concretize from that category to a lion. Richard and a lion share one ontological category, even if they are opposed in another. This is something that Aristotle noted long ago. [*where?*]

 

So it would be fairly simple, with the help of Wordnet, to implement CONCR, reverse CONCR, and a metaphorical function. Metonymic functions could be built in a similar fashion, based on part/whole relations.

 

These simple lexical relations are easy targets for Excel functions, especially with applications like Wordnet available. However, if we look at complete branch of the Trishka example some of the complications become apparent.

 

People tend to repeat their errors.

Someone tries to correct an error, but repeats it.

Trishka tried to repair his garment.

Trishka's caftan needed mending.

Trishka's caftan needed mending at the elbows.

Trishka's caftan got torn at the elbows.

 

Under the concept of “error” given at the top node, we see a trail of concretizations (in red) : 0. error --> 1. correct an error ŕ-->2. repair --> 3. needed mending --> 4. got torn. Each transition relies on different kinds of meaning relations.

 

The 0-1 relation depends on operations and actors surrounding an error: we can correct an error, but we can as well admit to an error, feel sorry for an error, avoid an error and so on. The notion of error can be “unpacked” into an set of elaborate possible narratives any dimension of which  can be forefronted by concretization.

 

There are logical relations as well. The 3-4 relation depends on a simple kind of reasoning: if something is torn, it may well need mending. If something is not torn nor damaged, it will not need mending.

 

Point of view is also a factor of concretization. The 2-3 transition shifts the theme from  “repair”, the action planned by the active subject, to “needed mending”, the object’s passive state.

 

Along with that shift in point of view, we note another shift as well: the subject of the sentence shifts from Trishka to Trishka’s caftan, and with that shift comes a shift in alignment of the sentences: the word “People” is leftmost in the uppermost outline, but “caftan” is leftmost in the lower levels. Summary trees are not column regular. ~