Prose, papers and documentation

Tutorial
An overview that focuses on the syntax, but also goes into the data model. Read this first to get an idea of what LMNL is capable of.
Data model
LMNL Data Model
The data model is the core of LMNL. We're still working on some of the details, so your comments are very welcome.
Reified LMNL
The reified LMNL layer acts as a bridge between a syntax and the LMNL data model, by representing syntactic features as ranges and annotations.
Syntax
At the moment this only describes the LMNL syntax specification. Eventually we plan to show mappings from XML syntax(es) onto the LMNL data model.
APIs
These are very preliminary, since the data model hasn't quite been finalised.
LMNL Object Model (LOM) API
The LOM API is equivalent to the DOM API for XML: an object model that you can use to hold the entire document in memory.
Simple API for LMNL (SAL)
SAL is equivalent to SAX for XML: an event-based API useful for streaming documents.

Extreme Markup Languages 2002

Extreme abstract

We were fortunate to have a paper accepted as a late-breaking presentation at Extreme 2002, in Montreal. Jeni delivered a knock-down explanation of what we were doing. The paper abstract as it appeared in the conference program:

Representing multiple hierarchies within a single document has always been a problem for XML. To try to address the problems of representing multiple hierarchies and of annotating existing tree structures with type information (as in the PSVI), we have developed a layered data model based on the Core Range Algebra presented at Extreme 2002 by Gavin Nicol. This data model views documents as strings over which span a number of named ranges, each of which can themselves have associated metaranges with their own internal structure. To aid our experimentation with this data model, we developed a markup notation to reflect it, the Layered Markup and Annotation Language (LMNL), and have constructed several prototype applications to facilitate the extraction of single views, as XML structures, from LMNL documents. This paper outlines LMNL and discusses how its development has made us reflect on the nature of XML, schema and query languages.

Jeni's slides
Here are Jeni's slides for the presentation. We didn't go much beyond the slides in presenting LMNL to this extremely erudite group (there was no time). We did get some good questions and remarks.
.zip of MS Powerpoint slides
maybe someday an edited version

An explanation

If you want to know what LMNL is, imagine that XML is like stacking boxes in pyramids. The boxes are labelled (elements have names and attributes) but you are still limited to putting boxes on top of boxes (or, if you like, stacking them inside one another), without overlapping them.

LMNL, in contrast, allows you to layer the ranges the text is made up of, the way a wall of bricks might overlap, but much more likely in other ways.

If that doesn't make any sense to you, then this won't either. LMNL has a syntax, but is not standardized around that but rather around the “LMNL data model” we create out of that stuff (or out of anything else). The original design and purpose of LMNL was developed because Jeni and Wendell happened to be talking about “overlapping hierarchies” when Jeni thought of applying an approach described as a Core Range Algebra, by Gavin Nicol. Gavin suggests a generalized way to extract named “ranges” from text data.

Gavin's approach seemed very powerful but lacked a syntax in which to express or capture the ranges one might recognize in a document.[(Though the ranges themselves were apparently just the construct we needed.)] Wendell and Jeni started experimenting with different syntaxes, developing a parser (LMNOP) that could interpret a text stream as a range model (a more concrete version of the range algebra data model). The model was developed as an abstraction of what we expected would be useful, while the syntax could be tailored to express fairly neatly the model's fundamental structure, a layered set of annotated ranges.

The first operation we are interested in is the extraction of systematic tree structures, such as XML documents conforming to different schemas. Thus we can apply XML's strong tool set on a wider range of marked-up data than is conveniently handled in XML. In addition, since LMNL scales naturally (instead of attributes it allows annotations that may themselves be structured) and handles overlapping hierarchies natively, we anticipate an open-ended set of applications.