We were fortunate to have a paper accepted as a late-breaking presentation at Extreme 2002, in Montreal. Jeni delivered a knock-down explanation of what we were doing. The paper abstract as it appeared in the conference program:
Representing multiple hierarchies within a single document has always been a problem for XML. To try to address the problems of representing multiple hierarchies and of annotating existing tree structures with type information (as in the PSVI), we have developed a layered data model based on the Core Range Algebra presented at Extreme 2002 by Gavin Nicol. This data model views documents as strings over which span a number of named ranges, each of which can themselves have associated metaranges with their own internal structure. To aid our experimentation with this data model, we developed a markup notation to reflect it, the Layered Markup and Annotation Language (LMNL), and have constructed several prototype applications to facilitate the extraction of single views, as XML structures, from LMNL documents. This paper outlines LMNL and discusses how its development has made us reflect on the nature of XML, schema and query languages.
If you want to know what LMNL is, imagine that XML is like stacking boxes in pyramids. The boxes are labelled (elements have names and attributes) but you are still limited to putting boxes on top of boxes (or, if you like, stacking them inside one another), without overlapping them.
LMNL, in contrast, allows you to layer the ranges the text is made up of, the way a wall of bricks might overlap, but much more likely in other ways.
If that doesn't make any sense to you, then this won't either. LMNL has a syntax, but is not standardized around that but rather around the “LMNL data model” we create out of that stuff (or out of anything else). The original design and purpose of LMNL was developed because Jeni and Wendell happened to be talking about “overlapping hierarchies” when Jeni thought of applying an approach described as a Core Range Algebra, by Gavin Nicol. Gavin suggests a generalized way to extract named “ranges” from text data.
Gavin's approach seemed very powerful but lacked a syntax in
which to express or capture the ranges one might recognize in a document.
Wendell and Jeni started
experimenting with different syntaxes, developing a parser (LMNOP) that could interpret a text stream as a range model
(a more concrete version of the range algebra data model). The model was developed as an abstraction of what we expected would
be useful, while the syntax could be tailored to express fairly neatly the model's fundamental structure, a layered set of
annotated ranges.
The first operation we are interested in is the extraction of systematic tree structures, such as XML documents conforming to different schemas. Thus we can apply XML's strong tool set on a wider range of marked-up data than is conveniently handled in XML. In addition, since LMNL scales naturally (instead of attributes it allows annotations that may themselves be structured) and handles overlapping hierarchies natively, we anticipate an open-ended set of applications.
| © 2002 by the authors and LMNL.org All rights reserved |
![]() |