The Artima Developer Community
Sponsored Link

Java Buzz Forum
A myriad of markup systems

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Steve Conover

Posts: 37
Nickname: sgcjr
Registered: Feb, 2003

Steve Conover is a professional Java engineer and consultant
A myriad of markup systems Posted: Dec 13, 2005 1:17 PM
Reply to this message Reply

This post originated from an RSS feed registered with Java Buzz by Steve Conover.
Original Post: A myriad of markup systems
Feed Title: Steve Conover's Weblog
Feed URL: http://www.sonic.net/~conover/index.rdf
Feed Description: Mostly java-related.
Latest Java Buzz Posts
Latest Java Buzz Posts by Steve Conover
Latest Posts From Steve Conover's Weblog

Advertisement

It's hard to avoid the legions of custom markup systems out there these days. Every Wiki has it's own syntactical quirks, while packages like Markdown, Textile, BBCode (in dozens of variants), reStructuredText offer easy ways of hooking markup conversion in to existing applications. When it comes to being totally over-implemented and infuratingly inconsistent, markup systems are rapidly catching up with template packages. Never one to miss out on an opportunity to reinvent the wheel, I've worked on several of each ;)

My most recent markup handling attempt has just been published as part of my SitePoint article on Bookmarklets (cliché). It's a structured markup language in a bookmarklet: activate the bookmarklet to convert the text in any textarea on a page to XHTML. The syntax is ridiculously simple, and serves my limited needs just fine:


= This is a header

Here is a paragraph.

* This is a list of items
* Another item in the list

Converts to:


<h4>This is a header</h4>

<p>Here is a paragraph.</p>

<ul>
 <li>This is a list of items</li>
 <li>Another item in the list</li>
</ul>

The algorithm is simple, and easily portable to any language you care to mention:

  1. Normalise newlines to \n, for cross-platform consistency.
  2. Split the text up on double newlines, to create a list of blocks.
  3. For each block:
    1. If it starts with an equals sign, wrap it in header tags.
    2. If it starts with an asterisk, split it in to lines, make each a list item (stripping off the asterisk at the start of the line if required) and glue them all together inside a <ul>.
    3. Otherwise, wrap it in a <p> tag provided it doesn't have one already.
  4. Glue everything back together again with a couple of newlines, to make the underlying XHTML look pretty.

The bookmarklet comes in two flavours: Expand HTML Shorthand (the full version) and Expand HTML Shorthand IE, which loses header support in order to fit within IE's rippling 508 character limit. A more capable bookmarklet could be built using the import-script-stub method described in my article, but the implementation of such a thing is left as an exercise for the reader (I've always wanted to say that).

Incidentally, there's a very common bug in markup systems that allow inline styles that proves extremely difficult to fix: that of improperly nested tags. Say you have a system where *text* is bold and _text_ is italic; what happens when the user enters _italic*italic-bold_bold*? Most systems (and that includes Markdown, Textile and my home-rolled Python solution) use naive regular expressions for inline markup processing and will output vadly formed XHTML: <em>italic<strong>italic-bold</em>bold</strong>. To truly solve this problem requires a context-sensitive parser, which involves an unpleasantly large amount of effort to solve what looks like a simple bug.

Read: A myriad of markup systems

Topic: ex-FEMA Director Becomes Consultant Previous Topic   Next Topic Topic: PMD 3.4 released - thirteen new rules!

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use