I’ve never been very happy with BlueCloth, Ruby’s de facto
Markdown library. It was well-developed throughout 2005, reached a
fairly complete 1.0 release that year, and then… just… stopped. There
hasn’t been so much as a maintenance release since 2005 – and that’s
certainly not due to a lack of bugs and feature requests.
BlueCloth is slow. Really slow. Gruber’s Markdown.pl
(which was never
designed for speed, if I remember correctly) can process the basic syntax
test document a full three times in the same amount of time it takes
BlueCloth to process it once.
BlueCloth is also broken:
$ echo "Oh _is_ it?" | Markdown.pl
Oh is it?
$ ruby -rbluecloth -e "puts BlueCloth.new('Oh _is_ it?').to_html"
Oh, _is_ it?
BlueCloth is broken, slow, and unmaintained. What to do?
Mislav Marohnić recently created a Git clone
of the BlueCloth subversion repository to fix bugs. That’s a good start.
There’s also Maruku, another pure-Ruby implementation that’s a bit
faster and includes a variety of interesting extensions to the core
Markdown grammar.
Here’s another idea:
class BabyShitGreenCloth
def initialize(text)
@text = text
end
def to_html
open("|perl Markdown.pl", 'r+') do |io|
io.write(@text)
io.close_write
io.read
end
end
end
BlueCloth = BabyShitGreenCloth
You laugh!? Don’t. It would be funny if it were actually an inferior
implementation compared to BlueCloth. It isn’t.
Shrug. Just sayin…
Announcing Two New Fast Markdown Libraries for Ruby
(Three if you include the pipe-to-perl implementation above.)
I have two experimental Ruby extension libraries: one that wraps Jon
MacFarleane’s peg-markdown and one that wraps David Loren Parsons’s
Discount. Both are complete implementations of core Markdown plus
SmartyPants in C.
Why two? Well, there are some pretty big differences between
implementations:
Discount has a BSD-style license; peg-markdown is GPL. The Ruby extensions
adopt the license of their parent work.
peg-markdown uses a PEG-based grammar definition and a parser generator
called leg. That’s just fucking cool. It stimulates both the
CompSci weeny and pirate areas of my brain simultaneously. Also, this
should – theoretically, of course – make peg-markdown easier to maintain
and extend and guarantees a high level of correctness, assuming the grammar
is defined properly.
Discount is thread-safe, has good memory management, and includes a
stable set of functions geared toward library use. peg-markdown has none
of that but the author is entertaining suggestions (in the form of
patches).
Discount is quite a bit faster than peg-markdown (~8x in my tests),
although either will blow the doors off BlueCloth (or Markdown.pl for that
matter) in raw performance.
Discount makes for a better Ruby extension presently but peg-markdown has
legs (hardy har har).
Installing, Using, Hacking
Git clones are available on GitHub for monitoring, hacking, and browsing
the source / documentation files: rdiscount
and rpeg-markdown.
GEMs have been released to RubyForge. Install as usual:
$ sudo gem install discount
$ sudo gem install rpeg-markdown
(If you have a spare moment, please consider installing either or both of
these and note any compilation errors along with your platform in the
comments.)
Both extensions implement the basic protocol popularized by RedCloth and
adopted by BlueCloth:
require 'discount'
markdown = Discount.new("Hello World!")
puts markdown.to_html
For rpeg-markdown:
require 'markdown'
markdown = Markdown.new("Hello World!")
puts markdown.to_html
Lastly, you can inject either library into your BlueCloth-using code by
replacing your bluecloth require
statements with the following:
begin
require 'discount'
BlueCloth = Discount
rescue LoadError
require 'bluecloth'
end
Benchmarks
Here’s the results of processing the Basic Markdown Syntax test file over
100 iterations with BlueCloth, Maruku, Discount, and rpeg-markdown on my
2GHz MacBook Pro. All values are wall-clock time.
$ ruby -rubygems benchmark.rb
Results for 100 iterations
BlueCloth: 13.029987s total time, 00.130300s average
Maruku: 08.424132s total time, 00.084241s average
Discount: 00.082019s total time, 00.000820s average
Markdown: 00.715275s total time, 00.007153s average
Note: The Markdown
above is rpeg-markdown.
I will be changing the class name to something else in the
near future.
Here’s the code used to perform the benchmarks (benchmark.rb
):
require 'rubygems'
require 'bluecloth'
require 'discount'
require 'maruku'
require 'markdown'
iterations = 100
test_file = 'test.txt'
implementations = [ BlueCloth, Maruku, Discount, Markdown ]
def benchmark(implementation, text, iterations)
start = Time.now
iterations.times do |i|
implementation.new(text).to_html
end
Time.now - start
end
# read test file
test_data = File.read(test_file)
# prime the pump
implementations.each { |impl| benchmark(impl, test_data, 1) }
# gather results
results =
implementations.inject([]) do |r,impl|
GC.start
r << [ impl, benchmark(impl, test_data, iterations) ]
end
puts "results for #{iterations} iterations"
results.each do |impl,time|
printf "%10s %09.06fs total time, %09.06fs average\n",
"#{impl}:", time, time / iterations
end