This post originated from an RSS feed registered with Ruby Buzz
by Red Handed.
Original Post: HTML Filtering For RedCloth
Feed Title: RedHanded
Feed URL: http://redhanded.hobix.com/index.xml
Feed Description: sneaking Ruby through the system
This isn’t a patch for RedCloth, it’s a method you can use to filter out HTML in general. But it works nicely with RedCloth output. I use it for the comments on this site. It’s the best solution I can think of for world-writable files.
class String
## Dictionary describing allowable HTML
## tags and attributes.
BASIC_TAGS = {
'a' => ['href', 'title'],
'img' => ['src', 'alt', 'title'],
'br' => [],
'i' => nil,
'u' => nil,
'b' => nil,
'pre' => nil,
'kbd' => nil,
'code' => ['lang'],
'cite' => nil,
'strong' => nil,
'em' => nil,
'ins' => nil,
'sup' => nil,
'sub' => nil,
'del' => nil,
'table' => nil,
'tr' => nil,
'td' => nil,
'th' => nil,
'ol' => nil,
'ul' => nil,
'li' => nil,
'p' => nil,
'h1' => nil,
'h2' => nil,
'h3' => nil,
'h4' => nil,
'h5' => nil,
'h6' => nil,
'blockquote' => ['cite']
}
## Method which cleans the String of HTML tags
## and attributes outside of the allowed list.
def clean_html!( tags = BASIC_TAGS )
gsub!( /<(\/*)(\w+)([^>]*)>/ ) do
raw = $~
tag = raw[2].downcase
if tags.has_key? tag
pcs = [tag]
tags[tag].each do |prop|
['"', "'", ''].each do |q|
q2 = ( q != '' ? q : '\s' )
if raw[3] =~ /#{prop}\s*=\s*#{q}([^#{q2}]+)#{q}/i
pcs << "#{prop}=\"#{$1.gsub('"', '\\"')}\""
break
end
end
end if tags[tag]
"<#{raw[1]}#{pcs.join " "}>"
else
" "
end
end
end
end
Be sure to use it after you convert your Textile to HTML.
I’d like to make RedCloth’s built-in filter allow this kind of customization. It may even be worthwhile to have it scan for allowed CSS within a style declaration. On a Wiki, it’s nice to allow people to come up with widths and floating directions and detailed colors, you know?