The Artima Developer Community
Sponsored Link

Ruby Buzz Forum
Searching Beast and WordPress from a Rails app

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Scott Patten

Posts: 43
Nickname: spatten
Registered: Jan, 2008

Scott Patten is a freelance web developer and Ruby on Rails trainer based in Vancouver
Searching Beast and WordPress from a Rails app Posted: May 8, 2008 12:10 AM
Reply to this message Reply

This post originated from an RSS feed registered with Ruby Buzz by Scott Patten.
Original Post: Searching Beast and WordPress from a Rails app
Feed Title: Scott Patten's Blog
Feed URL: http://feeds.feedburner.com/scottpatten.ca
Feed Description: Scott Patten is the cofounder of Ruboss (http://ruboss.com) and Leanpub (http://leanpub.com), both based in Vancouver. He is also the author of The S3 Cookbook (http://leanpub.com/thes3cookbook). He blogs about Startups, Ruby, Rails, Javascript, CSS, Amazon Web Services and whatever else strikes his fancy.
Latest Ruby Buzz Posts
Latest Ruby Buzz Posts by Scott Patten
Latest Posts From Scott Patten's Blog

Advertisement

I did some work for ecolect recently, integrating search across their main site, a Beast forum and a WordPress blog. It was pretty straightforward once I had it figured out, but I couldn’t find a walkthrough on the net.

So, I decided to write one.

The Search Engine

Ecolect was already using acts_as_ferret for their main site search, so it was a no-brainer to keep using it. For a fun and thorough introduction to acts_as_ferret, see the Rails Envy Tutorial

Searching the Beast Forum

(Note: the beast Forum search isn’t live at the moment as the Forums are not fully functioning yet.)

Beast setup

In Beast, each Forum has many Topics (saved in the topics table), and each Topic has many Posts (saved in the posts table).

The Beast forum was set up by Shanti Braford as described in option #3 of this great article on integrating a beast forum into a Rails app.

In this setup, the Beast tables are added to the main site’s database. So, the posts and topics tables that we want to search are already in the main site’s database. This made things pretty easy: you can search the Beast forum just like you would any database table.

Searching a non-integrated Beast forum

If you don’t have the Beast tables integrated in to your main site’s database, you can still search them. You just need to point the Topic and Post models to the correct database. This is a two step process.

First, set up a database entry for your Beast forum in config/database.yml. Something like this:

1
2
3
4
5
6
beast:
  adapter: mysql
  database: beast_forum
  username: app
  password: your_password
  host: localhost

Then, in the bottom of config/environment.rb, add the following lines:

1
2
Post.establish_connection "beast_forum"
Topic.establish_connection "beast_forum"

I haven’t actually tried this, so let me know if you get it working or if you needed to make any changes to what I’ve written here.

The Topic Model

Even with the integrated setup there were a few wrinkles. First, although the Beast tables are in the database, there are no models associated with them. I wanted to search post body and topic titles, so I created Topic and Post models.

Here’s the Topic model. It’s only here so that the Post model can search a posts’s titles, so there’s not much to it.

1
2
3
4
5
class Topic < ActiveRecord::Base
  
  has_many :posts
    
end

The Post Model

The Post model is a bit more complicated.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
class Post < ActiveRecord::Base
  
  belongs_to :topic
  
  SEARCH_FIELDS = [:scrubbed_body, :topic_title]
  
  acts_as_ferret :fields => { :scrubbed_body => {:boost => 0, :store => :yes}, 
                              :topic_title => {:boost => 3, :store => :yes}}  
  
  extend FullTextSearch
  
  # remove the html tags from body
  def scrubbed_body
    body_html.gsub(/<\/?[^>]*>/, "")
  end
  
  def topic_title
    topic.title
  end
      
  # Construct the url to the post    
  def url
    File.join("http://forums.ecolect.net", "forums", topic.forum_id.to_s, 
              "topics", topic.id.to_s + "#post_#{id}")
  end
  
end

acts_as_ferret

The acts_as_ferret declaration makes the Post model searchable. Notice that the actual fields being searched are not taken directly from the database; they are both manipulated in some way. acts_as_ferret doesn’t really care if the stuff it is indexing is coming directly from the database or from methods you have added to your model.

scrubbing the html tags

The post body is stored with HTML tags in them, so I wanted to search and show the posts with tags scrubbed out of them. This is done using the Post#scrubbed_body method, which is just an ugly regexp that takes out anything between < and > signs.

the url method

I also wanted to link to the posts, so I created a Post#url method which is used in the view.

Finally, the actual search is done using the FullTextSearch mixin, which adds a class method Post::full_text_search to Post. The FullTextSearch mixin is described in more detail below.

Searching the WordPress Blog

The only table from the WordPress Blog that you really care about is the wp_post table. To get access to it in your Rails app, make a WpPost model and point it at your WordPress db.

First, create a database entry in config/database.yml that looks like this:

1
2
3
4
5
6
wordpress:
  adapter: mysql
  database: wordpress
  username: app
  password: your_password
  host: localhost
Then, add the following line to the bottom of config/environment.rb

WpPost.establish_connection "wordpress"

Here’s the model:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
class WpPost < ActiveRecord::Base
  
  primary_key = "ID"
  
  SEARCH_FIELDS = [:scrubbed_title, :scrubbed_content]
                   
  acts_as_ferret :fields => { :scrubbed_title => {:boost => 3, :store => :yes},
                              :scrubbed_content => {:boost => 0, :store => :yes}}
  
  extend FullTextSearch
  
  def id
    read_attribute(:ID)
  end
  
  # title and content need to have the html tags removed from them, and should
  # only be searchable if the post has been assigned a url (the guid) and the post 
  # is actually a post and not an asset.
  def scrubbed_title
   post_title.gsub(/<\/?[^>]*>/, "") if post_type == "post" and !guid.empty?
  end
  
  def scrubbed_content
    post_content.gsub(/<\/?[^>]*>/, "") if post_type == "post" and !guid.empty?
  end
end

There are a few things to note here:

WordPress uses ID, rather than id, as its primary key. The line primary_key = "ID" lets Rails know about that. You also need to add an id method that returns ID to get ferret indexing things properly.

You will need to scrub the html tags from the content and title; that’s what the scrubbed_title and scrubbed_content methods do.

Finally, you don’t want search results to index assets (which are stored in the wp_post model as well) or any un-published posts.
  • Real posts will have a post_type of 'post'.
  • An unpublished post won’t have its guid set.

This is taken care of by only returning titles or content if post_type == "post" and !guid.empty?.

The FullTextSearch mixin

This is based on code by Roman Mackovcak’s article on full text search in Rails. All I did was extract the method he provides out in to a mixin so I could use it in multiple models.

To use the mixin in a model, the model needs to define SEARCH_FIELDS and have an acts_as_ferret declaration. SEARCH_FIELDS is an array of symbols giving the model fields to be searched.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
module FullTextSearch
  
  ##
  # FERRET SEARCH METHOD
  # This method requires that you set the following in the model:
  # SEARCH_FIELDS: 
  # a list of symbols giving the fields to be searched.
  # E.g., SEARCH_FIELDS = [:post_title, :post_content]
  #
  # The acts_as_ferret declaration.  
  # Use :store => :yes for each field if you want to use highlighting for that field.
  # E.g., 
  # acts_as_ferret :fields => { :post_title => {:boost => 3, :store => :yes},
  #                             :post_content => {:boost => 0, :store => :yes}}
  ##
  DEFAULT_PER_PAGE = 10
  
  def full_text_search(q, options = {})
     return nil if q.nil? || q == ""
     
     default_options = {:limit => FullTextSearch::DEFAULT_PER_PAGE, 
                        :page => 1, 
                        :lazy => self.const_get(:SEARCH_FIELDS)
                       }
     options = default_options.merge options

     # Get the offset based on what page we're on
     options[:offset] = options[:limit] * (options.delete(:page).to_i-1)  
     # Now do the query with our options
     results = self.find_by_contents(q, options)

     return [results.total_hits, results]
  end  
  
  
end
You use it like this:
1
2
WpPost.full_text_search('test')
WpPost.full_text_search('test', :limit => 30, :page => 2)
The full_text_search method returns an array of length two. The first value in the array is the number of search results, and the second value the actual search results.

Read: Searching Beast and WordPress from a Rails app

Topic: Speaking @ MinneBar Previous Topic   Next Topic Topic: This Week in Ruby (May 5, 2008)

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use