Skip to content

A full text indexing extension for MongoDB using Sphinx and Mongoid

Notifications You must be signed in to change notification settings

radepal/mongoid-sphinx

 
 

Repository files navigation

This is a fork of github.com/burke/mongosphinx with many changes to simplify and support Mongoid.

The MongoidSphinx library implements an interface between MongoDB and Sphinx supporting Mongoid to automatically index objects in Sphinx. It tries to act as transparent as possible: Just an additional method in Mongoid and some Sphinx configuration are needed to get going.

MongoidSphinx needs gems Mongoid and Riddle as well as a running Sphinx and a MongoDB installation.

sudo gem sources -a http://gems.github.com  # Only needed once!
sudo gem install riddle
sudo gem install mongoid
sudo gem install mongoidsphinx

No additional configuraton is needed for interfacing with MongoDB: Setup is done when Mongoid is able to talk to the MongoDB server.

A proper “sphinx.conf” file and a script for retrieving index data have to be provided for interfacing with Sphinx: Sorry, no ThinkingSphinx like magic… :-) Depending on the amount of data, more than one index may be used and indexes may be consolidated from time to time.

This is a sample configuration for a single “main” index:

searchd {
  address = 0.0.0.0
  port = 3312

  log = ./sphinx/searchd.log
  query_log = ./sphinx/query.log
  pid_file = ./sphinx/searchd.pid
}

source mongoblog {
  type = xmlpipe2

  xmlpipe_command = rake sphinx:genxml --silent
}

index mongoblog {
  source = mongoblog

  charset_type = utf-8
  path = ./sphinx/sphinx_index_main
}

Notice the line “xmlpipe_command =”. This is what the indexer runs to generate its input. You can change this to whatever works best for you, but I set it up as a rake task, with the following in ‘lib/tasks/sphinx.rake` .

Here, :fields is a list of fields to export. Performance tends to suffer if you export everything, so you’ll probably want to just list the fields you’re indexing.

namespace :sphinx do
  task :genxml => :environment do
    MongoidSphinx::Indexer::XMLDocset.stream(Food)
  end
end

This uses MongoDB cursor to better stream collection. Instead of offset. See: groups.google.com/group/mongodb-user/browse_thread/thread/35f01db45ea3b0bd/96ebc49b511a6b41?lnk=gst&q=skip#96ebc49b511a6b41

Use method search_index to enable indexing of a model. You must provide a list of attribute keys.

A side effect of calling this method is, that MongoidSphinx overrides the default of letting MongoDB create new IDs: Sphinx only allows numeric IDs and MongoidSphinx forces new objects with the name of the class, a hyphen and an integer as ID (e.g. Post-38497238). Again: Only these objects are indexed due to internal restrictions of Sphinx.

Sample:

class Post
  include Mongoid::Sphinx

  field :title
  field :body

  search_index :title, :body
end

You must also create a config/sphinx.yml file with the host and port of your sphinxd process like so:

development: address: localhost port: 3312 staging: address: localhost port: 3312 production: address: localhost port: 3312

An additional instance method search is added for each search indexed model. This method takes a Sphinx query like “foo @title bar”, runs it within the context of the current class and returns an Array of matching MongoDB documents.

Samples:

Post.search('first')
=> [...]

post = Post.search('this is @title post').first
post.title
=> "First Post"
post.class
=> Post

Additional options :match_mode, :limit and :max_matches can be provided to customize the behaviour of Riddle. Option :raw can be set to true to do no lookup of the document IDs but return the raw IDs instead.

Sample:

Post.search('my post', :limit => 100)

Copyright © 2010 Matt Hodgson

Copyright © 2009 Burke Libbey, Ryan Neufeld

CouchSphinx Copyright © 2009 Holtzbrinck Digital GmbH, Jan Ulbrich

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

About

A full text indexing extension for MongoDB using Sphinx and Mongoid

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Ruby 100.0%