Geodistance Searching With Ultrasphinx

| Comments

I’m happy to annouce a patch for Ultrasphinx that enables access to the geographical distance searching in the Sphinx full-text search engine.

Why

Through my company Somebox, I recently led a team that launched TravelSkoot.com for NBC Digital Innovation, which is a Google Maps mash-up that allows people to group travel destinations together (called “skoots”) along with comments, ratings, etc. The entire project was built in Rails in about four months, and is living at EngineYard.

One of the challenges was to come up with an efficient way to search a large number of points, along with other metadata and fulltext searching. Normally, we would use GeoKit for this kind of thing, but once you combine fulltext with lots of other filters, things get complicated (and slow). That’s where Sphinx really shines.

I knew that Sphinx had support for geodistance, but unfortunately none of the Rails plugins did at the time. UltraSphinx offers more features than any of the other Rails/Sphinx plugins, and is based on Pat Allan’s excellent Riddle Client, which already had the basics for geodistance baked in (and required only a tiny patch to make it work right). The rest of the changes were then made to the UltraSphinx plugin to make it usable.

How it Works

To set up UltraSphinx for geodistance searches, you need to declare your latitude and longitude columns in the model. Since Sphinx expects these to be stored in radians, you can use the :function_sql option of is_indexed to do the conversion:

class Point < ActiveRecord::Base
  is_indexed :fields => [
    :title,
    :description,
    {:field => "lat", :function_sql => 'RADIANS(?)'},
    {:field => "lng", :function_sql => 'RADIANS(?)'}
  ]
end

The search itself ends up looking like this:

@search = Ultrasphinx::Search.new(
  :query          => 'pizza',
  :sort_mode   => 'extended',
  :sort_by        => 'distance asc',
  :filters        => {'distance' => 0..10000},
  :location => {
    :units => 'degrees',
    :lat => 40.343,
    :long => -74.233
  }
)
@search.run

The actual distance is then available in your models (in meters):

@search.results.first.distance

You can also filter and sort results by distance and combine all the other features of UltraSphinx (faceting, weighting, etc).

Thanks

Open source rocks. Evan Weaver was very encouraging in helping this patch along, cleaning up the API, and guiding discussion. A lot of support came through Dr. Mark Lane, who helped guide me through the internals, rewrote the tests in his roll-up patches, and pushed me to finish. Also thanks to Michael Hill, Michael Burmann, and Jason Lee for testing and feedback.

Footnotes

In order to use the geodistance features of UltraSphinx, you need to be using version 1.10 or higher (which also requires at minimum Sphinx v1198). These geo features are still young, and not without some minor issues. Be sure to consult the forum for answers (or submit a patch!), or leave a comment here if you need help. And if you end up using this patch in your project, recommend me at WWR!

Comments