Object-Oriented approach to Solr in Ruby.
- Installation
- Configuration
- Indexing
- Querying
- Deleting documents
- Active Support instrumentation
- Testing
- Running specs
Add solrb to your Gemfile:
gem 'solrb'If you are going to use solrb with solr cloud:
gem 'zk' # required for solrb solr-cloud integration
gem 'solrb'The simplest way to use Solrb is SORL_URL environment variable (that has a core name in it):
ENV['SOLR_URL'] = 'http://localhost:8983/solr/demo'You can also use Solr.configure to specify the solr URL explicitly:
Solr.configure do |config|
config.url = 'http://localhost:8983/solr/demo'
endIt's important to note that those fields that are not configured, will be passed as-is to solr. So you only need to specify fields in configuration if you want Solrb to modify them at runtime.
Use Solr.configure for an additional configuration:
Solr.configure do |config|
config.url = 'http://localhost:8983/solr/demo'
# This gem uses faraday to make requests to Solr. You can specify additional faraday
# options here.
config.faraday_options = {}
# Core's URL is 'http://localhost:8983/solr/demo'
# Adding fields to work with
config.define_core do |f|
f.field :title, dynamic_field: :text
f.dynamic_field :text, solr_name: '*_text'
end
endSolr.configure do |config|
config.url = 'http://localhost:8983/solr'
# Define a core with fields that will be used with Solr.
# Core URL is 'http://localhost:8983/solr/listings'
config.define_core(name: :listings) do |f|
# When a dynamic_field is present, the field name will be mapped to match the dynamic field.
# Here, "title" will be mapped to "title_text"
# You must define a dynamic field to be able to use the dynamic_field option
f.field :title, dynamic_field: :text
# When solr_name is present, the field name will be mapped to the solr_name at runtime
f.field :tags, solr_name: :tags_array
# define a dynamic field
f.dynamic_field :text, solr_name: '*_text'
end
# Pass `default: true` to use one core as a default.
# Core's URL is 'http://localhost:8983/solr/cars'
config.define_core(name: :cars, default: true) do |f|
f.field :manufacturer, solr_name: :manuf_s
f.field :model, solr_name: :model_s
end
endWarning: Solrb doesn't support fields with the same name. If you have two fields with the same name mapping to a single solr field, you'll have to rename one of the fields.
...
config.define_core do |f|
...
# Not allowed: Two fields with same name 'title'
f.field :title, solr_name: :article_title
f.field :title, solr_name: :page_title
end
...To enable solr cloud mode you must define a zookeeper url on solr config block.
In solr cloud mode you don't need to provide a solr url (config.url or ENV['SOLR_URL']).
Solrb will watch the zookeeper state to receive up-to-date information about active solr nodes including the solr urls.
You can also specify the ACL credentials for Zookeeper. More Information
Solr.configure do |config|
config.zookeeper_urls = ['localhost:2181', 'localhost:2182', 'localhost:2183']
config.zookeeper_auth_user = 'zk_acl_user'
config.zookeeper_auth_password = 'zk_acl_password'
endIf you are using puma web server in clustered mode you must call enable_solr_cloud! on on_worker_boot
callback to make each puma worker connect with zookeeper.
on_worker_boot do
Solr.enable_solr_cloud!
endTo enable master-slave mode you must define a master url and slave url on solr config block.
In solr master-slave mode you don't need to provide a solr url (config.url or ENV['SOLR_URL']).
Solr.configure do |config|
config.master_url = 'localhost:8983'
config.slave_url = 'localhost:8984'
# Disable select queries from master:
config.disable_read_from_master = true
# Specify Gray-list service
config.nodes_gray_list = Solr::MasterSlave::NodesGrayList::InMemory.new
endIf you are using puma web server in clustered mode you must call enable_master_slave! on on_worker_boot
callback to make each puma worker connect with zookeeper.
on_worker_boot do
Solr.enable_master_slave!
endSolrb provides two built-in services:
Solr::MasterSlave::NodesGrayList::Disabled— Disabled service (default). Just does nothing.Solr::MasterSlave::NodesGrayList::InMemory— In memory service. It stores failed URLs in an instance variable, so it's not shared across threads/servers. URLs will be marked as "gray" for 5 minutes, but if all URLs are gray, the policy will try to send requests to these URLs earlier.
You are able to implement your own services with corresponding API.
You can force solrb to use a specific node URL with the with_node_url method:
Solr.with_node_url('http://localhost:9000') do
Solr::Query::Request.new(search_term: 'example', query_fields: query_fields).run
endBasic authentication is supported by solrb. You can enable it by providing auth_user and auth_password
on the config block.
Solr.configure do |config|
config.auth_user = 'user'
config.auth_password = 'password'
end# creates a single document and commits it to index
doc = Solr::Update::Commands::Add.new
doc.add_field(:id, 1)
doc.add_field(:name, 'Solrb!!!')
commit = Solr::Update::Commands::Commit.new
request = Solr::Update::Request.new([doc, commit])
request.runYou can also create indexing document directly from attributes:
doc = Solr::Update::Commands::Add.new(doc: { id: 5, name: 'John' }) query_field = Solr::Query::Request::QueryField.new(field: :name)
request = Solr::Query::Request.new(search_term: 'term', query_fields: [query_field])
request.run(page: 1, page_size: 10)For multi-core configuration use Solr.with_core block:
Solr.with_core(:models) do
Solr.delete_by_id(3242343)
Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
Solr::Update::Request.new([doc])
end query_fields = [
# Use boost_magnitude argument to apply boost to a specific field that you query
Solr::Query::Request::QueryField.new(field: :name, boost_magnitude: 16),
Solr::Query::Request::QueryField.new(field: :title)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.run(page: 1, page_size: 10) query_fields = [
Solr::Query::Request::QueryField.new(field: :name),
Solr::Query::Request::QueryField.new(field: :title)
]
filters = [Solr::Query::Request::Filter.new(type: :equal, field: :title, value: 'A title')]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields, filters: filters)
request.run(page: 1, page_size: 10) usa_filter =
Solr::Query::Request::AndFilter.new(
Solr::Query::Request::Filter.new(type: :equal, field: :contry, value: 'USA'),
Solr::Query::Request::Filter.new(type: :equal, field: :region, value: 'Idaho')
)
canada_filter =
Solr::Query::Request::AndFilter.new(
Solr::Query::Request::Filter.new(type: :equal, field: :contry, value: 'Canada'),
Solr::Query::Request::Filter.new(type: :equal, field: :region, value: 'Alberta')
)
location_filters = Solr::Query::Request::OrFilter.new(usa_filter, canada_filter)
request = Solr::Query::Request.new(search_term: 'term', filters: location_filters)
request.run(page: 1, page_size: 10) spatial_point = Solr::SpatialPoint.new(lat: 40.0, lon: -120.0)
filters = [
Solr::Query::Request::Geofilt.new(field: :location, spatial_point: spatial_point, spatial_radius: 100)
]
request = Solr::Query::Request.new(search_term: 'term', filters: filters)
request.run(page: 1, page_size: 10) spatial_rectangle = Solr::SpatialRectangle.new(
top_left: Solr::SpatialPoint.new(lat: 40.0, lon: -120.0),
bottom_right: Solr::SpatialPoint.new(lat: 30.0, lon: -110.0)
)
filters = [
Solr::Query::Request::Filter.new(type: :equal, field: :location, value: spatial_rectangle)
]
request = Solr::Query::Request.new(search_term: 'term', filters: filters)
request.run(page: 1, page_size: 10) query_fields = [
Solr::Query::Request::QueryField.new(field: :name),
Solr::Query::Request::QueryField.new(field: :title)
]
sort_fields = [Solr::Query::Request::Sorting::Field.new(name: :name, direction: :asc)]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.sorting = Solr::Query::Request::Sorting.new(fields: sort_fields)
request.run(page: 1, page_size: 10)Default sorting logic is following: nulls last, not-nulls first.
query_fields = [
Solr::Query::Request::QueryField.new(field: :name)
]
sort_fields = [
Solr::Query::Request::Sorting::Field.new(name: :is_featured, direction: :desc),
Solr::Query::Request::Sorting::Function.new(function: "score desc")
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.sorting = Solr::Query::Request::Sorting.new(fields: sort_fields)
request.run(page: 1, page_size: 10) query_fields = [
Solr::Query::Request::QueryField.new(field: :name),
Solr::Query::Request::QueryField.new(field: :category)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.grouping = Solr::Query::Request::Grouping.new(field: :category, limit: 10)
request.run(page: 1, page_size: 10) query_fields = [
Solr::Query::Request::QueryField.new(field: :name),
Solr::Query::Request::QueryField.new(field: :category)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.facets = [Solr::Query::Request::Facet.new(type: :terms, field: :category, options: { limit: 10 })]
request.run(page: 1, page_size: 10) query_fields = [
Solr::Query::Request::QueryField.new(field: :name),
Solr::Query::Request::QueryField.new(field: :category)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
request.boosting = Solr::Query::Request::Boosting.new(
multiplicative_boost_functions: [Solr::Query::Request::Boosting::RankingFieldBoostFunction.new(field: :name)],
phrase_boosts: [Solr::Query::Request::Boosting::PhraseProximityBoost.new(field: :category, boost_magnitude: 4)]
)
request.run(page: 1, page_size: 10)Sometimes you want to do a dictionary-style boosting example: given a hash (dictionary)
{3025 => 2.0, 3024 => 1.5, 3023 => 1.2}and a field of category_id
the resulting boosting function will be:
if(eq(category_id_it, 3025), 2.0, if(eq(category_id_it, 3024), 1.5, if(eq(category_id_it, 3023), 1.2, 1)))
note that I added spaces for readability, real Solr query functions must always be w/out spaces
Example of usage:
category_id_boosts = {3025 => 2.0, 3024 => 1.5, 3023 => 1.2}
request.boosting = Solr::Query::Request::Boosting.new(
multiplicative_boost_functions: [
Solr::Query::Request::Boosting::DictionaryBoostFunction.new(field: :category_id,
dictionary: category_id_boosts)
]
) shards_preference = Solr::Query::Request::ShardsPreference.new(
properties: [
Solr::Query::Request::ShardsPreferences::Property.new(name: 'replica.type', value: 'PULL')
]
)
request = Solr::Query::Request.new(search_term: 'term', shards_preference: shards_preference)
request.run(page: 1, page_size: 10) query_fields = [
Solr::Query::Request::QueryField.new(field: :name),
Solr::Query::Request::QueryField.new(field: :category)
]
request = Solr::Query::Request.new(search_term: 'term', query_fields: query_fields)
# Solr::Query::Request will return only :id field by default.
# Specify additional return fields (fl param) by setting the request field_list
request.field_list = [:name, :category]
request.run(page: 1, page_size: 10)# Delete by document ID
Solr.delete_by_id(3242343)
Solr.delete_by_id(3242343, commit: true)
# Delete by query
Solr.delete_by_query('*:*')
Solr.delete_by_query('*:*', commit: true)
# Delete by filters
filters = [Solr::Query::Request::Filter.new(type: :equal, field: :contry, value: 'Canada')]
commands = [Solr::Update::Commands::Delete.new(filters: filters)]
commands << Solr::Update::Commands::Commit.new if commit?
request = Solr::Update::Request.new(commands)
request.runThis gem publishes events via Active Support Instrumentation
To subscribe to solrb events, you can add this code to initializer:
ActiveSupport::Notifications.subscribe('request.solrb') do |*args|
event = ActiveSupport::Notifications::Event.new(*args)
if Logger::INFO == Rails.logger.level
Rails.logger.info("Solrb #{event.duration.round(1)}ms")
elsif Logger::DEBUG == Rails.logger.level && Rails.env.development?
Pry::ColorPrinter.pp(event.payload)
end
endIt's possible to inspect the parameters for each solr query request done using Solrb by requiring
solr/testing file in your test suite. The query parameters will be accessible by reading
Solr::Testing.last_solr_request after each request.
require 'solr/testing'
RSpec.describe MyTest do
let(:query) { Solr::Query::Request.new(search_term: 'Solrb') }
it 'returns the last solr request params' do
query.run(page: 1, page_size: 10)
expect(Solr::Testing.last_solr_request.body[:params]).to eq({ ... })
end
endThis project is setup to use CI to run all specs agains a real solr.
If you want to run it locally, you have several options:
- Use CircleCI CLI
- Use Docker Compose (recommended)
- Manual setup with Docker commands
# Start Solr
docker-compose -f docker-compose.single.yml up -d
# Wait for Solr to be healthy
docker-compose -f docker-compose.single.yml ps
# Create test core
# First copy the default configset to the correct location
docker exec -u 0 solrb-solr-1 sh -c "mkdir -p /var/solr/data/configsets && cp -R /opt/solr/server/solr/configsets/_default /var/solr/data/configsets/ && chown -R solr:solr /var/solr/data/configsets"
# Then create the core
curl 'http://localhost:8983/solr/admin/cores?action=CREATE&name=test-core&configSet=_default'
# Disable field guessing
curl http://localhost:8983/solr/test-core/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'
# Run specs
SOLR_URL=http://localhost:8983/solr/test-core rspec
# Clean up
docker-compose -f docker-compose.single.yml down -vIf you prefer more control or need to debug the setup, you can use the manual Docker commands:
# Start Solr
docker run -it --name test-solr -p 8983:8983/tcp -t solr:9.7.0-slim
# Copy default configset to the correct location
docker exec -u 0 test-solr sh -c "mkdir -p /var/solr/data/configsets && cp -R /opt/solr/server/solr/configsets/_default /var/solr/data/configsets/ && chown -R solr:solr /var/solr/data/configsets"
# Create a core
curl 'http://localhost:8983/solr/admin/cores?action=CREATE&name=test-core&configSet=_default'
# Disable field guessing
curl http://localhost:8983/solr/test-core/config -d '{"set-user-property": {"update.autoCreateFields":"false"}}'
# Run specs
SOLR_URL=http://localhost:8983/solr/test-core rspec