Skip to content

rizecode/pysolr

Repository files navigation

pysolr

pysolr is a lightweight Python wrapper for Apache Solr. It provides an interface that queries the server and returns results based on the query.

Status

https://secure.travis-ci.org/django-haystack/pysolr.png

Changelog

Features

  • Basic operations such as selecting, updating & deleting.
  • Index optimization.
  • "More Like This" support (if set up in Solr).
  • Spelling correction (if set up in Solr).
  • Timeout support.
  • SolrCloud awareness

Requirements

  • Python 2.7 - 3.5
  • Requests 2.0+
  • Optional - simplejson
  • Optional - kazoo for SolrCloud mode

Installation

sudo python setup.py install or drop the pysolr.py file anywhere on your PYTHONPATH.

Usage

Basic usage looks like:

# If on Python 2.Xfrom __future__ importprint_functionimportpysolr# Setup a Solr instance. The timeout is optional.solr=pysolr.Solr('http://localhost:8983/solr/', timeout=10) # How you'd index data.solr.add([{"id": "doc_1", "title": "A test document", },{"id": "doc_2", "title": "The Banana: Tasty or Dangerous?", }, ]) # Later, searching is easy. In the simple case, just a plain Lucene-style# query is fine.results=solr.search('bananas') # The ``Results`` object stores total results found, by default the top# ten most relevant results and any additional data like# facets/highlighting/spelling/etc.print("Saw{0} result(s).".format(len(results))) # Just loop over it to access the results.forresultinresults: print("The title is '{0}'.".format(result['title'])) # For a more advanced query, say involving highlighting, you can pass# additional options to Solr.results=solr.search('bananas', **{'hl': 'true', 'hl.fragsize': 10, }) # You can also perform More Like This searches, if your Solr is configured# correctly.similar=solr.more_like_this(q='id:doc_2', mltfl='text') # Finally, you can delete either individual documents...solr.delete(id='doc_1') # ...or all documents.solr.delete(q='*:*')
# For SolrCloud mode, initialize your Solr like this:zookeeper=pysolr.Zookeeper("zkhost1:2181,zkhost2:2181,zkhost3:2181") solr=pysolr.SolrCloud(zookeeper, "collection1")

Multicore Index

Simply point the URL to the index core:

# Setup a Solr instance. The timeout is optional.solr=pysolr.Solr('http://localhost:8983/solr/core_0/', timeout=10)

Custom Request Handlers

# Setup a Solr instance. The trailing slash is optional.solr=pysolr.Solr('http://localhost:8983/solr/core_0/', search_handler='/autocomplete', use_qt_param=False)

If use_qt_param is True it is essential that the name of the handler is exactly what is configured in solrconfig.xml, including the leading slash if any (though with the qt parameter a leading slash is not a requirement by SOLR). If use_qt_param is False (default), the leading and trailing slashes can be omitted.

If search_handler is not specified, pysolr will default to /select.

The handlers for MoreLikeThis, Update, Terms etc. all default to the values set in the solrconfig.xml SOLR ships with: mlt, update, terms etc. The specific methods of pysolr's Solr class (like more_like_this, suggest_terms etc.) allow for a kwarg handler to override that value. This includes the search method. Setting a handler in search explicitly overrides the search_handler setting (if any).

LICENSE

pysolr is licensed under the New BSD license.

Running Tests

The run-tests.py script will automatically perform the steps below and is recommended for testing by default unless you need more control.

Running a test Solr instance

Downloading, configuring and running Solr 4 looks like this:

./start-solr-test-server.sh 

Running the tests

The test suite requires the unittest2 library:

Python 2:

python -m unittest2 tests 

Python 3:

python3 -m unittest tests 

About

Pysolr 3.2.0. The official source.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python93.5%
  • Shell6.5%