Durable writes - Meaning document is available in real-time for get requests even before its indexed.
This is achieved by having a layer of transaction log between client and Lucene index.
Get requests are serviced from this layer making get real-time even though the doc is not indexed.
Automatic sharding and replication with Apache Zookeeper
Since 4.0, Solr provides a cloud-mode out of the box which takes care of sharding, replication, load-balancing
and scales linearly. This is referred to as Solr Cloud
When not to use Solr
Solr (or in general a search engine) is not good when:
A query returns thousands of documents (like bootstrapping another Solr by querying
current Solr) Because search engines store fields on disk in a format from which it is easy to get only a few documents,
Lot of hierarchical relations are expected in the design with same kind of queries.
Document-level security is desired in Solr.
Building a very very large scale index.
Solr is not recommended for very large scale inverted indexes like web-scale inverted index used in Google.
For such cases, better use Hadoop map-reduce to create indexes.
Apache Nutch is one such project that uses Hadoop to map-reduce web-links and feeding the resulting index to Solr.
Got a thought to share or found a bug in the code? We'd love to hear from you: