Make delicious recipes!

Solr Shard Splitting


Following shard splitting commands assume that a three node server is already setup and running using fictional IPs 10.11.100.11, 20.22.200.22 and 30.33.300.33. All servers are assumed to be running on the default Solr ports: 8983 for Solr and 9983 for Zookeeper.

With the above assumptions, following is the command to create a new collection
http://10.11.100.11:8983/solr/admin/collections?
  action=CREATE &
  name=mycollection &
  numShards=1 &
  replicationFactor=2


Number of servers required by the above command is numShards * replicationFactor = 2.
So make sure to setup a cloud with adequate number of servers before giving this command.
When this command runs on a bigger Solr Cloud (with server-count >= 2), following is the result:


Note the naming of the new collection’s cores.
  1. mycollection_shard1_replica1 is "core" present on one server and
  2. mycollection_shard1_replica2 is "core" present on another server.
  3. But the name of the collection is still "mycollection".

This is somewhat different from the default "collection1” which is present in all the servers and still named "collection1".
Also, checkout the cloud configuration picture at http://localhost:8983/solr/#/~cloud:



http://wiki.apache.org/solr/SolrCloud provides a complete reference of commands and attributes.


Shard splitting

Run the following command:
http://10.11.100.11:8983/solr/admin/collections?
  action=SPLITSHARD &
  collection=mycollection &
  shard=shard1
If there is no error, you should see the following (after refreshing the cloud config page):



With the below collections in place, it is easy to segregate data into multiple collections. This segregation is particularly helpful when there are multiple teams trying to use the same Solr Cloud. Each team can create its own collection, decide the number of shards and replicas for itself and manage it independently (like using shard-splitting etc.) while remaining part of the larger Solr Cloud.

For writing, collections won’t talk to each other but for reading, it is possible to read at once from multiple collections by means of Virtual Collection. A virtual collection is a group of collections used for reading data at once from several collections.


Adding cores to a collection

Cores can be added to a collection by the following command:
http://10.11.100.11:8983/solr/admin/cores?
  action=CREATE &
  name=mycore &
  collection=mycollection &
  shard=shard1

For the below cloud config, it will create a new core in shard1 and the new core will be used by "mycollection"

Since that collection already had a functional shard-leader for shard1, new core is just added as another replica to the system to give the following configuration:










Like us on Facebook to remain in touch
with the latest in technology and tutorials!


Got a thought to share or found a
bug in the code?
We'd love to hear from you:

Name:
Email: (Your email is not shared with anybody)
Comment:

Facebook comments:

Site Owner: Sachin Goyal