If I have a single Solr index running on a Core, can I split it or migrate it into 2 shards?
Yes you can split your index into multiple shards More info on shards can be found here : http://lucidworks.lucidimagination.com/displa... Thanks. Regards, Nitin Keswani -----Original Message----- From: michaelsever Sent: Friday, May 04, 2012 9:44 AM To: Subject: Single Index to Shards If I have a single Solr index running on a Core, can I split it or migrate it into 2 shards?
There's no way to split an _existing_ index into multiple shards, although
some of the work on SolrCloud is considering being able to do this. You
have a couple of choices here:
1> Just reindex everything from scratch into two shards
2> delete all the docs from your index that will go into shard 2 and just
index the docs for shard 2 in your new shard
But I want to be sure you're on the right track here. You only need to shard
if your index contains "too many" documents for your hardware to produce
decent query rates. If you are getting (and I'm picking this number out
of thin air) 50 QPS on your hardware (i.e. you're not stressing memory
etc) and just want to get to 150 QPS, use replication rather than sharding.
see: http://wiki.apache.org/solr/SolrReplication
Best
ErickOn Fri, May 4, 2012 at 9:44 AM, michaelsever wrote:
> If I have a single Solr index running on a Core, can I split it or migrate it
> into 2 shards?
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Single-Ind...
> Sent from the Solr - User mailing list archive at Nabble.com.
You can also make a copy of your existing index, bring it up as a second instance/core and then send delete queries to both indexes.
-----Original Message-----
From: Erick Erickson
Sent: Friday, May 04, 2012 8:37 AM
To:
Subject: Re: Single Index to Shards
There's no way to split an _existing_ index into multiple shards, although some of the work on SolrCloud is considering being able to do this. You have a couple of choices here:
1> Just reindex everything from scratch into two shards
2> delete all the docs from your index that will go into shard 2 and
2> just
index the docs for shard 2 in your new shard
But I want to be sure you're on the right track here. You only need to shard if your index contains "too many" documents for your hardware to produce decent query rates. If you are getting (and I'm picking this number out of thin air) 50 QPS on your hardware (i.e. you're not stressing memory
etc) and just want to get to 150 QPS, use replication rather than sharding.
see: http://wiki.apache.org/solr/SolrReplication
Best
ErickOn Fri, May 4, 2012 at 9:44 AM, michaelsever wrote:
> If I have a single Solr index running on a Core, can I split it or
> migrate it into 2 shards?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Single-Ind...
> ml Sent from the Solr - User mailing list archive at Nabble.com.
If you are not using SolrCloud, splitting an index is simple: 1) copy the index 2) remove what you do not want via "delete-by-query" 3) Optimize! #2 brings up a basic design question: you have to decide which documents go to which shards. Mostly people use a value generated by a hash on the actual id- this allows you to assign docs evenly. http://wiki.apache.org/solr/UniqueKeyOn Fri, May 4, 2012 at 4:28 PM, Young, Cody wrote: > You can also make a copy of your existing index, bring it up as a second instance/core and then send delete queries to both indexes. > > -----Original Message----- > From: Erick Erickson > Sent: Friday, May 04, 2012 8:37 AM > To: > Subject: Re: Single Index to Shards > > There's no way to split an _existing_ index into multiple shards, although some of the work on SolrCloud is considering being able to do this. You have a couple of choices here: > > 1> Just reindex everything from scratch into two shards > 2> delete all the docs from your index that will go into shard 2 and > 2> just > index the docs for shard 2 in your new shard > > But I want to be sure you're on the right track here. You only need to shard if your index contains "too many" documents for your hardware to produce decent query rates. If you are getting (and I'm picking this number out of thin air) 50 QPS on your hardware (i.e. you're not stressing memory > etc) and just want to get to 150 QPS, use replication rather than sharding. > > see: http://wiki.apache.org/solr/SolrReplication > > Best > Erick > > On Fri, May 4, 2012 at 9:44 AM, michaelsever wrote: >> If I have a single Solr index running on a Core, can I split it or >> migrate it into 2 shards? >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Single-Ind... >> ml Sent from the Solr - User mailing list archive at Nabble.com.
Oh, isn't that easier! Need more coffee before suggesting things..
Thanks,
ErickOn Fri, May 4, 2012 at 8:16 PM, Lance Norskog wrote:
> If you are not using SolrCloud, splitting an index is simple:
> 1) copy the index
> 2) remove what you do not want via "delete-by-query"
> 3) Optimize!
>
> #2 brings up a basic design question: you have to decide which
> documents go to which shards. Mostly people use a value generated by a
> hash on the actual id- this allows you to assign docs evenly.
>
> http://wiki.apache.org/solr/UniqueKey
>
> On Fri, May 4, 2012 at 4:28 PM, Young, Cody wrote:
>> You can also make a copy of your existing index, bring it up as a second instance/core and then send delete queries to both indexes.
>>
>> -----Original Message-----
>> From: Erick Erickson
>> Sent: Friday, May 04, 2012 8:37 AM
>> To:
>> Subject: Re: Single Index to Shards
>>
>> There's no way to split an _existing_ index into multiple shards, although some of the work on SolrCloud is considering being able to do this. You have a couple of choices here:
>>
>> 1> Just reindex everything from scratch into two shards
>> 2> delete all the docs from your index that will go into shard 2 and
>> 2> just
>> index the docs for shard 2 in your new shard
>>
>> But I want to be sure you're on the right track here. You only need to shard if your index contains "too many" documents for your hardware to produce decent query rates. If you are getting (and I'm picking this number out of thin air) 50 QPS on your hardware (i.e. you're not stressing memory
>> etc) and just want to get to 150 QPS, use replication rather than sharding.
>>
>> see: http://wiki.apache.org/solr/SolrReplication
>>
>> Best
>> Erick
>>
>> On Fri, May 4, 2012 at 9:44 AM, michaelsever wrote:
>>> If I have a single Solr index running on a Core, can I split it or
>>> migrate it into 2 shards?
>>>
>>> --
>>> View this message in context:
>>> http://lucene.472066.n3.nabble.com/Single-Ind...
>>> ml Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
> --
> Lance Norskog
>
We did it at my last job. Took a few days to split a 500mdoc index.On Sat, May 5, 2012 at 9:55 AM, Erick Erickson wrote:
> Oh, isn't that easier! Need more coffee before suggesting things..
>
> Thanks,
> Erick
>
> On Fri, May 4, 2012 at 8:16 PM, Lance Norskog wrote:
>> If you are not using SolrCloud, splitting an index is simple:
>> 1) copy the index
>> 2) remove what you do not want via "delete-by-query"
>> 3) Optimize!
>>
>> #2 brings up a basic design question: you have to decide which
>> documents go to which shards. Mostly people use a value generated by a
>> hash on the actual id- this allows you to assign docs evenly.
>>
>> http://wiki.apache.org/solr/UniqueKey
>>
>> On Fri, May 4, 2012 at 4:28 PM, Young, Cody wrote:
>>> You can also make a copy of your existing index, bring it up as a second instance/core and then send delete queries to both indexes.
>>>
>>> -----Original Message-----
>>> From: Erick Erickson
>>> Sent: Friday, May 04, 2012 8:37 AM
>>> To:
>>> Subject: Re: Single Index to Shards
>>>
>>> There's no way to split an _existing_ index into multiple shards, although some of the work on SolrCloud is considering being able to do this. You have a couple of choices here:
>>>
>>> 1> Just reindex everything from scratch into two shards
>>> 2> delete all the docs from your index that will go into shard 2 and
>>> 2> just
>>> index the docs for shard 2 in your new shard
>>>
>>> But I want to be sure you're on the right track here. You only need to shard if your index contains "too many" documents for your hardware to produce decent query rates. If you are getting (and I'm picking this number out of thin air) 50 QPS on your hardware (i.e. you're not stressing memory
>>> etc) and just want to get to 150 QPS, use replication rather than sharding.
>>>
>>> see: http://wiki.apache.org/solr/SolrReplication
>>>
>>> Best
>>> Erick
>>>
>>> On Fri, May 4, 2012 at 9:44 AM, michaelsever wrote:
>>>> If I have a single Solr index running on a Core, can I split it or
>>>> migrate it into 2 shards?
>>>>
>>>> --
>>>> View this message in context:
>>>> http://lucene.472066.n3.nabble.com/Single-Ind...
>>>> ml Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
>>
>> --
>> Lance Norskog
>>