Solr Restore Space Considerations in Kubernetes #726

mchennupati · 2024-10-11T07:59:37Z

I am restoring a large index (655G) that is currently on google cloud storage to a new solr cloud on kubernetes instance. I am trying to understand how much space I need to allocate to each of my node pvcs.

I am currently using the collections api, with async to restore a collection saved in gcs.

When I check my disk usage for /var/solr/data on each of the nodes, it looks like this. So each of them appears to be downloading the entire index. I initially allocated 500G to each of the pvcs but that turned out to be too little. I am now doing it with 700G.

Is this expected behaviour or am I doing something wrong ? One would have expected the metadata has enough information to download the index in parts and not do it 655G x 3. It's cost me a fair bit in network costs already as I reiterate :)

In general, how would one restore a large index, I did not find a solrrestore similar to solrbackups in the solr operator crds.

So I ran an async job using the solr collections api.

Thanks !

/var/solr/data$ du
4 ./userfiles
4 ./backup-restore/gcs-backups/gcscredential/..2024_10_11_06_16_24.1266852566
4 ./backup-restore/gcs-backups/gcscredential
8 ./backup-restore/gcs-backups
12 ./backup-restore
4 ./filestore
4 ./mycoll_shard3_replica_n3/data/tlog
4 ./mycoll_shard3_replica_n3/data/snapshot_metadata
8 ./mycoll_shard3_replica_n3/data/index
85744132 ./mycoll_shard3_replica_n3/data/restore.20241011062904489
85744152 ./mycoll_shard3_replica_n3/data
85744160 ./mycoll_shard3_replica_n3
85744192 .
solr@mycoll-solrcloud-0:/var/solr/data$ du -sh

gerlowskija · 2024-12-10T17:33:18Z

(Hi @mchennupati - it looks like the formatting on your post mangled a few things, so apologies if I'm missing something.)

afaict your question isn't necessarily related to using the operator for restores, it's just a question about the disk and network costs of restoring a Solr collection? Assuming I've got that right - a better place to ask in the future would be our project's "user" mailing list: [email protected]. Please subscribe and ask similar questions there going forward!

To your specific question: if you're restoring data to an existing collection, Solr will have each replica fetch data from the backup repository. (So if you have three replicas each fetching a 100gb index, you'll pull 300gb from GCS). Restores to a new collection work slightly differently, with only one replica fetching the index and then distributing it within your Solr cluster as needed. So the network impact of restores can be tuned a little bit.

In terms of disk space though - ultimately all replicas of a shard will need a full copy of that shard's data, which sounds like 665GB in your case.

mchennupati · 2024-12-10T17:39:06Z

Thank you for your reply. Yes, I think i didnt quite understand how the restore worked, but i figured it out eventually. One aspect of my question still remains, perhaps its missing documentation. The solr operator or a CRD allows one to do a backup. But a similar restore doesnt exist or is missing from the docs ? Thanks !

…

On Tue 10. Dec 2024 at 18:33, Jason Gerlowski ***@***.***> wrote: (Hi @mchennupati <https://github.com/mchennupati> - it looks like the formatting on your post mangled a few things, so apologies if I'm missing something.) afaict your question isn't necessarily related to using the operator for restores, it's just a question about the disk and network costs of restoring a Solr collection? Assuming I've got that right - a better place to ask in the future would be our project's "user" mailing list: ***@***.*** Please subscribe and ask similar questions there going forward! To your specific question: if you're restoring data to an existing collection, Solr will have each replica fetch data from the backup repository. (So if you have three replicas each fetching a 100gb index, you'll pull 300gb from GCS). Restores to a new collection work slightly differently, with only one replica fetching the index and then distributing it within your Solr cluster as needed. So the network impact of restores can be tuned a little bit. In terms of disk space though - ultimately all replicas of a shard will need a full copy of that shard's data, which sounds like 665GB in your case. — Reply to this email directly, view it on GitHub <#726 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA7RXOKFB4UDOV4O6MUUG4T2E4QXJAVCNFSM6AAAAABPYMBCFGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMZSGM3DINRYGY> . You are receiving this because you were mentioned.Message ID: ***@***.***>

gerlowskija closed this as completed Dec 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solr Restore Space Considerations in Kubernetes #726

Solr Restore Space Considerations in Kubernetes #726

mchennupati commented Oct 11, 2024

gerlowskija commented Dec 10, 2024

mchennupati commented Dec 10, 2024 via email

Solr Restore Space Considerations in Kubernetes #726

Solr Restore Space Considerations in Kubernetes #726

Comments

mchennupati commented Oct 11, 2024

gerlowskija commented Dec 10, 2024

mchennupati commented Dec 10, 2024 via email