-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Role Assignment for SOLR Replicas #618
Comments
This is a very good question, but ultimately it would be a fairly big change to the way that the operator works. Currently the Solr Operator uses a single StatefulSet to manage the pods for each SolrCloud instance. The statefulSet has a single pod spec, so pods are meant to be identical. That means we can't give different envVars (to set the node role) to different pods. One option would be to have the Solr Operator manage a solr cloud in multiple StatefulSets, so that each statefulSet could have a different podSpec to . This is definitely something that could be done, but would be a fairly big undertaking. If you list your use case, and how you imagine you would want to configure the node roles, then it's something we could think about for |
In SIP-14 we discuss adding a |
Thank you @janhoy and @HoustonPutman for your interest in this question. The primarily idea is to create a SolrCloud environment where we maintain a large dataset under 3+ distributed Solr instances with the backup-restore feature at hand. This mechanism is experimental and we conduct extensive tests on it. Additionally, I want to experiment with another scenario where shard-leaders and replicas reside in separate and isolated instances within the cluster. In this setup, leader-shards are responsible for updates and backup-restore tasks, while replicas can handle heavy users requests. However, I'm still unsure if this is the best practice and feasible when it comes to the SolrCloud concept. Returning to my initial question, although my understanding of how the logic behind assigning node roles works is not solid at the moment, I was expecting a Solr API for nodes to assign new roles to running instances in the cluster at runtime as there is a Roles API that allows us to list some node level details at cluster level. It seems that adding a different startup flag to each Solr instance using the Solr Operator is not a trivial task as Houston mentioned. |
I believe the role assignment is a startup property on purpose, designed to be a more static feature of a node that you don't want to be messed with at runtime. It would also be much more complex to, say, remove the "data" role from a node in-flight, would it then instantly evict all its cores, etc? |
Why would someone mess things up if they have the right API calls to arrange their cluster's nodes roles before populating any data and releasing it to production? Apart from this, one could also mess things up with some of the existing Solr APIs too. For instance, the Restore API lets us restore a backup even if there is a collection with the same name in the cluster. My point is, depending on how we approach, you might need to do some pre-checks before messing the whole cluster.
So what happens "if I restart a |
There was a fairly thorough discussion on the design here https://lists.apache.org/thread/39q48fbtcxcl9btg24twj9bwowlro16d - perhaps you find the answer to the particular decision. All in all, would be great to have support in the operator! |
Indeed, It looks like a long and back and forth discussion, and I also noticed some of you shared my opinion. Well, at this point, it seems you have already made a conclusion about this SIP, what is done is already done. However, It would be excellent if you consider another option for defining an instance's role next to envVars in the future. Regarding the main question, we'll likely have to rely on the operator to have support for specific envVars declarations in the |
When do you think you can consider this enhancement to be part of one of the next releases @HoustonPutman ? |
Maybe for v0.9.0. Using the roles API, it would be more feasible. We already do some defaulting when Pods are created (no API calls though). So it is feasible to add another defaulting method, which works with a readinessCondition. Basically when a pod is started, the How do you imagine this would be integrated into the SolrCloud CRD? |
Sorry that I totally missed your message. I'm unsure whether this could be applicable from the implementation perspective. Would it be possible to introduce another handy CRD for custom Solr settings that interacts with the SolrCloud CRD at pod startup? |
Having this would really help our scenario as well. We have very heavy indexing which requires complete different cpu and ram compared to our search. Having the ability to spin up ingestion nodes and search nodes with different specs would be really helpful @HoustonPutman |
Hey there ! Wanted to ask if technically this can work for this problem
This would be of course breaking change, it could also be a separated CRD (eg. SolrCloudWithRoles or so) |
Hi Team!
I was wondering if we have possibility to assign different roles for replicas, each of which is essentially a SOLR node, using SOLR's Node Roles feature?
The text was updated successfully, but these errors were encountered: