-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design document to improve the sql storage performance by using bulk save api #194
Conversation
updated bulksave.md with the spinnaker issue no.
I'm unclear about when this would be used. Are we changing Deck to save multiple pipelines, and if so, when does that occur? |
@robzienert Hi There! @sanopsmx, his team and salesforce are working together on this new feature. Basically, we templatize all our pipelines and we have macros that could apply to all our pipelines (over 100K). Therefore, we need to update them all. This is why we need a Bulk API to speed up the writing of the pipelines. If we do not have this feature, we need to update one pipeline at the time and it makes the process extremely slow, one update could take over 5 hours for those 100K pipelines. let me know if I can address more questions. |
I added a comment on one of the PRs (I believe) but there's some prior art in See the On the other hand, I think with this intended volume of discrete pipelines you're trying to fit a square peg into a round hole. What some folks have done on our side is build a pipeline generator outside of Spinnaker. The generator itself is parameterized and generates pipeline execution json that is fed into |
|
||
2.1 The post request sent by the deck will hit the PipelineController Rest Controller endpoint. | ||
|
||
com.netflix.spinnaker.gate.controllers.PipelineController.bulksavePipeline(Body pipeline..) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we need a dedicated endpoint for this ... can it not just be submitted asynchronously as a task a polled.
We should avoid the taskService.createAndWaitForCompletion()
wheverer possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We kept the dedicate endpoint for a rest call. Was able to submit 3000+ pipelines thru json payload and got the response time ~2 mins. Hence, made the call as synchronous. It can be called from deck as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are adding only 2 rest endpoints. One in gate and the other in front50. Not adding any rest endpoint in orca.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if there are SpEL expressions? I believe the current SavePipelineTask
deals with this by having the pipeline contents be Base64 encoded.
3.3 The QueueExecutionRunner will call the StartExecutionHandler.kt. | ||
3.4 The StartExecutionHandler.kt will call the SavePipelineTask.java. | ||
3.5 SavePipelineTask will call the front 50 rest api. | ||
Response response = front50Service.savePipelineList(List<pipeline>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
savePipelineList
would be better as savePipelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
5.1 The post request sent by the orca will hit the PipelineController Rest Controller endpoint. | ||
|
||
com.netflix.spinnaker.front50.controllers.OperationsController.bulksave( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what OperationsController
is, I can only assume it's actually going to call the PipelineController.batchUpdate()
endpoint and we'll be handling whether it's admin vs. non-admin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caveat that the batchUpdate()
endpoint wasn't expected to be exposed and lacks the validation provided by save()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be PipelineController and not OperationsController. Corrected the document too.
My earlier comment around if the intention for this is to batch update 10k+ pipelines, I don't think it's the right approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @ajordens.
My current vote is -1
This proposal is written to support a use case which I think is misusing the system; a square peg through a round hole as mentioned before. It's unclear to me how this change would be beneficial to the community as a whole.
This seems like something I would rather have an RFC for adding extension points so you can do whatever you want. In which case, you already have support for creating a new bulk pipeline save stage in Orca, and then you'd just need some extension points in front50 to service this request.
This being said, I do think the underlying use case is approaching the problem incorrectly.
Are you trying to suggest multipart/form data files for batch update 10k+ pipelines. |
Re: use case fit: I don't believe it's unreasonable to expect a pipeline orchestration system to scale to support a large number of actual pipelines, which may be created relatively dynamically. I'm concerned about cases that we considered critical to our choice of Spinnaker as a pipeline orchestration system being considered odd or abnormal. I would expect some degree in flexibility in how users use the platform. I don't see bulk creation of pipelines as an unusual pattern for a business with a very large footprint. |
First and foremost, Spinnaker is a platform that provides a set of building blocks. In many cases, complete solutions can be built of these building blocks. Investments have been made towards supporting extensibility for use-cases that at present time, may fall outside of what's natively provided. Speaking as a Netflix engineer, we've been using these points of extension for many years. As have other large enterprises in the community. They are intended specifically for situations like this where a targeted solution is needed quickly. Ignoring the merits of this RFC for a moment, the proposed PRs backing this RFC need significant improvement before they could be considered mergeable. |
Supporting pipeline crud operations should be one of the core functions of Spinnaker as a platform. In many deployments, the pipelines are managed with templates and pipeline json structures are generated from these templates based on enterprise requirements and then saved to Spinnaker using the API. This operation occurs frequently enough that there is a requirement for efficiency in saving bulk pipelines in Spinnaker. The changes to pipelines can occur in a Spinnaker deployments because of
Applying the changes to pipelines with one pipeline at a time with currently available API takes hours to complete. Exposing the batchupdate API currently available in Front50 can perform order of magnitude faster. It seems to me that saving pipelines one at a time or in bulk should belong to Spinnaker core platform. |
Let's step back a bit. I'm not against the idea of a bulk api but the particular implementation details called out in this proposal were problematic. Proposals that have accompanying implementation details are challenging in that the very literal implementation often distracts from the underlying idea. In the future, a governance rfc like this would benefit from focus on the idea, motivations, alternatives, and any potential problems that might fall out, and less on the "it's going to change this file, etc.". |
We do need a bulk save method of saving multiple pipelines in one go. |
you were mentioning about extensions in your previous comments. Will adding plugin approach to save multiple pipelines work here? |
Raised a new PR. Hence closing this PR. |
Improve the sql storage performance by using bulk save api
Overview
spinnaker/spinnaker#6147
This is valid only for sql storage.
When we have multiple pipelines to save , we need to call front50 save api serially one after the other.
This reduces the performance of front50 save api.
We improve the front50 save performance by using bulk insert query.
This will help save many pipelines at once which greatly improves the performance of front50.
We need to write a bulk save api which accepts an list of pipelines in a single rest api call
and save them in front50 using bulk insert query.
Goals and Non-Goals
We improve the front50 save performance by using bulk insert query.
This will help save many pipelines at once which greatly improves the performance of front50.
We need to write a bulk save api which accepts an list of pipelines in a single rest api call
and save them in front50 using bulk insert query.
Motivation and Rationale
To improve the performance of multiple saving of pipelines at once using bulk insert sql query.
Timeline
Already Implemented
Design
Deck calls the gate api:
1.1 url: http://:8084/pipelines/bulksave
method: POST
payload: [{"keepWaitingPipelines":false,"limitConcurrent":true,"application":"apptest","spelEvaluator":"v4",
"lastModifiedBy":"admin","name":"pipeline_test","stages":[{"refId":"1","requisiteStageRefIds":[],
"type":"wait","name":"Wait","waitTime":30}],"index":3,"id":"66a5c77b-3998-495d-ad61-117313c5800c",
"triggers":[],"updateTs":"1599652266000"},{"keepWaitingPipelines":false,"limitConcurrent":true,
"application":"apptest","spelEvaluator":"v4","lastModifiedBy":"admin","name":"pipeline_test1",
"stages":[{"refId":"1","requisiteStageRefIds":[],"type":"wait","name":"Wait","waitTime":30}],
"index":4,"id":"66a5c77b-3998-495d-ad61-117313c5800c","triggers":[],"updateTs":"1599652266000"}......]
Gate api flow:
2.1 The post request sent by the deck will hit the PipelineController Rest Controller endpoint.
2.2 TaskService will create a new task(synchronous job) for saving the pipeline.
taskService.createAndWaitForCompletion(operation)
2.3 The above new task will call the orca service post api to save the pipeline.
Orca api flow:
3.1 The post request sent by the gate will hit the OperationsController Rest Controller endpoint.
3.2 The above method will start a new task with QueueExecutionRunner.kt.
3.3 The QueueExecutionRunner will call the StartExecutionHandler.kt.
3.4 The StartExecutionHandler.kt will call the SavePipelineTask.java.
3.5 SavePipelineTask will call the front 50 rest api.
Response response = front50Service.savePipelineList(List);
url: http://:8080/pipelines/bulksave
method: POST
Fiat api flow:
4.1 Front 50 will check with fiat for permissions.
4.2 storeWholePermission() will check if the permissions object is not null..
4.3 hasPermission() will check whether that particular resource(APPLICATION) has ‘WRITE’ permission to save or not.
4.4 authorizationSupport.hasRunAsUserPermission() check happens at front50,
which calls fiat to check whether that particular resource(Application) has ‘EXECUTE’ permission or not.
Front50 api flow:
5.1 The post request sent by the orca will hit the PipelineController Rest Controller endpoint.
5.2 PipelineDAO will now save the pipeline list and return the response.
Usage:
The bulk API is invoked using Gate. The invocation is:
curl -X POST -H "Content-type: application/json" -d '[{
"keepWaitingPipelines": false,
"limitConcurrent": true,
"application": "test_004",
"spelEvaluator": "v4",
"name": "pipe1001",
"stages": [
{
"requisiteStageRefIds": [],
"name": "Wait",
"refId": "1",
"type": "wait",
"waitTime": 6
}
],
"index": 1000,
"triggers": []
},
{
"keepWaitingPipelines": false,
"limitConcurrent": true,
"application": "test_005",
"spelEvaluator": "v4",
"name": "pipe1002",
"stages": [
{
"requisiteStageRefIds": [],
"name": "Wait",
"refId": "1",
"type": "wait",
"waitTime": 6
}
],
"index": 1001,
"triggers": []
}]' http://:8084/pipelines/bulksave
Output:
{
“Successful”: ,
“Failed”: ,
“Failed_list”: [<array of failed pipelines - (application, pipelinename, etc) and the error message]
}
Assumptions:
This solution works only for RDBMS database store.
Save pipelines run in the context of an application, irrespective of the content of the pipeline.
In the case of bulk save, we will use the application name as “bulk save”.
The implication is that the tasks for bulk save will not show up under tasks in their respective applications
Sample json payload:
An array of valid Spinnaker pipeline jsons.
Example:
[
{
"keepWaitingPipelines": false,
"limitConcurrent": true,
"application": "test_004",
"spelEvaluator": "v4",
"name": "pipe1001",
"stages": [
{
"requisiteStageRefIds": [],
"name": "Wait",
"refId": "1",
"type": "wait",
"waitTime": 6
}
],
"index": 1000,
"triggers": []
},
{
"keepWaitingPipelines": false,
"limitConcurrent": true,
"application": "test_005",
"spelEvaluator": "v4",
"name": "pipe1002",
"stages": [
{
"requisiteStageRefIds": [],
"name": "Wait",
"refId": "1",
"type": "wait",
"waitTime": 6
}
],
"index": 1001,
"triggers": []
},
{
"keepWaitingPipelines": false,
"limitConcurrent": true,
"application": "test_006",
"spelEvaluator": "v4",
"name": "pipe1003",
"stages": [
{
"requisiteStageRefIds": [],
"name": "Wait",
"refId": "1",
"type": "wait",
"waitTime": 6
}
],
"index": 1002,
"triggers": []
}
]