Skip to content

Commit d33be52

Browse files
ani1311aalur
andauthored
Feature: Make slack request rate configurable (#42)
If multiple pods of this app is deployed, they will not respect the slack API request rate of 1 RPS. To have multiple pods of this app using the same slack bot token, we need to have the SlackRequestRPS configurable. --------- Co-authored-by: aalur <[email protected]>
1 parent 001cd36 commit d33be52

File tree

4 files changed

+108
-52
lines changed

4 files changed

+108
-52
lines changed

README.md

+27-24
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ Many applications posting messages to Slack either overlook Slack's rate limits
1919
By being a 1:1 forwarding proxy, you simply POST to this application instead, and it will get forwarded to Slack.
2020

2121
Furthermore, by adding observability, we can have a much clearer picture of:
22+
2223
- Requests per second
2324
- To which channel?
2425
- Are there failures and at what rate?
@@ -29,7 +30,6 @@ These type of insights are currently not possible to know via Slack, and only vi
2930

3031
We don't try to 'mock' the Slack API. We make a fair assumption that the message you post to the proxy **is already tested** and meets the API spec. In other words, if you got a new (custom) application where you are testing the API, I would highly recommend you do that to Slack directly. Once you have 'battletested' your implementation, you then simply convert the URL to this proxy and gain out of the box retries and rate limit behaviour with included metrics.
3132

32-
3333
## Features
3434

3535
### SlackProxy Metrics
@@ -65,7 +65,6 @@ The `slackproxy` service provides several metrics to monitor and gauge the perfo
6565
- Metric: `slackproxy_queue_size`
6666
- Description: The current size of the proxy's queue.
6767

68-
6968
### Queue
7069

7170
Monitor the queue size with the `slackproxy_queue_size` metric. This isn't a persistent queue. If the application crashes abruptly, the queue is lost. However, during a clean application shutdown, the queue processes, given adequate time. If, for instance, there's a prolonged Slack outage or if you face an outage, the queue might be lost. While the queue size is configurable, remember that the processing rate is a maximum of 1 message per second. If the queue consistently reaches its limit, consider horizontal scaling.
@@ -87,10 +86,10 @@ Permanent errors are logged in detail, including the complete POST request. Conc
8786
- How to run multiple replicas with each their own API key?
8887
- Add some basic sanity check if the basics are part of the request (channel, some body, etc)
8988

90-
9189
## Slack Application manifest
9290

9391
This manifest is required when making an application that can:
92+
9493
- Use a single token
9594
- Post to any (public) channel
9695
- Change it's name
@@ -121,41 +120,45 @@ settings:
121120

122121
### Required
123122

124-
- `--token` : Bearer token for the Slack API.
125-
- Example: `--token=YOUR_BEARER_TOKEN`
123+
- `--token` : Bearer token for the Slack API.
124+
- Example: `--token=YOUR_BEARER_TOKEN`
126125

127-
### Optional
126+
### Optional
128127

129128
> I would recommend not altering these values until you have a good understanding how it performs for your workload
130129
131130
- `--maxRetries` : Maximum number of retries for posting a message.
132-
- Default: *`3`*
133-
- Example: `--maxRetries=5`
131+
- Default: *`3`*
132+
- Example: `--maxRetries=5`
134133

135134
- `--initialBackoffMs` : Initial backoff in milliseconds for retries.
136-
- Default: *`1000`*
137-
- Example: `--initialBackoffMs=2000`
135+
- Default: *`1000`*
136+
- Example: `--initialBackoffMs=2000`
138137

139138
- `--slackURL` : Slack Post Message API URL.
140-
- Default: *`https://slack.com/api/chat.postMessage`*
141-
- Example: `--slackURL=https://api.slack.com/your-endpoint`
139+
- Default: *`https://slack.com/api/chat.postMessage`*
140+
- Example: `--slackURL=https://api.slack.com/your-endpoint`
142141

143142
- `--queueSize` : Maximum number of messages in the queue.
144-
- Default: *`100`*
145-
- Example: `--queueSize=200`
143+
- Default: *`100`*
144+
- Example: `--queueSize=200`
146145

147146
- `--burst` : Maximum number of burst messages to allow.
148-
- Default: *`3`*
149-
- Example: `--burst=2`
147+
- Default: *`3`*
148+
- Example: `--burst=2`
150149

151150
- `--metricsPort` : Port used for the `/metrics` endpoint
152-
- Default: *`:9090`*
153-
- Example: `--metricsPort :9090`
151+
- Default: *`:9090`*
152+
- Example: `--metricsPort :9090`
154153

155154
- `--applicationPort` : Port used for the application endpoint (where you send your requests to)
156-
- Default: *`:8080`*
157-
- Example: `--applicationPort :8080`
158-
159-
- `--channelOverride` : Override on sending _all_ messages to this defined channel. This is useful for debugging or if you want to force to use a single channel
160-
- Default: *``*
161-
- Example: `--channelOverride #debug-notifications`
155+
- Default: *`:8080`*
156+
- Example: `--applicationPort :8080`
157+
158+
- `--channelOverride` : Override on sending *all* messages to this defined channel. This is useful for debugging or if you want to force to use a single channel
159+
- Default: *``*
160+
- Example: `--channelOverride #debug-notifications`
161+
162+
- `--slackRequestRate` : Request rate for slack requests in milliseconds.
163+
- Default: *`1000`*
164+
- Example: `--slackRequestRate=500`

app.go

+11-9
Original file line numberDiff line numberDiff line change
@@ -146,12 +146,14 @@ func (s *SlackClient) PostMessage(request SlackPostMessageRequest, url string, t
146146
return nil
147147
}
148148

149-
func NewApp(queueSize int, httpClient *http.Client, metrics *Metrics, channelOverride string) *App {
149+
func NewApp(queueSize int, httpClient *http.Client, metrics *Metrics, channelOverride, slackPostMessageURL, slackToken string) *App {
150150
return &App{
151-
slackQueue: make(chan SlackPostMessageRequest, queueSize),
152-
messenger: &SlackClient{client: httpClient},
153-
metrics: metrics,
154-
channelOverride: channelOverride,
151+
slackQueue: make(chan SlackPostMessageRequest, queueSize),
152+
messenger: &SlackClient{client: httpClient},
153+
SlackPostMessageURL: slackPostMessageURL,
154+
SlackToken: slackToken,
155+
metrics: metrics,
156+
channelOverride: channelOverride,
155157
}
156158
}
157159

@@ -162,11 +164,11 @@ func (app *App) Shutdown() {
162164
}
163165

164166
//nolint:gocognit // but could probably use a refactor.
165-
func (app *App) processQueue(ctx context.Context, maxRetries int, initialBackoffMs int, slackPostMessageURL string, tokenFlag string, burst int) {
167+
func (app *App) processQueue(ctx context.Context, maxRetries int, initialBackoff time.Duration, burst int, slackRequestRate time.Duration) {
166168
// This is the rate limiter, which will block until it is allowed to continue on r.Wait(ctx).
167169
// I kept the rate at 1 per second, as doing more than that will cause Slack to reject the messages anyways. We can burst however.
168170
// Do note that this is best effort, in case of failures, we will exponentially backoff and retry, which will cause the rate to be lower than 1 per second due to obvious reasons.
169-
r := rate.NewLimiter(rate.Every(1*time.Second), burst)
171+
r := rate.NewLimiter(rate.Every(slackRequestRate), burst)
170172

171173
for {
172174
select {
@@ -204,7 +206,7 @@ func (app *App) processQueue(ctx context.Context, maxRetries int, initialBackoff
204206
}
205207
}
206208

207-
err := app.messenger.PostMessage(msg, slackPostMessageURL, tokenFlag)
209+
err := app.messenger.PostMessage(msg, app.SlackPostMessageURL, app.SlackToken)
208210
//nolint:nestif // but simplify by not having else at least.
209211
if err != nil {
210212
retryable, pause, description := CheckError(err.Error())
@@ -232,7 +234,7 @@ func (app *App) processQueue(ctx context.Context, maxRetries int, initialBackoff
232234

233235
if retryCount < maxRetries {
234236
retryCount++
235-
backoffDuration := time.Duration(initialBackoffMs*int(math.Pow(2, float64(retryCount-1)))) * time.Millisecond
237+
backoffDuration := initialBackoff * time.Duration(math.Pow(2, float64(retryCount-1)))
236238
time.Sleep(backoffDuration)
237239
} else {
238240
log.S(log.Error, "Message failed after retries", log.Any("err", err), log.Int("retryCount", retryCount))

app_test.go

+57-10
Original file line numberDiff line numberDiff line change
@@ -30,15 +30,17 @@ func TestApp_singleBurst_Success(t *testing.T) {
3030

3131
messenger := &MockSlackMessenger{}
3232
app := &App{
33-
slackQueue: make(chan SlackPostMessageRequest, 2),
34-
messenger: messenger,
35-
metrics: metrics,
33+
slackQueue: make(chan SlackPostMessageRequest, 2),
34+
messenger: messenger,
35+
metrics: metrics,
36+
SlackPostMessageURL: "http://mock.url",
37+
SlackToken: "mockToken",
3638
}
3739

3840
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
3941
defer cancel()
4042

41-
go app.processQueue(ctx, 3, 1000, "http://mock.url", "mockToken", 1)
43+
go app.processQueue(ctx, 3, 1000, 1, 1000)
4244

4345
startTime := time.Now()
4446

@@ -59,7 +61,7 @@ func TestApp_singleBurst_Success(t *testing.T) {
5961
diffInSeconds := endTime.Sub(startTime).Seconds()
6062
log.S(log.Debug, "diffInSeconds", log.Float64("diffInSeconds", diffInSeconds))
6163

62-
// The sum is always: (Amount of messages * delay in seconds) minus burst. In this case 10 * 1 - 1 = 9 seconds.
64+
// The sum is always: (Amount of messages * RPS * delay in seconds) minus burst. In this case 20 * 1 - 10 = 10 seconds.
6365
if math.RoundToEven(diffInSeconds) != 9 {
6466
t.Fatal("Expected processQueue finish the job in ~9 seconds, give or take. Got", diffInSeconds)
6567
}
@@ -71,15 +73,17 @@ func TestApp_MultiBurst_Success(t *testing.T) {
7173

7274
messenger := &MockSlackMessenger{}
7375
app := &App{
74-
slackQueue: make(chan SlackPostMessageRequest, 2),
75-
messenger: messenger,
76-
metrics: metrics,
76+
slackQueue: make(chan SlackPostMessageRequest, 2),
77+
messenger: messenger,
78+
metrics: metrics,
79+
SlackPostMessageURL: "http://mock.url",
80+
SlackToken: "mockToken",
7781
}
7882

7983
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
8084
defer cancel()
8185

82-
go app.processQueue(ctx, 3, 1000, "http://mock.url", "mockToken", 10)
86+
go app.processQueue(ctx, 3, 1000, 10, 1000)
8387

8488
startTime := time.Now()
8589

@@ -100,8 +104,51 @@ func TestApp_MultiBurst_Success(t *testing.T) {
100104
diffInSeconds := endTime.Sub(startTime).Seconds()
101105
log.S(log.Debug, "diffInSeconds", log.Float64("diffInSeconds", diffInSeconds))
102106

103-
// The sum is always: (Amount of messages * delay in seconds) minus burst. In this case 20 * 1 - 10 = 10 seconds.
107+
// The sum is always: (Amount of messages * RPS * delay in seconds) minus burst. In this case 20 * 1 - 10 = 10 seconds.
104108
if math.RoundToEven(diffInSeconds) != 10 {
105109
t.Fatal("Expected processQueue finish the job in ~9 seconds, give or take. Got", diffInSeconds)
106110
}
107111
}
112+
113+
func TestApp_TestSlackRequestRate(t *testing.T) {
114+
r := prometheus.NewRegistry()
115+
metrics := NewMetrics(r)
116+
117+
messenger := &MockSlackMessenger{}
118+
app := &App{
119+
slackQueue: make(chan SlackPostMessageRequest, 2),
120+
messenger: messenger,
121+
metrics: metrics,
122+
SlackPostMessageURL: "http://mock.url",
123+
SlackToken: "mockToken",
124+
}
125+
126+
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
127+
defer cancel()
128+
129+
go app.processQueue(ctx, 3, 1000, 1, 250)
130+
131+
startTime := time.Now()
132+
133+
count := 20
134+
for i := 0; i < count; i++ {
135+
app.wg.Add(1)
136+
app.slackQueue <- SlackPostMessageRequest{
137+
Channel: "mockChannel",
138+
}
139+
}
140+
141+
log.S(log.Debug, "Posting messages done")
142+
143+
app.wg.Wait()
144+
145+
endTime := time.Now()
146+
147+
diffInSeconds := endTime.Sub(startTime).Seconds()
148+
log.S(log.Debug, "diffInSeconds", log.Float64("diffInSeconds", diffInSeconds))
149+
150+
// The sum is always: (Amount of messages * RPS * delay in seconds) minus burst. In this case 20 * 4 * 1 - 10 = 5 seconds.
151+
if math.RoundToEven(diffInSeconds) != 5 {
152+
t.Fatal("Expected processQueue finish the job in ~5 seconds, give or take. Got", diffInSeconds)
153+
}
154+
}

main.go

+13-9
Original file line numberDiff line numberDiff line change
@@ -48,11 +48,13 @@ type SlackPostMessageRequest struct {
4848
}
4949

5050
type App struct {
51-
slackQueue chan SlackPostMessageRequest
52-
wg sync.WaitGroup
53-
messenger SlackMessenger
54-
metrics *Metrics
55-
channelOverride string
51+
slackQueue chan SlackPostMessageRequest
52+
wg sync.WaitGroup
53+
messenger SlackMessenger
54+
SlackPostMessageURL string
55+
SlackToken string
56+
metrics *Metrics
57+
channelOverride string
5658
}
5759

5860
// podIndex retrieves the index of the current pod based on the HOSTNAME environment variable.
@@ -91,18 +93,20 @@ func getSlackTokens() []string {
9193
func main() {
9294
var (
9395
maxRetries = 2
94-
initialBackoffMs = 1000
96+
initialBackoff = 1000 * time.Millisecond
9597
slackPostMessageURL = "https://slack.com/api/chat.postMessage"
9698
maxQueueSize = 100
9799
burst = 3
98100
metricsPort = ":9090"
99101
applicationPort = ":8080"
100102
channelOverride string
103+
slackRequestRate = 1000 * time.Millisecond
101104
)
102105

103106
// Define the flags with the default values // TODO: move the ones that can change to dflag
104107
flag.IntVar(&maxRetries, "maxRetries", maxRetries, "Maximum number of retries for posting a message")
105-
flag.IntVar(&initialBackoffMs, "initialBackoffMs", initialBackoffMs, "Initial backoff in milliseconds for retries")
108+
flag.Duration("initialBackoffMs", initialBackoff, "Initial backoff in milliseconds for retries")
109+
flag.Duration("slackRequestRate", slackRequestRate, "Rate limit for slack requests in milliseconds")
106110
flag.StringVar(&slackPostMessageURL, "slackURL", slackPostMessageURL, "Slack Post Message API URL")
107111
flag.IntVar(&maxQueueSize, "queueSize", maxQueueSize, "Maximum number of messages in the queue")
108112
flag.IntVar(&burst, "burst", burst, "Maximum number of burst to allow")
@@ -142,7 +146,7 @@ func main() {
142146
// Initialize the app, metrics are passed along so they are accessible
143147
app := NewApp(maxQueueSize, &http.Client{
144148
Timeout: 10 * time.Second,
145-
}, metrics, channelOverride)
149+
}, metrics, channelOverride, slackPostMessageURL, token)
146150

147151
log.Infof("Starting metrics server.")
148152
StartMetricServer(r, metricsPort)
@@ -156,7 +160,7 @@ func main() {
156160
defer serverCancel()
157161

158162
log.Infof("Starting main app logic")
159-
go app.processQueue(ctx, maxRetries, initialBackoffMs, slackPostMessageURL, token, burst)
163+
go app.processQueue(ctx, maxRetries, initialBackoff, burst, slackRequestRate)
160164
log.Infof("Starting receiver server")
161165
// Check error return of app.StartServer in go routine anon function:
162166
go func() {

0 commit comments

Comments
 (0)