Skip to content

Commit

Permalink
Add the forwarding server with updated documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
lolipopshock committed Apr 10, 2024
1 parent 69b2138 commit 9601e7a
Show file tree
Hide file tree
Showing 2 changed files with 38 additions and 0 deletions.
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,9 @@ It uses the `scripts/evaluation/generic/deferral_generate.sh` script with three
2. Evaluate the accuracy and pick the best deferral threshold
3. Run Co-LLM generation with the optimal threshold.

We also need to run the `forward.js` script to start a javascript-based batch request forwarding server: `node forward.js`.
It will be terribly slow to batch send async requests to the `vllm` server in Python (during the threshold searching stage); so we made this javascript based forwarding server to speed up this process.
The dependency is very minimal -- you only need to install `express` and `axios` with `npm install express axios`.

## Acknowledgement

Expand Down
35 changes: 35 additions & 0 deletions forward.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
const express = require('express');
const axios = require('axios');
const app = express();

app.use(express.json());

app.post('/batch_generate', async (req, res) => {
const { forward_url, prompt_token_id_list, n, top_p, top_k, temperature, max_tokens, stop, logprobs } = req.body;

console.log("forward_url", forward_url);
try {
const promises = prompt_token_id_list.map(prompt_token_id =>
axios.post(forward_url, {
"prompt_token_ids": prompt_token_id,
n,
top_p,
top_k,
temperature,
max_tokens,
stop,
logprobs,
})
);

const responses = await Promise.all(promises);
const combinedResults = responses.map(r => r.data);

res.json(combinedResults);
} catch (error) {
res.status(500).send(error.toString());
}
});

const PORT = process.env.PORT || 8003;
app.listen(PORT, () => console.log(`Server is running on port ${PORT}`));

0 comments on commit 9601e7a

Please sign in to comment.