You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the first prompt engineering task for this "mostly prompt engineering" project.
Build a library consisting of prompts that reliably does the following tasks when working with llama3.2vision-11b or llama3.3vision-11b:
extract or calculate the grand total on a receipt
extract the nth row off a receipt
based on a set of categories description/sub-prompt - classify which category a particular row belongs to
based on a set of categories description/sub-prompt - extract all rows that matches that category
based on a set of receipts description/sub-prompt - classify which type a receipt belongs to (Uber, FamilyMart, Didi, Tsco, Khetlaji , Reliance and so on)
The prompts in the library must be designed to be stackable - able to be combined and used one after another - in one single completion call to the LLM.
The prompts in the library must be designed to be composable as much as possible - able to be combined and used in one single call to perform a complex task -- but not necessarily combined in a stacked-linear fashion.
Make sure you have unit tests for every single prompt in the library. reliability should be defined as a 70% pass when 10 diverse receipts are used for the test dataset. (let's use 10 for this - of course you can propose 100 if you like)
Note that there is no limit to the length and complexity of the prompts in the library.
Looking for actual PoC implementation. Welcoming PRs.
The text was updated successfully, but these errors were encountered:
This is the first prompt engineering task for this "mostly prompt engineering" project.
Build a library consisting of prompts that reliably does the following tasks when working with llama3.2vision-11b or llama3.3vision-11b:
The prompts in the library must be designed to be stackable - able to be combined and used one after another - in one single completion call to the LLM.
The prompts in the library must be designed to be composable as much as possible - able to be combined and used in one single call to perform a complex task -- but not necessarily combined in a stacked-linear fashion.
Make sure you have unit tests for every single prompt in the library. reliability should be defined as a 70% pass when 10 diverse receipts are used for the test dataset. (let's use 10 for this - of course you can propose 100 if you like)
Note that there is no limit to the length and complexity of the prompts in the library.
Looking for actual PoC implementation. Welcoming PRs.
The text was updated successfully, but these errors were encountered: