Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I would like you to add an option that allows me to specify AmazonOpenSearch for the Knowledge (Vector DB). #33

Open
tatsuyoshi-mashiko opened this issue Feb 6, 2025 · 4 comments

Comments

@tatsuyoshi-mashiko
Copy link
Contributor

Vector DB is RDS by default, but I would like to see an option to select Amazon OpenSearchServerless added.

OpenSearchServerless is a very useful option for Vector DB because it enables a very good hybrid search.

It is definitely a valid option when building an AI execution environment (RAG) on AWS.

At present, it seems that OpenSearchServerless cannot be selected as an option, so it seems that the only way is to build Amazon Bedrock separately and connect it using Dify's custom tool. (I would like to be able to select Amazon OpenSearchServerless for deployment of the relevant CDK, as it is time-consuming and labor-intensive.)

To begin with, it is unclear whether OpenSearchServerless can be set up for the knowledge (vector DB) of Dify's Community Edition...

@tatsuyoshi-mashiko tatsuyoshi-mashiko changed the title I want to be able to specify Amazon OpenSearch for the vector DB. I would like you to add an option that allows me to specify AmazonOpenSearch for the vector DB. Feb 6, 2025
@tatsuyoshi-mashiko tatsuyoshi-mashiko changed the title I would like you to add an option that allows me to specify AmazonOpenSearch for the vector DB. I would like you to add an option that allows me to specify AmazonOpenSearch for the Knowledge (Vector DB). Feb 6, 2025
@tmokmss
Copy link
Contributor

tmokmss commented Feb 7, 2025

Hi @tatsuyoshi-mashiko thank you for the feature request!

Do you have any particular example of when opensearch works better while pgvector misbehaves?

A current workaround would be to use Bedrock Knowledge Base, which automatically provisions OpenSearch without configuration and can be referenced from Dify by both knowledge and a tool.

@tatsuyoshi-mashiko
Copy link
Contributor Author

@tmokmss Thank you for your reply.
There is no objective quantitative data or evidence to suggest that Amazon OpenSearch is superior to Postgreth, which is what you are looking for.

We assume that Amazon OpenSearch will be effective in hybrid searches because the full-text search accuracy of ElasticSearch, which is the basis of Amazon OpenSearch, is excellent. Therefore, this is a qualitative evaluation based on our past development cases.

@tatsuyoshi-mashiko
Copy link
Contributor Author

@tmokmss
I feel that it should be possible to choose optionally, as it is pointless to argue about which is better, as it requires things like use cases, etc., but what do you think?

@tmokmss
Copy link
Contributor

tmokmss commented Feb 7, 2025

Yes it is definitely better to have OpenSearch availability. Since the current postgres implementation works and users can use Bedrock Knowledge Base with OpenSearch, I put less priority on this issue . Of course PR is welcome! Thanks.

remote-swe-user pushed a commit to remote-swe-user/dify-self-hosted-on-aws that referenced this issue Feb 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants