Skip to content

Commit 564325b

Browse files
authored
Merge pull request #1507 from zc277584121/master
mindsdb integration
2 parents 4b404ca + 4fd3003 commit 564325b

File tree

1 file changed

+148
-0
lines changed

1 file changed

+148
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
# Integrate Milvus with MindsDB
2+
3+
[MindsDB](https://docs.mindsdb.com/what-is-mindsdb) is a powerful tool for integrating AI applications with diverse enterprise data sources. It acts as a federated query engine that brings order to data sprawl while meticulously answering queries across both structured and unstructured data. Whether your data is scattered across SaaS applications, databases, or data warehouses, MindsDB can connect and query it all using standard SQL. It features state-of-the-art autonomous RAG systems through Knowledge Bases, supports hundreds of data sources, and provides flexible deployment options from local development to cloud environments.
4+
5+
This tutorial demonstrates how to integrate Milvus with MindsDB, enabling you to leverage MindsDB's AI capabilities with Milvus's vector database functionality through SQL-like operations for managing and querying vector embeddings.
6+
7+
8+
> This tutorial mainly refers to the official documentation of the [MindsDB Milvus Handler](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/milvus_handler). If you find any outdated parts in this tutorial, you can prioritize following the official documentation and create an issue for us.
9+
10+
11+
## Install MindsDB
12+
13+
Before we start, install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop).
14+
15+
Before proceeding, ensure you have a solid understanding of the fundamental concepts and operations of both MindsDB and Milvus.
16+
17+
18+
## Arguments Introduction
19+
The required arguments to establish a connection are:
20+
21+
* `uri`: uri for milvus database, can be set to local ".db" file or docker or cloud service
22+
* `token`: token to support docker or cloud service according to uri option
23+
24+
The optional arguments to establish a connection are:
25+
26+
These are used for `SELECT` queries:
27+
* `search_default_limit`: default limit to be passed in select statements (default=100)
28+
* `search_metric_type`: metric type used for searches (default="L2")
29+
* `search_ignore_growing`: whether to ignore growing segments during similarity searches (default=False)
30+
* `search_params`: specific to the `search_metric_type` (default={"nprobe": 10})
31+
32+
These are used for `CREATE` queries:
33+
* `create_auto_id`: whether to auto generate id when inserting records with no ID (default=False)
34+
* `create_id_max_len`: maximum length of the id field when creating a table (default=64)
35+
* `create_embedding_dim`: embedding dimension for creating table (default=8)
36+
* `create_dynamic_field`: whether or not the created tables have dynamic fields or not (default=True)
37+
* `create_content_max_len`: max length of the content column (default=200)
38+
* `create_content_default_value`: default value of content column (default='')
39+
* `create_schema_description`: description of the created schemas (default='')
40+
* `create_alias`: alias of the created schemas (default='default')
41+
* `create_index_params`: parameters of the index created on embeddings column (default={})
42+
* `create_index_metric_type`: metric used to create the index (default='L2')
43+
* `create_index_type`: the type of index (default='AUTOINDEX')
44+
45+
46+
## Usage
47+
48+
Before continuing, make sure that `pymilvus` version is same as this [pinned version](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/milvus_handler/requirements.txt). If you find any issues with version compatibility, you can roll back your version of pymilvus, or customize it in this [requirement file](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/milvus_handler).
49+
50+
### Creating connection
51+
52+
In order to make use of this handler and connect to a Milvus server in MindsDB, the following syntax can be used:
53+
54+
```sql
55+
CREATE DATABASE milvus_datasource
56+
WITH
57+
ENGINE = 'milvus',
58+
PARAMETERS = {
59+
"uri": "./milvus_local.db",
60+
"token": "",
61+
"create_embedding_dim": 3,
62+
"create_auto_id": true
63+
};
64+
```
65+
66+
> - If you only need a local vector database for small scale data or prototyping, setting the uri as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file.
67+
> - For larger scale data and traffic in production, you can set up a Milvus server on [Docker or Kubernetes](https://milvus.io/docs/install-overview.md). In this setup, please use the server address and port as your `uri`, e.g.`http://localhost:19530`. If you enable the authentication feature on Milvus, set the `token` as `"<your_username>:<your_password>"`, otherwise there is no need to set the token.
68+
> - You can also use fully managed Milvus on [Zilliz Cloud](https://zilliz.com/cloud). Simply set the `uri` and `token` to the [Public Endpoint and API key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#cluster-details) of your Zilliz Cloud instance.
69+
70+
71+
### Dropping connection
72+
73+
To drop the connection, use this command
74+
75+
```sql
76+
DROP DATABASE milvus_datasource;
77+
```
78+
79+
### Creating tables
80+
81+
To insert data from a pre-existing table, use `CREATE`
82+
83+
```sql
84+
CREATE TABLE milvus_datasource.test
85+
(SELECT * FROM sqlitedb.test);
86+
```
87+
88+
### Dropping collections
89+
90+
Dropping a collection is not supported
91+
92+
### Querying and selecting
93+
94+
To query database using a search vector, you can use `search_vector` in `WHERE` clause
95+
96+
Caveats:
97+
- If you omit `LIMIT`, the `search_default_limit` is used since Milvus requires it
98+
- Metadata column is not supported, but if the collection has dynamic schema enabled, you can query like normal, see the example below
99+
- Dynamic fields cannot be displayed but can be queried
100+
101+
```sql
102+
SELECT * from milvus_datasource.test
103+
WHERE search_vector = '[3.0, 1.0, 2.0, 4.5]'
104+
LIMIT 10;
105+
```
106+
107+
If you omit the `search_vector`, this becomes a basic search and `LIMIT` or `search_default_limit` amount of entries in collection are returned
108+
109+
```sql
110+
SELECT * from milvus_datasource.test
111+
```
112+
113+
You can use `WHERE` clause on dynamic fields like normal SQL
114+
115+
```sql
116+
SELECT * FROM milvus_datasource.createtest
117+
WHERE category = "science";
118+
```
119+
120+
### Deleting records
121+
122+
You can delete entries using `DELETE` just like in SQL.
123+
124+
Caveats:
125+
- Milvus only supports deleting entities with clearly specified primary keys
126+
- You can only use `IN` operator
127+
128+
```sql
129+
DELETE FROM milvus_datasource.test
130+
WHERE id IN (1, 2, 3);
131+
```
132+
133+
### Inserting records
134+
135+
You can also insert individual rows like so:
136+
137+
```sql
138+
INSERT INTO milvus_test.testable (id,content,metadata,embeddings)
139+
VALUES ("id3", 'this is a test', '{"test": "test"}', '[1.0, 8.0, 9.0]');
140+
```
141+
142+
### Updating
143+
144+
Updating records is not supported by Milvus API. You can try using combination of `DELETE` and `INSERT`
145+
146+
---
147+
148+
For more details and examples, please refer to the [MindsDB Official Documentation](https://docs.mindsdb.com/what-is-mindsdb).

0 commit comments

Comments
 (0)