|
| 1 | +# Integrate Milvus with MindsDB |
| 2 | + |
| 3 | +[MindsDB](https://docs.mindsdb.com/what-is-mindsdb) is a powerful tool for integrating AI applications with diverse enterprise data sources. It acts as a federated query engine that brings order to data sprawl while meticulously answering queries across both structured and unstructured data. Whether your data is scattered across SaaS applications, databases, or data warehouses, MindsDB can connect and query it all using standard SQL. It features state-of-the-art autonomous RAG systems through Knowledge Bases, supports hundreds of data sources, and provides flexible deployment options from local development to cloud environments. |
| 4 | + |
| 5 | +This tutorial demonstrates how to integrate Milvus with MindsDB, enabling you to leverage MindsDB's AI capabilities with Milvus's vector database functionality through SQL-like operations for managing and querying vector embeddings. |
| 6 | + |
| 7 | + |
| 8 | +> This tutorial mainly refers to the official documentation of the [MindsDB Milvus Handler](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/milvus_handler). If you find any outdated parts in this tutorial, you can prioritize following the official documentation and create an issue for us. |
| 9 | +
|
| 10 | + |
| 11 | +## Install MindsDB |
| 12 | + |
| 13 | +Before we start, install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). |
| 14 | + |
| 15 | +Before proceeding, ensure you have a solid understanding of the fundamental concepts and operations of both MindsDB and Milvus. |
| 16 | + |
| 17 | + |
| 18 | +## Arguments Introduction |
| 19 | +The required arguments to establish a connection are: |
| 20 | + |
| 21 | +* `uri`: uri for milvus database, can be set to local ".db" file or docker or cloud service |
| 22 | +* `token`: token to support docker or cloud service according to uri option |
| 23 | + |
| 24 | +The optional arguments to establish a connection are: |
| 25 | + |
| 26 | +These are used for `SELECT` queries: |
| 27 | +* `search_default_limit`: default limit to be passed in select statements (default=100) |
| 28 | +* `search_metric_type`: metric type used for searches (default="L2") |
| 29 | +* `search_ignore_growing`: whether to ignore growing segments during similarity searches (default=False) |
| 30 | +* `search_params`: specific to the `search_metric_type` (default={"nprobe": 10}) |
| 31 | + |
| 32 | +These are used for `CREATE` queries: |
| 33 | +* `create_auto_id`: whether to auto generate id when inserting records with no ID (default=False) |
| 34 | +* `create_id_max_len`: maximum length of the id field when creating a table (default=64) |
| 35 | +* `create_embedding_dim`: embedding dimension for creating table (default=8) |
| 36 | +* `create_dynamic_field`: whether or not the created tables have dynamic fields or not (default=True) |
| 37 | +* `create_content_max_len`: max length of the content column (default=200) |
| 38 | +* `create_content_default_value`: default value of content column (default='') |
| 39 | +* `create_schema_description`: description of the created schemas (default='') |
| 40 | +* `create_alias`: alias of the created schemas (default='default') |
| 41 | +* `create_index_params`: parameters of the index created on embeddings column (default={}) |
| 42 | +* `create_index_metric_type`: metric used to create the index (default='L2') |
| 43 | +* `create_index_type`: the type of index (default='AUTOINDEX') |
| 44 | + |
| 45 | + |
| 46 | +## Usage |
| 47 | + |
| 48 | +Before continuing, make sure that `pymilvus` version is same as this [pinned version](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/milvus_handler/requirements.txt). If you find any issues with version compatibility, you can roll back your version of pymilvus, or customize it in this [requirement file](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/milvus_handler). |
| 49 | + |
| 50 | +### Creating connection |
| 51 | + |
| 52 | +In order to make use of this handler and connect to a Milvus server in MindsDB, the following syntax can be used: |
| 53 | + |
| 54 | +```sql |
| 55 | +CREATE DATABASE milvus_datasource |
| 56 | +WITH |
| 57 | + ENGINE = 'milvus', |
| 58 | + PARAMETERS = { |
| 59 | + "uri": "./milvus_local.db", |
| 60 | + "token": "", |
| 61 | + "create_embedding_dim": 3, |
| 62 | + "create_auto_id": true |
| 63 | +}; |
| 64 | +``` |
| 65 | + |
| 66 | +> - If you only need a local vector database for small scale data or prototyping, setting the uri as a local file, e.g.`./milvus.db`, is the most convenient method, as it automatically utilizes [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to store all data in this file. |
| 67 | +> - For larger scale data and traffic in production, you can set up a Milvus server on [Docker or Kubernetes](https://milvus.io/docs/install-overview.md). In this setup, please use the server address and port as your `uri`, e.g.`http://localhost:19530`. If you enable the authentication feature on Milvus, set the `token` as `"<your_username>:<your_password>"`, otherwise there is no need to set the token. |
| 68 | +> - You can also use fully managed Milvus on [Zilliz Cloud](https://zilliz.com/cloud). Simply set the `uri` and `token` to the [Public Endpoint and API key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#cluster-details) of your Zilliz Cloud instance. |
| 69 | +
|
| 70 | + |
| 71 | +### Dropping connection |
| 72 | + |
| 73 | +To drop the connection, use this command |
| 74 | + |
| 75 | +```sql |
| 76 | +DROP DATABASE milvus_datasource; |
| 77 | +``` |
| 78 | + |
| 79 | +### Creating tables |
| 80 | + |
| 81 | +To insert data from a pre-existing table, use `CREATE` |
| 82 | + |
| 83 | +```sql |
| 84 | +CREATE TABLE milvus_datasource.test |
| 85 | +(SELECT * FROM sqlitedb.test); |
| 86 | +``` |
| 87 | + |
| 88 | +### Dropping collections |
| 89 | + |
| 90 | +Dropping a collection is not supported |
| 91 | + |
| 92 | +### Querying and selecting |
| 93 | + |
| 94 | +To query database using a search vector, you can use `search_vector` in `WHERE` clause |
| 95 | + |
| 96 | +Caveats: |
| 97 | +- If you omit `LIMIT`, the `search_default_limit` is used since Milvus requires it |
| 98 | +- Metadata column is not supported, but if the collection has dynamic schema enabled, you can query like normal, see the example below |
| 99 | +- Dynamic fields cannot be displayed but can be queried |
| 100 | + |
| 101 | +```sql |
| 102 | +SELECT * from milvus_datasource.test |
| 103 | +WHERE search_vector = '[3.0, 1.0, 2.0, 4.5]' |
| 104 | +LIMIT 10; |
| 105 | +``` |
| 106 | + |
| 107 | +If you omit the `search_vector`, this becomes a basic search and `LIMIT` or `search_default_limit` amount of entries in collection are returned |
| 108 | + |
| 109 | +```sql |
| 110 | +SELECT * from milvus_datasource.test |
| 111 | +``` |
| 112 | + |
| 113 | +You can use `WHERE` clause on dynamic fields like normal SQL |
| 114 | + |
| 115 | +```sql |
| 116 | +SELECT * FROM milvus_datasource.createtest |
| 117 | +WHERE category = "science"; |
| 118 | +``` |
| 119 | + |
| 120 | +### Deleting records |
| 121 | + |
| 122 | +You can delete entries using `DELETE` just like in SQL. |
| 123 | + |
| 124 | +Caveats: |
| 125 | +- Milvus only supports deleting entities with clearly specified primary keys |
| 126 | +- You can only use `IN` operator |
| 127 | + |
| 128 | +```sql |
| 129 | +DELETE FROM milvus_datasource.test |
| 130 | +WHERE id IN (1, 2, 3); |
| 131 | +``` |
| 132 | + |
| 133 | +### Inserting records |
| 134 | + |
| 135 | +You can also insert individual rows like so: |
| 136 | + |
| 137 | +```sql |
| 138 | +INSERT INTO milvus_test.testable (id,content,metadata,embeddings) |
| 139 | +VALUES ("id3", 'this is a test', '{"test": "test"}', '[1.0, 8.0, 9.0]'); |
| 140 | +``` |
| 141 | + |
| 142 | +### Updating |
| 143 | + |
| 144 | +Updating records is not supported by Milvus API. You can try using combination of `DELETE` and `INSERT` |
| 145 | + |
| 146 | +--- |
| 147 | + |
| 148 | +For more details and examples, please refer to the [MindsDB Official Documentation](https://docs.mindsdb.com/what-is-mindsdb). |
0 commit comments