You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+20-10
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,11 @@
1
1
# Hnswlib - fast approximate nearest neighbor search
2
-
Header-only C++ HNSW implementation with python bindings. Paper's code for the HNSW 200M SIFT experiment
2
+
Header-only C++ HNSW implementation with python bindings.
3
3
4
4
**NEWS:**
5
5
6
+
***Hnswlib is now 0.5.2**. Bugfixes - thanks [@marekhanus](https://github.com/marekhanus) for fixing the missing arguments, adding support for python 3.8, 3.9 in Travis, improving python wrapper and fixing typos/code style; [@apoorv-sharma](https://github.com/apoorv-sharma) for fixing the bug int the insertion/deletion logic; [@shengjun1985](https://github.com/shengjun1985) for simplifying the memory reallocation logic; [@TakaakiFuruse](https://github.com/TakaakiFuruse) for improved description of `add_items`; [@psobot](https://github.com/psobot) for improving error handling; [@ShuAiii](https://github.com/ShuAiii) for reporting the bug in the python interface
6
7
7
-
***hnswlib is now 0.5.0. Added support for pickling indices, support for PEP-517 and PEP-518 building, small speedups, bug and documentation fixes. Many thanks to [@dbespalov](https://github.com/dbespalov), [@dyashuni](https://github.com/dyashuni), [@groodt](https://github.com/groodt),[@uestc-lfs](https://github.com/uestc-lfs), [@vinnitu](https://github.com/vinnitu), [@fabiencastan](https://github.com/fabiencastan), [@JinHai-CN](https://github.com/JinHai-CN), [@js1010](https://github.com/js1010)!**
8
+
***Hnswlib is now 0.5.0**. Added support for pickling indices, support for PEP-517 and PEP-518 building, small speedups, bug and documentation fixes. Many thanks to [@dbespalov](https://github.com/dbespalov), [@dyashuni](https://github.com/dyashuni), [@groodt](https://github.com/groodt),[@uestc-lfs](https://github.com/uestc-lfs), [@vinnitu](https://github.com/vinnitu), [@fabiencastan](https://github.com/fabiencastan), [@JinHai-CN](https://github.com/JinHai-CN), [@js1010](https://github.com/js1010)!
8
9
9
10
***Thanks to Apoorv Sharma [@apoorv-sharma](https://github.com/apoorv-sharma), hnswlib now supports true element updates (the interface remained the same, but when you the performance/memory should not degrade as you update the element embeddings).**
10
11
@@ -41,18 +42,18 @@ For other spaces use the nmslib library https://github.com/nmslib/nmslib.
41
42
*`hnswlib.Index(space, dim)` creates a non-initialized index an HNSW in space `space` with integer dimension `dim`.
42
43
43
44
`hnswlib.Index` methods:
44
-
*`init_index(max_elements, ef_construction = 200, M = 16, random_seed = 100)` initializes the index from with no elements.
45
+
*`init_index(max_elements, M = 16, ef_construction = 200, random_seed = 100)` initializes the index from with no elements.
45
46
*`max_elements` defines the maximum number of elements that can be stored in the structure(can be increased/shrunk).
46
47
*`ef_construction` defines a construction time/accuracy trade-off (see [ALGO_PARAMS.md](ALGO_PARAMS.md)).
47
48
*`M` defines tha maximum number of outgoing connections in the graph ([ALGO_PARAMS.md](ALGO_PARAMS.md)).
48
49
49
-
*`add_items(data, data_labels, num_threads = -1)` - inserts the `data`(numpy array of vectors, shape:`N*dim`) into the structure.
50
-
*`labels` is an optional N-size numpy array of integer labels for all elements in `data`.
50
+
*`add_items(data, ids, num_threads = -1)` - inserts the `data`(numpy array of vectors, shape:`N*dim`) into the structure.
51
51
*`num_threads` sets the number of cpu threads to use (-1 means use default).
52
-
*`data_labels` specifies the labels for the data. If index already has the elements with the same labels, their features will be updated. Note that update procedure is slower than insertion of a new element, but more memory- and query-efficient.
52
+
*`ids` are optional N-size numpy array of integer labels for all elements in `data`.
53
+
- If index already has the elements with the same labels, their features will be updated. Note that update procedure is slower than insertion of a new element, but more memory- and query-efficient.
53
54
* Thread-safe with other `add_items` calls, but not with `knn_query`.
54
55
55
-
*`mark_deleted(data_label)` - marks the element as deleted, so it will be omitted from search results.
56
+
*`mark_deleted(label)` - marks the element as deleted, so it will be omitted from search results.
56
57
57
58
*`resize_index(new_size)` - changes the maximum capacity of the index. Not thread safe with `add_items` and `knn_query`.
58
59
@@ -113,7 +114,7 @@ num_elements = 10000
113
114
114
115
# Generating sample data
115
116
data = np.float32(np.random.random((num_elements, dim)))
116
-
data_labels= np.arange(num_elements)
117
+
ids= np.arange(num_elements)
117
118
118
119
# Declaring index
119
120
p = hnswlib.Index(space='l2', dim= dim) # possible options are l2, cosine or ip
@@ -122,7 +123,7 @@ p = hnswlib.Index(space = 'l2', dim = dim) # possible options are l2, cosine or
# Element insertion (can be called several times):
125
-
p.add_items(data, data_labels)
126
+
p.add_items(data, ids)
126
127
127
128
# Controlling the recall by setting ef:
128
129
p.set_ef(50) # ef should always be > k
@@ -295,4 +296,13 @@ To run test **with** updates (from `build` directory)
295
296
296
297
### References
297
298
298
-
Malkov, Yu A., and D. A. Yashunin. "Efficient and robust approximate nearest neighbor search using Hierarchical Navigable Small World graphs." TPAMI, preprint: https://arxiv.org/abs/1603.09320
299
+
@article{malkov2018efficient,
300
+
title={Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs},
301
+
author={Malkov, Yu A and Yashunin, Dmitry A},
302
+
journal={IEEE transactions on pattern analysis and machine intelligence},
0 commit comments