Skip to content

Commit 3cb7304

Browse files
committed
fix style
1 parent c5aa10c commit 3cb7304

File tree

1 file changed

+6
-6
lines changed

1 file changed

+6
-6
lines changed

docs/user_guides/fs/vector_similarity_search.md

+6-6
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ from hsfs import embedding
1717
emb = embedding.EmbeddingIndex(index_name="news_fg")
1818
```
1919

20-
Then, add one or more embedding features to the index. Name and dimension of the embedding features are required for identifying which features should be indexed for k-nearest neighbor (KNN) search. In this example, we get the dimension of the embedding by taking the length of the value of the `embedding_heading` column in the first row of the dataframe `df`. Optionally, you can specify the [similarity function](TODO: add link).
20+
Then, add one or more embedding features to the index. Name and dimension of the embedding features are required for identifying which features should be indexed for k-nearest neighbor (KNN) search. In this example, we get the dimension of the embedding by taking the length of the value of the `embedding_heading` column in the first row of the dataframe `df`. Optionally, you can specify the similarity function.
2121
```aidl
2222
# Add embedding feature to the index
2323
emb.add_embedding("embedding_heading", len(df["embedding_heading"][0]))
@@ -62,7 +62,7 @@ news_fg.as_of(time_in_past).read()
6262
# Querying Similar Embeddings with Additional features
6363

6464
You can also use similarity search for vector embedding features in feature views.
65-
In the code snippet below, we create a feature view by selecting features from the earlier `news_fg` and a new feature group `view_fg`. If you include a feature group with vector embedding features in a feature view, *regardless* if the vector embedding features are selected or not, you can call `find_neighbors` on the feature view, and it will return rows containing all the feature values in the feature view. In the example below, a list of `heading` and `view_cnt` will be returned for the news articles which are closet to provided `news_description`.
65+
In the code snippet below, we create a feature view by selecting features from the earlier `news_fg` and a new feature group `view_fg`. If you include a feature group with vector embedding features in a feature view, **whether or not the vector embedding features are selected**, you can call `find_neighbors` on the feature view, and it will return rows containing all the feature values in the feature view. In the example below, a list of `heading` and `view_cnt` will be returned for the news articles which are closet to provided `news_description`.
6666

6767
```aidl
6868
view_fg = fs.get_or_create_feature_group(
@@ -80,7 +80,7 @@ fv = fs.get_or_create_feature_view(
8080
fv.find_neighbors(model.encode(news_description), k=5)
8181
```
8282

83-
Note that you can use similarity search from the feature view only if the feature group which you are querying with `find_neighbors` has all the primary keys of the other feature groups. In the example above, you are querying against the feature group `news_fg` which has the vector embedding features, and it has the feature "news_id" which is the primary key of the feature group `view_fg`. But if `page_fg` is used as illustrated below, `find_neighbors` will fail to return any features because primary key `page_id` does not exist in `news_fg`.
83+
Note that you can use similarity search from the feature view **only if** the feature group which you are querying with `find_neighbors` has **all** the primary keys of the other feature groups. In the example above, you are querying against the feature group `news_fg` which has the vector embedding features, and it has the feature "news_id" which is the primary key of the feature group `view_fg`. But if `page_fg` is used as illustrated below, `find_neighbors` will fail to return any features because primary key `page_id` does not exist in `news_fg`.
8484

8585
<p align="center">
8686
<figure>
@@ -95,16 +95,16 @@ fv.get_feature_vector({"news_id": 1})
9595
```
9696

9797
# Best Practices
98-
1. Choose the Appropriate Online Feature Stores
98+
## Choose the Appropriate Online Feature Stores
9999

100100
There are 2 types of online feature stores in Hopsworks: online store (RonDB) and vector store (Opensearch). Online store is designed for retrieving feature vectors efficiently with low latency. Vector store is designed for finding similar embedding efficiently. If similarity search is not required, using online store is recommended for low latency retrieval of feature values including embedding.
101101

102102
# Performance considerations for Feature Groups with Embeddings
103-
1. Choose features for vector store
103+
## Choose Features for Vector Store
104104

105105
While it is possible to update feature value in vector store, updating feature value in online store is more efficient. If you have features which are frequently being updated and do not require for filtering, consider storing them separately in a different feature group. As shown in the previous example, `view_cnt` is updated frequently and stored separately. You can then get all the required features by using feature view.
106106

107-
2. Use new index per feature group
107+
## Use New Index per Feature Group
108108

109109
Create a new index per feature group to optimize retrieval performance.
110110

0 commit comments

Comments
 (0)