Skip to content

Commit 20503e5

Browse files
committed
feedback fix
1 parent a585081 commit 20503e5

File tree

4 files changed

+16
-3
lines changed

4 files changed

+16
-3
lines changed

docs/concepts/fs/feature_group/external_fg.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
External feature groups are offline feature groups where their data is stored in an external table. An external table requires a storage connector, defined with the Connector API (or more typically in the user interface), to enable HSFS to retrieve data from the external table. An external table includes a user-defined SQL string for retrieving data, but you also perform SQL operations, including projections, aggregations, and so on. The SQL query is executed on-demand when HSFS retrieves data from the external Feature Group, for example, when creating training data using features in the external table.
1+
External feature groups are offline feature groups where their data is stored in an external table. An external table requires a storage connector, defined with the Connector API (or more typically in the user interface), to enable HSFS to retrieve data from the external table. An external feature group doesn't allow for offline data ingestion or modification; instead, it includes a user-defined SQL string for retrieving data. You can also perform SQL operations, including projections, aggregations, and so on. The SQL query is executed on-demand when HSFS retrieves data from the external Feature Group, for example, when creating training data using features in the external table.
22

33
In the image below, we can see that HSFS currently supports a large number of data sources, including any JDBC-enabled source, Snowflake, Data Lake, Redshift, BigQuery, S3, ADLS, GCS, and Kafka
44

docs/concepts/fs/feature_group/fg_overview.md

+11-1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,16 @@ A feature group is a table of features, where each feature group has a primary k
77

88
### Online and offline Storage
99

10-
Feature groups can be stored in a low-latency "online" database and/or in low cost, high throughput "offline" storage, typically a data lake or data warehouse. The online store stores only the latest values of features for a feature group. It is used to serve pre-computed features to models at runtime. The offline store stores the historical values of features for a feature group, so it may store many times more data than the online store. Offline feature groups are used, typically, to create training data for models, but also to retrieve data for batch scoring of models:
10+
Feature groups can be stored in a low-latency "online" database and/or in low cost, high throughput "offline" storage, typically a data lake or data warehouse.
1111

1212
<img src="../../../../assets/images/concepts/fs/feature-storage.svg">
13+
14+
#### Online Storage
15+
16+
The online store stores only the latest values of features for a feature group. It is used to serve pre-computed features to models at runtime.
17+
18+
#### Offline Storage
19+
20+
The offline store stores the historical values of features for a feature group so that it may store much more data than the online store. Offline feature groups are used, typically, to create training data for models, but also to retrieve data for batch scoring of models.
21+
22+
In most cases, offline data is stored in Hopsworks, but through the implementation of storage connectors, it can reside in an external file system. The externally stored data can be managed by Hopsworks by defining ordinary feature groups or it can be used for reading only by defining [External Feature Group](external_fg.md).

docs/user_guides/fs/feature_group/create.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ When you create a feature group, you can specify the table format you want to us
8787

8888
##### Storage connector
8989

90-
During the creation of a feature group, it is possible to define the `storage_connector` parameter, this allows for saving of offline data in the desired table format outside the Hopsworks cluster. Currently, only [S3](../storage_connector/creation/s3.md) connectors are supported.
90+
During the creation of a feature group, it is possible to define the `storage_connector` parameter, this allows for management of offline data in the desired table format outside the Hopsworks cluster. Currently, only [S3](../storage_connector/creation/s3.md) connectors and "DELTA" `time_travel_format` format is supported.
9191

9292

9393
#### Streaming Write API

docs/user_guides/integrations/databricks/configuration.md

+3
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,9 @@ During the cluster configuration the following steps will be taken:
8080
- Install `hsfs` python library
8181
- Configure the necessary Spark properties to authenticate and communicate with the Feature Store
8282

83+
!!! note "HopsFS configuration"
84+
It is not necessary to configure HopsFS if data is stored outside the Hopsworks file system. To do this define [Storage Connectors](../../fs/storage_connector/index.md) and link them to [Feature Groups](../../fs/feature_group/create.md) and [Training Datasets](../../fs/feature_view/training-data.md).
85+
8386
When a cluster is configured for a specific project user, all the operations with the Hopsworks Feature Store will be executed as that project user. If another user needs to re-use the same cluster, the cluster can be reconfigured by following the same steps above.
8487

8588
## Connecting to the Feature Store

0 commit comments

Comments
 (0)