Skip to content

Commit a6bfb01

Browse files
committed
[HWORKS-742] Update documentation on external feature groups limitations (#363)
1 parent d96f13b commit a6bfb01

File tree

1 file changed

+30
-4
lines changed

1 file changed

+30
-4
lines changed

docs/user_guides/fs/feature_group/create_external.md

+30-4
Original file line numberDiff line numberDiff line change
@@ -93,13 +93,39 @@ In the snippet above it's important that the created metadata object gets regist
9393
fg.save()
9494
```
9595

96-
### Limitations
96+
### Enable online storage
9797

98-
Hopsworks Feature Store does not support time-travel capabilities for external feature groups. Moreover, as the data resides on external systems, external feature groups cannot be made available online for low latency serving. To make data from an external feature group available online, users need to define an online enabled feature group and have a job that periodically reads data from the external feature group and writes in the online feature group.
98+
You can enable online storage for external feature groups, however, the sync from the external storage to Hopsworks online storage is not automatic and needs to be setup manually. For an external feature group to be available online, during the creation of the feature group, the `online_enabled` option needs to be set to `True`.
9999

100-
!!! warning "Python support"
100+
=== "Python"
101+
102+
```python
103+
external_fg = fs.create_external_feature_group(
104+
name="sales",
105+
version=1,
106+
description="Physical shop sales features",
107+
query=query,
108+
storage_connector=connector,
109+
primary_key=['ss_store_sk'],
110+
event_time='sale_date',
111+
online_enabled=True)
112+
external_fg.save()
113+
114+
# read from external storage and filter data to sync to online
115+
df = external_fg.read().filter(external_fg.customer_status == "active")
116+
117+
# insert to online storage
118+
external_fg.insert(df)
119+
```
120+
121+
The `insert()` method takes a DataFrame as parameter and writes it _only_ to the online feature store. Users can select which subset of the feature group data they want to make available on the online feautre store by using the [query APIs](https://docs.hopsworks.ai/feature-store-api/{{{ hopsworks_version }}}/generated/api/query_api/).
122+
123+
### Limitations
124+
125+
Hopsworks Feature Store does not support time-travel queries on external feature groups.
101126

102-
Currently the HSFS library does not support calling the read() or show() methods on external feature groups. Likewise it is not possible to call the read() or show() methods on queries containing external feature groups. Nevertheless, external feature groups can be used from a Python engine to create training datasets.
127+
Additionally, support for `.read()` and `.show()` methods when using by the Python engine is limited to external feature groups defined on BigQuery and Snowflake and only when using the [Feature Query Service](../../../setup_installation/common/arrow_flight_duckdb.md).
128+
Nevertheless, external feature groups defined top of any storage connector can be used to create a training dataset from a Python environment invoking one of the following methods: [create_training_data](https://docs.hopsworks.ai/feature-store-api/{{{ hopsworks_version }}}/generated/api/feature_view_api/#create_training_data), [create_train_test_split](https://docs.hopsworks.ai/feature-store-api/{{{ hopsworks_version }}}/generated/api/feature_view_api/#create_train_test_split) or the [create_train_validation_test_split](https://docs.hopsworks.ai/feature-store-api/{{{ hopsworks_version }}}/generated/api/feature_view_api/#create_train_validation_test_split)
103129

104130

105131
### API Reference

0 commit comments

Comments
 (0)