Skip to content

Commit 50550bf

Browse files
authored
[HWORKS-1104] Explicit storage connector provenance (#382)
1 parent 8cead6a commit 50550bf

File tree

3 files changed

+73
-4
lines changed

3 files changed

+73
-4
lines changed
Loading
Loading

docs/user_guides/fs/provenance/provenance.md

+73-4
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,80 @@
22

33
## Introduction
44

5-
Hopsworks feature store allows users to track provenance (lineage) between feature groups, feature views and training dataset. Tracking lineage allows users to determine where/if a feature group is being used. You can track if feature groups are being used to create additional (derived) feature groups or feature views.
5+
Hopsworks feature store allows users to track provenance (lineage) between storage connectors, feature groups, feature views, training datasets and models. Tracking lineage allows users to determine where/if a feature group is being used. You can track if feature groups are being used to create additional (derived) feature groups or feature views.
66

77
You can interact with the provenance graph using the UI and the APIs.
88

9-
## Step 1: Feature group lineage
9+
## Step 1: Storage connector lineage
10+
11+
The relationship between storage connectors and feature groups is captured automatically when you create an external feature group. You can inspect the relationship between storage connectors and feature groups using the APIs.
12+
13+
=== "Python"
14+
15+
```python
16+
# Retrieve the storage connector
17+
snowflake_sc = fs.get_storage_connector("snowflake_sc")
18+
19+
# Create the user profiles feature group
20+
user_profiles_fg = fs.create_external_feature_group(
21+
name="user_profiles",
22+
version=1,
23+
storage_connector=snowflake_sc,
24+
query="SELECT * FROM USER_PROFILES"
25+
)
26+
user_profiles_fg.save()
27+
```
28+
29+
### Using the APIs
30+
31+
Starting from a feature group metadata object, you can traverse upstream the provenance graph to retrieve the metadata objects of the storage connectors that are part of the feature group. To do so, you can use the [get_storage_connector_provenance](https://docs.hopsworks.ai/feature-store-api/{{{ hopsworks_version }}}/generated/api/feature_group_api/#get_storage_connector_provenance) method.
32+
33+
=== "Python"
34+
35+
```python
36+
# Returns all storage connectors linked to the provided feature group
37+
lineage = user_profiles_fg.get_storage_connector_provenance()
38+
39+
# List all accessible parent storage connectors
40+
lineage.accessible
41+
42+
# List all deleted parent storage connectors
43+
lineage.deleted
44+
45+
# List all the inaccessible parent storage connectors
46+
lineage.inaccessible
47+
```
48+
49+
=== "Python"
50+
51+
```python
52+
# Returns an accessible storage connector linked to the feature group (if it exists)
53+
user_profiles_fg.get_storage_connector()
54+
```
55+
56+
To traverse the provenance graph in the opposite direction (i.e. from the storage connector to the feature group), you can use the [get_feature_groups_provenance](https://docs.hopsworks.ai/feature-store-api/{{{ hopsworks_version }}}/generated/api/storage_connector_api/#get_feature_groups_provenance) method. When navigating the provenance graph downstream, the `deleted` feature groups are not tracked by provenance, as such, the `deleted` property will always return an empty list.
57+
58+
=== "Python"
59+
60+
```python
61+
# Returns all feature groups linked to the provided storage connector
62+
lineage = snowflake_sc.get_feature_groups_provenance()
63+
64+
# List all accessible downstream feature groups
65+
lineage.accessible
66+
67+
# List all the inaccessible downstream feature groups
68+
lineage.inaccessible
69+
```
70+
71+
=== "Python"
72+
73+
```python
74+
# Returns all accessible feature groups linked to the storage connector (if any exists)
75+
snowflake_sc.get_feature_groups()
76+
```
77+
78+
## Step 2: Feature group lineage
1079

1180
### Assign parents to a feature group
1281

@@ -96,7 +165,7 @@ To traverse the provenance graph in the opposite direction (i.e. from the parent
96165
lineage.inaccessible
97166
```
98167

99-
You can also visualize the relationship between the parent and child feature groups in the UI. In each feature group overview page you can find a provenance section with the graph of parent feature groups and child feature groups/feature views.
168+
You can also visualize the relationship between the parent and child feature groups in the UI. In each feature group overview page you can find a provenance section with the graph of parent storage connectors/feature groups and child feature groups/feature views.
100169

101170
<p align="center">
102171
<figure>
@@ -105,7 +174,7 @@ You can also visualize the relationship between the parent and child feature gro
105174
</figure>
106175
</p>
107176

108-
## Step 2: Feature view lineage
177+
## Step 3: Feature view lineage
109178

110179
The relationship between feature groups and feature views is captured automatically when you create a feature view. You can inspect the relationship between feature groups and feature views using the APIs or the UI.
111180

0 commit comments

Comments
 (0)