Skip to content

Commit 289e2dd

Browse files
author
davitbzh
committed
Merge remote-tracking branch 'upstream/main' into FSTORE-1008-java-engine
2 parents 503246c + 26bc581 commit 289e2dd

File tree

11 files changed

+19
-27
lines changed

11 files changed

+19
-27
lines changed
-54.2 KB
Loading
Loading

docs/setup_installation/admin/audit/audit-logs.md

+2-3
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,7 @@ To edit a configuration variable, you can click on the edit button (:material-pe
2828

2929
| Name | Description |
3030
| --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
31-
| audit_log_count | the number of files to keep when rotating logs (java.util.logging.FileHandler.count) |
32-
| audit_log_dir | the path where audit logs are saved |
31+
| audit_log_count | the number of files to keep when rotating logs (java.util.logging.FileHandler count) | |
3332
| audit_log_file_format | log file name pattern. (java.util.logging.FileHandler.pattern) |
3433
| audit_log_file_type | the output format of the log file. Can be one of java.util.logging.SimpleFormatter (default), io.hops.hopsworks.audit.helper.JSONLogFormatter, or io.hops.hopsworks.audit.helper.HtmlLogFormatter. |
3534
| audit_log_size_limit | the maximum number of bytes to write to any one file. (java.util.logging.FileHandler.limit) |
@@ -40,7 +39,7 @@ To edit a configuration variable, you can click on the edit button (:material-pe
4039

4140
## Step 2: Access the Logs
4241

43-
To access the audit logs, SSH into the **head node** of your Hopsworks cluster and navigate to the path set in the _audit\_log\_dir_ configuration variable.
42+
To access the audit logs, SSH into the **instance pod** of your Hopsworks cluster and navigate to the path ```/opt/payara/appserver/glassfish/nodes/<node name>/<instance name>/logs/audit```.
4443

4544
Audit logs follow the format set in the _audit\_log\_file\_type_ configuration variable.
4645

docs/tutorials/index.md

+9-17
Original file line numberDiff line numberDiff line change
@@ -24,19 +24,19 @@ This is a batch use case variant of the fraud tutorial, it will give you a high
2424

2525
| Notebooks | |
2626
| ----------- | ------------------------------------ |
27-
| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/1_fraud_batch_feature_pipeline.ipynb){:target="_blank"} |
28-
| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/2_fraud_batch_training_pipeline.ipynb){:target="_blank"} |
29-
| 3. How to train a model from the feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/3_fraud_batch_inference.ipynb){:target="_blank"} |
27+
| 1. [How to load, engineer and create feature groups](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/1_fraud_batch_feature_pipeline.ipynb){:target="_blank"} |
28+
| 2. [How to create training datasets](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/2_fraud_batch_training_pipeline.ipynb){:target="_blank"} |
29+
| 3. [How to train a model from the feature store](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/3_fraud_batch_inference.ipynb){:target="_blank"} |
3030

3131
### Online
3232
This is a online use case variant of the fraud tutorial, it is similar to the batch use case, however, in this tutorial you will get introduced to the usage of Feature Groups which are kept in online storage, and how to access single feature vectors from the online storage
3333
at low latency. Additionally, the model will be deployed as a model serving instance, to provide a REST endpoint for real time serving.
3434

3535
| Notebooks | |
3636
| ----------- | ------------------------------------ |
37-
| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_online/1_fraud_online_feature_pipeline.ipynb){:target="_blank"} |
38-
| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_online/2_fraud_online_training_pipeline.ipynb){:target="_blank"} |
39-
| 3. How to train a model from the feature store and deploying it as a serving instance together with the online feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/fraud_online/3_fraud_online_inference_pipeline.ipynb){:target="_blank"} |
37+
| 1. [How to load, engineer and create feature groups](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/1_fraud_online_feature_pipeline.ipynb){:target="_blank"} |
38+
| 2. [How to create training datasets](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/2_fraud_online_training_pipeline.ipynb){:target="_blank"} |
39+
| 3. [How to train a model from the feature store and deploying it as a serving instance together with the online feature store](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/real-time-ai-systems/fraud_online/3_fraud_online_inference_pipeline.ipynb){:target="_blank"} |
4040

4141
## Churn Tutorial
4242

@@ -45,17 +45,9 @@ at low latency. Additionally, the model will be deployed as a model serving inst
4545

4646
| Notebooks | |
4747
| ----------- | ------------------------------------ |
48-
| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/churn/1_churn_feature_pipeline.ipynb){:target="_blank"} |
49-
| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/churn/2_churn_training_pipeline.ipynb){:target="_blank"} |
50-
| 3. How to train a model from the feature store and deploying it as a serving instance together with the online feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/churn/3_churn_batch_inference.ipynb){:target="_blank"} |
51-
52-
## Iris Tutorial
53-
54-
In this tutorial you will learn how to create an online prediction service for the Iris flower prediction problem.
55-
56-
| Notebooks | |
57-
| ----------- | ------------------------------------ |
58-
| 1. All-in-one notebook, showing how to create the needed feature groups, train the model and deploy it as a serving instance | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/iris/iris_tutorial.ipynb){:target="_blank"} |
48+
| 1. How to load, engineer and create feature groups | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/churn/1_churn_feature_pipeline.ipynb){:target="_blank"} |
49+
| 2. How to create training datasets | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/churn/2_churn_training_pipeline.ipynb){:target="_blank"} |
50+
| 3. How to train a model from the feature store and deploying it as a serving instance together with the online feature store | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/churn/3_churn_batch_inference.ipynb){:target="_blank"} |
5951

6052
## Integration Tutorials
6153

docs/user_guides/fs/feature_group/feature_monitoring.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Before continuing with this guide, see the [Feature monitoring guide](../feature
99

1010
## Code
1111

12-
In this section, we show you how to setup feature monitoring in a Feature Group using the ==Hopsworks Python library==. Alternatively, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/integrations/feature-monitoring/feature-monitoring.ipynb).
12+
In this section, we show you how to setup feature monitoring in a Feature Group using the ==Hopsworks Python library==. Alternatively, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/feature_monitoring.ipynb).
1313

1414
First, checkout the pre-requisite and Hopsworks setup to follow the guide below. Create a project, install the [Hopsworks Python library](https://pypi.org/project/hopsworks) in your environment, connect via the generated API key. The second step is to start a new configuration for feature monitoring.
1515

docs/user_guides/fs/feature_monitoring/feature_monitoring_advanced.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Advanced guide
22

3-
An introduction to Feature Monitoring can be found in the guides for [Feature Groups](../feature_group/feature_monitoring.md) and [Feature Views](../feature_view/feature_monitoring.md). In addition, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/integrations/feature-monitoring/feature-monitoring.ipynb).
3+
An introduction to Feature Monitoring can be found in the guides for [Feature Groups](../feature_group/feature_monitoring.md) and [Feature Views](../feature_view/feature_monitoring.md). In addition, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/feature_monitoring.ipynb).
44

55
## Retrieve feature monitoring configurations
66

docs/user_guides/fs/feature_view/feature_monitoring.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Before continuing with this guide, see the [Feature monitoring guide](../feature
99

1010
## Code
1111

12-
In this section, we show you how to setup feature monitoring in a Feature View using the ==Hopsworks Python library==. Alternatively, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/integrations/feature-monitoring/feature-monitoring.ipynb).
12+
In this section, we show you how to setup feature monitoring in a Feature View using the ==Hopsworks Python library==. Alternatively, you can get started quickly by running our [tutorial for feature monitoring](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/feature_monitoring.ipynb).
1313

1414
First, checkout the pre-requisite and Hopsworks setup to follow the guide below. Create a project, install the [Hopsworks Python library](https://pypi.org/project/hopsworks) in your environment and connect via the generated API key. The second step is to start a new configuration for feature monitoring.
1515

docs/user_guides/fs/feature_view/overview.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -44,7 +44,7 @@ If you want to understand more about the concept of feature view, you can refer
4444
.build();
4545
```
4646

47-
You can refer to [query](./query.md) and [transformation function](./model-dependent-transformations.md) for creating `query` and `transformation_function`. To see a full example of how to create a feature view, you can read [this notebook](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/2_feature_view_creation.ipynb).
47+
You can refer to [query](./query.md) and [transformation function](./model-dependent-transformations.md) for creating `query` and `transformation_function`. To see a full example of how to create a feature view, you can read [this notebook](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/2_fraud_batch_training_pipeline.ipynb).
4848

4949
## Retrieval
5050
Once you have created a feature view, you can retrieve it by its name and version.

docs/user_guides/fs/feature_view/training-data.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Training data can be created from the feature view and used by different ML libraries for training different models.
44

5-
You can read [training data concepts](../../../concepts/fs/feature_view/offline_api.md) for more details. To see a full example of how to create training data, you can read [this notebook](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/2_feature_view_creation.ipynb).
5+
You can read [training data concepts](../../../concepts/fs/feature_view/offline_api.md) for more details. To see a full example of how to create training data, you can read [this notebook](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/2_fraud_batch_training_pipeline.ipynb).
66

77
For Python-clients, handling small or moderately-sized data, we recommend enabling the [ArrowFlight Server with DuckDB](../../../setup_installation/common/arrow_flight_duckdb.md) service,
88
which will provide significant speedups over Spark/Hive for reading and creating in-memory training datasets.
@@ -29,7 +29,7 @@ print(job.id) # get the job's id and view the job status in the UI
2929
### Extra filters
3030
Sometimes data scientists need to train different models using subsets of a dataset. For example, there can be different models for different countries, seasons, and different groups. One way is to create different feature views for training different models. Another way is to add extra filters on top of the feature view when creating training data.
3131

32-
In the [transaction fraud example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/fraud_batch/1_feature_groups.ipynb), there are different transaction categories, for example: "Health/Beauty", "Restaurant/Cafeteria", "Holliday/Travel" etc. Examples below show how to create training data for different transaction categories.
32+
In the [transaction fraud example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/batch-ai-systems/fraud_batch/1_fraud_batch_feature_pipeline.ipynb), there are different transaction categories, for example: "Health/Beauty", "Restaurant/Cafeteria", "Holliday/Travel" etc. Examples below show how to create training data for different transaction categories.
3333
```python
3434
# Create a training dataset for Health/Beauty
3535
df_health = feature_view.training_data(

docs/user_guides/fs/storage_connector/creation/s3.md

+1
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ When you're finished, you'll be able to read files using Spark through HSFS APIs
1717
Before you begin this guide you'll need to retrieve the following information from your AWS S3 account and bucket:
1818

1919
- **Bucket:** You will need a S3 bucket that you have access to. The bucket is identified by its name.
20+
- **Path (Optional):** If needed, a path can be defined to ensure that all operations are restricted to a specific location within the bucket.
2021
- **Region (Optional):** You will need an S3 region to have complete control over data when managing the feature group that relies on this storage connector. The region is identified by its code.
2122
- **Authentication Method:** You can authenticate using Access Key/Secret, or use IAM roles. If you want to use an IAM role it either needs to be attached to the entire Hopsworks cluster or Hopsworks needs to be able to assume the role. See [IAM role documentation](../../../../setup_installation/admin/roleChaining.md) for more information.
2223
- **Server Side Encryption details:** If your bucket has server side encryption (SSE) enabled, make sure you know which algorithm it is using (AES256 or SSE-KMS). If you are using SSE-KMS, you need the resource ARN of the managed key.

docs/user_guides/fs/vector_similarity_search.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -108,4 +108,4 @@ There are 2 types of online feature stores in Hopsworks: online store (RonDB) an
108108
Create a new index per feature group to optimize retrieval performance.
109109

110110
# Next step
111-
Explore the [news search example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/hsfs/knn_search/news-search-knn.ipynb), demonstrating how to use Hopsworks for implementing a news search application using natural language in the application. Additionally, you can see the application of querying similar embeddings with additional features in this [news rank example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/hsfs/knn_search/news-search-rank-view.ipynb).
111+
Explore the [news search example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/vector_similarity_search/1_feature_group_embeddings_api.ipynb), demonstrating how to use Hopsworks for implementing a news search application using natural language in the application. Additionally, you can see the application of querying similar embeddings with additional features in this [news rank example](https://github.com/logicalclocks/hopsworks-tutorials/blob/master/api_examples/vector_similarity_search/2_feature_view_embeddings_api.ipynb).

0 commit comments

Comments
 (0)