Skip to content

Commit 426f730

Browse files
averseySirOibaf
authored andcommitted
Replace hsfs with hopsworks where it is possible in docs (logicalclocks#374)
1 parent 40167cb commit 426f730

File tree

7 files changed

+84
-16
lines changed

7 files changed

+84
-16
lines changed

python/hopsworks_common/client/online_store_rest_client.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -305,7 +305,7 @@ def _check_hopsworks_connection(self) -> None:
305305
assert (
306306
client.get_instance() is not None and client.get_instance()._connected
307307
), """Hopsworks Client is not connected. Please connect to Hopsworks cluster
308-
via hopsworks.login or hsfs.connection before initialising the Online Store REST Client.
308+
via hopsworks.login before initialising the Online Store REST Client.
309309
"""
310310
_logger.debug("Hopsworks connection is active.")
311311

python/hopsworks_common/connection.py

Lines changed: 68 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -477,7 +477,74 @@ def connection(
477477
api_key_file: Optional[str] = None,
478478
api_key_value: Optional[str] = None,
479479
) -> Connection:
480-
"""Connection factory method, accessible through `hopsworks.connection()`."""
480+
"""Connection factory method, accessible through `hopsworks.connection()`.
481+
482+
This class provides convenience classmethods accessible from the `hopsworks`-module:
483+
484+
!!! example "Connection factory"
485+
For convenience, `hopsworks` provides a factory method, accessible from the top level
486+
module, so you don't have to import the `Connection` class manually:
487+
488+
```python
489+
import hopsworks
490+
conn = hopsworks.connection()
491+
```
492+
493+
!!! hint "Save API Key as File"
494+
To get started quickly, you can simply create a file with the previously
495+
created Hopsworks API Key and place it on the environment from which you
496+
wish to connect to Hopsworks.
497+
498+
You can then connect by simply passing the path to the key file when
499+
instantiating a connection:
500+
501+
```python hl_lines="6"
502+
import hopsworks
503+
conn = hopsworks.connection(
504+
'my_instance', # DNS of your Hopsworks instance
505+
443, # Port to reach your Hopsworks instance, defaults to 443
506+
api_key_file='hopsworks.key', # The file containing the API key generated above
507+
hostname_verification=True) # Disable for self-signed certificates
508+
)
509+
project = conn.get_project("my_project")
510+
```
511+
512+
Clients in external clusters need to connect to the Hopsworks using an
513+
API key. The API key is generated inside the Hopsworks platform, and requires at
514+
least the "project" scope to be able to access a project.
515+
For more information, see the [integration guides](../setup.md).
516+
517+
# Arguments
518+
host: The hostname of the Hopsworks instance in the form of `[UUID].cloud.hopsworks.ai`,
519+
defaults to `None`. Do **not** use the url including `https://` when connecting
520+
programatically.
521+
port: The port on which the Hopsworks instance can be reached,
522+
defaults to `443`.
523+
project: The name of the project to connect to. When running on Hopsworks, this
524+
defaults to the project from where the client is run from.
525+
Defaults to `None`.
526+
engine: Which engine to use, `"spark"`, `"python"` or `"training"`. Defaults to `None`,
527+
which initializes the engine to Spark if the environment provides Spark, for
528+
example on Hopsworks and Databricks, or falls back on Hive in Python if Spark is not
529+
available, e.g. on local Python environments or AWS SageMaker. This option
530+
allows you to override this behaviour. `"training"` engine is useful when only
531+
feature store metadata is needed, for example training dataset location and label
532+
information when Hopsworks training experiment is conducted.
533+
hostname_verification: Whether or not to verify Hopsworks' certificate, defaults
534+
to `True`.
535+
trust_store_path: Path on the file system containing the Hopsworks certificates,
536+
defaults to `None`.
537+
cert_folder: The directory to store retrieved HopsFS certificates, defaults to
538+
`"/tmp"`. Only required when running without a Spark environment.
539+
api_key_file: Path to a file containing the API Key, defaults to `None`.
540+
api_key_value: API Key as string, if provided, `api_key_file` will be ignored,
541+
however, this should be used with care, especially if the used notebook or
542+
job script is accessible by multiple parties. Defaults to `None`.
543+
544+
# Returns
545+
`Connection`. Connection handle to perform operations on a
546+
Hopsworks project.
547+
"""
481548
return cls(
482549
host,
483550
port,

python/hopsworks_common/project.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ def get_feature_store(
129129
name: Project name of the feature store.
130130
engine: Which engine to use, `"spark"`, `"python"` or `"training"`.
131131
Defaults to `"python"` when connected to [Serverless Hopsworks](https://app.hopsworks.ai).
132-
See hsfs.Connection.connection documentation for more information.
132+
See [`hopsworks.connection`](connection.md#connection) documentation for more information.
133133
# Returns
134134
`hsfs.feature_store.FeatureStore`: The Feature Store API
135135
# Raises

python/hsfs/feature_store.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -456,7 +456,7 @@ def sql(
456456
For spark engine: Dictionary of read options for Spark.
457457
For python engine:
458458
If running queries on the online feature store, users can provide an entry `{'external': True}`,
459-
this instructs the library to use the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) to establish the connection to the online feature store.
459+
this instructs the library to use the `host` parameter in the [`hopsworks.login()`](login.md#login) to establish the connection to the online feature store.
460460
If not set, or set to False, the online feature store storage connector is used which relies on
461461
the private ip.
462462
Defaults to `{}`.

python/hsfs/feature_view.py

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -332,7 +332,7 @@ def init_serving(
332332
Transformation statistics are fetched from training dataset and applied to the feature vector.
333333
external: boolean, optional. If set to True, the connection to the
334334
online feature store is established using the same host as
335-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
335+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
336336
If set to False, the online feature store storage connector is used which relies on the private IP.
337337
Defaults to True if connection to Hopsworks is established from external environment (e.g AWS
338338
Sagemaker or Google Colab), otherwise to False.
@@ -587,7 +587,7 @@ def get_feature_vector(
587587
providing feature values which are not available in the feature store.
588588
external: boolean, optional. If set to True, the connection to the
589589
online feature store is established using the same host as
590-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
590+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
591591
If set to False, the online feature store storage connector is used
592592
which relies on the private IP. Defaults to True if connection to Hopsworks is established from
593593
external environment (e.g AWS Sagemaker or Google Colab), otherwise to False.
@@ -700,7 +700,7 @@ def get_feature_vectors(
700700
providing feature values which are not available in the feature store.
701701
external: boolean, optional. If set to True, the connection to the
702702
online feature store is established using the same host as
703-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
703+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
704704
If set to False, the online feature store storage connector is used
705705
which relies on the private IP. Defaults to True if connection to Hopsworks is established from
706706
external environment (e.g AWS Sagemaker or Google Colab), otherwise to False.
@@ -772,7 +772,7 @@ def get_inference_helper(
772772
Set of required primary keys is [`feature_view.primary_keys`](#primary_keys)
773773
external: boolean, optional. If set to True, the connection to the
774774
online feature store is established using the same host as
775-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
775+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
776776
If set to False, the online feature store storage connector is used
777777
which relies on the private IP. Defaults to True if connection to Hopsworks is established from
778778
external environment (e.g AWS Sagemaker or Google Colab), otherwise to False.
@@ -830,7 +830,7 @@ def get_inference_helpers(
830830
Set of required primary keys is [`feature_view.primary_keys`](#primary_keys)
831831
external: boolean, optional. If set to True, the connection to the
832832
online feature store is established using the same host as
833-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
833+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
834834
If set to False, the online feature store storage connector is used
835835
which relies on the private IP. Defaults to True if connection to Hopsworks is established from
836836
external environment (e.g AWS Sagemaker or Google Colab), otherwise to False.
@@ -907,7 +907,7 @@ def find_neighbors(
907907
filter: A filter expression to restrict the search space (optional).
908908
external: boolean, optional. If set to True, the connection to the
909909
online feature store is established using the same host as
910-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
910+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
911911
If set to False, the online feature store storage connector is used
912912
which relies on the private IP. Defaults to True if connection to Hopsworks is established from
913913
external environment (e.g AWS Sagemaker or Google Colab), otherwise to False.
@@ -3562,7 +3562,7 @@ def transform(
35623562
feature_vector: `Union[List[Any], List[List[Any]], pd.DataFrame, pl.DataFrame]`. The feature vector to be transformed.
35633563
external: boolean, optional. If set to True, the connection to the
35643564
online feature store is established using the same host as
3565-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
3565+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
35663566
If set to False, the online feature store storage connector is used
35673567
which relies on the private IP. Defaults to True if connection to Hopsworks is established from
35683568
external environment (e.g AWS Sagemaker or Google Colab), otherwise to False.

python/hsfs/training_dataset.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1003,7 +1003,7 @@ def init_prepared_statement(
10031003
initialised for retrieving serving vectors as a batch.
10041004
external: boolean, optional. If set to True, the connection to the
10051005
online feature store is established using the same host as
1006-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
1006+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
10071007
If set to False, the online feature store storage connector is used
10081008
which relies on the private IP. Defaults to True if connection to Hopsworks is established from
10091009
external environment (e.g AWS Sagemaker or Google Colab), otherwise to False.
@@ -1020,7 +1020,7 @@ def get_serving_vector(
10201020
serving application.
10211021
external: boolean, optional. If set to True, the connection to the
10221022
online feature store is established using the same host as
1023-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
1023+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
10241024
If set to False, the online feature store storage connector is used
10251025
which relies on the private IP. Defaults to True if connection to Hopsworks is established from
10261026
external environment (e.g AWS Sagemaker or Google Colab), otherwise to False.
@@ -1042,7 +1042,7 @@ def get_serving_vectors(
10421042
serving application.
10431043
external: boolean, optional. If set to True, the connection to the
10441044
online feature store is established using the same host as
1045-
for the `host` parameter in the [`hsfs.connection()`](connection_api.md#connection) method.
1045+
for the `host` parameter in the [`hopsworks.login()`](login.md#login) method.
10461046
If set to False, the online feature store storage connector is used
10471047
which relies on the private IP. Defaults to True if connection to Hopsworks is established from
10481048
external environment (e.g AWS Sagemaker or Google Colab), otherwise to False.

python/hsml/core/dataset_api.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -61,10 +61,11 @@ def upload(
6161
"""Upload a file to the Hopsworks filesystem.
6262
6363
```python
64+
import hopsworks
6465
65-
conn = hsml.connection(project="my-project")
66+
project = hopsworks.login(project="my-project")
6667
67-
dataset_api = conn.get_dataset_api()
68+
dataset_api = project.get_dataset_api()
6869
6970
uploaded_file_path = dataset_api.upload("my_local_file.txt", "Resources")
7071

0 commit comments

Comments
 (0)