Skip to content

Commit 76aa4a0

Browse files
committed
Merge README and remove hsfs and hsml subdirectories
1 parent cf0e015 commit 76aa4a0

File tree

5 files changed

+79
-759
lines changed

5 files changed

+79
-759
lines changed

README.md

+79-15
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@
1717
src="https://img.shields.io/pypi/v/hopsworks?color=blue"
1818
alt="PyPiStatus"
1919
/></a>
20+
<a href="https://archiva.hops.works/#artifact/com.logicalclocks/hopsworks"><img
21+
src="https://img.shields.io/badge/java-HOPSWORKS-green"
22+
alt="Scala/Java Artifacts"
23+
/></a>
2024
<a href="https://pepy.tech/project/hopsworks/month"><img
2125
src="https://pepy.tech/badge/hopsworks/month"
2226
alt="Downloads"
@@ -32,9 +36,10 @@
3236
</p>
3337

3438
*hopsworks* is the python API for interacting with a Hopsworks cluster. Don't have a Hopsworks cluster just yet? Register an account on [Hopsworks Serverless](https://app.hopsworks.ai/) and get started for free. Once connected to your project, you can:
35-
- Insert dataframes into the online or offline Store, create training datasets or *serve real-time* feature vectors in the Feature Store via the [Feature Store API](https://github.com/logicalclocks/feature-store-api). Already have data somewhere you want to import, checkout our [Storage Connectors](https://docs.hopsworks.ai/latest/user_guides/fs/storage_connector/) documentation.
36-
- register ML models in the model registry and *deploy* them via model serving via the [Machine Learning API](https://gitub.com/logicalclocks/machine-learning-api).
37-
- manage environments, executions, kafka topics and more once you deploy your own Hopsworks cluster, either on-prem or in the cloud. Hopsworks is open-source and has its own [Community Edition](https://github.com/logicalclocks/hopsworks).
39+
40+
- Insert dataframes into the online or offline Store, create training datasets or *serve real-time* feature vectors in the Feature Store via the Feature Store API. Already have data somewhere you want to import, checkout our [Storage Connectors](https://docs.hopsworks.ai/latest/user_guides/fs/storage_connector/) documentation.
41+
- register ML models in the model registry and *deploy* them via model serving via the Machine Learning API.
42+
- manage environments, executions, kafka topics and more once you deploy your own Hopsworks cluster, either on-prem or in the cloud. Hopsworks is open-source and has its own [Community Edition](https://github.com/logicalclocks/hopsworks).
3843

3944
Our [tutorials](https://github.com/logicalclocks/hopsworks-tutorials) cover a wide range of use cases and example of what *you* can build using Hopsworks.
4045

@@ -43,26 +48,29 @@ Our [tutorials](https://github.com/logicalclocks/hopsworks-tutorials) cover a wi
4348
Once you created a project on [Hopsworks Serverless](https://app.hopsworks.ai) and created a new [Api Key](https://docs.hopsworks.ai/latest/user_guides/projects/api_key/create_api_key/), just use your favourite virtualenv and package manager to install the library:
4449

4550
```bash
46-
pip install hopsworks
51+
pip install "hopsworks[python]"
4752
```
4853

4954
Fire up a notebook and connect to your project, you will be prompted to enter your newly created API key:
55+
5056
```python
5157
import hopsworks
5258

5359
project = hopsworks.login()
5460
```
5561

62+
### Feature Store API
63+
5664
Access the Feature Store of your project to use as a central repository for your feature data. Use *your* favourite data engineering library (pandas, polars, Spark, etc...) to insert data into the Feature Store, create training datasets or serve real-time feature vectors. Want to predict likelyhood of e-scooter accidents in real-time? Here's how you can do it:
5765

5866
```python
5967
fs = project.get_feature_store()
6068

6169
# Write to Feature Groups
6270
bike_ride_fg = fs.get_or_create_feature_group(
63-
name="bike_rides",
64-
version=1,
65-
primary_key=["ride_id"],
71+
name="bike_rides",
72+
version=1,
73+
primary_key=["ride_id"],
6674
event_time="activation_time",
6775
online_enabled=True,
6876
)
@@ -73,13 +81,13 @@ fg.insert(bike_rides_df)
7381
profile_fg = fs.get_feature_group("user_profile", version=1)
7482

7583
bike_ride_fv = fs.get_or_create_feature_view(
76-
name="bike_rides_view",
77-
version=1,
84+
name="bike_rides_view",
85+
version=1,
7886
query=bike_ride_fg.select_except(["ride_id"]).join(profile_fg.select(["age", "has_license"]), on="user_id")
7987
)
8088

8189
bike_rides_Q1_2021_df = bike_ride_fv.get_batch_data(
82-
start_date="2021-01-01",
90+
start_date="2021-01-01",
8391
end_date="2021-01-31"
8492
)
8593

@@ -97,22 +105,68 @@ bike_ride_fv.init_serving()
97105
while True:
98106
new_ride_vector = poll_ride_queue()
99107
feature_vector = bike_ride_fv.get_online_feature_vector(
100-
{"user_id": new_ride_vector["user_id"]},
108+
{"user_id": new_ride_vector["user_id"]},
101109
passed_features=new_ride_vector
102110
)
103111
accident_probability = model.predict(feature_vector)
104112
```
105113

106-
Or you can use the Machine Learning API to register models and deploy them for serving:
114+
The API enables interaction with the Hopsworks Feature Store. It makes creating new features, feature groups and training datasets easy.
115+
116+
The API is environment independent and can be used in two modes:
117+
118+
- Spark mode: For data engineering jobs that create and write features into the feature store or generate training datasets. It requires a Spark environment such as the one provided in the Hopsworks platform or Databricks. In Spark mode, HSFS provides bindings both for Python and JVM languages.
119+
120+
- Python mode: For data science jobs to explore the features available in the feature store, generate training datasets and feed them in a training pipeline. Python mode requires just a Python interpreter and can be used both in Hopsworks from Python Jobs/Jupyter Kernels, Amazon SageMaker or KubeFlow.
121+
122+
Scala API is also available, here is a short sample of it:
123+
124+
```scala
125+
import com.logicalclocks.hsfs._
126+
val connection = HopsworksConnection.builder().build()
127+
val fs = connection.getFeatureStore();
128+
val attendances_features_fg = fs.getFeatureGroup("games_features", 1);
129+
attendances_features_fg.show(1)
130+
```
131+
132+
### Machine Learning API
133+
134+
Or you can use the Machine Learning API to interact with the Hopsworks Model Registry and Model Serving. The API makes it easy to export, manage and deploy models. For example, to register models and deploy them for serving you can do:
135+
107136
```python
108137
mr = project.get_model_registry()
109138
# or
110-
ms = project.get_model_serving()
139+
ms = connection.get_model_serving()
140+
141+
# Create a new model:
142+
model = mr.tensorflow.create_model(name="mnist",
143+
version=1,
144+
metrics={"accuracy": 0.94},
145+
description="mnist model description")
146+
model.save("/tmp/model_directory") # or /tmp/model_file
147+
148+
# Download a model:
149+
model = mr.get_model("mnist", version=1)
150+
model_path = model.download()
151+
152+
# Delete the model:
153+
model.delete()
154+
155+
# Get the best-performing model
156+
best_model = mr.get_best_model('mnist', 'accuracy', 'max')
157+
158+
# Deploy the model:
159+
deployment = model.deploy()
160+
deployment.start()
161+
162+
# Make predictions with a deployed model
163+
data = { "instances": [ model.input_example ] }
164+
predictions = deployment.predict(data)
111165
```
112166

113167
## Tutorials
114168

115-
Need more inspiration or want to learn more about the Hopsworks platform? Check out our [tutorials](https://github.com/logicalclocks/hopsworks-tutorials).
169+
Need more inspiration or want to learn more about the Hopsworks platform? Check out our [tutorials](https://github.com/logicalclocks/hopsworks-tutorials).
116170

117171
## Documentation
118172

@@ -124,7 +178,17 @@ For general questions about the usage of Hopsworks and the Feature Store please
124178

125179
Please report any issue using [Github issue tracking](https://github.com/logicalclocks/hopsworks-api/issues).
126180

181+
### Related to Feautre Store API
182+
183+
Please attach the client environment from the output below to your issue, if it is related to Feature Store API:
184+
185+
```python
186+
import hopsworks
187+
import hsfs
188+
hopsworks.login().get_feature_store()
189+
print(hsfs.get_env())
190+
```
191+
127192
## Contributing
128193

129194
If you would like to contribute to this library, please see the [Contribution Guidelines](CONTRIBUTING.md).
130-

hsfs/LICENSE

-201
This file was deleted.

0 commit comments

Comments
 (0)