|
9 | 9 | src="https://img.shields.io/badge/docs-HOPSWORKS-orange"
|
10 | 10 | alt="Hopsworks Documentation"
|
11 | 11 | /></a>
|
| 12 | + <a><img |
| 13 | + src="https://img.shields.io/badge/python-3.8+-blue" |
| 14 | + alt="python" |
| 15 | + /></a> |
12 | 16 | <a href="https://pypi.org/project/hopsworks/"><img
|
13 | 17 | src="https://img.shields.io/pypi/v/hopsworks?color=blue"
|
14 | 18 | alt="PyPiStatus"
|
|
17 | 21 | src="https://pepy.tech/badge/hopsworks/month"
|
18 | 22 | alt="Downloads"
|
19 | 23 | /></a>
|
20 |
| - <a href="https://github.com/psf/black"><img |
21 |
| - src="https://img.shields.io/badge/code%20style-black-000000.svg" |
22 |
| - alt="CodeStyle" |
| 24 | + <a href=https://github.com/astral-sh/ruff><img |
| 25 | + src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json" |
| 26 | + alt="Ruff" |
23 | 27 | /></a>
|
24 | 28 | <a><img
|
25 | 29 | src="https://img.shields.io/pypi/l/hopsworks?color=green"
|
26 | 30 | alt="License"
|
27 | 31 | /></a>
|
28 | 32 | </p>
|
29 | 33 |
|
30 |
| -*hopsworks* is the python API for interacting with a Hopsworks cluster. |
| 34 | +*hopsworks* is the python API for interacting with a Hopsworks cluster. Don't have a Hopsworks cluster just yet? Register an account on [Hopsworks Serverless](https://app.hopsworks.ai/) and get started for free. Once connected to your project, you can: |
| 35 | + - Insert dataframes into the online or offline Store, create training datasets or *serve real-time* feature vectors in the Feature Store via the [Feature Store API](https://github.com/logicalclocks/feature-store-api). Already have data somewhere you want to import, checkout our [Storage Connectors](https://docs.hopsworks.ai/latest/user_guides/fs/storage_connector/) documentation. |
| 36 | + - register ML models in the model registry and *deploy* them via model serving via the [Machine Learning API](https://gitub.com/logicalclocks/machine-learning-api). |
| 37 | + - manage environments, executions, kafka topics and more once you deploy your own Hopsworks cluster, either on-prem or in the cloud. Hopsworks is open-source and has its own [Community Edition](https://github.com/logicalclocks/hopsworks). |
31 | 38 |
|
32 |
| -## Getting Started On Hopsworks |
| 39 | +Our [tutorials](https://github.com/logicalclocks/hopsworks-tutorials) cover a wide range of use cases and example of what *you* can build using Hopsworks. |
33 | 40 |
|
34 |
| -Instantiate a connection and get the project object |
35 |
| -```python |
36 |
| -import hopsworks |
| 41 | +## Getting Started On Hopsworks |
37 | 42 |
|
38 |
| -connection = hopsworks.connection() |
| 43 | +Once you created a project on [Hopsworks Serverless](https://app.hopsworks.ai) and created a new [Api Key](https://docs.hopsworks.ai/latest/user_guides/projects/api_key/create_api_key/), just use your favourite virtualenv and package manager to install the library: |
39 | 44 |
|
40 |
| -project = connection.get_project("my_project") |
| 45 | +```bash |
| 46 | +pip install hopsworks |
| 47 | +``` |
41 | 48 |
|
| 49 | +Fire up a notebook and connect to your project, you will be prompted to enter your newly created API key: |
| 50 | +```python |
| 51 | +import hopsworks |
42 | 52 |
|
| 53 | +project = hopsworks.login() |
43 | 54 | ```
|
44 | 55 |
|
45 |
| -Create a new project |
| 56 | +Access the Feature Store of your project to use as a central repository for your feature data. Use *your* favourite data engineering library (pandas, polars, Spark, etc...) to insert data into the Feature Store, create training datasets or serve real-time feature vectors. Want to predict likelyhood of e-scooter accidents in real-time? Here's how you can do it: |
| 57 | + |
46 | 58 | ```python
|
47 |
| -project = connection.create_project("my_project") |
| 59 | +fs = project.get_feature_store() |
| 60 | + |
| 61 | +# Write to Feature Groups |
| 62 | +bike_ride_fg = fs.get_or_create_feature_group( |
| 63 | + name="bike_rides", |
| 64 | + version=1, |
| 65 | + primary_key=["ride_id"], |
| 66 | + event_time="activation_time", |
| 67 | + online_enabled=True, |
| 68 | +) |
| 69 | + |
| 70 | +fg.insert(bike_rides_df) |
| 71 | + |
| 72 | +# Read from Feature Views |
| 73 | +profile_fg = fs.get_feature_group("user_profile", version=1) |
| 74 | + |
| 75 | +bike_ride_fv = fs.get_or_create_feature_view( |
| 76 | + name="bike_rides_view", |
| 77 | + version=1, |
| 78 | + query=bike_ride_fg.select_except(["ride_id"]).join(profile_fg.select(["age", "has_license"]), on="user_id") |
| 79 | +) |
| 80 | + |
| 81 | +bike_rides_Q1_2021_df = bike_ride_fv.get_batch_data( |
| 82 | + start_date="2021-01-01", |
| 83 | + end_date="2021-01-31" |
| 84 | +) |
| 85 | + |
| 86 | +# Create a training dataset |
| 87 | +version, job = bike_ride_fv.create_train_test_split( |
| 88 | + test_size=0.2, |
| 89 | + description='Description of a dataset', |
| 90 | + # you can have different data formats such as csv, tsv, tfrecord, parquet and others |
| 91 | + data_format='csv' |
| 92 | +) |
| 93 | + |
| 94 | +# Predict the probability of accident in real-time using new data + context data |
| 95 | +bike_ride_fv.init_serving() |
| 96 | + |
| 97 | +while True: |
| 98 | + new_ride_vector = poll_ride_queue() |
| 99 | + feature_vector = bike_ride_fv.get_online_feature_vector( |
| 100 | + {"user_id": new_ride_vector["user_id"]}, |
| 101 | + passed_features=new_ride_vector |
| 102 | + ) |
| 103 | + accident_probability = model.predict(feature_vector) |
48 | 104 | ```
|
49 | 105 |
|
50 |
| -Upload data to a project |
| 106 | +Or you can use the Machine Learning API to register models and deploy them for serving: |
51 | 107 | ```python
|
52 |
| -dataset_api = project.get_dataset_api() |
53 |
| - |
54 |
| -dataset_api.upload("data.csv", "Resources") |
| 108 | +mr = project.get_model_registry() |
| 109 | +# or |
| 110 | +ms = project.get_model_serving() |
55 | 111 | ```
|
56 | 112 |
|
| 113 | +## Tutorials |
57 | 114 |
|
58 |
| - |
59 |
| - |
60 |
| - |
61 |
| -You can find more examples on how to use the library in our [hops-examples](https://github.com/logicalclocks/hops-examples) repository. |
| 115 | +Need more inspiration or want to learn more about the Hopsworks platform? Check out our [tutorials](https://github.com/logicalclocks/hopsworks-tutorials). |
62 | 116 |
|
63 | 117 | ## Documentation
|
64 | 118 |
|
|
0 commit comments