Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FSTORE-1371][FSTORE-1346][FSTORE-1212] Migrate to pyproject, ruff and upgrade docs #200

Merged
merged 11 commits into from
May 6, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions .github/workflows/mkdocs-main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,21 @@ jobs:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
with:
fetch-depth: 0

- name: set dev version
working-directory: ./java
run: echo "DEV_VERSION=$(mvn org.apache.maven.plugins:maven-help-plugin:2.1.1:evaluate -Dexpression=project.version | grep -Ev 'Download|INFO|WARNING')" >> $GITHUB_ENV

- uses: actions/setup-python@v2
- uses: actions/setup-python@v5
with:
python-version: '3.8'
python-version: "3.10"

- name: install deps
working-directory: ./python
run: cp ../README.md . && pip3 install 'git+https://github.com/logicalclocks/feature-store-api@master#egg=hsfs[python]&subdirectory=python' && pip3 install -e .[dev,docs]
run: cp ../README.md . && pip3 install 'hsfs[python] @ git+https://github.com/logicalclocks/feature-store-api@master#subdirectory=python' && pip3 install -e .[dev,docs]

- name: generate autodoc
run: python3 auto_doc.py
Expand All @@ -33,4 +33,3 @@ jobs:

- name: mike deploy docs
run: mike deploy ${{ env.DEV_VERSION }} dev -u

14 changes: 5 additions & 9 deletions .github/workflows/mkdocs-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,14 @@ name: mkdocs-release

on:
push:
branches: [ branch-* ]
branches: [branch-*]

jobs:
publish-release:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v4
with:
fetch-depth: 0

Expand All @@ -20,16 +20,13 @@ jobs:
- name: set major/minor release version
run: echo "MAJOR_VERSION=$(echo $RELEASE_VERSION | sed 's/^\([0-9]*\.[0-9]*\).*$/\1/')" >> $GITHUB_ENV

- uses: actions/setup-python@v2
- uses: actions/setup-python@v5
with:
python-version: '3.8'
python-version: "3.10"

- name: install deps
working-directory: ./python
run: cp ../README.md . && pip3 install pip==22.0.3 && pip3 install -e .[dev,docs]

- name: use dev mike
run: pip3 uninstall -y mike && pip3 install git+'https://github.com/jimporter/mike.git'
run: cp ../README.md . && pip3 install -e .[dev,docs]

- name: generate autodoc
run: python3 auto_doc.py
Expand All @@ -43,4 +40,3 @@ jobs:
run: |
mike deploy ${{ env.RELEASE_VERSION }} ${{ env.MAJOR_VERSION }} -u --push
mike alias ${{ env.RELEASE_VERSION }} latest -u --push

48 changes: 33 additions & 15 deletions .github/workflows/python-lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,21 +11,39 @@ jobs:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.8'
- name: install deps
run: pip install flake8==3.9.0 black==22.3.0 pre-commit-hooks==2.4.0

- name: black
run: black --check python
- uses: actions/checkout@v4

- name: flake8
run: flake8 --config python/.flake8 python
- uses: actions/setup-python@v5
with:
python-version: "3.11"

- name: trailing-whitespace-fixer
run: trailing-whitespace-fixer $(find python -name "*.py" -type f) || exit 1
- name: Get all changed files
id: get-changed-files
uses: tj-actions/changed-files@v44
with:
files_yaml: |
src:
- 'python/**/*.py'
- '!python/tests/**/*.py'
test:
- 'python/tests/**/*.py'

- name: end-of-file-fixer
run: end-of-file-fixer $(find python -name "*.py" -type f) || exit 1
- name: install deps
run: pip install ruff==0.4.2

- name: ruff on python files
if: steps.get-changed-files.outputs.src_any_changed == 'true'
env:
SRC_ALL_CHANGED_FILES: ${{ steps.get-changed-files.outputs.src_all_changed_files }}
run: ruff check --output-format=github $SRC_ALL_CHANGED_FILES

- name: ruff on test files
if: steps.get-changed-files.outputs.test_any_changed == 'true'
env:
TEST_ALL_CHANGED_FILES: ${{ steps.get-changed-files.outputs.test_all_changed_files }}
run: ruff check --output-format=github $TEST_ALL_CHANGED_FILES

- name: ruff format --check $ALL_CHANGED_FILES
env:
ALL_CHANGED_FILES: ${{ steps.get-changed-files.outputs.all_changed_files }}
run: ruff format $ALL_CHANGED_FILES
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ coverage.xml
*.cover
.hypothesis/
.pytest_cache/
.ruff_cache/

# Translations
*.mo
Expand Down
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ We use `mkdocs` together with `mike` ([for versioning](https://github.com/jimpor
1. Currently we are using our own version of `keras-autodoc`

```bash
pip install git+https://github.com/moritzmeister/keras-autodoc@split-tags-properties
pip install git+https://github.com/logicalclocks/keras-autodoc
```

2. Install HOPSWORKS with `docs` extras:
Expand Down
94 changes: 74 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@
src="https://img.shields.io/badge/docs-HOPSWORKS-orange"
alt="Hopsworks Documentation"
/></a>
<a><img
src="https://img.shields.io/badge/python-3.8+-blue"
alt="python"
/></a>
<a href="https://pypi.org/project/hopsworks/"><img
src="https://img.shields.io/pypi/v/hopsworks?color=blue"
alt="PyPiStatus"
Expand All @@ -17,48 +21,98 @@
src="https://pepy.tech/badge/hopsworks/month"
alt="Downloads"
/></a>
<a href="https://github.com/psf/black"><img
src="https://img.shields.io/badge/code%20style-black-000000.svg"
alt="CodeStyle"
<a href=https://github.com/astral-sh/ruff><img
src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json"
alt="Ruff"
/></a>
<a><img
src="https://img.shields.io/pypi/l/hopsworks?color=green"
alt="License"
/></a>
</p>

*hopsworks* is the python API for interacting with a Hopsworks cluster.
*hopsworks* is the python API for interacting with a Hopsworks cluster. Don't have a Hopsworks cluster just yet? Register an account on [Hopsworks Serverless](https://app.hopsworks.ai/) and get started for free. Once connected to your project, you can:
- Insert dataframes into the online or offline Store, create training datasets or *serve real-time* feature vectors in the Feature Store via the [Feature Store API](https://github.com/logicalclocks/feature-store-api). Already have data somewhere you want to import, checkout our [Storage Connectors](https://docs.hopsworks.ai/latest/user_guides/fs/storage_connector/) documentation.
- register ML models in the model registry and *deploy* them via model serving via the [Machine Learning API](https://gitub.com/logicalclocks/machine-learning-api).
- manage environments, executions, kafka topics and more once you deploy your own Hopsworks cluster, either on-prem or in the cloud. Hopsworks is open-source and has its own [Community Edition](https://github.com/logicalclocks/hopsworks).

## Getting Started On Hopsworks
Our [tutorials](https://github.com/logicalclocks/hopsworks-tutorials) cover a wide range of use cases and example of what *you* can build using Hopsworks.

Instantiate a connection and get the project object
```python
import hopsworks
## Getting Started On Hopsworks

connection = hopsworks.connection()
Once you created a project on [Hopsworks Serverless](https://app.hopsworks.ai) and created a new [Api Key](https://docs.hopsworks.ai/latest/user_guides/projects/api_key/create_api_key/), just use your favourite virtualenv and package manager to install the library:

project = connection.get_project("my_project")
```bash
pip install hopsworks
```

Fire up a notebook and connect to your project, you will be prompted to enter your newly created API key:
```python
import hopsworks

project = hopsworks.login()
```

Create a new project
Access the Feature Store of your project to use as a central repository for your feature data. Use *your* favourite data engineering library (pandas, polars, Spark, etc...) to insert data into the Feature Store, create training datasets or serve real-time feature vectors. Want to predict likelyhood of e-scooter accidents in real-time? Here's how you can do it:

```python
project = connection.create_project("my_project")
fs = project.get_feature_store()

# Write to Feature Groups
bike_ride_fg = fs.get_or_create_feature_group(
name="bike_rides",
version=1,
primary_key=["ride_id"],
event_time="activation_time",
online_enabled=True,
)

fg.insert(bike_rides_df)

# Read from Feature Views
profile_fg = fs.get_feature_group("user_profile", version=1)

bike_ride_fv = fs.get_or_create_feature_view(
name="bike_rides_view",
version=1,
query=bike_ride_fg.select_except(["ride_id"]).join(profile_fg.select(["age", "has_license"]), on="user_id")
)

bike_rides_Q1_2021_df = bike_ride_fv.get_batch_data(
start_date="2021-01-01",
end_date="2021-01-31"
)

# Create a training dataset
version, job = bike_ride_fv.create_train_test_split(
test_size=0.2,
description='Description of a dataset',
# you can have different data formats such as csv, tsv, tfrecord, parquet and others
data_format='csv'
)

# Predict the probability of accident in real-time using new data + context data
bike_ride_fv.init_serving()

while True:
new_ride_vector = poll_ride_queue()
feature_vector = bike_ride_fv.get_online_feature_vector(
{"user_id": new_ride_vector["user_id"]},
passed_features=new_ride_vector
)
accident_probability = model.predict(feature_vector)
```

Upload data to a project
Or you can use the Machine Learning API to register models and deploy them for serving:
```python
dataset_api = project.get_dataset_api()

dataset_api.upload("data.csv", "Resources")
mr = project.get_model_registry()
# or
ms = project.get_model_serving()
```

## Tutorials




You can find more examples on how to use the library in our [hops-examples](https://github.com/logicalclocks/hops-examples) repository.
Need more inspiration or want to learn more about the Hopsworks platform? Check out our [tutorials](https://github.com/logicalclocks/hopsworks-tutorials).

## Documentation

Expand Down
2 changes: 1 addition & 1 deletion docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ We use `mkdocs` together with `mike` ([for versioning](https://github.com/jimpor
1. Currently we are using our own version of `keras-autodoc`

```bash
pip install git+https://github.com/moritzmeister/keras-autodoc@split-tags-properties
pip install git+https://github.com/logicalclocks/keras-autodoc
```

2. Install HOPSWORKS with `docs` extras:
Expand Down
7 changes: 6 additions & 1 deletion docs/css/custom.css
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
:root {
[data-md-color-scheme="hopsworks"] {
--md-primary-fg-color: #1EB382;
--md-secondary-fg-color: #188a64;
--md-tertiary-fg-color: #0d493550;
Expand All @@ -24,6 +24,11 @@
box-shadow: 0 0 0 0;
}

.md-tabs__item {
min-width: 2.25rem;
min-height: 1.5rem;
}

.md-tabs__item:hover {
background-color: var(--md-tertiary-fg-color);
transition: background-color 450ms;
Expand Down
21 changes: 9 additions & 12 deletions docs/css/dropdown.css
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/* Style The Dropdown Button */
.dropbtn-api {
.dropbtn {
color: white;
border: none;
cursor: pointer;
Expand All @@ -9,20 +9,17 @@
contain: inherit;
}
.md-tabs {
overflow: inherit;
}
.md-header {
z-index: 1000 !important;
overflow: inherit;
}

/* The container <div> - needed to position the dropdown content */
.dropdown-api {
position: relative;
.dropdown {
position: absolute;
display: inline-block;
}

/* Dropdown Content (Hidden by Default) */
.dropdown-content-api {
.dropdown-content {
display:none;
font-size: 13px;
position: absolute;
Expand All @@ -35,21 +32,21 @@ overflow: inherit;
}

/* Links inside the dropdown */
.dropdown-content-api a {
.dropdown-content a {
color: black;
padding: 12px 16px;
text-decoration: none;
display: block;
}

/* Change color of dropdown links on hover */
.dropdown-content-api a:hover {background-color: #f1f1f1}
.dropdown-content a:hover {background-color: #f1f1f1}

/* Show the dropdown menu on hover */
.dropdown-api:hover .dropdown-content-api {
.dropdown:hover .dropdown-content {
display: block;
}

/* Change the background color of the dropdown button when the dropdown content is shown */
.dropdown-api:hover .dropbtn-api {
.dropdown:hover .dropbtn {
}
2 changes: 2 additions & 0 deletions docs/js/dropdown.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
document.getElementsByClassName("md-tabs__link")[7].style.display = "none";
document.getElementsByClassName("md-tabs__link")[9].style.display = "none";
9 changes: 7 additions & 2 deletions docs/js/inject-api-links.js
Original file line number Diff line number Diff line change
@@ -1,14 +1,19 @@
window.addEventListener("DOMContentLoaded", function () {
var windowPathNameSplits = window.location.pathname.split("/");
var majorVersionRegex = new RegExp("(\\d+[.]\\d+)")
var latestRegex = new RegExp("latest");
if (majorVersionRegex.test(windowPathNameSplits[1])) { // On landing page docs.hopsworks.api/3.0 - URL contains major version
// Version API dropdown
document.getElementById("hopsworks_api_link").href = "https://docs.hopsworks.ai/hopsworks-api/" + windowPathNameSplits[1] + "/generated/api/login/";
document.getElementById("hsfs_api_link").href = "https://docs.hopsworks.ai/feature-store-api/" + windowPathNameSplits[1] + "/generated/api/connection_api/";
document.getElementById("hsml_api_link").href = "https://docs.hopsworks.ai/machine-learning-api/" + windowPathNameSplits[1] + "/generated/connection_api/";
} else { // on docs.hopsworks.api/feature-store-api/3.0 / docs.hopsworks.api/hopsworks-api/3.0 / docs.hopsworks.api/machine-learning-api/3.0
var apiVersion = windowPathNameSplits[2];
var majorVersion = apiVersion.match(majorVersionRegex)[0];
if (latestRegex.test(windowPathNameSplits[2]) || latestRegex.test(windowPathNameSplits[1])) {
var majorVersion = "latest";
} else {
var apiVersion = windowPathNameSplits[2];
var majorVersion = apiVersion.match(majorVersionRegex)[0];
}
// Version main navigation
document.getElementsByClassName("md-tabs__link")[0].href = "https://docs.hopsworks.ai/" + majorVersion;
document.getElementsByClassName("md-tabs__link")[1].href = "https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/quickstart.ipynb";
Expand Down
Loading