You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Slightly reworks SDF deployment docs page.
1. Moves paragraph about cloud worker creation under the explanations of
creating Host & Remote worker.
2. Adjusts wording & adds a couple links
Still have to:
1. proof `Remote Worker` section and `Managing Dataflows` section.
2. Clarify difference in use case between user-managed "Remote" worker
and Infinyon-managed "Cloud" worker
Copy file name to clipboardExpand all lines: sdf/deployment.mdx
+52-79Lines changed: 52 additions & 79 deletions
Original file line number
Diff line number
Diff line change
@@ -4,33 +4,29 @@ description: Deployment of dataflow via a Worker
4
4
sidebar_position: 60
5
5
---
6
6
7
-
# Introduction
7
+
##Introduction
8
8
9
9
When you use the `run` command to execute a dataflow, it runs within the same process as the CLI. This is useful for development and testing because it's easy to start without needing to manage additional resources. It allows for quick testing and validation of the dataflow, and you can easily load and integrate development packages.
10
10
11
-
For production deployment, the `deploy` command is used to deploy the dataflow on a worker. All operations available in `run` also apply to deploy, with the following differences:
11
+
For production deployment, the `deploy` command is used to deploy the dataflow on a `worker`. All operations available in `run` also apply to deploy, with the following differences:
12
12
- The dataflow is executed on the worker, not within the CLI process. The CLI communicates with the worker on the user's behalf.
13
-
- The dataflow continues running even if the CLI is shut down. It will only terminate if the worker is stopped, shut down, or the dataflow is explicitly stopped or deleted.
13
+
- The dataflow continues running even if the CLI is shut down. It will only terminate if the worker is stopped or shut down, or if the dataflow is explicitly stopped or deleted.
14
14
- Dataflows in the worker only have access to published packages, unlike `run` mode, which allows access to local packages. If you need to use a package, you must publish it first.
15
15
- Multiple dataflows can be deployed on the worker, with each dataflow isolated from the others. They do not share any state or memory but can communicate via Fluvio topics.
16
16
17
+
To use deployment mode, it's essential to understand what a worker is, and [how to manage a dataflow inside a worker](#managing-dataflows).
17
18
18
-
To use deployment mode, it's essential to understand the following concepts:
19
-
- Workers
20
-
- Deploying dataflows to workers
21
-
- Dataflow lifecycle within a worker
22
19
23
-
# Workers
20
+
##Workers
24
21
25
22
A worker is the deployment target for a dataflow and must be created and provisioned before deploying a dataflow. The worker can run anywhere as long as it can connect to the same Fluvio cluster. If you're using InfinyOn Cloud, the worker is automatically provisioned.
26
23
27
24
There is no limit to the number of dataflows you can run on each worker, apart from CPU, memory, and disk constraints. For optimal performance, it is recommended to run a single worker per machine.
28
25
29
-
There are two types of workers: `Host` and `Remote`. `Host` is a simple worker designed for local deployment without requiring any additional infrastructure. It is not designed for robust production deployment.
30
-
For typical production deployment, you will use `Remote` worker. It is designed to run in the cloud, data center, or edge device. If you are using InfinyOn Cloud, the `remote` cloud worker is automatically provisioned and registered in your profile.
26
+
There are two types of workers: `host` and `remote`. A host worker is a simple worker designed for local deployment without requiring any additional infrastructure. It is not designed for robust production deployments. For typical production deployments, you will use remote workers. Remote workers are designed to run in the cloud, data center, or on edge devices. If you are using InfinyOn Cloud, the remote cloud worker is automatically provisioned and registered in your profile.
27
+
28
+
A worker "profile" is maintained for each Fluvio cluster. The worker profile maintains a list of uuids of the cluster's workers, as well as the currently selected worker. When you switch the Fluvio profile, the corresponding worker profile is used automatically. Together, the worker profile and Fluvio profile allow the [SDF CLI] to issue commands to the selected worker. Once a worker is selected, it will be used for all dataflow operations until you choose a different worker. Each worker also has a human-readable name which is used to easily identify the worker in the CLI.
31
29
32
-
Each worker has a unique identifier for the Fluvio cluster. The worker profile is stored in the local machine and is used to match the worker with the Fluvio cluster. When you switch the Fluvio profile, the worker profile is also switched. Once a worker is selected, it will be used for all dataflow operations until you choose a different worker.
33
-
The worker also human-readable name that is used to identify the worker in the CLI.
34
30
35
31
### Host Workers
36
32
@@ -39,20 +35,20 @@ To create host worker, you can use the following command.
39
35
$> sdf worker create <name>
40
36
```
41
37
42
-
This will creates and register a new worker in your machine. It will run in the background until you shutdown the worker or machine is rebooted. The name can be anything as long as it is unique for your machine since profile are not shared across different machines.
38
+
This will creates and register a new worker on your machine. It will run in the background until you shutdown the worker or machine is rebooted. The name can be anything.
43
39
44
-
Once you have created the worker, You can list them.
40
+
Once you have created a worker, You can view the list of workers on your Fluvio cluster.
45
41
46
42
```bash
47
43
$> sdf worker create main
48
44
Worker `main` created for cluster: `local`
49
45
$> sdf worker list
50
-
NAME TYPE CLUSTER WORKER ID
51
-
* main Host local 7fd7eda3-2738-41ef-8edc-9f04e500b919
46
+
NAME TYPE CLUSTER WORKER ID VERSION
47
+
* main Host local 7fd7eda3-2738-41ef-8edc-9f04e500b919<your SDF version>
52
48
```
53
-
The `*` indicates the current selected worker.
49
+
The `*` indicates the current selected worker.
54
50
55
-
SDF only support running a single HOST worker for each machine since a single worker can support many dataflow. If you try to create another worker, you will get an error message.
51
+
SDF only supports running a single host worker for each machine since a single worker can support many dataflows. If you try to create another worker, you will get an error message.
56
52
57
53
```bash
58
54
$ sdf worker create main2
@@ -61,26 +57,16 @@ There is already a host worker with pid 20686 running. Please terminate it firs
61
57
```
62
58
63
59
Shutting down a worker will terminate all running dataflow and worker processes.
60
+
64
61
```bash
65
62
$> sdf worker shutdown main
66
-
sdf worker shutdown main
67
63
Shutting down pid: 20688
68
-
Shutting down pid: 20686
69
64
Host worker: main has been shutdown
70
65
```
71
66
72
-
Even though host worker is shutdown and removed from the profile, the dataflow files and state are still persisted. You can restart the worker and the dataflow will resume.
67
+
Even though the host worker is shutdown and removed from the profile, the dataflow files and state are still persisted. You can restart the worker and the dataflow will resume.
73
68
74
-
For example, if you have dataflow `fraud-detector` and `car-processor` running in the worker and you shut down the worker, the dataflow process will be terminated. But you can resume by recreating the HOST worker.
75
-
76
-
```bash
77
-
$> sdf worker create main
78
-
```
79
-
80
-
The local worker stores the dataflow state in the local file system. The dataflow state is stored in the `~/.sdf/<cluster>/worker/<dataflow>`.
81
-
For the `local` cluster, files will be stored in `~/.sdf/local/worker/dataflows`.
82
-
83
-
if you have deleted the fluvio cluster, the worker needs to be manually shutdown and created again. This limitation will be removed in a future release
69
+
Host workers store the dataflow state in the local file system at `~/.sdf/local/worker/dataflows`. If you have deleted your local fluvio cluster, the worker needs to be manually shutdown and created again. This limitation will be removed in a future release
84
70
85
71
86
72
### Remote Workers
@@ -93,7 +79,7 @@ Typical lifecycle for using remote worker:
93
79
94
80
Note that there are many ways to manage the remote worker. You can use Kubernetes, Docker, Systemd, Terraform, Ansible, or any other tool that can manage the server process and ensure it can restart when server is rebooted. Please contact InfinyOn support for more information.
95
81
96
-
InfinyOn cloud is a simplest way to use the remote worker. When you create a cluster in InfinyOn cloud, it will automatically provision and sync worker for you.
82
+
InfinyOn cloud is a simplest way to use the remote worker. When you create a cluster in InfinyOn cloud, it will automatically provision and sync worker for you.
97
83
98
84
The worker is automatically register when you create the cluster. By default, worker is name as cluster name.
99
85
@@ -134,7 +120,7 @@ To unregister the worker after you are done with and no longer need, you can us
134
120
$> sdf worker unregister <name>
135
121
```
136
122
137
-
## Managing workers
123
+
###Managing workers
138
124
139
125
Workers must be registered before deploying a dataflow. The CLI provides commands to manage workers, including creating, listing, switching, and deleting them.
140
126
@@ -160,8 +146,30 @@ finding all available workers:
160
146
161
147
With `-all` option, it will display `version` of the discovered worker.
162
148
149
+
// check if this is true for remote workers
150
+
The dataflow state is stored in the `~/.sdf/<cluster>/worker/<dataflow>`.
163
151
164
-
# Deploying dataflow
152
+
### Workers on InfinyOn Cloud
153
+
154
+
With InfinyOn Cloud, there is no need to manage the worker. It provisions the worker for you. It also sync profile when cluster is created.
155
+
156
+
For example, creating cloud cluster will automatically provision and create SDF worker profile.
157
+
158
+
```bash
159
+
$> fluvio cloud login --use-oauth2
160
+
$> fluvio cloud cluster create
161
+
Creating cluster...
162
+
Done!
163
+
Downloading cluster config
164
+
Registered sdf worker: jellyfish
165
+
Switched to new profile: jellyfish
166
+
```
167
+
168
+
You can unregister the cloud worker like any other remote worker.
169
+
170
+
## Managing Dataflows
171
+
172
+
### Deploying Dataflows to Workers
165
173
166
174
Once worker is selected, you can deploy the dataflow using `deploy` command:
167
175
@@ -175,16 +183,14 @@ The deploy command is similar to the run command. It deploys the dataflow and st
175
183
Error: No workers. run `sdf worker create` to create one.
176
184
```
177
185
178
-
## Managing dataflow in worker
179
-
180
186
When you are running dataflow in the worker, it will indicate name of the worker in the prompt:
181
187
182
188
```bash
183
189
$> sdf deploy
184
190
[main] >> show state
185
191
```
186
192
187
-
## Listing and selecting dataflow
193
+
###Listing and selecting dataflow
188
194
189
195
To list all dataflows running in the worker, you can use the `show dataflow` command which shows the fully qualified name of the dataflow and its status.
190
196
@@ -212,22 +218,7 @@ To select the dataflow, you can use `dataflow select` with the fully qualified d
212
218
dataflow switched to: myorg/wordcount-simple@0.10
213
219
```
214
220
215
-
## Deleting dataflow
216
-
217
-
To delete the dataflow, you can use the `dataflow delete` command.
218
-
219
-
After you delete the dataflow, it will no longer be listed in the dataflow list.
0 commit comments