Skip to content

Commit 45b7b06

Browse files
authored
Update docs and examples for new standalone CLI (#524)
* Update docs for standalond CLI usage * Update examples for standalond CLI usage * Refine some expressions to make docs clearer
1 parent a37eddc commit 45b7b06

File tree

15 files changed

+61
-151
lines changed

15 files changed

+61
-151
lines changed

docs/docs/core/cli.mdx

Lines changed: 21 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -8,60 +8,39 @@ import TabItem from '@theme/TabItem';
88

99
# CocoIndex CLI
1010

11-
CocoIndex CLI embeds CLI functionality in your program.
12-
It provides a bunch of commands for easily managing and inspecting your flows and indexes.
11+
CocoIndex CLI is a standalone tool for easily managing and inspecting your flows and indexes.
1312

14-
## Enable CocoIndex CLI
13+
## Invoking the CLI
1514

16-
### Use Packaged Main
15+
Once CocoIndex is installed, you can invoke the CLI directly using the `cocoindex` command. Most commands require an `APP_TARGET` argument, which tells the CLI where your flow definitions are located.
1716

18-
The easiest way is to use a packaged main function:
17+
**APP_TARGET Format:**
1918

20-
<Tabs>
21-
<TabItem value="python" label="Python" default>
19+
The `APP_TARGET` can be:
20+
1. A **path to a Python file** defining your flows (e.g., `main.py`, `path/to/my_flows.py`).
21+
2. An **installed Python module name** that contains your flow definitions (e.g., `my_package.flows`).
22+
3. For commands that operate on a *specific flow* (like `show`, `update`, `evaluate`), you can combine the application reference with a flow name:
23+
* `path/to/my_flows.py:MyFlow`
24+
* `my_package.flows:MyFlow`
2225

23-
```python title="main.py"
24-
import cocoindex
26+
**Global Options:**
2527

26-
@cocoindex.main_fn()
27-
def main():
28-
...
29-
```
28+
* `--env-file <path>`: Load environment variables from a specified `.env` file. If not provided, `.env` in the current directory is loaded if it exists.
29+
* `--version`: Show the CocoIndex version and exit.
30+
* `--help`: Show the main help message and exit.
3031

31-
</TabItem>
32-
</Tabs>
33-
34-
With this, when the program is executed with `cocoindex` as its first argument, CocoIndex CLI will take over the control. For example:
35-
36-
```sh
37-
$ python main.py cocoindex ls # Run "ls" subcommand: list all flows
38-
```
39-
40-
You may also provide a `cocoindex_cmd` argument to the `main_fn` decorator to change the command from `cocoindex` to something else.
41-
42-
### Explicitly CLI Invoke
43-
44-
An alternative way is to use `cocoindex.cli.cli` (with type [`click.Group`](https://click.palletsprojects.com/en/stable/api/#click.Group)).
45-
For example, you may invoke the CLI explicitly with additional arguments:
46-
47-
<Tabs>
48-
<TabItem value="python" label="Python" default>
49-
50-
```python
51-
cocoindex.cli.cli.main(args)
52-
```
53-
54-
</TabItem>
55-
</Tabs>
32+
:::caution Deprecated Usage
33+
The old method of invoking the CLI using `python main.py cocoindex ...` via the `@cocoindex.main_fn()` decorator is now deprecated. Please remove `@cocoindex.main_fn()` from your scripts and use the standalone cocoindex command as described.
34+
:::
5635

5736
## Subcommands
5837

5938
The following subcommands are available:
6039

6140
| Subcommand | Description |
6241
| ---------- | ----------- |
63-
| `ls` | List all flows present in the current process. Or list all persisted flows under the current app namespace if `--all` is specified. |
64-
| `show` | Show the spec for a specific flow. |
42+
| `ls` | List all flows present in the given file/module. Or list all persisted flows under the current app namespace if no file/module specified. |
43+
| `show` | Show the spec and schema for a specific flow. |
6544
| `setup` | Check and apply backend setup changes for flows, including the internal and target storage (to export). |
6645
| `drop` | Drop the backend setup for specified flows. |
6746
| `update` | Update the index defined by the flow. |
@@ -71,6 +50,6 @@ The following subcommands are available:
7150
Use `--help` to see the full list of subcommands, and `subcommand --help` to see the usage of a specific one.
7251

7352
```sh
74-
python main.py cocoindex --help # Show all subcommands
75-
python main.py cocoindex show --help # Show usage of "show" subcommand
53+
cocoindex --help # Show all subcommands
54+
cocoindex show --help # Show usage of "show" subcommand
7655
```

docs/docs/core/flow_methods.mdx

Lines changed: 3 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -30,17 +30,7 @@ def demo_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataSco
3030
```
3131

3232
It creates a `demo_flow` object in `cocoindex.Flow` type.
33-
To enable CLI, you also need to make sure you have a main function decorated with `@cocoindex.main_fn()`:
3433

35-
36-
```python title="main.py"
37-
@cocoindex.main_fn()
38-
def main():
39-
...
40-
41-
if __name__ == "__main__":
42-
main()
43-
```
4434
</TabItem>
4535
</Tabs>
4636

@@ -78,7 +68,7 @@ The `cocoindex update` subcommand creates/updates data in the target storage.
7868
Once it's done, the target data is fresh up to the moment when the function is called.
7969

8070
```sh
81-
python main.py cocoindex update
71+
cocoindex update main.py
8272
```
8373

8474
#### Library API
@@ -115,7 +105,7 @@ Change capture mechanisms enable CocoIndex to continuously capture changes from
115105
To perform live update, run the `cocoindex update` subcommand with `-L` option:
116106

117107
```sh
118-
python main.py cocoindex update -L
108+
cocoindex update main.py -L
119109
```
120110

121111
If there's at least one data source with change capture mechanism enabled, it will keep running until the aborted (e.g. by `Ctrl-C`).
@@ -232,7 +222,7 @@ It takes the following options:
232222
Example:
233223

234224
```sh
235-
python main.py cocoindex evaluate --output-dir ./eval_output
225+
cocoindex evaluate main.py --output-dir ./eval_output
236226
```
237227

238228
### Library API

docs/docs/core/initialization.mdx

Lines changed: 29 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -15,42 +15,42 @@ We'll talk about the code skeleton to initialize the library in your code, and t
1515

1616
There're two options to initialize in your code:
1717

18-
* Use packaged main function. It's easier to start with.
18+
* Use Cocoindex CLI. It's easier to start with.
1919
* Explicit initialization. It's more flexible.
2020

21-
### Packaged Main
21+
### CLI-Based Initialization
2222

23-
The easiest way is to use a packaged main function:
23+
When you use the `cocoindex` command-line tool, the library is automatically initialized for you:
2424

25-
<Tabs>
26-
<TabItem value="python" label="Python" default>
25+
1. **Environment File Loading**:
26+
* By default, the `cocoindex` CLI searches upward from the current directory for a `.env` file.
27+
* You can use `--env-file <path>` to specify one explicitly:
2728

28-
The `@cocoindex.main_fn` decorator wraps your main function for CocoIndex:
29-
30-
```python
31-
import cocoindex
29+
```sh
30+
cocoindex --env-file path/to/custom.env <COMMAND> ...
31+
```
3232

33-
@cocoindex.main_fn()
34-
def main():
35-
...
33+
* If no file is found, only existing system environment variables are used.
34+
* Loaded variables do **not** override existing system ones.
3635

37-
if __name__ == "__main__":
38-
main()
39-
```
36+
2. **Automatic Library Initialization**:
37+
* Then, the CLI automatically prepares everything using loaded environment variables — no manual setup required.
38+
* Your script (e.g. `main.py`) is just used to discover defined flows.
4039

41-
</TabItem>
42-
</Tabs>
40+
See [Environment Variables](#environment-variables) for supported variables.
4341

44-
This takes care of the following effects:
42+
The primary way to interact with CocoIndex in this setup is via CLI commands that operate on your script:
43+
You interact with CocoIndex via CLI commands that operate on your script:
4544

46-
1. Initialize the library with settings loaded from environment variables, if not explicitly provided.
47-
2. If the program is executed with the `cocoindex` command, CocoIndex CLI will take over the control.
48-
It provides a bunch of commands for easily managing and inspecting indexes.
49-
See [CocoIndex CLI](/docs/core/cli) for more details.
50-
3. Otherwise, it will run the main function.
45+
```sh
46+
# Example: List flows defined in my_app.py
47+
cocoindex ls my_app.py
5148

52-
See [Environment Variables](#environment-variables) for supported environment variables.
49+
# Example: Update a specific flow in my_app.py
50+
cocoindex update my_app.py:MyFlowName
51+
```
5352

53+
See [CocoIndex CLI](/docs/core/cli) for more details.
5454

5555
### Explicit Initialization
5656

@@ -123,7 +123,11 @@ If you use the Postgres database hosted by [Supabase](https://supabase.com/), pl
123123

124124
## Environment Variables
125125

126-
When you use the packaged main function, settings will be loaded from environment variables.
126+
When using the CLI, settings are primarily loaded from environment variables. The CLI will:
127+
128+
* Use the `--env-file` option if provided.
129+
* Otherwise, try to locate a `.env` file by searching upward from the current directory.
130+
127131
Each setting field has a corresponding environment variable:
128132

129133
| environment variable | corresponding field in `Settings` | required? |

docs/docs/getting_started/quickstart.md

Lines changed: 8 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -46,19 +46,15 @@ We'll need to install a bunch of dependencies for this project.
4646
2. Prepare input files for the index. Put them in a directory, e.g. `markdown_files`.
4747
If you don't have any files at hand, you may download the example [markdown_files.zip](markdown_files.zip) and unzip it in the current directory.
4848
49-
## Step 2: Create the Python file `quickstart.py`
49+
## Step 2: Define the indexing flow
5050
5151
Create a new file `quickstart.py` and import the `cocoindex` library:
5252
5353
```python title="quickstart.py"
5454
import cocoindex
5555
```
5656
57-
Then we'll create the indexing flow.
58-
59-
### Step 2.1: Define the indexing flow
60-
61-
Starting from the indexing flow:
57+
Then we'll create the indexing flow as follows.
6258

6359
```python title="quickstart.py"
6460
@cocoindex.flow_def(name="TextEmbedding")
@@ -117,24 +113,6 @@ Notes:
117113

118114
6. In CocoIndex, a *collector* collects multiple entries of data together. In this example, the `doc_embeddings` collector collects data from all `chunk`s across all `doc`s, and using the collected data to build a vector index `"doc_embeddings"`, using `Postgres`.
119115

120-
### Step 2.2: Define the main function
121-
122-
We can provide an empty main function for now, with a `@cocoindex.main_fn()` decorator:
123-
124-
```python title="quickstart.py"
125-
@cocoindex.main_fn()
126-
def _main():
127-
pass
128-
129-
if __name__ == "__main__":
130-
_main()
131-
```
132-
133-
The `@cocoindex.main_fn` declares a function as the main function for an indexing application. This achieves the following effects:
134-
135-
* Initialize the CocoIndex library states. Settings (e.g. database URL) are loaded from environment variables by default.
136-
* When the CLI is invoked with `cocoindex` subcommand, `cocoindex CLI` takes over the control, which provides convenient ways to manage the index. See the next step for more details.
137-
138116
## Step 3: Run the indexing pipeline and queries
139117

140118
Specify the database URL by environment variable:
@@ -148,7 +126,7 @@ export COCOINDEX_DATABASE_URL="postgresql://cocoindex:cocoindex@localhost:5432/c
148126
We need to setup the index:
149127

150128
```bash
151-
python quickstart.py cocoindex setup
129+
cocoindex setup quickstart.py
152130
```
153131

154132
Enter `yes` and it will automatically create a few tables in the database.
@@ -160,7 +138,7 @@ Now we have tables needed by this CocoIndex flow.
160138
Now we're ready to build the index:
161139
162140
```bash
163-
python quickstart.py cocoindex update
141+
cocoindex update quickstart.py
164142
```
165143
166144
It will run for a few seconds and output the following statistics:
@@ -260,13 +238,12 @@ There're two CocoIndex-specific logic:
260238
It's done by the `eval()` method of the transform flow `text_to_embedding`.
261239
The return type of this method is `list[float]` as declared in the `text_to_embedding()` function (`cocoindex.DataSlice[list[float]]`).
262240
263-
### Step 4.3: Update the main function
241+
### Step 4.3: Add the main script logic
264242
265-
Now we can update the main function to use the query function we just defined:
243+
Now we can add the main logic to the program. It uses the query function we just defined:
266244
267245
```python title="quickstart.py"
268-
@cocoindex.main_fn()
269-
def _run():
246+
if __name__ == "__main__":
270247
# Initialize the database connection pool.
271248
pool = ConnectionPool(os.getenv("COCOINDEX_DATABASE_URL"))
272249
# Run queries in a loop to demonstrate the query capabilities.
@@ -291,7 +268,7 @@ It interacts with users and search the database by calling the `search()` method
291268
292269
### Step 4.4: Run queries against the index
293270
294-
Now we can run the same Python file, which will run the new main function:
271+
Now we can run the same Python file, which will run the new added main logic:
295272
296273
```bash
297274
python quickstart.py

examples/amazon_s3_embedding/main.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
from dotenv import load_dotenv
2-
31
import cocoindex
42
import os
53

@@ -52,7 +50,6 @@ def amazon_s3_text_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scop
5250
model="sentence-transformers/all-MiniLM-L6-v2")),
5351
default_similarity_metric=cocoindex.VectorSimilarityMetric.COSINE_SIMILARITY)
5452

55-
@cocoindex.main_fn()
5653
def _run():
5754
# Use a `FlowLiveUpdater` to keep the flow data updated.
5855
with cocoindex.FlowLiveUpdater(amazon_s3_text_embedding_flow):
@@ -73,5 +70,4 @@ def _run():
7370
break
7471

7572
if __name__ == "__main__":
76-
load_dotenv(override=True)
7773
_run()

examples/code_embedding/main.py

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
from dotenv import load_dotenv
2-
31
import cocoindex
42
import os
53

@@ -54,7 +52,6 @@ def code_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoind
5452
query_transform_flow=code_to_embedding,
5553
default_similarity_metric=cocoindex.VectorSimilarityMetric.COSINE_SIMILARITY)
5654

57-
@cocoindex.main_fn()
5855
def _run():
5956
# Run queries in a loop to demonstrate the query capabilities.
6057
while True:
@@ -73,5 +70,4 @@ def _run():
7370
break
7471

7572
if __name__ == "__main__":
76-
load_dotenv(override=True)
7773
_run()

examples/docs_to_knowledge_graph/main.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
This example shows how to extract relationships from documents and build a knowledge graph.
33
"""
44
import dataclasses
5-
from dotenv import load_dotenv
65
import cocoindex
76

87
@dataclasses.dataclass
@@ -148,10 +147,8 @@ def docs_to_kg_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.D
148147
primary_key_fields=["id"],
149148
)
150149

151-
@cocoindex.main_fn()
152150
def _run():
153151
pass
154152

155153
if __name__ == "__main__":
156-
load_dotenv(override=True)
157154
_run()

examples/fastapi_server_docker/main.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@
22
import uvicorn
33

44
from fastapi import FastAPI
5-
from dotenv import load_dotenv
65

76
from src.cocoindex_funs import code_embedding_flow, code_to_embedding
87

@@ -21,10 +20,8 @@ def query_endpoint(string: str):
2120
results, _ = query_handler.search(string, 10)
2221
return results
2322

24-
@cocoindex.main_fn()
2523
def _run():
2624
uvicorn.run(fastapi_app, host="0.0.0.0", port=8080)
2725

2826
if __name__ == "__main__":
29-
load_dotenv(override=True)
3027
_run()

0 commit comments

Comments
 (0)