Skip to content

Commit 2108cb0

Browse files
authored
Update documentations for the evaluation and dump functionality (#200)
1 parent ec5f39b commit 2108cb0

File tree

5 files changed

+32
-6
lines changed

5 files changed

+32
-6
lines changed

docs/docs/core/cli.mdx

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,7 @@ The following subcommands are available:
6565
| `setup` | Check and apply setup changes for flows, including the internal and target storage (to export). |
6666
| `show` | Show the spec for a specific flow. |
6767
| `update` | Update the index defined by the flow. |
68+
| `evaluate` | Evaluate the flow and dump flow outputs to files. Instead of updating the index, it dumps what should be indexed to files. Mainly used for evaluation purpose. |
6869

6970
Use `--help` to see the full list of subcommands, and `subcommand --help` to see the usage of a specific one.
7071

docs/docs/core/flow_methods.mdx

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ After a flow is defined as discussed in [Flow Definition](/docs/core/flow_def),
1212

1313
## update
1414

15-
The `update()` method will update will update the index defined by the flow.
15+
The `update()` method will update the index defined by the flow.
1616

1717
Once the function returns, the indice is fresh up to the moment when the function is called.
1818

@@ -23,5 +23,25 @@ Once the function returns, the indice is fresh up to the moment when the functio
2323
flow.update()
2424
```
2525

26+
</TabItem>
27+
</Tabs>
28+
29+
## evaluate_and_dump
30+
31+
The `evaluate_and_dump()` method evaluates the flow and dump flow outputs to files.
32+
33+
It takes a `EvaluateAndDumpOptions` dataclass as input to configure, with the following fields:
34+
35+
* `output_dir` (type: `str`, required): The directory to dump the result to.
36+
* `use_cache` (type: `bool`, default: `True`): Use already-cached intermediate data if available.
37+
Note that we only reuse existing cached data without updating the cache even if it's turned on.
38+
39+
<Tabs>
40+
<TabItem value="python" label="Python" default>
41+
42+
```python
43+
flow.evaluate_and_dump(EvaluateAndDumpOptions(output_dir="./eval_output"))
44+
```
45+
2646
</TabItem>
2747
</Tabs>

python/cocoindex/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
Cocoindex is a framework for building and running indexing pipelines.
33
"""
44
from . import flow, functions, query, sources, storages, cli
5-
from .flow import FlowBuilder, DataScope, DataSlice, Flow, flow_def
5+
from .flow import FlowBuilder, DataScope, DataSlice, Flow, flow_def, EvaluateAndDumpOptions
66
from .llm import LlmSpec, LlmApiType
77
from .vector import VectorSimilarityMetric
88
from .lib import *

python/cocoindex/cli.py

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,13 +57,18 @@ def update(flow_name: str | None):
5757
@click.argument("flow_name", type=str, required=False)
5858
@click.option(
5959
"-o", "--output-dir", type=str, required=False,
60-
help="The directory to dump the evaluation output to.")
60+
help="The directory to dump the output to.")
6161
@click.option(
6262
"-c", "--use-cache", is_flag=True, show_default=True, default=True,
63-
help="Use cached evaluation results if available.")
63+
help="Use already-cached intermediate data if available. "
64+
"Note that we only reuse existing cached data without updating the cache "
65+
"even if it's turned on.")
6466
def evaluate(flow_name: str | None, output_dir: str | None, use_cache: bool = True):
6567
"""
66-
Evaluate and dump the flow.
68+
Evaluate the flow and dump flow outputs to files.
69+
70+
Instead of updating the index, it dumps what should be indexed to files.
71+
Mainly used for evaluation purpose.
6772
"""
6873
fl = _flow_by_name(flow_name)
6974
if output_dir is None:

python/cocoindex/flow.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -372,7 +372,7 @@ def update(self):
372372

373373
def evaluate_and_dump(self, options: EvaluateAndDumpOptions):
374374
"""
375-
Evaluate and dump the flow.
375+
Evaluate the flow and dump flow outputs to files.
376376
"""
377377
return self._lazy_engine_flow().evaluate_and_dump(_dump_engine_object(options))
378378

0 commit comments

Comments
 (0)