Skip to content

Polish document for quickstart #40

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 5, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 24 additions & 18 deletions docs/docs/getting_started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,22 +11,18 @@ This guide will help you get up and running with CocoIndex in just a few minutes
* loads the data into a vector store (PG Vector)


## Step 1: Set up the CocoIndex environment
## Prerequisite: Install CocoIndex environment

1. Open the terminal and create a new directory for your project:
We'll need to install a bunch of dependencies for this project.

```bash
mkdir cocoindex-quickstart
cd cocoindex-quickstart
```

2. Install CocoIndex:
1. Install CocoIndex:

```bash
pip install cocoindex
```

3. You need to setup a Postgres database with pgvector extension installed; or bring up a Postgres database using docker compose:

2. You can skip this step if you already have a Postgres database with pgvector extension installed.
If not, the easiest way is to bring up a Postgres database using docker compose:

- Make sure Docker Compose is installed: [docs](https://docs.docker.com/compose/install/)
- Start a Postgres SQL database for cocoindex using our docker compose config:
Expand All @@ -35,11 +31,23 @@ This guide will help you get up and running with CocoIndex in just a few minutes
docker compose -f <(curl -L https://raw.githubusercontent.com/cocoindex-io/cocoindex/refs/heads/main/dev/postgres.yaml) up -d
```

## Step 1: Prepare directory for your project

1. Open the terminal and create a new directory for your project:

```bash
mkdir cocoindex-quickstart
cd cocoindex-quickstart
```

2. Prepare input files for the index. Put them in a directory, e.g. `markdown_files`.
If you don't have any files at hand, you may download the example [markdown_files.zip](markdown_files.zip) and unzip it in the current directory.

## Step 2: Create the Python file `quickstart.py`

Create a new file `quickstart.py` and import the `cocoindex` library:

```python
```python title="quickstart.py"
import cocoindex
```

Expand All @@ -53,11 +61,12 @@ Then we'll put the following pieces into the file:

Starting from the indexing flow:

```python
```python title="quickstart.py"
@cocoindex.flow_def(name="TextEmbedding")
def text_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope):
# Add a data source to read files from a directory
data_scope["documents"] = flow_builder.add_source(cocoindex.sources.LocalFile(path="markdown_files"))
data_scope["documents"] = flow_builder.add_source(
cocoindex.sources.LocalFile(path="markdown_files"))

# Add a collector for data to be exported to the vector index
doc_embeddings = data_scope.add_collector()
Expand Down Expand Up @@ -109,7 +118,7 @@ Notes:

Starting from the query handler:

```python
```python title="quickstart.py"
query_handler = cocoindex.query.SimpleSemanticsQueryHandler(
name="SemanticsSearch",
flow=text_embedding_flow,
Expand All @@ -127,7 +136,7 @@ This handler queries the vector index `"doc_embeddings"`, and uses the same embe

The main function is used to interact with users and run queries using the query handler above.

```python
```python title="quickstart.py"
@cocoindex.main_fn()
def _main():
# Run queries to demonstrate the query capabilities.
Expand Down Expand Up @@ -178,9 +187,6 @@ Now we have tables needed by this CocoIndex flow.

### Step 3.2: Build the index

Before building the index, make sure input markdown files are in the `markdown_files` directory.
You may download the example [markdown_files.zip](markdown_files.zip) and unzip it in the current directory.

Now we're ready to build the index:

```bash
Expand Down