You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/text_embedding/README.md
+22-5Lines changed: 22 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,24 @@
1
-
Simple example for cocoindex: build embedding index based on local files.
2
-
1
+
# Build text embedding and semantic search 🔍
3
2
[](https://colab.research.google.com/github/cocoindex-io/cocoindex/blob/main/examples/text_embedding/Text_Embedding.ipynb)
In this example, we will build index flow from text embedding from local markdown files, and query the index.
6
+
7
+
We appreicate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/cocoindex) if this is helpful.
8
+
9
+
## Steps:
10
+
🌱 A detailed step by step tutorial can be found here: [Get Started Documentation](https://cocoindex.io/docs/getting_started/quickstart)
11
+
12
+
### Indexing Flow:
13
+
<imgwidth="461"alt="Screenshot 2025-05-19 at 5 48 28 PM"src="https://github.com/user-attachments/assets/b6825302-a0c7-4b86-9a2d-52da8286b4bd" />
14
+
15
+
1. We will ingest from a list of local files.
16
+
2. For each file, perform chunking (Recursive Split) and then embeddings.
17
+
3. We will save the embeddings and the metadata in Postgres with PGVector.
18
+
19
+
### Query:
20
+
We will match against user-provided text by a SQL query, reusing the embedding operation in the indexing flow.
21
+
4
22
5
23
## Prerequisite
6
24
@@ -34,9 +52,8 @@ python main.py
34
52
35
53
## CocoInsight
36
54
37
-
CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight: [Watch on YouTube](https://youtu.be/ZnmyoHslBSc?si=pPLXWALztkA710r9).
38
-
39
-
Run CocoInsight to understand your RAG data pipeline:
55
+
I used CocoInsight (Free beta now) to troubleshoot the index generation and understand the data lineage of the pipeline.
56
+
It just connects to your local CocoIndex server, with Zero pipeline data retention. Run following command to start CocoInsight:
0 commit comments