|
| 1 | +# Structured Data Extraction from PDF with Ollama and CocoIndex |
| 2 | + |
| 3 | + |
| 4 | + |
| 5 | + |
1 | 6 | In this example, we
|
2 | 7 |
|
3 | 8 | * Converts PDFs (generated from a few Python docs) into Markdown.
|
4 | 9 | * Extract structured information from the Markdown using LLM.
|
5 | 10 | * Use a custom function to further extract information from the structured output.
|
6 | 11 |
|
| 12 | +Please give [Cocoindex on Github](https://github.com/cocoindex-io/cocoindex) a star to support us if you like our work. Thank you so much with a warm coconut hug 🥥🤗. [](https://github.com/cocoindex-io/cocoindex) |
| 13 | + |
7 | 14 | ## Prerequisite
|
8 | 15 |
|
9 | 16 | Before running the example, you need to:
|
@@ -47,14 +54,21 @@ And run the SQL query:
|
47 | 54 | ```sql
|
48 | 55 | SELECT filename, module_info->'title' AS title, module_summary FROM modules_info;
|
49 | 56 | ```
|
| 57 | +You should see results like: |
| 58 | + |
| 59 | + |
| 60 | + |
50 | 61 |
|
51 | 62 | ## CocoInsight
|
52 |
| -CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight: [Watch on YouTube](https://youtu.be/ZnmyoHslBSc?si=pPLXWALztkA710r9). |
| 63 | +CocoInsight is a tool to help you understand your data pipeline and data index. CocoInsight is in Early Access now (Free) 😊 You found us! A quick 3 minute video tutorial about CocoInsight: [Watch on YouTube](https://youtu.be/ZnmyoHslBSc?si=pPLXWALztkA710r9). |
53 | 64 |
|
54 | 65 | Run CocoInsight to understand your RAG data pipeline:
|
55 | 66 |
|
56 | 67 | ```
|
57 | 68 | python main.py cocoindex server -c https://cocoindex.io
|
58 | 69 | ```
|
59 | 70 |
|
60 |
| -Then open the CocoInsight UI at [https://cocoindex.io/cocoinsight](https://cocoindex.io/cocoinsight). |
| 71 | +Then open the CocoInsight UI at [https://cocoindex.io/cocoinsight](https://cocoindex.io/cocoinsight). It connects to your local CocoIndex server with zero data retention. |
| 72 | + |
| 73 | +You can view the pipeline flow and the data preview in the CocoInsight UI: |
| 74 | + |
0 commit comments