-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] GraphRAG integration issue #367
Comments
Thanks a lot @ronchengang, the index with GraphRAG worked fine here - running on Mac, using Docker. I'm now experiencing issues in the path of the artifact. When using Looks like the chat pipeline is trying to retreive the GraphRAG from this path:
![]() which causes this error/stack:
Maybe during the settings creation we're losing some config for the artifacts path? Btw, just as a test/workaround, I've moved my artifacts to match the expected path, and it worked fine, proving that the index with your changes actually are working. |
thanks @joaoaugustogrobe! I forgot to mention that if you are using OpanAI, it should work without any modification, but for people like me using the private model in Ollama, they may have the same issue like me. Regarding the query problem you mentioned, I also encountered it. My approach is as follows: Search for the following code snippet in libs/ktem/ktem/index/file/graph/pipelines.py output_path = root_path / "output"
child_paths = sorted(
list(output_path.iterdir()), key=lambda x: x.stem, reverse=True
)
# get the latest child path
assert child_paths, "GraphRAG index output not found"
latest_child_path = Path(child_paths[0]) / "artifacts"
INPUT_DIR = latest_child_path and chang it like this output_path = root_path / "output"
#child_paths = sorted(
# list(output_path.iterdir()), key=lambda x: x.stem, reverse=True
#)
# get the latest child path
# assert child_paths, "GraphRAG index output not found"
# latest_child_path = Path(child_paths[0]) / "artifacts"
INPUT_DIR = output_path The change here is related to a setting in GraphRAG setting.yaml. In the old version of GraphRAG's setting.yaml, this setting is like this
Note that in the value of base_dir, there is a ${timestamp} and artifacts .The commented out code in my changes above is about to find the last timestamp in the output directory and append artifacts to form the directory to data files. But in the new version of GraphRAG, this setting has been changed to this
Here, base_dir only retains output, ${timestamp} and artifacts are no longer there. This is why I commented out related lines of code and just keep output_path. So I can tell the reason of query error is because GraphRAG changed its configuration files. I don't know from which version Graphrag made this change, but it did cause the query problem in Kotaemon. It seems that the author has not noticed this change yet, maybe I can create a PR to remind him. |
|
@vip-china Are you sure your GraphRAG index process completed successfully? I guess you didn't finish the index process. Take a look at what files are in your output directory, make sure your index is complete and the files are complete. Here are mine for your reference.
|
############ |
@vip-china |
Description
Integration with GraphRAG is an amazing job!
However, it is not easy to make it work smoothly. In order to make it work on my mac, I changed a lot of things, and finally succeeded.
I found that the main problem here is the setting.yaml of GraphRAG. This setting.yaml file is generated at here:
libs/ktem/ktem/index/file/graph/pipelines.py@call_graphrag_index
However, it cannot be used directly, as some system variables set in Kotaemon's .env cannot be reflected in setting.yaml, like these three.
What I did was to divide this step into three parts.
after doing this, a default setting.yaml file will be generated
and then add following code to update these values into setting.yaml
BTW, here are all GraphRAG env vars for your reference.
https://microsoft.github.io/graphrag/posts/config/env_vars/
Reproduction steps
Screenshots

Logs
No response
Browsers
Chrome
OS
MacOS
Additional information
No response
The text was updated successfully, but these errors were encountered: