Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: TypeScript templates do not create a new LlamaCloud index or upload a file to an existing index. #356

Merged
merged 14 commits into from
Oct 10, 2024

Conversation

leehuwuj
Copy link
Collaborator

@leehuwuj leehuwuj commented Oct 9, 2024

Summary by CodeRabbit

  • New Features

    • Introduced a streamlined initialization process for LlamaCloudConfig, enhancing the handling of environment variables.
    • Added error handling in the uploadDocument function for clearer compatibility feedback.
    • Enhanced error handling in the LlamaCloudSelector component for improved user feedback during fetch operations.
    • Implemented HTTP exception handling in the chat configuration router to validate the presence of the API key.
  • Bug Fixes

    • Removed the deprecated from_env method, ensuring a more straightforward configuration setup.
  • Refactor

    • Updated the IndexConfig and get_client functions to utilize the new initialization method for LlamaCloudConfig.
    • Enhanced the loadAndIndex function with improved error handling for file uploads.
    • Streamlined import statements in the generate.py file for clarity.
  • Chores

    • Modified the E2E testing workflow to include the --llamacloud option in the data sources.
    • Enhanced test suite to skip incompatible tests based on Node.js version and configuration.

Copy link

changeset-bot bot commented Oct 9, 2024

🦋 Changeset detected

Latest commit: 0366c34

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
create-llama Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link

coderabbitai bot commented Oct 9, 2024

Caution

Review failed

The head commit changed during the review from aa5b3a9 to 0366c34.

Walkthrough

The changes focus on the LlamaCloudConfig class within the index.py file. A new constructor is introduced to initialize class attributes using environment variables, replacing the previously existing from_env class method, which has been removed. The IndexConfig class is updated to instantiate LlamaCloudConfig directly without the from_env method, and the get_client function is modified accordingly. The validation logic for required fields remains unchanged. Additionally, the GitHub Actions workflow for end-to-end tests is updated to include a new option in the datasources parameter.

Changes

File Path Change Summary
templates/components/vectordbs/python/llamacloud/index.py - Added __init__(self, **kwargs) method in LlamaCloudConfig class.
- Removed from_env(cls) class method in LlamaCloudConfig.
- Updated llama_cloud_pipeline_config to use a lambda for instantiation instead of from_env.
- Modified internal logic of get_client() function to create LlamaCloudConfig instance directly.
.github/workflows/e2e.yml - Updated datasources in E2E test jobs to include --llamacloud option.
e2e/shared/streaming_template.spec.ts - Added conditional checks to skip tests based on Node.js version and data source.
templates/components/llamaindex/typescript/documents/upload.ts - Wrapped LLamaCloudFileService.addFileToPipeline in a try-catch for error handling.
templates/components/vectordbs/typescript/llamacloud/generate.ts - Introduced try-catch for error handling in loadAndIndex function when calling addFileToPipeline.
- Updated options object in addFileToPipeline method to use boolean for private and added filename.
templates/components/vectordbs/python/llamacloud/generate.py - Reorganized import statements and removed specific metadata configuration in add_file_to_pipeline.
templates/components/vectordbs/python/llamacloud/query_filter.py - Updated comment in generate_filters function for clarity on public documents.
templates/types/streaming/fastapi/app/api/routers/chat_config.py - Added HTTP exception handling for missing LLAMA_CLOUD_API_KEY.
templates/types/streaming/nextjs/app/components/ui/chat/widgets/LlamaCloudSelector.tsx - Enhanced error handling during fetch operation for LlamaCloud configuration.
templates/components/vectordbs/typescript/llamacloud/queryFilter.ts - Updated comment in generateFilters function for clarity on public documents.

Possibly related PRs

Suggested reviewers

  • marcusschiesser

Poem

In the cloud where llamas play,
A constructor leads the way.
From env no more, we now create,
With simpler paths, we celebrate!
Hops of joy, let changes ring,
For every line, a new spring! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between a47d778 and 006830d.

📒 Files selected for processing (1)
  • templates/components/vectordbs/python/llamacloud/index.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
templates/components/vectordbs/python/llamacloud/index.py (1)

Pattern templates/**: For files under the templates folder, do not report 'Missing Dependencies Detected' errors.

🔇 Additional comments (2)
templates/components/vectordbs/python/llamacloud/index.py (2)

67-70: LGTM

The default_factory correctly initializes LlamaCloudConfig.


94-94: LGTM

The get_client() function correctly initializes LlamaCloudConfig as expected.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
templates/components/vectordbs/typescript/llamacloud/generate.ts (1)

35-38: Approved changes with a minor suggestion.

The modifications improve compatibility and align better with the LLamaCloudFileService.addFileToPipeline API. The use of Blob instead of File is a good choice for better cross-environment support.

A minor suggestion to improve code clarity:

Consider destructuring filename from path.basename(filePath) earlier in the function for better readability. Here's a suggested refactor:

-    const filename = path.basename(filePath);
-    const blob = new Blob([buffer], { type: "application/octet-stream" });
-    await LLamaCloudFileService.addFileToPipeline(projectId, pipelineId, blob, {
-      private: false,
-      filename: filename,
-    });
+    const filename = path.basename(filePath);
+    const blob = new Blob([buffer], { type: "application/octet-stream" });
+    await LLamaCloudFileService.addFileToPipeline(projectId, pipelineId, blob, {
+      private: false,
+      filename,
+    });

This change uses object property shorthand, which is a common JavaScript convention when the property name matches the variable name.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 2f529c4 and 81b5d0b.

📒 Files selected for processing (1)
  • templates/components/vectordbs/typescript/llamacloud/generate.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
templates/components/vectordbs/typescript/llamacloud/generate.ts (1)

Pattern templates/**: For files under the templates folder, do not report 'Missing Dependencies Detected' errors.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (2)
templates/components/llamaindex/typescript/documents/upload.ts (1)

19-38: Improved error handling, but consider refining the error message.

The addition of error handling is a significant improvement to the function's robustness. The try-catch block effectively captures potential errors, and the specific handling of the ReferenceError for the File class is particularly useful.

Consider refining the error message to be more flexible about Node.js versions. Instead of specifying "Node.js 20 or higher", you could say "a Node.js version that supports the File API". This allows for compatibility with any version that includes this feature, not just version 20 and above.

Here's a suggested modification:

- "File class is not supported in the current Node.js version. Please use Node.js 20 or higher.",
+ "File class is not supported in the current Node.js version. Please use a Node.js version that supports the File API.",

This change makes the error message more adaptable to future Node.js releases and avoids potential confusion if the File API becomes available in versions prior to 20.

e2e/shared/streaming_template.spec.ts (1)

30-35: LGTM! Consider extracting the skip condition for better maintainability.

The conditional skip for Node.js 18 and LlamaCloud data source is well-implemented and clearly explained. This change improves the test suite by avoiding incompatible test scenarios.

Consider extracting the skip condition into a separate function for better maintainability:

function shouldSkipTest(nodeVersion: string, dataSource: string): boolean {
  return nodeVersion.startsWith("v18") && dataSource === "--llamacloud";
}

// Usage
test.skip(shouldSkipTest(process.version, dataSource), "Skipping tests for Node 18 and LlamaCloud data source");

This refactoring would make it easier to update or extend the skip conditions in the future.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 81b5d0b and fc3f457.

📒 Files selected for processing (2)
  • e2e/shared/streaming_template.spec.ts (1 hunks)
  • templates/components/llamaindex/typescript/documents/upload.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
templates/components/llamaindex/typescript/documents/upload.ts (1)

Pattern templates/**: For files under the templates folder, do not report 'Missing Dependencies Detected' errors.

@leehuwuj leehuwuj changed the title fix: fix LlamaCloud don't use env value with constructor fix: TypeScript templates do not create a new LlamaCloud index or upload a file to an existing index. Oct 9, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (4)
templates/components/vectordbs/typescript/llamacloud/queryFilter.ts (1)

4-4: Improved clarity in comment for public documents definition.

The updated comment provides more specific and accurate information about how public documents are created. This change enhances the understanding for developers working with this code, aligning well with the subsequent logic that checks for the absence of the "private" field.

Consider adding a brief explanation of how this distinction between public and private documents affects the filter generation, to further improve the documentation.

templates/types/streaming/fastapi/app/api/routers/chat_config.py (2)

29-32: Approve with suggestion: Consider using a more appropriate HTTP status code.

The new error handling for missing LLAMA_CLOUD_API_KEY is a good addition that enhances the robustness of the API. However, consider using HTTP status code 503 (Service Unavailable) instead of 500 (Internal Server Error) for this scenario. Status code 503 better represents that the service is temporarily unavailable due to a configuration issue.

Here's a suggested improvement:

 if not os.getenv("LLAMA_CLOUD_API_KEY"):
     raise HTTPException(
-        status_code=500, detail="LlamaCloud API KEY is not configured"
+        status_code=503, detail="LlamaCloud service is unavailable: API key is not configured"
     )

4-4: Overall: Solid improvement in error handling and API robustness.

The changes introduce proper error handling for the LlamaCloud configuration, which aligns with the PR objectives. This enhancement improves the reliability of the API and provides clear feedback when the necessary configuration is missing. The implementation follows FastAPI best practices and integrates well with the existing code structure.

Consider implementing similar error handling for other environment variables used in this file (e.g., CONVERSATION_STARTERS, LLAMA_CLOUD_INDEX_NAME, LLAMA_CLOUD_PROJECT_NAME) to ensure consistent behavior across all configuration-dependent features.

Also applies to: 29-32

templates/types/streaming/nextjs/app/components/ui/chat/widgets/LlamaCloudSelector.tsx (1)

69-78: Improved error handling for fetch operation.

The changes enhance the robustness of the API call by properly handling non-OK responses and providing user feedback. This is a good improvement.

Consider using a more user-friendly error message format. Instead of displaying raw JSON, you could extract specific fields from the error response for a cleaner presentation. For example:

if (!response.ok) {
  return response.json().then((errorData) => {
    const errorMessage = errorData.message || 'Unknown error occurred';
    window.alert(`Error: ${errorMessage}`);
  });
}

This assumes the error response has a message field. Adjust according to your API's error response structure.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between c71a6e8 and 83f6e03.

📒 Files selected for processing (6)
  • templates/components/vectordbs/python/llamacloud/generate.py (1 hunks)
  • templates/components/vectordbs/python/llamacloud/query_filter.py (1 hunks)
  • templates/components/vectordbs/typescript/llamacloud/generate.ts (1 hunks)
  • templates/components/vectordbs/typescript/llamacloud/queryFilter.ts (1 hunks)
  • templates/types/streaming/fastapi/app/api/routers/chat_config.py (2 hunks)
  • templates/types/streaming/nextjs/app/components/ui/chat/widgets/LlamaCloudSelector.tsx (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • templates/components/vectordbs/python/llamacloud/query_filter.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • templates/components/vectordbs/typescript/llamacloud/generate.ts
🧰 Additional context used
📓 Path-based instructions (4)
templates/components/vectordbs/python/llamacloud/generate.py (1)

Pattern templates/**: For files under the templates folder, do not report 'Missing Dependencies Detected' errors.

templates/components/vectordbs/typescript/llamacloud/queryFilter.ts (1)

Pattern templates/**: For files under the templates folder, do not report 'Missing Dependencies Detected' errors.

templates/types/streaming/fastapi/app/api/routers/chat_config.py (1)

Pattern templates/**: For files under the templates folder, do not report 'Missing Dependencies Detected' errors.

templates/types/streaming/nextjs/app/components/ui/chat/widgets/LlamaCloudSelector.tsx (1)

Pattern templates/**: For files under the templates folder, do not report 'Missing Dependencies Detected' errors.

🔇 Additional comments (5)
templates/components/vectordbs/typescript/llamacloud/queryFilter.ts (1)

Line range hint 1-30: LGTM: Function logic is sound and comment update improves clarity.

The generateFilters function correctly handles both public and private documents based on the presence of documentIds. The updated comment improves understanding without affecting the logic. The rest of the file remains unchanged and functions as expected.

templates/types/streaming/fastapi/app/api/routers/chat_config.py (1)

4-4: LGTM: Appropriate import for error handling.

The addition of HTTPException import from fastapi is correct and necessary for the new error handling mechanism implemented in the /llamacloud route.

templates/components/vectordbs/python/llamacloud/generate.py (3)

3-3: LGTM: Import statements reorganized for better readability

The reorganization of import statements improves code readability by grouping standard library imports, third-party imports, and local imports. The addition of PipelineType import suggests its usage in the code, which enhances type safety.

Also applies to: 8-8, 12-15


Line range hint 1-95: LGTM: Overall structure and logic maintained

The core functionality of the ensure_index and generate_datasource functions remains intact. The script continues to use environment variables and initialize settings correctly. The main purpose and workflow of the script have been preserved, maintaining its integrity and functionality.


Line range hint 89-92: Verify the removal of custom metadata

The custom_metadata parameter has been removed from the add_file_to_pipeline method call. This change removes the explicit setting of the document's visibility ("private": "false").

Please confirm if this change is intentional and consider its potential impact on document visibility and searchability in the LlamaCloud index.

To check for any related changes or discussions, you can run:

✅ Verification successful

Removal of custom_metadata Verified

The custom_metadata parameter has been successfully removed from all instances of the add_file_to_pipeline method calls. Document visibility and searchability are now managed directly through doc.metadata["private"], ensuring consistent functionality without the custom_metadata parameter.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for related changes or discussions about custom metadata
rg --type md --type python 'custom_metadata|document visibility|private'

Length of output: 105


Script:

#!/bin/bash
# Search for related changes or discussions about custom metadata
rg --type md --type py 'custom_metadata|document visibility|private'

Length of output: 3481


Script:

#!/bin/bash
# Find all calls to add_file_to_pipeline and check for custom_metadata parameter
rg --type py 'add_file_to_pipeline\s*\('

# Find all instances where 'private' metadata is set without custom_metadata
rg --type py 'doc.metadata\s*\[\s*["\']private["\']\s*\]'

Length of output: 448


Script:

#!/bin/bash
# Find all calls to add_file_to_pipeline and check for custom_metadata parameter
rg --type py 'add_file_to_pipeline\s*\('

# Find all instances where "private" metadata is set without custom_metadata
rg --type py "doc.metadata\s*\[\s*[\"']private[\"']\s*\]"

Length of output: 704


Script:

#!/bin/bash
# Find all calls to add_file_to_pipeline that include the custom_metadata parameter
rg --type py 'add_file_to_pipeline\s*\(.*custom_metadata\s*='

Length of output: 63

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants