Skip to content

Commit 5fb261c

Browse files
authored
community: Google Vertex AI Search now returns the website title as part of the document metadata (#30688)
Google vertex ai search will now return the title of the found website as part of the document metadata, if available. Thank you for contributing to LangChain! - **Description**: Vertex AI Search can be used to index websites and then develop chatbots that use these websites to answer questions. At present, the document metadata includes an `id` and `source` (which is the URL). While the URL is enough to create a link, the ID is not descriptive enough to show users. Therefore, I propose we return `title` as well, when available (e.g., it will not be available in `.txt` documents found during the website indexing). - **Issue**: No bug in particular, but it would be better if this was here. - **Dependencies**: None - I do not use twitter. Format, Lint and Test seem to be all good.
1 parent 636d831 commit 5fb261c

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

libs/community/langchain_community/retrievers/google_vertex_ai_search.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,8 @@ def _convert_website_search_response(
167167
doc_metadata = document_dict.get("struct_data", {})
168168
doc_metadata["id"] = document_dict["id"]
169169
doc_metadata["source"] = derived_struct_data.get("link", "")
170+
if derived_struct_data.get("title") is not None:
171+
doc_metadata["title"] = derived_struct_data.get("title")
170172

171173
if chunk_type not in derived_struct_data:
172174
continue

0 commit comments

Comments
 (0)