Skip to content

Commit 05fc58b

Browse files
authored
Merge branch 'master' into openai-embeddings-not-respecting-chunk-size
2 parents f45afa4 + 63c16f5 commit 05fc58b

File tree

83 files changed

+803
-503
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+803
-503
lines changed

.github/workflows/codspeed.yml

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
name: CodSpeed
2+
3+
on:
4+
push:
5+
branches:
6+
- master
7+
pull_request:
8+
paths:
9+
- 'libs/core/**'
10+
# `workflow_dispatch` allows CodSpeed to trigger backtest
11+
# performance analysis in order to generate initial data.
12+
workflow_dispatch:
13+
14+
jobs:
15+
codspeed:
16+
name: Run benchmarks
17+
runs-on: codspeed-macro
18+
steps:
19+
- uses: actions/checkout@v4
20+
21+
# We have to use 3.12, 3.13 is not yet supported
22+
- name: Install uv
23+
uses: astral-sh/setup-uv@v5
24+
with:
25+
python-version: "3.12"
26+
27+
# Using this action is still necessary for CodSpeed to work
28+
- uses: actions/setup-python@v3
29+
with:
30+
python-version: "3.12"
31+
32+
- name: install deps
33+
run: uv sync --group test
34+
working-directory: ./libs/core
35+
36+
- name: Run benchmarks
37+
uses: CodSpeedHQ/action@v3
38+
with:
39+
token: ${{ secrets.CODSPEED_TOKEN }}
40+
run: |
41+
cd libs/core
42+
uv run --no-sync pytest ./tests/benchmarks --codspeed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,7 @@ coverage.xml
5959
*.py,cover
6060
.hypothesis/
6161
.pytest_cache/
62+
.codspeed/
6263

6364
# Translations
6465
*.mo

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode&style=flat-square)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
1818
[<img src="https://github.com/codespaces/badge.svg" title="Open in Github Codespace" width="150" height="20">](https://codespaces.new/langchain-ai/langchain)
1919
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)
20+
[![CodSpeed Badge](https://img.shields.io/endpoint?url=https://codspeed.io/badge.json)](https://codspeed.io/langchain-ai/langchain)
2021

2122
> [!NOTE]
2223
> Looking for the JS/TS library? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).

docs/docs/concepts/retrieval.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ structured_model = model.with_structured_output(Questions)
9292

9393
# Define the system prompt
9494
system = """You are a helpful assistant that generates multiple sub-questions related to an input question. \n
95-
The goal is to break down the input into a set of sub-problems / sub-questions that can be answers in isolation. \n"""
95+
The goal is to break down the input into a set of sub-problems / sub-questions that can be answered independently. \n"""
9696

9797
# Pass the question to the model
9898
question = """What are the main components of an LLM-powered autonomous agent system?"""

docs/docs/how_to/code_splitter.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
"\n",
4141
"To view the list of separators for a given language, pass a value from this enum into\n",
4242
"```python\n",
43-
"RecursiveCharacterTextSplitter.get_separators_for_language`\n",
43+
"RecursiveCharacterTextSplitter.get_separators_for_language\n",
4444
"```\n",
4545
"\n",
4646
"To instantiate a splitter that is tailored for a specific language, pass a value from the enum into\n",

docs/docs/versions/migrating_chains/map_rerank_docs_chain.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313
"- Map a process to the set of documents, where the process includes generating a score;\n",
1414
"- Rank the results by score and return the maximum.\n",
1515
"\n",
16-
"A common process in this scenario is question-answering using pieces of context from a document. Forcing the model to generate score along with its answer helps to select for answers generated only by relevant context.\n",
16+
"A common process in this scenario is question-answering using pieces of context from a document. Forcing the model to generate a score along with its answer helps to select for answers generated only by relevant context.\n",
1717
"\n",
1818
"An [LangGraph](https://langchain-ai.github.io/langgraph/) implementation allows for the incorporation of [tool calling](/docs/concepts/tool_calling) and other features for this problem. Below we will go through both `MapRerankDocumentsChain` and a corresponding LangGraph implementation on a simple example for illustrative purposes."
1919
]

libs/community/langchain_community/retrievers/google_vertex_ai_search.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -167,6 +167,8 @@ def _convert_website_search_response(
167167
doc_metadata = document_dict.get("struct_data", {})
168168
doc_metadata["id"] = document_dict["id"]
169169
doc_metadata["source"] = derived_struct_data.get("link", "")
170+
if derived_struct_data.get("title") is not None:
171+
doc_metadata["title"] = derived_struct_data.get("title")
170172

171173
if chunk_type not in derived_struct_data:
172174
continue

libs/community/langchain_community/vectorstores/azure_cosmos_db_no_sql.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from typing import TYPE_CHECKING, Any, Dict, Iterable, List, Optional, Tuple
77

88
import numpy as np
9+
from langchain_core._api import deprecated
910
from langchain_core.documents import Document
1011
from langchain_core.embeddings import Embeddings
1112
from langchain_core.vectorstores import VectorStore
@@ -40,6 +41,11 @@ class CosmosDBQueryType(str, Enum):
4041
HYBRID = "hybrid"
4142

4243

44+
@deprecated(
45+
since="0.3.22",
46+
removal="1.0",
47+
alternative_import="langchain_azure_ai.vectorstores.AzureCosmosDBNoSqlVectorSearch",
48+
)
4349
class AzureCosmosDBNoSqlVectorSearch(VectorStore):
4450
"""`Azure Cosmos DB for NoSQL` vector store.
4551

libs/core/Makefile

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,9 @@ spell_check:
6464
spell_fix:
6565
uv run --all-groups codespell --toml pyproject.toml -w
6666

67+
benchmark:
68+
uv run pytest tests/benchmarks --codspeed
69+
6770
######################
6871
# HELP
6972
######################

libs/core/langchain_core/_api/beta_decorator.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,7 @@ async def awarning_emitting_wrapper(*args: Any, **kwargs: Any) -> Any:
124124
_name = _name or obj.__qualname__
125125
old_doc = obj.__doc__
126126

127-
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T:
127+
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T: # noqa: ARG001
128128
"""Finalize the annotation of a class."""
129129
# Can't set new_doc on some extension objects.
130130
with contextlib.suppress(AttributeError):
@@ -190,7 +190,7 @@ def __set_name__(self, owner: Union[type, None], set_name: str) -> None:
190190
if _name == "<lambda>":
191191
_name = set_name
192192

193-
def finalize(wrapper: Callable[..., Any], new_doc: str) -> Any:
193+
def finalize(wrapper: Callable[..., Any], new_doc: str) -> Any: # noqa: ARG001
194194
"""Finalize the property."""
195195
return _BetaProperty(
196196
fget=obj.fget, fset=obj.fset, fdel=obj.fdel, doc=new_doc

libs/core/langchain_core/_api/deprecation.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -204,7 +204,7 @@ async def awarning_emitting_wrapper(*args: Any, **kwargs: Any) -> Any:
204204
_name = _name or obj.__qualname__
205205
old_doc = obj.__doc__
206206

207-
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T:
207+
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T: # noqa: ARG001
208208
"""Finalize the deprecation of a class."""
209209
# Can't set new_doc on some extension objects.
210210
with contextlib.suppress(AttributeError):
@@ -234,7 +234,7 @@ def warn_if_direct_instance(
234234
raise ValueError(msg)
235235
old_doc = obj.description
236236

237-
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T:
237+
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T: # noqa: ARG001
238238
return cast(
239239
"T",
240240
FieldInfoV1(
@@ -255,7 +255,7 @@ def finalize(wrapper: Callable[..., Any], new_doc: str) -> T:
255255
raise ValueError(msg)
256256
old_doc = obj.description
257257

258-
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T:
258+
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T: # noqa: ARG001
259259
return cast(
260260
"T",
261261
FieldInfoV2(
@@ -315,7 +315,7 @@ def __set_name__(self, owner: Union[type, None], set_name: str) -> None:
315315
if _name == "<lambda>":
316316
_name = set_name
317317

318-
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T:
318+
def finalize(wrapper: Callable[..., Any], new_doc: str) -> T: # noqa: ARG001
319319
"""Finalize the property."""
320320
return cast(
321321
"T",

libs/core/langchain_core/caches.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@
2727
from collections.abc import Sequence
2828
from typing import Any, Optional
2929

30+
from typing_extensions import override
31+
3032
from langchain_core.outputs import Generation
3133
from langchain_core.runnables import run_in_executor
3234

@@ -194,6 +196,7 @@ def update(self, prompt: str, llm_string: str, return_val: RETURN_VAL_TYPE) -> N
194196
del self._cache[next(iter(self._cache))]
195197
self._cache[(prompt, llm_string)] = return_val
196198

199+
@override
197200
def clear(self, **kwargs: Any) -> None:
198201
"""Clear cache."""
199202
self._cache = {}
@@ -227,6 +230,7 @@ async def aupdate(
227230
"""
228231
self.update(prompt, llm_string, return_val)
229232

233+
@override
230234
async def aclear(self, **kwargs: Any) -> None:
231235
"""Async clear cache."""
232236
self.clear()

libs/core/langchain_core/callbacks/file.py

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55
from pathlib import Path
66
from typing import TYPE_CHECKING, Any, Optional, TextIO, cast
77

8+
from typing_extensions import override
9+
810
from langchain_core.callbacks import BaseCallbackHandler
911
from langchain_core.utils.input import print_text
1012

@@ -38,6 +40,7 @@ def __del__(self) -> None:
3840
"""Destructor to cleanup when done."""
3941
self.file.close()
4042

43+
@override
4144
def on_chain_start(
4245
self, serialized: dict[str, Any], inputs: dict[str, Any], **kwargs: Any
4346
) -> None:
@@ -50,17 +53,17 @@ def on_chain_start(
5053
"""
5154
if "name" in kwargs:
5255
name = kwargs["name"]
56+
elif serialized:
57+
name = serialized.get("name", serialized.get("id", ["<unknown>"])[-1])
5358
else:
54-
if serialized:
55-
name = serialized.get("name", serialized.get("id", ["<unknown>"])[-1])
56-
else:
57-
name = "<unknown>"
59+
name = "<unknown>"
5860
print_text(
5961
f"\n\n\033[1m> Entering new {name} chain...\033[0m",
6062
end="\n",
6163
file=self.file,
6264
)
6365

66+
@override
6467
def on_chain_end(self, outputs: dict[str, Any], **kwargs: Any) -> None:
6568
"""Print out that we finished a chain.
6669
@@ -70,6 +73,7 @@ def on_chain_end(self, outputs: dict[str, Any], **kwargs: Any) -> None:
7073
"""
7174
print_text("\n\033[1m> Finished chain.\033[0m", end="\n", file=self.file)
7275

76+
@override
7377
def on_agent_action(
7478
self, action: AgentAction, color: Optional[str] = None, **kwargs: Any
7579
) -> Any:
@@ -83,6 +87,7 @@ def on_agent_action(
8387
"""
8488
print_text(action.log, color=color or self.color, file=self.file)
8589

90+
@override
8691
def on_tool_end(
8792
self,
8893
output: str,
@@ -109,6 +114,7 @@ def on_tool_end(
109114
if llm_prefix is not None:
110115
print_text(f"\n{llm_prefix}", file=self.file)
111116

117+
@override
112118
def on_text(
113119
self, text: str, color: Optional[str] = None, end: str = "", **kwargs: Any
114120
) -> None:
@@ -123,6 +129,7 @@ def on_text(
123129
"""
124130
print_text(text, color=color or self.color, end=end, file=self.file)
125131

132+
@override
126133
def on_agent_finish(
127134
self, finish: AgentFinish, color: Optional[str] = None, **kwargs: Any
128135
) -> None:

libs/core/langchain_core/callbacks/manager.py

Lines changed: 19 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
from uuid import UUID
2323

2424
from langsmith.run_helpers import get_tracing_context
25-
from typing_extensions import Self
25+
from typing_extensions import Self, override
2626

2727
from langchain_core.callbacks.base import (
2828
BaseCallbackHandler,
@@ -364,19 +364,16 @@ async def _ahandle_event_for_handler(
364364
event = getattr(handler, event_name)
365365
if asyncio.iscoroutinefunction(event):
366366
await event(*args, **kwargs)
367+
elif handler.run_inline:
368+
event(*args, **kwargs)
367369
else:
368-
if handler.run_inline:
369-
event(*args, **kwargs)
370-
else:
371-
await asyncio.get_event_loop().run_in_executor(
372-
None,
373-
cast(
374-
"Callable",
375-
functools.partial(
376-
copy_context().run, event, *args, **kwargs
377-
),
378-
),
379-
)
370+
await asyncio.get_event_loop().run_in_executor(
371+
None,
372+
cast(
373+
"Callable",
374+
functools.partial(copy_context().run, event, *args, **kwargs),
375+
),
376+
)
380377
except NotImplementedError as e:
381378
if event_name == "on_chat_model_start":
382379
message_strings = [get_buffer_string(m) for m in args[1]]
@@ -1401,6 +1398,7 @@ def on_chain_start(
14011398
inheritable_metadata=self.inheritable_metadata,
14021399
)
14031400

1401+
@override
14041402
def on_tool_start(
14051403
self,
14061404
serialized: Optional[dict[str, Any]],
@@ -1456,6 +1454,7 @@ def on_tool_start(
14561454
inheritable_metadata=self.inheritable_metadata,
14571455
)
14581456

1457+
@override
14591458
def on_retriever_start(
14601459
self,
14611460
serialized: Optional[dict[str, Any]],
@@ -1927,6 +1926,7 @@ async def on_chain_start(
19271926
inheritable_metadata=self.inheritable_metadata,
19281927
)
19291928

1929+
@override
19301930
async def on_tool_start(
19311931
self,
19321932
serialized: Optional[dict[str, Any]],
@@ -2017,6 +2017,7 @@ async def on_custom_event(
20172017
metadata=self.metadata,
20182018
)
20192019

2020+
@override
20202021
async def on_retriever_start(
20212022
self,
20222023
serialized: Optional[dict[str, Any]],
@@ -2422,12 +2423,11 @@ def _configure(
24222423
for handler in callback_manager.handlers
24232424
):
24242425
callback_manager.add_handler(var_handler, inheritable)
2425-
else:
2426-
if not any(
2427-
isinstance(handler, handler_class)
2428-
for handler in callback_manager.handlers
2429-
):
2430-
callback_manager.add_handler(var_handler, inheritable)
2426+
elif not any(
2427+
isinstance(handler, handler_class)
2428+
for handler in callback_manager.handlers
2429+
):
2430+
callback_manager.add_handler(var_handler, inheritable)
24312431
return callback_manager
24322432

24332433

0 commit comments

Comments
 (0)