Releases: deepset-ai/haystack-experimental
v0.9.0
🔧 Updates to Experiments
Adding breakpoints to components in a Pipeline
It's now possible to set breakpoints at any component in any pipeline, forcing the pipeline execution to stop before that component runs and generating a JSON file with the complete state of the pipeline before the breakpoint component was run.
Usage Examples
# Setting breakpoints
pipeline.run(
data={"input": "value"},
breakpoints={("component_name", 0)}, # Break at the first visit
debug_path="debug_states/"
)
This will generate a JSON with the complete pipeline state before the next component is run, i.e.: the one receiving the output of the component set in the breakpoint
# Resuming from a saved state
state = Pipeline.load_state("debug_states/component_state.json")
pipeline.run(
data={"input": "value"},
resume_state=state
)
🧑🍳 See an example notebook here
💬 Share your feedback in this discussion
✅ Adopted Experiments
- chore: Remove
Agent
after Haystack 2.12 release (#263) @julian-risch - chore: Remove
AutoMergingRetriever
after Haystack 2.12 release (#265) @davidsbatista
Other Updates
- Proposal for changing internal working of Agent (#245) @sjrl
- refactor: Streamline super components input and output mapping logic (#243) @sjrl
- refactor: Small updates to Agent. Make pipeline internal, add check for warm_up (#244) @sjrl
- feat: Updates to insertion of values into
State
(#239) @sjrl - feat: Add
unclassified
to output of MultiFileConverter (#240) @julian-risch - feat: Enhance tool error logs and some refactoring (#235) @sjrl
Full Changelog: v0.8.0...v0.9.0
v0.8.0
🔧 Updates to Experiments
Stream ChatGenerator responses with Agent
The Agent
component now allows setting a streaming callback at init and run time. This way, an Agent
's response can be streamed in chunks, enabling faster feedback for developers and end users. #233
agent = Agent(chat_generator=chat_generator, tools=[weather_tool])
response = agent.run([ChatMessage.from_user("Hello")], streaming_callback=streaming_callback)
🐛 Bug Fixes
- We fixed a bug that prevented ComponentTool to work with Jinja2-based components (PromptBuilder, ChatPromptBuilder, ConditionalRouter, OutputAdapter). #234
- The
Agent
component now deserializes Tools with the right class and usesdeserialize_tools_inplace
. #213 #222
✅ Adopted Experiments
- chore: remove
LLMMetadataExtractor
by @davidsbatista in #227 - chore: Remove some missed utility functions from previous experiments by @sjrl in #232
- chore: removing async version of
InMemoryDocumentStore
,DocumentWriter
,OpenAIChatGenerator
, InMemory Retrievers by @davidsbatista in #220 - chore: remove pipeline experiments by @mathislucka in #214
🛑 Discontinued Experiments
- chore: remove evaluation harness experiment by @julian-risch in #231
Full Changelog: v0.7.0...v0.8.0
v0.7.0
🧪 New Experiments
New Agent
component
Agent
component enables tool-calling functionality with provider-agnostic chat model support and can be used as a standalone component or within a pipeline.
👉 See the Agent
in action: 🧑🍳 Build a GitHub Issue Resolver Agent
from haystack.dataclasses import ChatMessage
from haystack.components.websearch import SerperDevWebSearch
from haystack_experimental.tools.component_tool import ComponentTool
from haystack_experimental.components.agents import Agent
from haystack.components.generators.chat import OpenAIChatGenerator
web_tool = ComponentTool(
component=SerperDevWebSearch(),
)
agent = Agent(
chat_generator=OpenAIChatGenerator(model="gpt-4o-mini"),
tools=[web_tool],
exit_condition="text",
)
result = agent.run(
messages=[ChatMessage.from_user("Find information about Haystack")]
)
Improved ComponentTool
and @tool
Decorator
The ComponentTool
and @tool
decorator are extended for better integration with the new Agent
component
New Ready-Made SuperComponents
Introducing new SuperComponent
s that bundle commonly used components and logic for indexing pipelines: MultiFileConverter
, SentenceTransformersDocumentIndexer
, DocumentPreprocessor
from haystack_experimental.super_components.converters import MultiFileConverter
# process all common file types (.csv, .docx, .html, .json, .md, .txt, .pdf, .pptx, .xlsx) with one component
converter = MultiFileConverter()
converter.run(sources=["test.txt", "test.pdf"], meta={})
What's Changed
- docs: add Supercomponent pydoc, delete outdated by @dfokina in #193
- docs: updating trace comparison tool README.md by @davidsbatista in #195
- chore: Create issue templates for adding, removing, moving an experiment by @julian-risch in #192
- chore: remove OpenSearch from experimental by @anakin87 in #200
- fix: fixing auto-merging tests, removing hard-coded doc ids by @davidsbatista in #202
- chore: add tool related code to prepare Agent PR by @mathislucka in #203
- feat: add file and indexing related super components by @mathislucka in #184
- docs: Add SuperComponent to catalog by @julian-risch in #190
- feat: Introduce Agent by @mathislucka in #175
- docs: add pydoc config for Agent component by @julian-risch in #208
- docs: Notebook for Agent component by @mathislucka in #204
- docs: add MultiFileConverter, SentenceTransformerrsDocumentIndexer, and DocumentPreprocessor to docs by @dfokina in #210
Full Changelog: v0.6.0...v0.7.0
v0.6.0
New Experiments
- New
SuperComponent
abstraction that allows to wrap any pipeline into a friendly component interface and to create your own super components 1
from haystack_experimental import SuperComponent
# rag_pipeline = basic RAG pipeline with retriever, prompt builder, generator and answer builder components
input_mapping = {
"search_query": ["retriever.query", "prompt_builder.query", "answer_builder.query"]
}
output_mapping = {
"answer_builder.answers": "final_answers"
}
wrapper = SuperComponent(
pipeline=rag_pipeline,
input_mapping=input_mapping,
output_mapping=output_mapping
)
result = wrapper.run(search_query="What is the capital of France?")
print(result["final_answers"][0])
- New
AsyncPipeline
that can schedule components to run concurrently 2
Other Updates:
- Added a debug/tracing script to compare two pipeline runs with the old and new pipeline run logic 3
- Changed
LLMMetadaExtractor
to useChatGenerator
instead ofGenerator
4
Full Changelog: v0.5.0...v0.6.0
v0.5.0
New Experiments
- New
Pipeline
class with new pipeline run logic -Pipeline
example
Full Changelog: v0.4.0...v0.5.0
🧬 New Pipeline Logic
This release introduces a reimplementation of the pipeline-run logic to resolve multiple issues, improving reliability and performance. These changes will also be included in Haystack 2.10.
Fixed Issues:
-
Exceptions in pipelines with two cycles
- Pipelines with two cycles sharing an optional (like in
PromptBuilder
) or a greedy variadic edge (e.g., inBranchJoiner
) might raise exceptions. Details here.
- Pipelines with two cycles sharing an optional (like in
-
Incorrect execution in cycles with multiple optional or variadic edges
- Entry points for cycles were non-deterministic, causing components to run with unexpected inputs or multiple times. This impacted execution time and final outputs.
-
Missing intermediate outputs in cycles
- Outputs produced within a cycle were overwritten, preventing downstream components from receiving them.
-
Premature execution of lazy variadic components
- Components like
DocumentJoiner
sometimes executed before receiving all inputs, leading to repeated partial executions that affected downstream results.
- Components like
-
Order-sensitive behavior in
add_component
andconnect
- Some bugs above occurred due to specific orderings of
add_component
andconnect
in pipeline creation, causing non-deterministic behavior in cyclic pipelines.
- Some bugs above occurred due to specific orderings of
Am I Affected by this Change?
-
Non-cyclic pipelines without lazy variadic components:
No impact—your pipelines should function as before. -
Non-cyclic pipelines with lazy variadic components:
Check inputs and outputs of components likeDocumentJoiner
for issues#4
and#5
. UseLoggingTracer
with content tracing to validate behavior. Component execution order now uses lexicographical sorting; rename upstream components if necessary. -
Pipelines with cycles:
Review your pipeline outputs as well as the component input and outputs to ensure expected behavior, as you may encounter any of the above issues.
Share your comments in discussion #177
v0.4.0
New Experiments
- AsyncPipeline and async-enabled components -
AsyncPipeline
example
Full Changelog: v0.3.0...v0.4.0
v0.3.0
New Experiments
- Metadata extraction with LLM -
LLMetadataExtractor
- Support for tools in
OllamaChatGenerator
,HuggingFaceAPIChatGenerator
,AnthropicChatGenerator
Full Changelog: v0.2.0...v0.3.0
v0.2.0
New experiments
- Tool Calling - refactored
ChatMessage
dataclass,Tool
dataclass, refactoredOpenAIChatGenerator
,ToolInvoker
component - Memory Components -
ChatMessageWriter
,ChatMessageRetriever
,InMemoryChatMessageStore
- Document Splitting & Retrieval Techniques -
Auto-Merging Retriever
,HierarchicalDocumentSplitter
- Metadata extraction with LLM -
LLMetadataExtractor
New Contributors
Full Changelog: v0.1.1...v0.2.0
v0.1.1
What's Changed
- Add google colab link to the example harness nb by @bilgeyucel in #36
- doc: Fix links to API docs and pip packages in RAG eval harness notebook by @shadeMe in #38
- Improve OpenAPITool corner cases handling (missing operationId, servers under paths, etc) by @vblagoje in #37
- fix: Centralize OpenAPI schema reference resolution by @vblagoje in #40
- feat: Add telemetry to RAG eval harness by @shadeMe in #42
- docs: Add section about telemetry to the readme by @shadeMe in #44
- docs: add
OpenAPITool
notebook by @anakin87 in #45
New Contributors
- @bilgeyucel made their first contribution in #36
- @anakin87 made their first contribution in #45
Full Changelog: v0.1.0...v0.1.1
v0.1.0
New Experiments 🚀
- An evaluation harness for RAG pipelines
- A function calling component for OpenAI.
- An OpenAPI-based function calling component for OpenAI, Cohere and Anthropic.
Full Changelog: v0.0.1...v0.1.0