Heterogeneous Recursive Planning based Open Writing Project
WriteHERE is an open-source framework that revolutionizes long-form writing through human-like adaptive planning. Unlike traditional AI writing tools that follow rigid workflows, WriteHERE dynamically decomposes writing tasks and integrates three fundamental capabilities:
- Recursive Planning: Breaks down complex writing tasks into manageable subtasks
- Heterogeneous Integration: Seamlessly combines retrieval, reasoning, and composition
- Dynamic Adaptation: Adjusts the writing process in real-time based on context
Our evaluations show that this approach consistently outperforms state-of-the-art methods in both fiction writing and technical report generation.
Unlike traditional approaches that rely on predetermined workflows and rigid thinking patterns, this framework:
- Eliminates workflow restrictions through a planning mechanism that interleaves recursive task decomposition and execution
- Facilitates heterogeneous task decomposition by integrating different task types
- Adapts dynamically during the writing process, similar to human writing behavior
Our evaluations on both fiction writing and technical report generation demonstrate that this method consistently outperforms state-of-the-art approaches across all evaluation metrics.
WriteHERE is developed with these core principles:
- Fully Open Source: All code is freely available for use, modification, and distribution under the MIT License
- Non-Commercial: Developed for research and educational purposes without commercial interests
- Full Transparency: The entire system architecture and decision-making processes are transparent to users
- Community-Driven: We welcome contributions, feedback, and collaborative improvements from the community
- Python 3.6+
- Node.js 14+ (for the frontend)
- API keys for:
- OpenAI (GPT models)
- Anthropic (Claude models)
- SerpAPI (for search functionality in report generation)
You can use WriteHERE in two ways: with or without the visualization interface.
This is the simpler approach when you don't need real-time visualization or want to use the engine for batch processing.
- Setup the environment:
python -m venv venv
source venv/bin/activate
pip install -v -e .
# Create api_key.env file based on example
cp recursive/api_key.env.example recursive/api_key.env
# Edit the file to add your keys
nano recursive/api_key.env
- Run the engine directly:
cd recursive
python engine.py --filename <input_file> --output-filename <output_file> --done-flag-file <done_file> --model <model_name> --mode <story|report>
Example for generating a story:
python engine.py --filename ../test_data/meta_fiction.jsonl --output-filename ./project/story/output.jsonl --done-flag-file ./project/story/done.txt --model gpt-4o --mode story
Example for generating a report:
python engine.py --filename ../test_data/qa_test.jsonl --output-filename ./project/qa/result.jsonl --done-flag-file ./project/qa/done.txt --model claude-3-sonnet --mode report
This option provides a web interface to visualize and monitor the writing process in real-time.
- One-step setup and launch:
./setup_env.sh # One-time setup of the environment
./start.sh # Start the application
This will:
- Create a clean Python virtual environment
- Install all required dependencies
- Start the backend server on port 5001
- Start the frontend on port 3000
- Open your browser at http://localhost:3000
You can customize the ports using command-line arguments:
./start.sh --backend-port 8080 --frontend-port 8000
If you're using Anaconda and encounter dependency conflicts, use:
./run_with_anaconda.sh
This script creates a dedicated Anaconda environment called 'writehere' with the correct dependencies and runs both servers.
You can customize ports with this script:
./run_with_anaconda.sh --backend-port 8080 --frontend-port 8000
If you prefer to set up the components manually:
- Create a Python virtual environment:
python -m venv venv
source venv/bin/activate
- Install main dependencies:
pip install -v -e .
- Install backend server dependencies:
pip install -r backend/requirements.txt
- Start the backend server:
cd backend
python server.py
To use a custom port:
python server.py --port 8080
- Install frontend dependencies:
cd frontend
npm install
- Start the frontend development server:
npm start
To use a custom port:
PORT=8000 npm start
If you encounter any issues, please check the Troubleshooting Guide for common problems and solutions.
- Recursive Task Decomposition: Breaks down complex writing tasks into manageable subtasks
- Dynamic Integration: Seamlessly combines retrieval, reasoning, and composition tasks
- Adaptive Workflow: Flexibly adjusts the writing process based on context and requirements
- Versatile Applications: Supports both creative fiction and technical report generation
- User-Friendly Interface: Intuitive web interface for easy interaction
- Real-Time Visualization: See the agent's "thinking process" as it works
- Transparent Operation: All agent decisions and processes are visible to users
- Fully Customizable: Modify prompts, parameters, and workflows to suit your needs
.
βββ backend/ # Backend Flask server
βββ frontend/ # React frontend
βββ recursive/ # Core engine implementation
β βββ agent/ # Agent implementation and prompts
β βββ executor/ # Task execution modules
β βββ llm/ # Language model integrations
β βββ utils/ # Utility functions and helpers
β βββ cache.py # Caching for improved efficiency
β βββ engine.py # Core planning and execution engine
β βββ graph.py # Task graph representation
β βββ memory.py # Memory management
β βββ test_run_report.sh # Script for generating reports
β βββ test_run_story.sh # Script for generating stories
βββ test_data/ # Example data for testing
βββ start.sh # All-in-one startup script
When using the visualization interface, you can see the task execution process in real-time. As the agent works on generating content, you can observe:
- The hierarchical decomposition of tasks
- Which tasks are currently being worked on
- The status of each task (ready, in progress, completed)
- The type of each task (retrieval, reasoning, composition)
This visualization provides insight into the agent's "thinking process" and helps you understand how complex writing tasks are broken down and solved step by step.
We welcome contributions from the community to help improve WriteHERE! Here's how you can contribute:
- Fork the repository and create your feature branch from
main
- Set up your development environment following the installation instructions above
- Make your changes, ensuring they follow the project's coding style and conventions
- Add tests for any new functionality
- Ensure all tests pass by running the test suite
- Submit a pull request with a clear description of your changes and their benefits
- Use the Issues tab to report bugs or suggest new features
- For bugs, include detailed steps to reproduce, expected behavior, and actual behavior
- For feature requests, describe the functionality you'd like to see and how it would benefit the project
- Help improve our documentation by fixing errors, adding examples, or clarifying instructions
- Documentation changes can be submitted through pull requests just like code changes
- Answer questions from other users in the Issues section
- Share your experiences and use cases with the community
- Follow the existing code style and architecture
- Document new functions, classes, and modules
- Write clear commit messages that explain the purpose of your changes
- Keep pull requests focused on a single feature or bug fix
By contributing to WriteHERE, you agree that your contributions will be licensed under the project's MIT License.
If you use this code in your research, please cite our paper:
@misc{xiong2025heterogeneousrecursiveplanning,
title={Beyond Outlining: Heterogeneous Recursive Planning for Adaptive Long-form Writing with Language Models},
author={Ruibin Xiong and Yimeng Chen and Dmitrii Khizbullin and Mingchen Zhuge and JΓΌrgen Schmidhuber},
year={2025},
eprint={2503.08275},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2503.08275}
}
This project is open-source. You are free to use, modify, and distribute the code for research, educational, and personal purposes.