Skip to content

Releases: databricks/compose-rl

v0.6.0

02 Jun 21:45
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.5.0...v0.6.0

v0.5.0

15 May 04:43
d65335f
Compare
Choose a tag to compare

What's new

  • Online RL Algorithms: We now support PPO and GRPO for online RL training
  • RL with Verifiable Rewards: We've added support for verifiable rewards with online RL algorithms, along with evaluations during training.
  • Registries for extensible and composable design
  • Robust vLLM support for efficient inference during online RL training

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.5.0

v0.4.0

09 Apr 06:39
c733544
Compare
Choose a tag to compare

What's Changed

  • Move non optional deps to non optional by @dakinggg in #23

Full Changelog: v0.3.0...v0.4.0

v0.3.0

08 Apr 23:16
9d3d63f
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.2.1...v0.3.0

v0.2.1

07 Mar 19:27
9f8237f
Compare
Choose a tag to compare

Cutting a new release to include cpu-release

v0.2.0

06 Mar 22:02
db44681
Compare
Choose a tag to compare

Added support for classifiers in LLMs, fixed some tests, and updated release dependencies.

v0.1.0

05 Feb 00:14
36c7a85
Compare
Choose a tag to compare

This is the first release of Databricks' Compose-RL, which is a library designed to streamline the integration of various reinforcement learning from human feedback (RLHF) techniques.