Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Karcher Mean Merge Method #546

Merged
merged 9 commits into from
Mar 31, 2025
Merged

Add Karcher Mean Merge Method #546

merged 9 commits into from
Mar 31, 2025

Conversation

win10ogod
Copy link
Contributor

@win10ogod win10ogod commented Mar 29, 2025

Add Karcher Mean Merge Method

Description

This PR adds a new merge method based on the Riemannian (Karcher) mean concept. The Karcher mean provides a geometrically meaningful way to average points on a manifold, which is particularly useful for merging model weights that can be interpreted as points on a hypersphere.

Features

  • Implements the Karcher mean algorithm for weight fusion
  • Supports configurable parameters:
    • max_iter: Maximum iterations for the Karcher mean algorithm (default: 10)
    • tol: Convergence tolerance (default: 1e-5)
  • Includes comprehensive tests for various scenarios

Implementation Details

  • The implementation follows the existing merge method patterns in mergekit
  • Properly handles tensor normalization and scaling
  • Includes detailed documentation and type hints
  • Adds tests for basic functionality, custom parameters, multi-model merging, and VLM compatibility

References

Copy link

github-actions bot commented Mar 29, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@6DammK9
Copy link

6DammK9 commented Mar 29, 2025

Warning: abstract math

I think this may be coincident with the "Geometric Median" in #345, which has been included in a merged PR in other repo.
Mathmatically (well, I'm not in Math major), Geometric median would be an Euclidean representation of Karcher Mean, which is also the Riemannian Center of Mass under Riemannian manifold. The iterlative Weiszfeld's algorithm highly resembles the proposed algorithm, although I'm not sure if the "trigonometry functions with norm" is actually equivalent to L2 norm.

image

Anyway nice work!

@win10ogod
Copy link
Contributor Author

Warning: abstract math

I think this may be coincident with the "Geometric Median" in #345, which has been included in a merged PR in other repo. Mathmatically (well, I'm not in Math major), Geometric median would be an Euclidean representation of Karcher Mean, which is also the Riemannian Center of Mass under Riemannian manifold. The iterlative Weiszfeld's algorithm highly resembles the proposed algorithm, although I'm not sure if the "trigonometry functions with norm" is actually equivalent to L2 norm.

image

Anyway nice work!

"Slightly different", this algorithm assumes that the model's intrinsic space is nonlinear.
And Karcher mean iteration was used.

@win10ogod
Copy link
Contributor Author

This is my recent work, and there is no paper yet.
I hope you can please merge the PR submission @cg123

@cg123
Copy link
Collaborator

cg123 commented Mar 31, 2025

Thanks for the PR, this is quite interesting! I'd love to merge it in. Could you run the pre-commit hook to get the formatting standardized?

@win10ogod
Copy link
Contributor Author

Thanks for the PR, this is quite interesting! I'd love to merge it in. Could you run the pre-commit hook to get the formatting standardized?

Could you please guide me? This is my first time submitting a PR.

@cg123
Copy link
Collaborator

cg123 commented Mar 31, 2025

First make sure you have the dev dependencies installed, like so:

pip install -e .[dev]

And then run this command:

pre-commit run --all-files

That will autoformat the code, then you can add those changes and push them.

@win10ogod
Copy link
Contributor Author

1 workflow awaiting approval

What should I do when I encounter 1 workflow awaiting approval?
Just need to wait?

@win10ogod
Copy link
Contributor Author

1 workflow awaiting approval

@win10ogod
Copy link
Contributor Author

@cg123 Could you take a look? It should work now.

@cg123
Copy link
Collaborator

cg123 commented Mar 31, 2025

Thanks for the PR! Merged.

@cg123 cg123 merged commit 09bbb0a into arcee-ai:main Mar 31, 2025
5 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Mar 31, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants