Skip to content

⚡️ Improve SequenceSet#xor performance by ~2x #463

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

nevans
Copy link
Collaborator

@nevans nevans commented Apr 29, 2025

Obviously, the performance improvement is highly dependant on what data you're using, whether YJIT is enabled, etc. I saw results ranging from 1.5x faster to 2.7x faster. The benchmark script is included.

For a benchmark run using sets with 10k members:

new impl   79.061 (± 8.9%) i/s   (12.65 ms/i) -    392.000 in   5.004322s
old impl   32.736 (±15.3%) i/s   (30.55 ms/i) -    162.000 in   5.052839s

The old implementation was ~2.42x slower.

For a benchmark run using very sparse sets with 100 members:

new impl   4.295k (±13.5%) i/s  (232.81 μs/i) -     21.476k in   5.102536s
old impl   2.459k (±11.3%) i/s  (406.69 μs/i) -     12.095k in   5.000148s

This time, the old implementation was ~1.75x slower.

I have some other (much bigger) PRs that should give even bigger performance (and memory use) improvements, but this is simple and effective.

Obviously, the performance improvement is highly dependant on what data
you're using, whether YJIT is enabled, etc.  I saw results ranging from
1.7x faster to 2.6x faster.  The benchmark script is included.

For a benchmark run using sets with 10k members:
```
new impl   79.061 (± 8.9%) i/s   (12.65 ms/i) -    392.000 in   5.004322s
old impl   32.736 (±15.3%) i/s   (30.55 ms/i) -    162.000 in   5.052839s
```
The old implementation was ~2.42x slower.

For a benchmark run using very sparse sets with 100 members:
```
new impl   4.295k (±13.5%) i/s  (232.81 μs/i) -     21.476k in   5.102536s
old impl   2.459k (±11.3%) i/s  (406.69 μs/i) -     12.095k in   5.000148s
```
This time, the old implementation was ~1.75x slower.

I have some other (much bigger) PRs that should give even bigger
performance improvements, but this is simple and effective.
@nevans nevans closed this Apr 29, 2025
@nevans
Copy link
Collaborator Author

nevans commented Apr 29, 2025

I didn't have proper tests over #xor and had accidentally pasted in an alternate implementation for #union! D'oh!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant