Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix docs and test for unsafe_indicies=true #566

Merged
merged 1 commit into from
Feb 14, 2025

Conversation

vchuravy
Copy link
Member

No description provided.

@vchuravy vchuravy marked this pull request as ready for review February 14, 2025 00:10
Copy link
Member Author

vchuravy commented Feb 14, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link
Contributor

Benchmark Results

main 0a3bd30... main/0a3bd307e59a8e...
saxpy/default/Float16/1024 0.732 ± 0.0086 μs 0.738 ± 0.0079 μs 0.992
saxpy/default/Float16/1048576 0.193 ± 0.017 ms 0.176 ± 0.0094 ms 1.1
saxpy/default/Float16/16384 3.34 ± 0.039 μs 3.34 ± 0.054 μs 1
saxpy/default/Float16/2048 0.913 ± 0.013 μs 0.915 ± 0.012 μs 0.998
saxpy/default/Float16/256 0.591 ± 0.0076 μs 0.591 ± 0.0071 μs 1
saxpy/default/Float16/262144 0.0455 ± 0.0026 ms 0.0444 ± 0.00067 ms 1.02
saxpy/default/Float16/32768 6.02 ± 0.065 μs 6.01 ± 0.11 μs 1
saxpy/default/Float16/4096 1.31 ± 0.026 μs 1.3 ± 0.026 μs 1
saxpy/default/Float16/512 0.649 ± 0.0078 μs 0.649 ± 0.0067 μs 0.999
saxpy/default/Float16/64 0.562 ± 0.0071 μs 0.559 ± 0.0061 μs 1.01
saxpy/default/Float16/65536 11.8 ± 0.28 μs 11.7 ± 0.21 μs 1.01
saxpy/default/Float32/1024 0.64 ± 0.0098 μs 0.662 ± 0.0091 μs 0.967
saxpy/default/Float32/1048576 0.248 ± 0.038 ms 0.219 ± 0.025 ms 1.13
saxpy/default/Float32/16384 2.82 ± 0.19 μs 2.92 ± 0.6 μs 0.965
saxpy/default/Float32/2048 0.761 ± 0.066 μs 0.768 ± 0.019 μs 0.99
saxpy/default/Float32/256 0.577 ± 0.0063 μs 0.592 ± 0.0082 μs 0.975
saxpy/default/Float32/262144 0.0485 ± 0.0057 ms 0.0458 ± 0.0037 ms 1.06
saxpy/default/Float32/32768 5.44 ± 0.38 μs 5.56 ± 1 μs 0.979
saxpy/default/Float32/4096 1.15 ± 0.082 μs 1.13 ± 0.068 μs 1.02
saxpy/default/Float32/512 0.609 ± 0.0069 μs 0.628 ± 0.01 μs 0.97
saxpy/default/Float32/64 0.569 ± 0.0059 μs 0.577 ± 0.0087 μs 0.985
saxpy/default/Float32/65536 12.1 ± 1.5 μs 12.5 ± 1.7 μs 0.971
saxpy/default/Float64/1024 0.765 ± 0.075 μs 0.749 ± 0.022 μs 1.02
saxpy/default/Float64/1048576 0.607 ± 0.058 ms 0.519 ± 0.05 ms 1.17
saxpy/default/Float64/16384 5.36 ± 0.43 μs 5.37 ± 0.61 μs 0.999
saxpy/default/Float64/2048 1.15 ± 0.1 μs 1.14 ± 0.1 μs 1.01
saxpy/default/Float64/256 0.589 ± 0.006 μs 0.585 ± 0.0076 μs 1.01
saxpy/default/Float64/262144 0.104 ± 0.015 ms 0.0978 ± 0.014 ms 1.06
saxpy/default/Float64/32768 13.8 ± 1.8 μs 12.5 ± 1.4 μs 1.1
saxpy/default/Float64/4096 1.71 ± 0.13 μs 1.73 ± 0.26 μs 0.986
saxpy/default/Float64/512 0.637 ± 0.0095 μs 0.637 ± 0.0093 μs 1
saxpy/default/Float64/64 0.569 ± 0.006 μs 0.56 ± 0.0066 μs 1.02
saxpy/default/Float64/65536 25.8 ± 4.2 μs 24.2 ± 2.7 μs 1.07
saxpy/static workgroup=(1024,)/Float16/1024 2.17 ± 0.03 μs 2.17 ± 0.027 μs 1
saxpy/static workgroup=(1024,)/Float16/1048576 0.18 ± 0.017 ms 0.162 ± 0.011 ms 1.11
saxpy/static workgroup=(1024,)/Float16/16384 4.42 ± 0.12 μs 4.4 ± 0.076 μs 1
saxpy/static workgroup=(1024,)/Float16/2048 2.34 ± 0.031 μs 2.35 ± 0.029 μs 0.994
saxpy/static workgroup=(1024,)/Float16/256 2.8 ± 0.04 μs 2.81 ± 0.033 μs 0.996
saxpy/static workgroup=(1024,)/Float16/262144 0.0448 ± 0.0033 ms 0.0426 ± 0.002 ms 1.05
saxpy/static workgroup=(1024,)/Float16/32768 6.89 ± 0.21 μs 6.84 ± 0.18 μs 1.01
saxpy/static workgroup=(1024,)/Float16/4096 2.65 ± 0.039 μs 2.66 ± 0.037 μs 0.997
saxpy/static workgroup=(1024,)/Float16/512 3.25 ± 0.041 μs 3.27 ± 0.044 μs 0.997
saxpy/static workgroup=(1024,)/Float16/64 2.5 ± 0.22 μs 2.51 ± 0.22 μs 0.995
saxpy/static workgroup=(1024,)/Float16/65536 12.9 ± 0.58 μs 12.7 ± 0.4 μs 1.02
saxpy/static workgroup=(1024,)/Float32/1024 2.21 ± 0.041 μs 2.2 ± 0.03 μs 1.01
saxpy/static workgroup=(1024,)/Float32/1048576 0.242 ± 0.029 ms 0.201 ± 0.02 ms 1.2
saxpy/static workgroup=(1024,)/Float32/16384 4.36 ± 0.24 μs 4.36 ± 0.38 μs 0.999
saxpy/static workgroup=(1024,)/Float32/2048 2.37 ± 0.061 μs 2.35 ± 0.057 μs 1.01
saxpy/static workgroup=(1024,)/Float32/256 2.69 ± 0.052 μs 2.67 ± 0.043 μs 1
saxpy/static workgroup=(1024,)/Float32/262144 0.0514 ± 0.0061 ms 0.0491 ± 0.0039 ms 1.05
saxpy/static workgroup=(1024,)/Float32/32768 7.52 ± 0.49 μs 7.47 ± 0.87 μs 1.01
saxpy/static workgroup=(1024,)/Float32/4096 2.66 ± 0.082 μs 2.62 ± 0.071 μs 1.01
saxpy/static workgroup=(1024,)/Float32/512 2.7 ± 0.038 μs 2.69 ± 0.032 μs 1
saxpy/static workgroup=(1024,)/Float32/64 2.7 ± 4.3 μs 2.84 ± 5.5 μs 0.952
saxpy/static workgroup=(1024,)/Float32/65536 16 ± 1.9 μs 14.9 ± 1.5 μs 1.07
saxpy/static workgroup=(1024,)/Float64/1024 2.33 ± 0.059 μs 2.32 ± 0.056 μs 1
saxpy/static workgroup=(1024,)/Float64/1048576 0.658 ± 0.073 ms 0.492 ± 0.048 ms 1.34
saxpy/static workgroup=(1024,)/Float64/16384 7.5 ± 0.53 μs 7.28 ± 0.6 μs 1.03
saxpy/static workgroup=(1024,)/Float64/2048 2.6 ± 0.07 μs 2.6 ± 0.082 μs 0.998
saxpy/static workgroup=(1024,)/Float64/256 2.63 ± 0.059 μs 2.65 ± 0.057 μs 0.993
saxpy/static workgroup=(1024,)/Float64/262144 0.11 ± 0.014 ms 0.0963 ± 0.0089 ms 1.14
saxpy/static workgroup=(1024,)/Float64/32768 16.9 ± 2.2 μs 14.8 ± 1.6 μs 1.14
saxpy/static workgroup=(1024,)/Float64/4096 3.15 ± 0.15 μs 3.16 ± 0.22 μs 0.999
saxpy/static workgroup=(1024,)/Float64/512 2.64 ± 0.072 μs 2.68 ± 0.072 μs 0.987
saxpy/static workgroup=(1024,)/Float64/64 2.59 ± 0.057 μs 2.61 ± 0.061 μs 0.992
saxpy/static workgroup=(1024,)/Float64/65536 28.6 ± 3.6 μs 26.9 ± 2.9 μs 1.06
time_to_load 0.332 ± 0.0042 s 0.319 ± 0.0094 s 1.04

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

@vchuravy vchuravy merged commit dff7afe into main Feb 14, 2025
36 of 39 checks passed
@vchuravy vchuravy deleted the 02-14-fix_docs_and_test_for_unsafe_indicies_true branch February 14, 2025 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant