Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] capture_non_tensor_stack #1221

Merged
merged 5 commits into from
Feb 19, 2025
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 19, 2025

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 19, 2025
ghstack-source-id: 86a79f8b5aad255f3c1cffb821e71b9d06378fdb
Pull Request resolved: #1221
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 19, 2025
Copy link

github-actions bot commented Feb 19, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 217. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}35$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 92.3720μs 20.9779μs 47.6691 KOps/s 49.7036 KOps/s $\color{#d91a1a}-4.09\%$
test_plain_set_stack_nested 67.7760μs 21.3307μs 46.8807 KOps/s 49.3906 KOps/s $\textbf{\color{#d91a1a}-5.08\%}$
test_plain_set_nested_inplace 0.1067ms 23.2987μs 42.9208 KOps/s 44.9494 KOps/s $\color{#d91a1a}-4.51\%$
test_plain_set_stack_nested_inplace 72.7950μs 23.0154μs 43.4491 KOps/s 44.8541 KOps/s $\color{#d91a1a}-3.13\%$
test_items 58.4190μs 4.2513μs 235.2210 KOps/s 238.3685 KOps/s $\color{#d91a1a}-1.32\%$
test_items_nested 0.7302ms 0.4206ms 2.3777 KOps/s 2.4438 KOps/s $\color{#d91a1a}-2.70\%$
test_items_nested_locked 0.4998ms 0.4193ms 2.3850 KOps/s 2.4927 KOps/s $\color{#d91a1a}-4.32\%$
test_items_nested_leaf 0.1641ms 77.4657μs 12.9089 KOps/s 12.8746 KOps/s $\color{#35bf28}+0.27\%$
test_items_stack_nested 0.5228ms 0.4198ms 2.3819 KOps/s 2.4485 KOps/s $\color{#d91a1a}-2.72\%$
test_items_stack_nested_leaf 0.1567ms 77.4926μs 12.9045 KOps/s 12.3810 KOps/s $\color{#35bf28}+4.23\%$
test_items_stack_nested_locked 0.7379ms 0.4231ms 2.3637 KOps/s 2.4453 KOps/s $\color{#d91a1a}-3.34\%$
test_keys 58.6890μs 3.5331μs 283.0398 KOps/s 281.5838 KOps/s $\color{#35bf28}+0.52\%$
test_keys_nested 0.2981ms 0.1658ms 6.0324 KOps/s 6.0504 KOps/s $\color{#d91a1a}-0.30\%$
test_keys_nested_locked 0.7066ms 0.1686ms 5.9310 KOps/s 5.8004 KOps/s $\color{#35bf28}+2.25\%$
test_keys_nested_leaf 0.2313ms 0.1423ms 7.0282 KOps/s 6.9759 KOps/s $\color{#35bf28}+0.75\%$
test_keys_stack_nested 0.2508ms 0.1635ms 6.1148 KOps/s 6.1250 KOps/s $\color{#d91a1a}-0.17\%$
test_keys_stack_nested_leaf 0.2233ms 0.1425ms 7.0188 KOps/s 7.1769 KOps/s $\color{#d91a1a}-2.20\%$
test_keys_stack_nested_locked 0.2965ms 0.1689ms 5.9195 KOps/s 5.8963 KOps/s $\color{#35bf28}+0.39\%$
test_values 10.8062μs 1.0415μs 960.1772 KOps/s 943.5435 KOps/s $\color{#35bf28}+1.76\%$
test_values_nested 0.1689ms 63.7501μs 15.6862 KOps/s 15.9462 KOps/s $\color{#d91a1a}-1.63\%$
test_values_nested_locked 0.1256ms 63.2777μs 15.8033 KOps/s 15.8605 KOps/s $\color{#d91a1a}-0.36\%$
test_values_nested_leaf 0.1354ms 72.6930μs 13.7565 KOps/s 13.9319 KOps/s $\color{#d91a1a}-1.26\%$
test_values_stack_nested 0.1162ms 63.0182μs 15.8684 KOps/s 14.2105 KOps/s $\textbf{\color{#35bf28}+11.67\%}$
test_values_stack_nested_leaf 0.3073ms 73.3010μs 13.6424 KOps/s 14.0251 KOps/s $\color{#d91a1a}-2.73\%$
test_values_stack_nested_locked 0.1329ms 63.8472μs 15.6624 KOps/s 15.6755 KOps/s $\color{#d91a1a}-0.08\%$
test_membership 30.1560μs 0.9273μs 1.0784 MOps/s 1.1572 MOps/s $\textbf{\color{#d91a1a}-6.81\%}$
test_membership_nested 53.0180μs 2.8840μs 346.7383 KOps/s 352.4050 KOps/s $\color{#d91a1a}-1.61\%$
test_membership_nested_leaf 35.1150μs 2.9150μs 343.0504 KOps/s 352.4026 KOps/s $\color{#d91a1a}-2.65\%$
test_membership_stacked_nested 52.4670μs 2.9298μs 341.3181 KOps/s 349.0682 KOps/s $\color{#d91a1a}-2.22\%$
test_membership_stacked_nested_leaf 45.3450μs 2.9113μs 343.4946 KOps/s 349.6518 KOps/s $\color{#d91a1a}-1.76\%$
test_membership_nested_last 67.5360μs 4.3791μs 228.3598 KOps/s 235.3199 KOps/s $\color{#d91a1a}-2.96\%$
test_membership_nested_leaf_last 93.8250μs 4.3743μs 228.6101 KOps/s 234.4156 KOps/s $\color{#d91a1a}-2.48\%$
test_membership_stacked_nested_last 61.9250μs 4.4188μs 226.3078 KOps/s 177.9424 KOps/s $\textbf{\color{#35bf28}+27.18\%}$
test_membership_stacked_nested_leaf_last 56.5450μs 4.3656μs 229.0642 KOps/s 179.3022 KOps/s $\textbf{\color{#35bf28}+27.75\%}$
test_nested_getleaf 51.3760μs 10.6188μs 94.1727 KOps/s 93.3230 KOps/s $\color{#35bf28}+0.91\%$
test_nested_get 62.8060μs 10.1861μs 98.1727 KOps/s 99.2909 KOps/s $\color{#d91a1a}-1.13\%$
test_stacked_getleaf 60.6030μs 10.5111μs 95.1379 KOps/s 95.8502 KOps/s $\color{#d91a1a}-0.74\%$
test_stacked_get 48.6510μs 9.9854μs 100.1465 KOps/s 99.9886 KOps/s $\color{#35bf28}+0.16\%$
test_nested_getitemleaf 71.6440μs 11.3237μs 88.3102 KOps/s 90.3394 KOps/s $\color{#d91a1a}-2.25\%$
test_nested_getitem 83.5660μs 10.6780μs 93.6504 KOps/s 95.1606 KOps/s $\color{#d91a1a}-1.59\%$
test_stacked_getitemleaf 61.3150μs 11.1545μs 89.6501 KOps/s 88.6774 KOps/s $\color{#35bf28}+1.10\%$
test_stacked_getitem 58.0480μs 10.6775μs 93.6550 KOps/s 94.9536 KOps/s $\color{#d91a1a}-1.37\%$
test_lock_nested 0.8790ms 0.4289ms 2.3315 KOps/s 2.3957 KOps/s $\color{#d91a1a}-2.68\%$
test_lock_stack_nested 0.7385ms 0.4425ms 2.2597 KOps/s 2.3620 KOps/s $\color{#d91a1a}-4.33\%$
test_unlock_nested 0.5544ms 0.3477ms 2.8757 KOps/s 2.9640 KOps/s $\color{#d91a1a}-2.98\%$
test_unlock_stack_nested 0.5495ms 0.3589ms 2.7863 KOps/s 2.9584 KOps/s $\textbf{\color{#d91a1a}-5.82\%}$
test_flatten_speed 0.1794ms 99.7119μs 10.0289 KOps/s 9.9302 KOps/s $\color{#35bf28}+0.99\%$
test_unflatten_speed 0.8997ms 0.5360ms 1.8656 KOps/s 1.9153 KOps/s $\color{#d91a1a}-2.60\%$
test_common_ops 1.0787ms 0.8337ms 1.1995 KOps/s 1.2437 KOps/s $\color{#d91a1a}-3.55\%$
test_creation 61.4140μs 2.5446μs 392.9851 KOps/s 392.6011 KOps/s $\color{#35bf28}+0.10\%$
test_creation_empty 40.6360μs 12.3504μs 80.9689 KOps/s 88.4268 KOps/s $\textbf{\color{#d91a1a}-8.43\%}$
test_creation_nested_1 47.8490μs 15.6417μs 63.9318 KOps/s 70.9587 KOps/s $\textbf{\color{#d91a1a}-9.90\%}$
test_creation_nested_2 58.7200μs 19.9858μs 50.0356 KOps/s 53.3997 KOps/s $\textbf{\color{#d91a1a}-6.30\%}$
test_clone 51.5760μs 14.1734μs 70.5549 KOps/s 70.0920 KOps/s $\color{#35bf28}+0.66\%$
test_getitem[int] 0.6972ms 13.3273μs 75.0342 KOps/s 77.7925 KOps/s $\color{#d91a1a}-3.55\%$
test_getitem[slice_int] 0.1368ms 25.5358μs 39.1607 KOps/s 41.5452 KOps/s $\textbf{\color{#d91a1a}-5.74\%}$
test_getitem[range] 0.1839ms 51.7622μs 19.3191 KOps/s 19.2805 KOps/s $\color{#35bf28}+0.20\%$
test_getitem[tuple] 0.1305ms 20.9979μs 47.6237 KOps/s 49.7469 KOps/s $\color{#d91a1a}-4.27\%$
test_getitem[list] 0.3900ms 46.2364μs 21.6280 KOps/s 21.4294 KOps/s $\color{#35bf28}+0.93\%$
test_setitem_dim[int] 74.9990μs 27.4102μs 36.4827 KOps/s 37.5631 KOps/s $\color{#d91a1a}-2.88\%$
test_setitem_dim[slice_int] 95.1780μs 52.6474μs 18.9943 KOps/s 18.9736 KOps/s $\color{#35bf28}+0.11\%$
test_setitem_dim[range] 0.1377ms 78.1932μs 12.7888 KOps/s 12.6155 KOps/s $\color{#35bf28}+1.37\%$
test_setitem_dim[tuple] 89.3570μs 41.4940μs 24.0999 KOps/s 23.8817 KOps/s $\color{#35bf28}+0.91\%$
test_setitem 0.1677ms 21.5650μs 46.3715 KOps/s 47.8969 KOps/s $\color{#d91a1a}-3.18\%$
test_set 0.1807ms 21.3205μs 46.9031 KOps/s 49.5791 KOps/s $\textbf{\color{#d91a1a}-5.40\%}$
test_set_shared 0.3909ms 0.1857ms 5.3842 KOps/s 5.2289 KOps/s $\color{#35bf28}+2.97\%$
test_update 0.2211ms 24.1791μs 41.3581 KOps/s 44.0066 KOps/s $\textbf{\color{#d91a1a}-6.02\%}$
test_update_nested 0.2244ms 36.1277μs 27.6796 KOps/s 29.2430 KOps/s $\textbf{\color{#d91a1a}-5.35\%}$
test_update__nested 0.4176ms 35.4319μs 28.2231 KOps/s 29.5758 KOps/s $\color{#d91a1a}-4.57\%$
test_set_nested 0.1699ms 23.1552μs 43.1869 KOps/s 44.7068 KOps/s $\color{#d91a1a}-3.40\%$
test_set_nested_new 66.0320μs 28.5559μs 35.0191 KOps/s 36.8876 KOps/s $\textbf{\color{#d91a1a}-5.07\%}$
test_select 0.2371ms 44.6503μs 22.3963 KOps/s 23.3167 KOps/s $\color{#d91a1a}-3.95\%$
test_select_nested 0.1134ms 66.4989μs 15.0378 KOps/s 15.6303 KOps/s $\color{#d91a1a}-3.79\%$
test_exclude_nested 0.1894ms 85.1503μs 11.7439 KOps/s 11.5425 KOps/s $\color{#35bf28}+1.75\%$
test_empty[True] 0.6200ms 0.4139ms 2.4162 KOps/s 2.4013 KOps/s $\color{#35bf28}+0.62\%$
test_empty[False] 8.7890μs 1.4707μs 679.9632 KOps/s 737.8024 KOps/s $\textbf{\color{#d91a1a}-7.84\%}$
test_unbind_speed 0.3743ms 0.2822ms 3.5441 KOps/s 3.6233 KOps/s $\color{#d91a1a}-2.19\%$
test_unbind_speed_stack0 0.4458ms 0.2823ms 3.5424 KOps/s 3.7723 KOps/s $\textbf{\color{#d91a1a}-6.09\%}$
test_unbind_speed_stack1 0.1140s 0.7697ms 1.2992 KOps/s 1.3702 KOps/s $\textbf{\color{#d91a1a}-5.19\%}$
test_split 0.1202s 1.8511ms 540.2227 Ops/s 561.1470 Ops/s $\color{#d91a1a}-3.73\%$
test_chunk 0.1186s 1.8559ms 538.8264 Ops/s 562.9485 Ops/s $\color{#d91a1a}-4.28\%$
test_consolidate_njt[False-None] 10.5743ms 8.7455ms 114.3439 Ops/s 121.0864 Ops/s $\textbf{\color{#d91a1a}-5.57\%}$
test_creation[device0] 3.9719ms 94.9106μs 10.5362 KOps/s 10.5543 KOps/s $\color{#d91a1a}-0.17\%$
test_creation_from_tensor 0.4130ms 96.5721μs 10.3550 KOps/s 10.3883 KOps/s $\color{#d91a1a}-0.32\%$
test_add_one[memmap_tensor0] 0.1235ms 4.8800μs 204.9166 KOps/s 186.2515 KOps/s $\textbf{\color{#35bf28}+10.02\%}$
test_contiguous[memmap_tensor0] 16.1100μs 0.4976μs 2.0096 MOps/s 1.9724 MOps/s $\color{#35bf28}+1.89\%$
test_stack[memmap_tensor0] 38.6120μs 3.4789μs 287.4470 KOps/s 269.5724 KOps/s $\textbf{\color{#35bf28}+6.63\%}$
test_memmaptd_index 1.0308ms 0.2413ms 4.1436 KOps/s 4.2077 KOps/s $\color{#d91a1a}-1.52\%$
test_memmaptd_index_astensor 0.5847ms 0.3254ms 3.0728 KOps/s 2.7098 KOps/s $\textbf{\color{#35bf28}+13.40\%}$
test_memmaptd_index_op 0.8447ms 0.6050ms 1.6529 KOps/s 1.7022 KOps/s $\color{#d91a1a}-2.90\%$
test_serialize_model 0.2462s 0.1409s 7.0992 Ops/s 8.5459 Ops/s $\textbf{\color{#d91a1a}-16.93\%}$
test_serialize_model_pickle 0.4471s 0.3905s 2.5608 Ops/s 2.4884 Ops/s $\color{#35bf28}+2.91\%$
test_serialize_weights 0.1247s 0.1184s 8.4482 Ops/s 8.6464 Ops/s $\color{#d91a1a}-2.29\%$
test_serialize_weights_returnearly 0.1872s 0.1655s 6.0407 Ops/s 5.9903 Ops/s $\color{#35bf28}+0.84\%$
test_serialize_weights_pickle 1.2353s 0.7213s 1.3863 Ops/s 2.4691 Ops/s $\textbf{\color{#d91a1a}-43.85\%}$
test_serialize_weights_filesystem 0.1541s 0.1493s 6.6987 Ops/s 6.1985 Ops/s $\textbf{\color{#35bf28}+8.07\%}$
test_serialize_model_filesystem 0.1601s 0.1495s 6.6883 Ops/s 6.3605 Ops/s $\textbf{\color{#35bf28}+5.15\%}$
test_reshape_pytree 75.1200μs 27.0395μs 36.9830 KOps/s 37.5751 KOps/s $\color{#d91a1a}-1.58\%$
test_reshape_td 68.8790μs 33.0684μs 30.2404 KOps/s 28.9574 KOps/s $\color{#35bf28}+4.43\%$
test_view_pytree 62.1550μs 26.6457μs 37.5294 KOps/s 37.8661 KOps/s $\color{#d91a1a}-0.89\%$
test_view_td 0.1544ms 44.1301μs 22.6603 KOps/s 24.6909 KOps/s $\textbf{\color{#d91a1a}-8.22\%}$
test_unbind_pytree 75.5810μs 30.5172μs 32.7684 KOps/s 33.5698 KOps/s $\color{#d91a1a}-2.39\%$
test_unbind_td 0.3457ms 41.1722μs 24.2882 KOps/s 25.0403 KOps/s $\color{#d91a1a}-3.00\%$
test_split_pytree 77.6740μs 29.8445μs 33.5070 KOps/s 34.1712 KOps/s $\color{#d91a1a}-1.94\%$
test_split_td 0.5309ms 47.4534μs 21.0733 KOps/s 22.1205 KOps/s $\color{#d91a1a}-4.73\%$
test_add_pytree 87.7230μs 35.4433μs 28.2141 KOps/s 27.4444 KOps/s $\color{#35bf28}+2.80\%$
test_add_td 0.1766ms 60.5309μs 16.5205 KOps/s 16.9271 KOps/s $\color{#d91a1a}-2.40\%$
test_compile_add_one_nested[tensordict-compile] 0.1263ms 66.7716μs 14.9764 KOps/s 14.8850 KOps/s $\color{#35bf28}+0.61\%$
test_compile_add_one_nested[tensordict-eager] 0.3624ms 0.1725ms 5.7980 KOps/s 5.7442 KOps/s $\color{#35bf28}+0.94\%$
test_compile_add_one_nested[pytree-compile] 0.1228ms 46.5122μs 21.4997 KOps/s 21.5562 KOps/s $\color{#d91a1a}-0.26\%$
test_compile_add_one_nested[pytree-eager] 0.2266ms 0.1203ms 8.3095 KOps/s 8.2591 KOps/s $\color{#35bf28}+0.61\%$
test_compile_copy_nested[tensordict-compile] 93.6750μs 28.5579μs 35.0166 KOps/s 35.9567 KOps/s $\color{#d91a1a}-2.61\%$
test_compile_copy_nested[tensordict-eager] 0.1263ms 61.7581μs 16.1922 KOps/s 17.1984 KOps/s $\textbf{\color{#d91a1a}-5.85\%}$
test_compile_copy_nested[pytree-compile] 0.1849ms 82.2752μs 12.1543 KOps/s 12.5432 KOps/s $\color{#d91a1a}-3.10\%$
test_compile_copy_nested[pytree-eager] 0.1308ms 68.5236μs 14.5935 KOps/s 15.0579 KOps/s $\color{#d91a1a}-3.08\%$
test_compile_add_one_flat[tensordict-compile] 0.2328ms 0.1095ms 9.1341 KOps/s 9.2918 KOps/s $\color{#d91a1a}-1.70\%$
test_compile_add_one_flat[tensordict-eager] 0.4864ms 0.2212ms 4.5218 KOps/s 4.6240 KOps/s $\color{#d91a1a}-2.21\%$
test_compile_add_one_flat[tensorclass-compile] 0.2745ms 47.9934μs 20.8362 KOps/s 20.9754 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_add_one_flat[tensorclass-eager] 0.1528ms 69.7276μs 14.3415 KOps/s 14.6849 KOps/s $\color{#d91a1a}-2.34\%$
test_compile_add_one_flat[pytree-compile] 0.2253ms 0.1024ms 9.7693 KOps/s 9.7332 KOps/s $\color{#35bf28}+0.37\%$
test_compile_add_one_flat[pytree-eager] 0.3718ms 0.2025ms 4.9372 KOps/s 4.7671 KOps/s $\color{#35bf28}+3.57\%$
test_compile_add_self_flat[tensordict-eager] 0.3832ms 0.2364ms 4.2301 KOps/s 4.2973 KOps/s $\color{#d91a1a}-1.56\%$
test_compile_add_self_flat[tensordict-compile] 0.2042ms 0.1099ms 9.0954 KOps/s 9.0775 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_self_flat[tensorclass-eager] 0.3202ms 64.7996μs 15.4322 KOps/s 15.7773 KOps/s $\color{#d91a1a}-2.19\%$
test_compile_add_self_flat[tensorclass-compile] 0.2511ms 50.5980μs 19.7636 KOps/s 19.9450 KOps/s $\color{#d91a1a}-0.91\%$
test_compile_add_self_flat[pytree-eager] 0.2930ms 0.1576ms 6.3445 KOps/s 6.1934 KOps/s $\color{#35bf28}+2.44\%$
test_compile_add_self_flat[pytree-compile] 0.1816ms 0.1014ms 9.8664 KOps/s 9.8197 KOps/s $\color{#35bf28}+0.48\%$
test_compile_copy_flat[tensordict-compile] 86.2520μs 22.4257μs 44.5918 KOps/s 47.9233 KOps/s $\textbf{\color{#d91a1a}-6.95\%}$
test_compile_copy_flat[tensordict-eager] 0.1321ms 69.2812μs 14.4339 KOps/s 14.6355 KOps/s $\color{#d91a1a}-1.38\%$
test_compile_copy_flat[pytree-compile] 0.1602ms 81.6160μs 12.2525 KOps/s 12.3001 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_copy_flat[pytree-eager] 0.1303ms 67.9637μs 14.7137 KOps/s 14.8160 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_assign_and_add[tensordict-compile] 0.4428ms 0.2215ms 4.5157 KOps/s 4.7158 KOps/s $\color{#d91a1a}-4.24\%$
test_compile_assign_and_add[tensordict-eager] 2.5650ms 1.4267ms 700.9410 Ops/s 716.5211 Ops/s $\color{#d91a1a}-2.17\%$
test_compile_assign_and_add[pytree-compile] 0.4207ms 0.2151ms 4.6500 KOps/s 4.7926 KOps/s $\color{#d91a1a}-2.97\%$
test_compile_assign_and_add[pytree-eager] 1.0768ms 0.8295ms 1.2056 KOps/s 1.1734 KOps/s $\color{#35bf28}+2.74\%$
test_compile_assign_and_add_stack[compile] 0.9995ms 0.4758ms 2.1015 KOps/s 2.1973 KOps/s $\color{#d91a1a}-4.36\%$
test_compile_assign_and_add_stack[eager] 4.7404ms 2.7969ms 357.5448 Ops/s 382.0838 Ops/s $\textbf{\color{#d91a1a}-6.42\%}$
test_compile_indexing[tensor-tensordict-compile] 0.1210ms 39.3418μs 25.4182 KOps/s 26.2020 KOps/s $\color{#d91a1a}-2.99\%$
test_compile_indexing[tensor-tensordict-eager] 0.5722ms 33.7454μs 29.6337 KOps/s 29.4322 KOps/s $\color{#35bf28}+0.68\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1089ms 30.7110μs 32.5616 KOps/s 31.5989 KOps/s $\color{#35bf28}+3.05\%$
test_compile_indexing[tensor-tensorclass-eager] 97.9430μs 23.6778μs 42.2337 KOps/s 42.4657 KOps/s $\color{#d91a1a}-0.55\%$
test_compile_indexing[tensor-pytree-compile] 93.4340μs 32.0957μs 31.1568 KOps/s 30.8185 KOps/s $\color{#35bf28}+1.10\%$
test_compile_indexing[tensor-pytree-eager] 83.0240μs 23.5009μs 42.5516 KOps/s 42.1691 KOps/s $\color{#35bf28}+0.91\%$
test_compile_indexing[slice-tensordict-compile] 0.1288ms 54.7178μs 18.2756 KOps/s 19.1375 KOps/s $\color{#d91a1a}-4.50\%$
test_compile_indexing[slice-tensordict-eager] 0.4075ms 21.3935μs 46.7432 KOps/s 49.9751 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1219ms 45.7885μs 21.8395 KOps/s 21.6724 KOps/s $\color{#35bf28}+0.77\%$
test_compile_indexing[slice-tensorclass-eager] 70.4110μs 18.7015μs 53.4717 KOps/s 53.5799 KOps/s $\color{#d91a1a}-0.20\%$
test_compile_indexing[slice-pytree-compile] 0.1310ms 47.3947μs 21.0994 KOps/s 21.4136 KOps/s $\color{#d91a1a}-1.47\%$
test_compile_indexing[slice-pytree-eager] 76.3330μs 18.7908μs 53.2175 KOps/s 53.3768 KOps/s $\color{#d91a1a}-0.30\%$
test_compile_indexing[int-tensordict-compile] 0.1302ms 56.0135μs 17.8528 KOps/s 18.6940 KOps/s $\color{#d91a1a}-4.50\%$
test_compile_indexing[int-tensordict-eager] 1.1280ms 21.4332μs 46.6566 KOps/s 50.5109 KOps/s $\textbf{\color{#d91a1a}-7.63\%}$
test_compile_indexing[int-tensorclass-compile] 0.1235ms 46.6933μs 21.4163 KOps/s 21.2519 KOps/s $\color{#35bf28}+0.77\%$
test_compile_indexing[int-tensorclass-eager] 72.4450μs 18.9108μs 52.8799 KOps/s 54.0245 KOps/s $\color{#d91a1a}-2.12\%$
test_compile_indexing[int-pytree-compile] 0.1086ms 47.1199μs 21.2225 KOps/s 21.3282 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_indexing[int-pytree-eager] 81.7420μs 18.6572μs 53.5986 KOps/s 53.1358 KOps/s $\color{#35bf28}+0.87\%$
test_mod_add[eager] 88.4350μs 36.3221μs 27.5314 KOps/s 27.7826 KOps/s $\color{#d91a1a}-0.90\%$
test_mod_add[compile] 0.1252ms 66.6497μs 15.0038 KOps/s 15.8624 KOps/s $\textbf{\color{#d91a1a}-5.41\%}$
test_mod_add[compile-overhead] 0.1412ms 66.5912μs 15.0170 KOps/s 15.6815 KOps/s $\color{#d91a1a}-4.24\%$
test_mod_wrap[eager] 0.4528ms 0.2264ms 4.4165 KOps/s 4.3668 KOps/s $\color{#35bf28}+1.14\%$
test_mod_wrap[compile] 2.0968ms 0.2346ms 4.2626 KOps/s 4.3292 KOps/s $\color{#d91a1a}-1.54\%$
test_mod_wrap[compile-overhead] 0.3653ms 0.2290ms 4.3661 KOps/s 4.4600 KOps/s $\color{#d91a1a}-2.11\%$
test_mod_wrap_and_backward[eager] 15.0182ms 13.4314ms 74.4523 Ops/s 78.4721 Ops/s $\textbf{\color{#d91a1a}-5.12\%}$
test_mod_wrap_and_backward[compile] 13.9426ms 11.7551ms 85.0693 Ops/s 78.4431 Ops/s $\textbf{\color{#35bf28}+8.45\%}$
test_mod_wrap_and_backward[compile-overhead] 13.5605ms 11.6265ms 86.0102 Ops/s 86.3346 Ops/s $\color{#d91a1a}-0.38\%$
test_seq_add[eager] 0.2212ms 0.1213ms 8.2430 KOps/s 8.3988 KOps/s $\color{#d91a1a}-1.85\%$
test_seq_add[compile] 0.1907ms 80.7149μs 12.3893 KOps/s 13.1419 KOps/s $\textbf{\color{#d91a1a}-5.73\%}$
test_seq_add[compile-overhead] 0.2117ms 79.3563μs 12.6014 KOps/s 13.5606 KOps/s $\textbf{\color{#d91a1a}-7.07\%}$
test_seq_wrap[eager] 1.0997ms 0.4599ms 2.1746 KOps/s 2.2300 KOps/s $\color{#d91a1a}-2.49\%$
test_seq_wrap[compile] 0.7175ms 0.2575ms 3.8838 KOps/s 4.1174 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_seq_wrap[compile-overhead] 0.5026ms 0.2519ms 3.9704 KOps/s 4.1989 KOps/s $\textbf{\color{#d91a1a}-5.44\%}$
test_func_call_runtime[False-eager] 0.8495ms 0.5619ms 1.7796 KOps/s 1.8163 KOps/s $\color{#d91a1a}-2.02\%$
test_func_call_runtime[False-compile] 0.6404ms 0.4535ms 2.2049 KOps/s 2.2541 KOps/s $\color{#d91a1a}-2.18\%$
test_func_call_runtime[False-compile-overhead] 0.6645ms 0.4514ms 2.2155 KOps/s 2.2657 KOps/s $\color{#d91a1a}-2.21\%$
test_func_call_runtime[True-eager] 1.9117ms 0.8062ms 1.2403 KOps/s 1.3261 KOps/s $\textbf{\color{#d91a1a}-6.47\%}$
test_func_call_runtime[True-compile] 0.6618ms 0.4743ms 2.1086 KOps/s 2.1730 KOps/s $\color{#d91a1a}-2.97\%$
test_func_call_runtime[True-compile-overhead] 0.6847ms 0.4781ms 2.0918 KOps/s 2.1353 KOps/s $\color{#d91a1a}-2.04\%$
test_func_call_cm_runtime[False-eager] 0.7784ms 0.5744ms 1.7409 KOps/s 1.8573 KOps/s $\textbf{\color{#d91a1a}-6.26\%}$
test_func_call_cm_runtime[False-compile] 0.6860ms 0.4458ms 2.2432 KOps/s 2.2732 KOps/s $\color{#d91a1a}-1.32\%$
test_func_call_cm_runtime[False-compile-overhead] 0.6231ms 0.4467ms 2.2388 KOps/s 2.2657 KOps/s $\color{#d91a1a}-1.19\%$
test_func_call_cm_runtime[True-eager] 1.5525ms 0.9392ms 1.0647 KOps/s 1.1198 KOps/s $\color{#d91a1a}-4.92\%$
test_func_call_cm_runtime[True-compile] 1.1515ms 0.8269ms 1.2094 KOps/s 1.2623 KOps/s $\color{#d91a1a}-4.19\%$
test_func_call_cm_runtime[True-compile-overhead] 1.0760ms 0.8323ms 1.2015 KOps/s 1.2536 KOps/s $\color{#d91a1a}-4.15\%$
test_vmap_func_call_cm_runtime[eager] 3.1765ms 1.9855ms 503.6484 Ops/s 516.2960 Ops/s $\color{#d91a1a}-2.45\%$
test_vmap_func_call_cm_runtime[compile] 0.9305ms 0.5576ms 1.7934 KOps/s 1.8403 KOps/s $\color{#d91a1a}-2.55\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.7552ms 0.5529ms 1.8087 KOps/s 1.8545 KOps/s $\color{#d91a1a}-2.47\%$
test_distributed 0.2568ms 0.1281ms 7.8055 KOps/s 7.7316 KOps/s $\color{#35bf28}+0.96\%$
test_tdmodule 59.1900μs 27.2048μs 36.7582 KOps/s 37.1141 KOps/s $\color{#d91a1a}-0.96\%$
test_tdmodule_dispatch 0.1099ms 52.8709μs 18.9140 KOps/s 20.8602 KOps/s $\textbf{\color{#d91a1a}-9.33\%}$
test_tdseq 64.8010μs 29.3346μs 34.0895 KOps/s 33.8440 KOps/s $\color{#35bf28}+0.73\%$
test_tdseq_dispatch 95.4580μs 55.2698μs 18.0931 KOps/s 17.7815 KOps/s $\color{#35bf28}+1.75\%$
test_instantiation_functorch 1.7761ms 1.5438ms 647.7314 Ops/s 637.3794 Ops/s $\color{#35bf28}+1.62\%$
test_exec_functorch 0.4016ms 0.1804ms 5.5447 KOps/s 5.5072 KOps/s $\color{#35bf28}+0.68\%$
test_exec_functional_call 0.3249ms 0.1757ms 5.6931 KOps/s 5.6987 KOps/s $\color{#d91a1a}-0.10\%$
test_exec_td_decorator 0.5823ms 0.2403ms 4.1621 KOps/s 4.2975 KOps/s $\color{#d91a1a}-3.15\%$
test_vmap_mlp_speed_decorator[True-True] 0.9214ms 0.6656ms 1.5024 KOps/s 1.4753 KOps/s $\color{#35bf28}+1.83\%$
test_vmap_mlp_speed_decorator[True-False] 1.0246ms 0.6642ms 1.5055 KOps/s 1.5299 KOps/s $\color{#d91a1a}-1.59\%$
test_vmap_mlp_speed_decorator[False-True] 0.9267ms 0.5416ms 1.8463 KOps/s 1.8499 KOps/s $\color{#d91a1a}-0.19\%$
test_vmap_mlp_speed_decorator[False-False] 0.8526ms 0.5432ms 1.8408 KOps/s 1.8604 KOps/s $\color{#d91a1a}-1.05\%$
test_to_module_speed[True] 2.5566ms 1.3891ms 719.8748 Ops/s 749.5396 Ops/s $\color{#d91a1a}-3.96\%$
test_to_module_speed[False] 2.1061ms 1.3334ms 749.9800 Ops/s 759.0140 Ops/s $\color{#d91a1a}-1.19\%$
test_tc_init 0.1065ms 47.9298μs 20.8638 KOps/s 20.9647 KOps/s $\color{#d91a1a}-0.48\%$
test_tc_init_nested 0.1815ms 95.6915μs 10.4502 KOps/s 10.6365 KOps/s $\color{#d91a1a}-1.75\%$
test_tc_first_layer_tensor 36.1070μs 1.5591μs 641.3809 KOps/s 650.1975 KOps/s $\color{#d91a1a}-1.36\%$
test_tc_first_layer_nontensor 33.5020μs 4.6797μs 213.6886 KOps/s 214.1823 KOps/s $\color{#d91a1a}-0.23\%$
test_tc_second_layer_tensor 30.9180μs 2.8828μs 346.8905 KOps/s 352.8782 KOps/s $\color{#d91a1a}-1.70\%$
test_tc_second_layer_nontensor 32.8110μs 6.1219μs 163.3485 KOps/s 162.8304 KOps/s $\color{#35bf28}+0.32\%$
test_unbind 0.2707s 15.2394ms 65.6194 Ops/s 68.4239 Ops/s $\color{#d91a1a}-4.10\%$
test_full_like 12.3197ms 10.5573ms 94.7214 Ops/s 110.2471 Ops/s $\textbf{\color{#d91a1a}-14.08\%}$
test_zeros_like 5.6675ms 3.6320ms 275.3281 Ops/s 284.9023 Ops/s $\color{#d91a1a}-3.36\%$
test_ones_like 5.5379ms 4.5698ms 218.8275 Ops/s 241.0863 Ops/s $\textbf{\color{#d91a1a}-9.23\%}$
test_clone 14.8106ms 7.7020ms 129.8369 Ops/s 161.8701 Ops/s $\textbf{\color{#d91a1a}-19.79\%}$
test_squeeze 90.9190μs 12.6046μs 79.3360 KOps/s 77.7566 KOps/s $\color{#35bf28}+2.03\%$
test_unsqueeze 0.1805ms 96.2508μs 10.3895 KOps/s 10.6351 KOps/s $\color{#d91a1a}-2.31\%$
test_split 0.3638ms 0.2004ms 4.9892 KOps/s 5.1743 KOps/s $\color{#d91a1a}-3.58\%$
test_permute 0.3242ms 0.2045ms 4.8894 KOps/s 5.0136 KOps/s $\color{#d91a1a}-2.48\%$
test_stack 29.1887ms 26.3323ms 37.9762 Ops/s 36.4714 Ops/s $\color{#35bf28}+4.13\%$
test_cat 34.0061ms 26.1292ms 38.2714 Ops/s 35.1470 Ops/s $\textbf{\color{#35bf28}+8.89\%}$

Copy link

github-actions bot commented Feb 19, 2025

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 229. Improved: $\large\color{#35bf28}14$. Worsened: $\large\color{#d91a1a}13$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 34.1410μs 13.1100μs 76.2778 KOps/s 76.5418 KOps/s $\color{#d91a1a}-0.34\%$
test_plain_set_stack_nested 47.3410μs 13.1945μs 75.7892 KOps/s 75.6787 KOps/s $\color{#35bf28}+0.15\%$
test_plain_set_nested_inplace 42.2510μs 14.1823μs 70.5104 KOps/s 70.1240 KOps/s $\color{#35bf28}+0.55\%$
test_plain_set_stack_nested_inplace 44.3400μs 14.1119μs 70.8621 KOps/s 71.1976 KOps/s $\color{#d91a1a}-0.47\%$
test_items 85.9820μs 2.8424μs 351.8213 KOps/s 345.1006 KOps/s $\color{#35bf28}+1.95\%$
test_items_nested 0.4688ms 0.3660ms 2.7323 KOps/s 2.6976 KOps/s $\color{#35bf28}+1.28\%$
test_items_nested_locked 0.4266ms 0.3693ms 2.7078 KOps/s 2.6741 KOps/s $\color{#35bf28}+1.26\%$
test_items_nested_leaf 0.1204ms 67.1649μs 14.8887 KOps/s 14.8433 KOps/s $\color{#35bf28}+0.31\%$
test_items_stack_nested 0.4227ms 0.3656ms 2.7350 KOps/s 2.7122 KOps/s $\color{#35bf28}+0.84\%$
test_items_stack_nested_leaf 0.1302ms 68.4824μs 14.6023 KOps/s 14.6240 KOps/s $\color{#d91a1a}-0.15\%$
test_items_stack_nested_locked 0.4068ms 0.3649ms 2.7408 KOps/s 2.6912 KOps/s $\color{#35bf28}+1.84\%$
test_keys 37.3110μs 3.7094μs 269.5869 KOps/s 291.7881 KOps/s $\textbf{\color{#d91a1a}-7.61\%}$
test_keys_nested 0.1183ms 88.7748μs 11.2645 KOps/s 11.2938 KOps/s $\color{#d91a1a}-0.26\%$
test_keys_nested_locked 0.7740ms 94.6061μs 10.5701 KOps/s 10.5784 KOps/s $\color{#d91a1a}-0.08\%$
test_keys_nested_leaf 0.1593ms 80.3353μs 12.4478 KOps/s 12.6043 KOps/s $\color{#d91a1a}-1.24\%$
test_keys_stack_nested 0.1237ms 88.9075μs 11.2476 KOps/s 11.2063 KOps/s $\color{#35bf28}+0.37\%$
test_keys_stack_nested_leaf 0.1094ms 79.8610μs 12.5218 KOps/s 12.3113 KOps/s $\color{#35bf28}+1.71\%$
test_keys_stack_nested_locked 0.1276ms 94.5882μs 10.5721 KOps/s 10.4531 KOps/s $\color{#35bf28}+1.14\%$
test_values 5.6768μs 0.8550μs 1.1696 MOps/s 1.1792 MOps/s $\color{#d91a1a}-0.81\%$
test_values_nested 67.0810μs 37.4057μs 26.7339 KOps/s 26.6677 KOps/s $\color{#35bf28}+0.25\%$
test_values_nested_locked 70.6810μs 39.1910μs 25.5161 KOps/s 25.4531 KOps/s $\color{#35bf28}+0.25\%$
test_values_nested_leaf 0.2262ms 42.4971μs 23.5310 KOps/s 23.5179 KOps/s $\color{#35bf28}+0.06\%$
test_values_stack_nested 68.4510μs 38.1276μs 26.2277 KOps/s 26.0903 KOps/s $\color{#35bf28}+0.53\%$
test_values_stack_nested_leaf 0.1053ms 42.6322μs 23.4565 KOps/s 23.3160 KOps/s $\color{#35bf28}+0.60\%$
test_values_stack_nested_locked 63.2910μs 39.8644μs 25.0850 KOps/s 24.8861 KOps/s $\color{#35bf28}+0.80\%$
test_membership 2.7296μs 0.5036μs 1.9858 MOps/s 1.9963 MOps/s $\color{#d91a1a}-0.53\%$
test_membership_nested 15.9605μs 2.0005μs 499.8808 KOps/s 496.0456 KOps/s $\color{#35bf28}+0.77\%$
test_membership_nested_leaf 15.9505μs 2.0163μs 495.9570 KOps/s 496.3671 KOps/s $\color{#d91a1a}-0.08\%$
test_membership_stacked_nested 56.6510μs 2.0677μs 483.6195 KOps/s 482.1015 KOps/s $\color{#35bf28}+0.31\%$
test_membership_stacked_nested_leaf 35.2610μs 2.0603μs 485.3637 KOps/s 481.7970 KOps/s $\color{#35bf28}+0.74\%$
test_membership_nested_last 27.0500μs 3.0791μs 324.7685 KOps/s 322.2496 KOps/s $\color{#35bf28}+0.78\%$
test_membership_nested_leaf_last 40.2500μs 3.1159μs 320.9348 KOps/s 322.1221 KOps/s $\color{#d91a1a}-0.37\%$
test_membership_stacked_nested_last 41.5200μs 8.5396μs 117.1014 KOps/s 279.1475 KOps/s $\textbf{\color{#d91a1a}-58.05\%}$
test_membership_stacked_nested_leaf_last 35.4610μs 8.4476μs 118.3769 KOps/s 276.1373 KOps/s $\textbf{\color{#d91a1a}-57.13\%}$
test_nested_getleaf 35.8100μs 6.5170μs 153.4456 KOps/s 155.8826 KOps/s $\color{#d91a1a}-1.56\%$
test_nested_get 32.8500μs 6.1929μs 161.4741 KOps/s 159.7780 KOps/s $\color{#35bf28}+1.06\%$
test_stacked_getleaf 75.3220μs 6.4559μs 154.8974 KOps/s 156.3088 KOps/s $\color{#d91a1a}-0.90\%$
test_stacked_get 38.6210μs 5.9837μs 167.1204 KOps/s 166.4831 KOps/s $\color{#35bf28}+0.38\%$
test_nested_getitemleaf 38.2810μs 6.6505μs 150.3645 KOps/s 150.6346 KOps/s $\color{#d91a1a}-0.18\%$
test_nested_getitem 35.2800μs 6.2813μs 159.2027 KOps/s 157.8667 KOps/s $\color{#35bf28}+0.85\%$
test_stacked_getitemleaf 40.2910μs 6.6207μs 151.0415 KOps/s 151.6943 KOps/s $\color{#d91a1a}-0.43\%$
test_stacked_getitem 34.5200μs 6.2009μs 161.2668 KOps/s 160.6900 KOps/s $\color{#35bf28}+0.36\%$
test_lock_nested 0.4164ms 0.3359ms 2.9773 KOps/s 2.9471 KOps/s $\color{#35bf28}+1.02\%$
test_lock_stack_nested 0.3959ms 0.3396ms 2.9447 KOps/s 2.8838 KOps/s $\color{#35bf28}+2.11\%$
test_unlock_nested 0.3745ms 0.2805ms 3.5656 KOps/s 3.5861 KOps/s $\color{#d91a1a}-0.57\%$
test_unlock_stack_nested 0.3359ms 0.2757ms 3.6268 KOps/s 3.5416 KOps/s $\color{#35bf28}+2.41\%$
test_flatten_speed 0.1124ms 85.2033μs 11.7366 KOps/s 11.8695 KOps/s $\color{#d91a1a}-1.12\%$
test_unflatten_speed 0.3660ms 0.3253ms 3.0743 KOps/s 3.0899 KOps/s $\color{#d91a1a}-0.51\%$
test_common_ops 0.8081ms 0.6232ms 1.6046 KOps/s 1.6033 KOps/s $\color{#35bf28}+0.08\%$
test_creation 0.1305ms 1.7306μs 577.8307 KOps/s 566.1271 KOps/s $\color{#35bf28}+2.07\%$
test_creation_empty 35.9900μs 8.8491μs 113.0055 KOps/s 109.4116 KOps/s $\color{#35bf28}+3.28\%$
test_creation_nested_1 37.0710μs 10.4615μs 95.5890 KOps/s 95.1912 KOps/s $\color{#35bf28}+0.42\%$
test_creation_nested_2 48.3010μs 13.2249μs 75.6151 KOps/s 75.1427 KOps/s $\color{#35bf28}+0.63\%$
test_clone 60.2910μs 10.8580μs 92.0979 KOps/s 93.7342 KOps/s $\color{#d91a1a}-1.75\%$
test_getitem[int] 1.3205ms 10.3289μs 96.8155 KOps/s 95.3511 KOps/s $\color{#35bf28}+1.54\%$
test_getitem[slice_int] 0.1729ms 20.5568μs 48.6457 KOps/s 49.1522 KOps/s $\color{#d91a1a}-1.03\%$
test_getitem[range] 0.1323ms 36.7445μs 27.2150 KOps/s 27.2817 KOps/s $\color{#d91a1a}-0.24\%$
test_getitem[tuple] 0.1371ms 17.8270μs 56.0947 KOps/s 56.3951 KOps/s $\color{#d91a1a}-0.53\%$
test_getitem[list] 0.1339ms 32.5643μs 30.7084 KOps/s 30.4265 KOps/s $\color{#35bf28}+0.93\%$
test_setitem_dim[int] 39.3510μs 18.5084μs 54.0294 KOps/s 52.7522 KOps/s $\color{#35bf28}+2.42\%$
test_setitem_dim[slice_int] 62.5010μs 37.9831μs 26.3275 KOps/s 26.8807 KOps/s $\color{#d91a1a}-2.06\%$
test_setitem_dim[range] 82.3010μs 50.8953μs 19.6482 KOps/s 19.5674 KOps/s $\color{#35bf28}+0.41\%$
test_setitem_dim[tuple] 54.7510μs 31.7541μs 31.4920 KOps/s 32.1182 KOps/s $\color{#d91a1a}-1.95\%$
test_setitem 79.0620μs 15.2100μs 65.7461 KOps/s 63.9478 KOps/s $\color{#35bf28}+2.81\%$
test_set 84.1110μs 14.9871μs 66.7242 KOps/s 66.5575 KOps/s $\color{#35bf28}+0.25\%$
test_set_shared 0.6231ms 0.1566ms 6.3840 KOps/s 6.3765 KOps/s $\color{#35bf28}+0.12\%$
test_update 0.3847ms 18.4415μs 54.2255 KOps/s 53.7452 KOps/s $\color{#35bf28}+0.89\%$
test_update_nested 70.7410μs 23.5828μs 42.4037 KOps/s 39.3979 KOps/s $\textbf{\color{#35bf28}+7.63\%}$
test_update__nested 0.6204ms 25.3862μs 39.3916 KOps/s 35.7991 KOps/s $\textbf{\color{#35bf28}+10.04\%}$
test_set_nested 63.0010μs 16.2093μs 61.6930 KOps/s 56.0930 KOps/s $\textbf{\color{#35bf28}+9.98\%}$
test_set_nested_new 0.2220ms 18.3595μs 54.4678 KOps/s 49.9221 KOps/s $\textbf{\color{#35bf28}+9.11\%}$
test_select 0.1120ms 30.0320μs 33.2978 KOps/s 33.1459 KOps/s $\color{#35bf28}+0.46\%$
test_select_nested 0.1085ms 44.3755μs 22.5349 KOps/s 22.3178 KOps/s $\color{#35bf28}+0.97\%$
test_exclude_nested 0.1119ms 63.8555μs 15.6604 KOps/s 15.7501 KOps/s $\color{#d91a1a}-0.57\%$
test_empty[True] 0.3600ms 0.2972ms 3.3645 KOps/s 3.3643 KOps/s $+0.01\%$
test_empty[False] 18.3733μs 0.8270μs 1.2092 MOps/s 1.2111 MOps/s $\color{#d91a1a}-0.16\%$
test_to 91.3620μs 55.4381μs 18.0381 KOps/s 18.2650 KOps/s $\color{#d91a1a}-1.24\%$
test_to_nonblocking 0.1836ms 47.5076μs 21.0493 KOps/s 21.5422 KOps/s $\color{#d91a1a}-2.29\%$
test_unbind_speed 0.2854ms 0.2420ms 4.1327 KOps/s 4.1286 KOps/s $\color{#35bf28}+0.10\%$
test_unbind_speed_stack0 0.2932ms 0.2325ms 4.3012 KOps/s 4.0311 KOps/s $\textbf{\color{#35bf28}+6.70\%}$
test_unbind_speed_stack1 0.1080s 0.7417ms 1.3483 KOps/s 1.2974 KOps/s $\color{#35bf28}+3.92\%$
test_split 0.1190s 1.6632ms 601.2637 Ops/s 584.5910 Ops/s $\color{#35bf28}+2.85\%$
test_chunk 0.1097s 1.6336ms 612.1429 Ops/s 565.0844 Ops/s $\textbf{\color{#35bf28}+8.33\%}$
test_consolidate[False-None] 2.7164ms 2.6281ms 380.5002 Ops/s 375.1722 Ops/s $\color{#35bf28}+1.42\%$
test_consolidate[default-None] 1.7796ms 1.6885ms 592.2416 Ops/s 580.9256 Ops/s $\color{#35bf28}+1.95\%$
test_consolidate[reduce-overhead-None] 1.8547ms 1.7150ms 583.0871 Ops/s 567.8051 Ops/s $\color{#35bf28}+2.69\%$
test_consolidate_njt[False-None] 6.7415ms 6.3621ms 157.1799 Ops/s 155.2339 Ops/s $\color{#35bf28}+1.25\%$
test_to[False-False-None] 1.9124ms 1.7343ms 576.6083 Ops/s 574.2382 Ops/s $\color{#35bf28}+0.41\%$
test_to[True-False-None] 1.5138ms 1.2901ms 775.1316 Ops/s 759.9633 Ops/s $\color{#35bf28}+2.00\%$
test_to[within-False-None] 4.2502ms 4.0383ms 247.6319 Ops/s 245.5147 Ops/s $\color{#35bf28}+0.86\%$
test_to[True-default-None] 5.4018ms 5.0936ms 196.3261 Ops/s 194.8867 Ops/s $\color{#35bf28}+0.74\%$
test_to_njt[False-False-None] 7.1106ms 6.8726ms 145.5063 Ops/s 144.7037 Ops/s $\color{#35bf28}+0.55\%$
test_to_njt[True-False-None] 5.8024ms 5.4132ms 184.7341 Ops/s 182.8684 Ops/s $\color{#35bf28}+1.02\%$
test_to_njt[within-False-None] 12.4192ms 12.0430ms 83.0360 Ops/s 82.9050 Ops/s $\color{#35bf28}+0.16\%$
test_creation[device0] 0.4678ms 81.1045μs 12.3298 KOps/s 12.7089 KOps/s $\color{#d91a1a}-2.98\%$
test_creation_from_tensor 0.5462ms 81.7932μs 12.2260 KOps/s 12.1728 KOps/s $\color{#35bf28}+0.44\%$
test_add_one[memmap_tensor0] 0.2575ms 6.6024μs 151.4593 KOps/s 148.3446 KOps/s $\color{#35bf28}+2.10\%$
test_contiguous[memmap_tensor0] 1.6715μs 0.4028μs 2.4827 MOps/s 2.4883 MOps/s $\color{#d91a1a}-0.23\%$
test_stack[memmap_tensor0] 24.4700μs 4.2437μs 235.6411 KOps/s 237.1434 KOps/s $\color{#d91a1a}-0.63\%$
test_memmaptd_index 1.7616ms 0.2448ms 4.0855 KOps/s 4.1553 KOps/s $\color{#d91a1a}-1.68\%$
test_memmaptd_index_astensor 0.4260ms 0.3027ms 3.3041 KOps/s 3.3196 KOps/s $\color{#d91a1a}-0.47\%$
test_memmaptd_index_op 0.7966ms 0.5833ms 1.7145 KOps/s 1.7063 KOps/s $\color{#35bf28}+0.48\%$
test_serialize_model 0.1334s 0.1325s 7.5474 Ops/s 7.5428 Ops/s $\color{#35bf28}+0.06\%$
test_serialize_model_pickle 1.3486s 1.1887s 0.8412 Ops/s 0.8217 Ops/s $\color{#35bf28}+2.37\%$
test_serialize_weights 0.1321s 0.1312s 7.6239 Ops/s 7.6262 Ops/s $\color{#d91a1a}-0.03\%$
test_serialize_weights_returnearly 0.3714s 56.2635ms 17.7735 Ops/s 23.5085 Ops/s $\textbf{\color{#d91a1a}-24.40\%}$
test_serialize_weights_pickle 1.3476s 1.2118s 0.8252 Ops/s 0.8136 Ops/s $\color{#35bf28}+1.42\%$
test_reshape_pytree 0.1717ms 22.1001μs 45.2487 KOps/s 45.6574 KOps/s $\color{#d91a1a}-0.90\%$
test_reshape_td 0.1481ms 25.8639μs 38.6640 KOps/s 37.2234 KOps/s $\color{#35bf28}+3.87\%$
test_view_pytree 0.1295ms 21.7433μs 45.9912 KOps/s 45.7852 KOps/s $\color{#35bf28}+0.45\%$
test_view_td 64.5710μs 29.9707μs 33.3659 KOps/s 30.1901 KOps/s $\textbf{\color{#35bf28}+10.52\%}$
test_unbind_pytree 0.1431ms 27.8668μs 35.8850 KOps/s 36.0182 KOps/s $\color{#d91a1a}-0.37\%$
test_unbind_td 0.5977ms 36.2112μs 27.6157 KOps/s 27.8963 KOps/s $\color{#d91a1a}-1.01\%$
test_split_pytree 0.1508ms 29.6385μs 33.7399 KOps/s 33.8709 KOps/s $\color{#d91a1a}-0.39\%$
test_split_td 0.1891s 52.3198μs 19.1132 KOps/s 25.8219 KOps/s $\textbf{\color{#d91a1a}-25.98\%}$
test_add_pytree 0.1650ms 34.4229μs 29.0505 KOps/s 28.5886 KOps/s $\color{#35bf28}+1.62\%$
test_add_td 0.1843ms 52.8920μs 18.9065 KOps/s 19.8205 KOps/s $\color{#d91a1a}-4.61\%$
test_compile_add_one_nested[tensordict-compile] 0.2690ms 0.1203ms 8.3110 KOps/s 7.8212 KOps/s $\textbf{\color{#35bf28}+6.26\%}$
test_compile_add_one_nested[tensordict-eager] 0.2863ms 0.1351ms 7.4008 KOps/s 7.3106 KOps/s $\color{#35bf28}+1.23\%$
test_compile_add_one_nested[pytree-compile] 0.2438ms 96.0161μs 10.4149 KOps/s 10.3208 KOps/s $\color{#35bf28}+0.91\%$
test_compile_add_one_nested[pytree-eager] 1.2366ms 0.1464ms 6.8311 KOps/s 6.7876 KOps/s $\color{#35bf28}+0.64\%$
test_compile_copy_nested[tensordict-compile] 0.4977ms 33.1980μs 30.1223 KOps/s 45.1957 KOps/s $\textbf{\color{#d91a1a}-33.35\%}$
test_compile_copy_nested[tensordict-eager] 0.4128ms 29.4377μs 33.9701 KOps/s 33.5711 KOps/s $\color{#35bf28}+1.19\%$
test_compile_copy_nested[pytree-compile] 0.4894ms 63.6755μs 15.7046 KOps/s 15.4159 KOps/s $\color{#35bf28}+1.87\%$
test_compile_copy_nested[pytree-eager] 0.4978ms 49.8324μs 20.0673 KOps/s 19.9877 KOps/s $\color{#35bf28}+0.40\%$
test_compile_add_one_flat[tensordict-compile] 0.2784ms 0.1425ms 7.0157 KOps/s 7.0599 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_add_one_flat[tensordict-eager] 0.6324ms 0.2200ms 4.5451 KOps/s 4.5827 KOps/s $\color{#d91a1a}-0.82\%$
test_compile_add_one_flat[tensorclass-compile] 0.3011ms 97.7331μs 10.2319 KOps/s 10.1777 KOps/s $\color{#35bf28}+0.53\%$
test_compile_add_one_flat[tensorclass-eager] 0.2650ms 54.7244μs 18.2734 KOps/s 17.6981 KOps/s $\color{#35bf28}+3.25\%$
test_compile_add_one_flat[pytree-compile] 0.2890ms 0.1350ms 7.4051 KOps/s 7.0487 KOps/s $\textbf{\color{#35bf28}+5.06\%}$
test_compile_add_one_flat[pytree-eager] 0.6263ms 0.4671ms 2.1409 KOps/s 2.1028 KOps/s $\color{#35bf28}+1.81\%$
test_compile_add_self_flat[tensordict-eager] 0.4371ms 0.2649ms 3.7753 KOps/s 3.8091 KOps/s $\color{#d91a1a}-0.89\%$
test_compile_add_self_flat[tensordict-compile] 0.2687ms 0.1429ms 6.9979 KOps/s 6.7605 KOps/s $\color{#35bf28}+3.51\%$
test_compile_add_self_flat[tensorclass-eager] 0.2157ms 66.6354μs 15.0070 KOps/s 14.1843 KOps/s $\textbf{\color{#35bf28}+5.80\%}$
test_compile_add_self_flat[tensorclass-compile] 0.2397ms 98.6144μs 10.1405 KOps/s 9.6332 KOps/s $\textbf{\color{#35bf28}+5.27\%}$
test_compile_add_self_flat[pytree-eager] 0.5473ms 0.3950ms 2.5314 KOps/s 2.5050 KOps/s $\color{#35bf28}+1.05\%$
test_compile_add_self_flat[pytree-compile] 0.2827ms 0.1345ms 7.4367 KOps/s 7.1780 KOps/s $\color{#35bf28}+3.60\%$
test_compile_copy_flat[tensordict-compile] 0.1527ms 17.8529μs 56.0133 KOps/s 56.0979 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_copy_flat[tensordict-eager] 0.4272ms 32.2029μs 31.0531 KOps/s 31.7432 KOps/s $\color{#d91a1a}-2.17\%$
test_compile_copy_flat[pytree-compile] 0.4649ms 69.7740μs 14.3320 KOps/s 14.3486 KOps/s $\color{#d91a1a}-0.12\%$
test_compile_copy_flat[pytree-eager] 0.4526ms 52.4838μs 19.0535 KOps/s 18.9732 KOps/s $\color{#35bf28}+0.42\%$
test_compile_assign_and_add[tensordict-compile] 1.6532ms 0.4540ms 2.2027 KOps/s 2.2285 KOps/s $\color{#d91a1a}-1.16\%$
test_compile_assign_and_add[tensordict-eager] 2.9066ms 2.6227ms 381.2897 Ops/s 380.9208 Ops/s $\color{#35bf28}+0.10\%$
test_compile_assign_and_add[pytree-compile] 1.5884ms 0.4318ms 2.3160 KOps/s 2.2604 KOps/s $\color{#35bf28}+2.46\%$
test_compile_assign_and_add[pytree-eager] 3.1344ms 2.5859ms 386.7171 Ops/s 388.5016 Ops/s $\color{#d91a1a}-0.46\%$
test_compile_indexing[tensor-tensordict-compile] 0.8842ms 0.1182ms 8.4609 KOps/s 8.7410 KOps/s $\color{#d91a1a}-3.20\%$
test_compile_indexing[tensor-tensordict-eager] 0.5843ms 81.5750μs 12.2587 KOps/s 12.3695 KOps/s $\color{#d91a1a}-0.90\%$
test_compile_indexing[tensor-tensorclass-compile] 0.8031ms 0.1101ms 9.0792 KOps/s 9.1591 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_indexing[tensor-tensorclass-eager] 0.2526ms 70.8249μs 14.1193 KOps/s 14.4069 KOps/s $\color{#d91a1a}-2.00\%$
test_compile_indexing[tensor-pytree-compile] 0.2884ms 0.1123ms 8.9014 KOps/s 9.0411 KOps/s $\color{#d91a1a}-1.54\%$
test_compile_indexing[tensor-pytree-eager] 0.2642ms 70.8321μs 14.1179 KOps/s 14.1720 KOps/s $\color{#d91a1a}-0.38\%$
test_compile_indexing[slice-tensordict-compile] 0.2869ms 0.1052ms 9.5081 KOps/s 10.0282 KOps/s $\textbf{\color{#d91a1a}-5.19\%}$
test_compile_indexing[slice-tensordict-eager] 0.1800ms 17.7473μs 56.3465 KOps/s 56.9541 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_indexing[slice-tensorclass-compile] 0.2400ms 96.3051μs 10.3837 KOps/s 10.0366 KOps/s $\color{#35bf28}+3.46\%$
test_compile_indexing[slice-tensorclass-eager] 0.1643ms 15.5015μs 64.5098 KOps/s 64.6219 KOps/s $\color{#d91a1a}-0.17\%$
test_compile_indexing[slice-pytree-compile] 0.2872ms 99.6354μs 10.0366 KOps/s 10.3175 KOps/s $\color{#d91a1a}-2.72\%$
test_compile_indexing[slice-pytree-eager] 0.1973ms 15.4841μs 64.5825 KOps/s 65.0724 KOps/s $\color{#d91a1a}-0.75\%$
test_compile_indexing[int-tensordict-compile] 0.3147ms 0.1052ms 9.5095 KOps/s 9.9667 KOps/s $\color{#d91a1a}-4.59\%$
test_compile_indexing[int-tensordict-eager] 0.5976ms 17.1486μs 58.3138 KOps/s 58.7899 KOps/s $\color{#d91a1a}-0.81\%$
test_compile_indexing[int-tensorclass-compile] 0.2866ms 0.1011ms 9.8888 KOps/s 10.2596 KOps/s $\color{#d91a1a}-3.61\%$
test_compile_indexing[int-tensorclass-eager] 0.1515ms 15.4049μs 64.9144 KOps/s 64.6911 KOps/s $\color{#35bf28}+0.35\%$
test_compile_indexing[int-pytree-compile] 0.2733ms 96.8859μs 10.3214 KOps/s 9.7900 KOps/s $\textbf{\color{#35bf28}+5.43\%}$
test_compile_indexing[int-pytree-eager] 0.2073ms 15.4996μs 64.5177 KOps/s 64.8718 KOps/s $\color{#d91a1a}-0.55\%$
test_mod_add[eager] 0.1899ms 38.0485μs 26.2823 KOps/s 26.2749 KOps/s $\color{#35bf28}+0.03\%$
test_mod_add[compile] 0.3011ms 79.5013μs 12.5784 KOps/s 12.3373 KOps/s $\color{#35bf28}+1.95\%$
test_mod_add[compile-overhead] 0.3873ms 0.1867ms 5.3572 KOps/s 5.5401 KOps/s $\color{#d91a1a}-3.30\%$
test_mod_wrap[eager] 0.4164ms 0.2478ms 4.0360 KOps/s 4.0505 KOps/s $\color{#d91a1a}-0.36\%$
test_mod_wrap[compile] 0.5048ms 0.2803ms 3.5677 KOps/s 3.5370 KOps/s $\color{#35bf28}+0.87\%$
test_mod_wrap[compile-overhead] 7.1692ms 3.7983ms 263.2756 Ops/s 256.1896 Ops/s $\color{#35bf28}+2.77\%$
test_mod_wrap_and_backward[eager] 1.5750ms 1.3813ms 723.9404 Ops/s 699.6335 Ops/s $\color{#35bf28}+3.47\%$
test_mod_wrap_and_backward[compile] 1.4736ms 1.2664ms 789.6148 Ops/s 800.6139 Ops/s $\color{#d91a1a}-1.37\%$
test_mod_wrap_and_backward[compile-overhead] 1.4315ms 0.9265ms 1.0793 KOps/s 1.0524 KOps/s $\color{#35bf28}+2.56\%$
test_seq_add[eager] 0.5471ms 0.1143ms 8.7452 KOps/s 8.6166 KOps/s $\color{#35bf28}+1.49\%$
test_seq_add[compile] 0.5058ms 87.0956μs 11.4816 KOps/s 11.6263 KOps/s $\color{#d91a1a}-1.24\%$
test_seq_add[compile-overhead] 0.2739ms 0.1286ms 7.7733 KOps/s 7.5580 KOps/s $\color{#35bf28}+2.85\%$
test_seq_wrap[eager] 0.8715ms 0.4282ms 2.3353 KOps/s 2.3186 KOps/s $\color{#35bf28}+0.72\%$
test_seq_wrap[compile] 0.6865ms 0.3007ms 3.3252 KOps/s 3.3851 KOps/s $\color{#d91a1a}-1.77\%$
test_seq_wrap[compile-overhead] 0.4134ms 0.2264ms 4.4165 KOps/s 4.4639 KOps/s $\color{#d91a1a}-1.06\%$
test_func_call_runtime[False-eager] 1.2155ms 0.7846ms 1.2745 KOps/s 1.3761 KOps/s $\textbf{\color{#d91a1a}-7.38\%}$
test_func_call_runtime[False-compile] 1.1847ms 0.7583ms 1.3187 KOps/s 1.3666 KOps/s $\color{#d91a1a}-3.50\%$
test_func_call_runtime[False-compile-overhead] 0.7806ms 0.3614ms 2.7667 KOps/s 2.7464 KOps/s $\color{#35bf28}+0.74\%$
test_func_call_runtime[True-eager] 1.3215ms 0.9149ms 1.0930 KOps/s 1.1083 KOps/s $\color{#d91a1a}-1.38\%$
test_func_call_runtime[True-compile] 1.1809ms 0.7906ms 1.2648 KOps/s 1.3234 KOps/s $\color{#d91a1a}-4.42\%$
test_func_call_runtime[True-compile-overhead] 0.8219ms 0.3874ms 2.5815 KOps/s 2.5905 KOps/s $\color{#d91a1a}-0.34\%$
test_func_call_cm_runtime[False-eager] 1.1833ms 0.7646ms 1.3079 KOps/s 1.3665 KOps/s $\color{#d91a1a}-4.29\%$
test_func_call_cm_runtime[False-compile] 1.1694ms 0.7559ms 1.3230 KOps/s 1.3506 KOps/s $\color{#d91a1a}-2.04\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4983ms 0.3642ms 2.7454 KOps/s 2.7396 KOps/s $\color{#35bf28}+0.21\%$
test_func_call_cm_runtime[True-eager] 1.4337ms 1.0132ms 986.9366 Ops/s 998.9880 Ops/s $\color{#d91a1a}-1.21\%$
test_func_call_cm_runtime[True-compile] 1.4070ms 1.0002ms 999.8276 Ops/s 1.0091 KOps/s $\color{#d91a1a}-0.92\%$
test_func_call_cm_runtime[True-compile-overhead] 1.4120ms 1.0012ms 998.7986 Ops/s 1.0093 KOps/s $\color{#d91a1a}-1.04\%$
test_vmap_func_call_cm_runtime[eager] 2.5262ms 2.1204ms 471.6114 Ops/s 475.0406 Ops/s $\color{#d91a1a}-0.72\%$
test_vmap_func_call_cm_runtime[compile] 1.2261ms 0.8245ms 1.2129 KOps/s 1.2429 KOps/s $\color{#d91a1a}-2.41\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5705ms 0.4164ms 2.4015 KOps/s 2.3771 KOps/s $\color{#35bf28}+1.03\%$
test_distributed 2.5495ms 0.1791ms 5.5821 KOps/s 8.7119 KOps/s $\textbf{\color{#d91a1a}-35.93\%}$
test_tdmodule 0.1860ms 21.1175μs 47.3541 KOps/s 48.2634 KOps/s $\color{#d91a1a}-1.88\%$
test_tdmodule_dispatch 0.2315ms 37.3240μs 26.7924 KOps/s 28.1922 KOps/s $\color{#d91a1a}-4.97\%$
test_tdseq 43.0210μs 22.1488μs 45.1492 KOps/s 48.0266 KOps/s $\textbf{\color{#d91a1a}-5.99\%}$
test_tdseq_dispatch 0.2122ms 40.4741μs 24.7072 KOps/s 24.8114 KOps/s $\color{#d91a1a}-0.42\%$
test_instantiation_functorch 1.9439ms 1.5476ms 646.1584 Ops/s 647.9153 Ops/s $\color{#d91a1a}-0.27\%$
test_exec_functorch 0.5618ms 0.1460ms 6.8485 KOps/s 6.8497 KOps/s $\color{#d91a1a}-0.02\%$
test_exec_functional_call 0.3307ms 0.1373ms 7.2820 KOps/s 7.2573 KOps/s $\color{#35bf28}+0.34\%$
test_exec_td_decorator 0.5993ms 0.1887ms 5.3007 KOps/s 5.1941 KOps/s $\color{#35bf28}+2.05\%$
test_vmap_mlp_speed_decorator[True-True] 1.0878ms 0.6885ms 1.4524 KOps/s 1.4498 KOps/s $\color{#35bf28}+0.17\%$
test_vmap_mlp_speed_decorator[True-False] 0.9336ms 0.6901ms 1.4490 KOps/s 1.4577 KOps/s $\color{#d91a1a}-0.59\%$
test_vmap_mlp_speed_decorator[False-True] 0.7686ms 0.6008ms 1.6644 KOps/s 1.6376 KOps/s $\color{#35bf28}+1.64\%$
test_vmap_mlp_speed_decorator[False-False] 0.9994ms 0.6014ms 1.6628 KOps/s 1.6712 KOps/s $\color{#d91a1a}-0.51\%$
test_vmap_transformer_speed_decorator[True-True] 19.9775ms 19.3799ms 51.6000 Ops/s 52.0165 Ops/s $\color{#d91a1a}-0.80\%$
test_vmap_transformer_speed_decorator[True-False] 19.9814ms 19.3839ms 51.5891 Ops/s 52.0417 Ops/s $\color{#d91a1a}-0.87\%$
test_vmap_transformer_speed_decorator[False-True] 19.5500ms 19.1879ms 52.1163 Ops/s 52.2612 Ops/s $\color{#d91a1a}-0.28\%$
test_vmap_transformer_speed_decorator[False-False] 19.4976ms 19.2174ms 52.0363 Ops/s 52.5283 Ops/s $\color{#d91a1a}-0.94\%$
test_to_module_speed[True] 1.2350ms 0.9963ms 1.0037 KOps/s 984.1275 Ops/s $\color{#35bf28}+1.99\%$
test_to_module_speed[False] 1.1923ms 0.9833ms 1.0169 KOps/s 1.0057 KOps/s $\color{#35bf28}+1.11\%$
test_tc_init 96.6520μs 37.1083μs 26.9481 KOps/s 27.2903 KOps/s $\color{#d91a1a}-1.25\%$
test_tc_init_nested 0.1775ms 74.7785μs 13.3728 KOps/s 13.1924 KOps/s $\color{#35bf28}+1.37\%$
test_tc_first_layer_tensor 22.6710μs 0.8123μs 1.2311 MOps/s 1.1860 MOps/s $\color{#35bf28}+3.80\%$
test_tc_first_layer_nontensor 20.6710μs 2.2817μs 438.2694 KOps/s 420.7372 KOps/s $\color{#35bf28}+4.17\%$
test_tc_second_layer_tensor 12.3225μs 1.4121μs 708.1884 KOps/s 669.0214 KOps/s $\textbf{\color{#35bf28}+5.85\%}$
test_tc_second_layer_nontensor 58.5110μs 3.0153μs 331.6433 KOps/s 321.5717 KOps/s $\color{#35bf28}+3.13\%$
test_unbind 0.2304s 12.6061ms 79.3266 Ops/s 141.1797 Ops/s $\textbf{\color{#d91a1a}-43.81\%}$
test_full_like 11.3575ms 10.3736ms 96.3984 Ops/s 90.2700 Ops/s $\textbf{\color{#35bf28}+6.79\%}$
test_zeros_like 10.1420ms 7.6132ms 131.3515 Ops/s 204.0316 Ops/s $\textbf{\color{#d91a1a}-35.62\%}$
test_ones_like 5.3697ms 4.6556ms 214.7956 Ops/s 206.0561 Ops/s $\color{#35bf28}+4.24\%$
test_clone 13.1081ms 10.1338ms 98.6794 Ops/s 125.2921 Ops/s $\textbf{\color{#d91a1a}-21.24\%}$
test_squeeze 0.1114ms 9.6817μs 103.2875 KOps/s 104.1994 KOps/s $\color{#d91a1a}-0.88\%$
test_unsqueeze 0.1740ms 72.6407μs 13.7664 KOps/s 13.3700 KOps/s $\color{#35bf28}+2.96\%$
test_split 0.6528ms 0.1576ms 6.3435 KOps/s 6.3293 KOps/s $\color{#35bf28}+0.22\%$
test_permute 0.3532ms 0.1891ms 5.2893 KOps/s 5.3428 KOps/s $\color{#d91a1a}-1.00\%$
test_stack 54.4714ms 52.8744ms 18.9127 Ops/s 18.6521 Ops/s $\color{#35bf28}+1.40\%$
test_cat 53.7535ms 52.5316ms 19.0362 Ops/s 18.4552 Ops/s $\color{#35bf28}+3.15\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Feb 19, 2025
ghstack-source-id: 27805b68d4663d51f4ecd67f0495de8f83c90c41
Pull Request resolved: #1221
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens added the enhancement New feature or request label Feb 19, 2025
[ghstack-poisoned]
@vmoens vmoens merged commit a0a9486 into gh/vmoens/48/base Feb 19, 2025
19 of 37 checks passed
vmoens added a commit that referenced this pull request Feb 19, 2025
ghstack-source-id: 8667892d782a5904e2c5117a1b039edcdaacb9e0
Pull Request resolved: #1221
@vmoens vmoens deleted the gh/vmoens/48/head branch February 19, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants