-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] capture_non_tensor_stack #1221
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Feb 19, 2025
ghstack-source-id: 86a79f8b5aad255f3c1cffb821e71b9d06378fdb Pull Request resolved: #1221
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 92.3720μs | 20.9779μs | 47.6691 KOps/s | 49.7036 KOps/s | |
test_plain_set_stack_nested | 67.7760μs | 21.3307μs | 46.8807 KOps/s | 49.3906 KOps/s | |
test_plain_set_nested_inplace | 0.1067ms | 23.2987μs | 42.9208 KOps/s | 44.9494 KOps/s | |
test_plain_set_stack_nested_inplace | 72.7950μs | 23.0154μs | 43.4491 KOps/s | 44.8541 KOps/s | |
test_items | 58.4190μs | 4.2513μs | 235.2210 KOps/s | 238.3685 KOps/s | |
test_items_nested | 0.7302ms | 0.4206ms | 2.3777 KOps/s | 2.4438 KOps/s | |
test_items_nested_locked | 0.4998ms | 0.4193ms | 2.3850 KOps/s | 2.4927 KOps/s | |
test_items_nested_leaf | 0.1641ms | 77.4657μs | 12.9089 KOps/s | 12.8746 KOps/s | |
test_items_stack_nested | 0.5228ms | 0.4198ms | 2.3819 KOps/s | 2.4485 KOps/s | |
test_items_stack_nested_leaf | 0.1567ms | 77.4926μs | 12.9045 KOps/s | 12.3810 KOps/s | |
test_items_stack_nested_locked | 0.7379ms | 0.4231ms | 2.3637 KOps/s | 2.4453 KOps/s | |
test_keys | 58.6890μs | 3.5331μs | 283.0398 KOps/s | 281.5838 KOps/s | |
test_keys_nested | 0.2981ms | 0.1658ms | 6.0324 KOps/s | 6.0504 KOps/s | |
test_keys_nested_locked | 0.7066ms | 0.1686ms | 5.9310 KOps/s | 5.8004 KOps/s | |
test_keys_nested_leaf | 0.2313ms | 0.1423ms | 7.0282 KOps/s | 6.9759 KOps/s | |
test_keys_stack_nested | 0.2508ms | 0.1635ms | 6.1148 KOps/s | 6.1250 KOps/s | |
test_keys_stack_nested_leaf | 0.2233ms | 0.1425ms | 7.0188 KOps/s | 7.1769 KOps/s | |
test_keys_stack_nested_locked | 0.2965ms | 0.1689ms | 5.9195 KOps/s | 5.8963 KOps/s | |
test_values | 10.8062μs | 1.0415μs | 960.1772 KOps/s | 943.5435 KOps/s | |
test_values_nested | 0.1689ms | 63.7501μs | 15.6862 KOps/s | 15.9462 KOps/s | |
test_values_nested_locked | 0.1256ms | 63.2777μs | 15.8033 KOps/s | 15.8605 KOps/s | |
test_values_nested_leaf | 0.1354ms | 72.6930μs | 13.7565 KOps/s | 13.9319 KOps/s | |
test_values_stack_nested | 0.1162ms | 63.0182μs | 15.8684 KOps/s | 14.2105 KOps/s | |
test_values_stack_nested_leaf | 0.3073ms | 73.3010μs | 13.6424 KOps/s | 14.0251 KOps/s | |
test_values_stack_nested_locked | 0.1329ms | 63.8472μs | 15.6624 KOps/s | 15.6755 KOps/s | |
test_membership | 30.1560μs | 0.9273μs | 1.0784 MOps/s | 1.1572 MOps/s | |
test_membership_nested | 53.0180μs | 2.8840μs | 346.7383 KOps/s | 352.4050 KOps/s | |
test_membership_nested_leaf | 35.1150μs | 2.9150μs | 343.0504 KOps/s | 352.4026 KOps/s | |
test_membership_stacked_nested | 52.4670μs | 2.9298μs | 341.3181 KOps/s | 349.0682 KOps/s | |
test_membership_stacked_nested_leaf | 45.3450μs | 2.9113μs | 343.4946 KOps/s | 349.6518 KOps/s | |
test_membership_nested_last | 67.5360μs | 4.3791μs | 228.3598 KOps/s | 235.3199 KOps/s | |
test_membership_nested_leaf_last | 93.8250μs | 4.3743μs | 228.6101 KOps/s | 234.4156 KOps/s | |
test_membership_stacked_nested_last | 61.9250μs | 4.4188μs | 226.3078 KOps/s | 177.9424 KOps/s | |
test_membership_stacked_nested_leaf_last | 56.5450μs | 4.3656μs | 229.0642 KOps/s | 179.3022 KOps/s | |
test_nested_getleaf | 51.3760μs | 10.6188μs | 94.1727 KOps/s | 93.3230 KOps/s | |
test_nested_get | 62.8060μs | 10.1861μs | 98.1727 KOps/s | 99.2909 KOps/s | |
test_stacked_getleaf | 60.6030μs | 10.5111μs | 95.1379 KOps/s | 95.8502 KOps/s | |
test_stacked_get | 48.6510μs | 9.9854μs | 100.1465 KOps/s | 99.9886 KOps/s | |
test_nested_getitemleaf | 71.6440μs | 11.3237μs | 88.3102 KOps/s | 90.3394 KOps/s | |
test_nested_getitem | 83.5660μs | 10.6780μs | 93.6504 KOps/s | 95.1606 KOps/s | |
test_stacked_getitemleaf | 61.3150μs | 11.1545μs | 89.6501 KOps/s | 88.6774 KOps/s | |
test_stacked_getitem | 58.0480μs | 10.6775μs | 93.6550 KOps/s | 94.9536 KOps/s | |
test_lock_nested | 0.8790ms | 0.4289ms | 2.3315 KOps/s | 2.3957 KOps/s | |
test_lock_stack_nested | 0.7385ms | 0.4425ms | 2.2597 KOps/s | 2.3620 KOps/s | |
test_unlock_nested | 0.5544ms | 0.3477ms | 2.8757 KOps/s | 2.9640 KOps/s | |
test_unlock_stack_nested | 0.5495ms | 0.3589ms | 2.7863 KOps/s | 2.9584 KOps/s | |
test_flatten_speed | 0.1794ms | 99.7119μs | 10.0289 KOps/s | 9.9302 KOps/s | |
test_unflatten_speed | 0.8997ms | 0.5360ms | 1.8656 KOps/s | 1.9153 KOps/s | |
test_common_ops | 1.0787ms | 0.8337ms | 1.1995 KOps/s | 1.2437 KOps/s | |
test_creation | 61.4140μs | 2.5446μs | 392.9851 KOps/s | 392.6011 KOps/s | |
test_creation_empty | 40.6360μs | 12.3504μs | 80.9689 KOps/s | 88.4268 KOps/s | |
test_creation_nested_1 | 47.8490μs | 15.6417μs | 63.9318 KOps/s | 70.9587 KOps/s | |
test_creation_nested_2 | 58.7200μs | 19.9858μs | 50.0356 KOps/s | 53.3997 KOps/s | |
test_clone | 51.5760μs | 14.1734μs | 70.5549 KOps/s | 70.0920 KOps/s | |
test_getitem[int] | 0.6972ms | 13.3273μs | 75.0342 KOps/s | 77.7925 KOps/s | |
test_getitem[slice_int] | 0.1368ms | 25.5358μs | 39.1607 KOps/s | 41.5452 KOps/s | |
test_getitem[range] | 0.1839ms | 51.7622μs | 19.3191 KOps/s | 19.2805 KOps/s | |
test_getitem[tuple] | 0.1305ms | 20.9979μs | 47.6237 KOps/s | 49.7469 KOps/s | |
test_getitem[list] | 0.3900ms | 46.2364μs | 21.6280 KOps/s | 21.4294 KOps/s | |
test_setitem_dim[int] | 74.9990μs | 27.4102μs | 36.4827 KOps/s | 37.5631 KOps/s | |
test_setitem_dim[slice_int] | 95.1780μs | 52.6474μs | 18.9943 KOps/s | 18.9736 KOps/s | |
test_setitem_dim[range] | 0.1377ms | 78.1932μs | 12.7888 KOps/s | 12.6155 KOps/s | |
test_setitem_dim[tuple] | 89.3570μs | 41.4940μs | 24.0999 KOps/s | 23.8817 KOps/s | |
test_setitem | 0.1677ms | 21.5650μs | 46.3715 KOps/s | 47.8969 KOps/s | |
test_set | 0.1807ms | 21.3205μs | 46.9031 KOps/s | 49.5791 KOps/s | |
test_set_shared | 0.3909ms | 0.1857ms | 5.3842 KOps/s | 5.2289 KOps/s | |
test_update | 0.2211ms | 24.1791μs | 41.3581 KOps/s | 44.0066 KOps/s | |
test_update_nested | 0.2244ms | 36.1277μs | 27.6796 KOps/s | 29.2430 KOps/s | |
test_update__nested | 0.4176ms | 35.4319μs | 28.2231 KOps/s | 29.5758 KOps/s | |
test_set_nested | 0.1699ms | 23.1552μs | 43.1869 KOps/s | 44.7068 KOps/s | |
test_set_nested_new | 66.0320μs | 28.5559μs | 35.0191 KOps/s | 36.8876 KOps/s | |
test_select | 0.2371ms | 44.6503μs | 22.3963 KOps/s | 23.3167 KOps/s | |
test_select_nested | 0.1134ms | 66.4989μs | 15.0378 KOps/s | 15.6303 KOps/s | |
test_exclude_nested | 0.1894ms | 85.1503μs | 11.7439 KOps/s | 11.5425 KOps/s | |
test_empty[True] | 0.6200ms | 0.4139ms | 2.4162 KOps/s | 2.4013 KOps/s | |
test_empty[False] | 8.7890μs | 1.4707μs | 679.9632 KOps/s | 737.8024 KOps/s | |
test_unbind_speed | 0.3743ms | 0.2822ms | 3.5441 KOps/s | 3.6233 KOps/s | |
test_unbind_speed_stack0 | 0.4458ms | 0.2823ms | 3.5424 KOps/s | 3.7723 KOps/s | |
test_unbind_speed_stack1 | 0.1140s | 0.7697ms | 1.2992 KOps/s | 1.3702 KOps/s | |
test_split | 0.1202s | 1.8511ms | 540.2227 Ops/s | 561.1470 Ops/s | |
test_chunk | 0.1186s | 1.8559ms | 538.8264 Ops/s | 562.9485 Ops/s | |
test_consolidate_njt[False-None] | 10.5743ms | 8.7455ms | 114.3439 Ops/s | 121.0864 Ops/s | |
test_creation[device0] | 3.9719ms | 94.9106μs | 10.5362 KOps/s | 10.5543 KOps/s | |
test_creation_from_tensor | 0.4130ms | 96.5721μs | 10.3550 KOps/s | 10.3883 KOps/s | |
test_add_one[memmap_tensor0] | 0.1235ms | 4.8800μs | 204.9166 KOps/s | 186.2515 KOps/s | |
test_contiguous[memmap_tensor0] | 16.1100μs | 0.4976μs | 2.0096 MOps/s | 1.9724 MOps/s | |
test_stack[memmap_tensor0] | 38.6120μs | 3.4789μs | 287.4470 KOps/s | 269.5724 KOps/s | |
test_memmaptd_index | 1.0308ms | 0.2413ms | 4.1436 KOps/s | 4.2077 KOps/s | |
test_memmaptd_index_astensor | 0.5847ms | 0.3254ms | 3.0728 KOps/s | 2.7098 KOps/s | |
test_memmaptd_index_op | 0.8447ms | 0.6050ms | 1.6529 KOps/s | 1.7022 KOps/s | |
test_serialize_model | 0.2462s | 0.1409s | 7.0992 Ops/s | 8.5459 Ops/s | |
test_serialize_model_pickle | 0.4471s | 0.3905s | 2.5608 Ops/s | 2.4884 Ops/s | |
test_serialize_weights | 0.1247s | 0.1184s | 8.4482 Ops/s | 8.6464 Ops/s | |
test_serialize_weights_returnearly | 0.1872s | 0.1655s | 6.0407 Ops/s | 5.9903 Ops/s | |
test_serialize_weights_pickle | 1.2353s | 0.7213s | 1.3863 Ops/s | 2.4691 Ops/s | |
test_serialize_weights_filesystem | 0.1541s | 0.1493s | 6.6987 Ops/s | 6.1985 Ops/s | |
test_serialize_model_filesystem | 0.1601s | 0.1495s | 6.6883 Ops/s | 6.3605 Ops/s | |
test_reshape_pytree | 75.1200μs | 27.0395μs | 36.9830 KOps/s | 37.5751 KOps/s | |
test_reshape_td | 68.8790μs | 33.0684μs | 30.2404 KOps/s | 28.9574 KOps/s | |
test_view_pytree | 62.1550μs | 26.6457μs | 37.5294 KOps/s | 37.8661 KOps/s | |
test_view_td | 0.1544ms | 44.1301μs | 22.6603 KOps/s | 24.6909 KOps/s | |
test_unbind_pytree | 75.5810μs | 30.5172μs | 32.7684 KOps/s | 33.5698 KOps/s | |
test_unbind_td | 0.3457ms | 41.1722μs | 24.2882 KOps/s | 25.0403 KOps/s | |
test_split_pytree | 77.6740μs | 29.8445μs | 33.5070 KOps/s | 34.1712 KOps/s | |
test_split_td | 0.5309ms | 47.4534μs | 21.0733 KOps/s | 22.1205 KOps/s | |
test_add_pytree | 87.7230μs | 35.4433μs | 28.2141 KOps/s | 27.4444 KOps/s | |
test_add_td | 0.1766ms | 60.5309μs | 16.5205 KOps/s | 16.9271 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1263ms | 66.7716μs | 14.9764 KOps/s | 14.8850 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3624ms | 0.1725ms | 5.7980 KOps/s | 5.7442 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1228ms | 46.5122μs | 21.4997 KOps/s | 21.5562 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2266ms | 0.1203ms | 8.3095 KOps/s | 8.2591 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 93.6750μs | 28.5579μs | 35.0166 KOps/s | 35.9567 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1263ms | 61.7581μs | 16.1922 KOps/s | 17.1984 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1849ms | 82.2752μs | 12.1543 KOps/s | 12.5432 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1308ms | 68.5236μs | 14.5935 KOps/s | 15.0579 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2328ms | 0.1095ms | 9.1341 KOps/s | 9.2918 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4864ms | 0.2212ms | 4.5218 KOps/s | 4.6240 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2745ms | 47.9934μs | 20.8362 KOps/s | 20.9754 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1528ms | 69.7276μs | 14.3415 KOps/s | 14.6849 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2253ms | 0.1024ms | 9.7693 KOps/s | 9.7332 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3718ms | 0.2025ms | 4.9372 KOps/s | 4.7671 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3832ms | 0.2364ms | 4.2301 KOps/s | 4.2973 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2042ms | 0.1099ms | 9.0954 KOps/s | 9.0775 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.3202ms | 64.7996μs | 15.4322 KOps/s | 15.7773 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2511ms | 50.5980μs | 19.7636 KOps/s | 19.9450 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2930ms | 0.1576ms | 6.3445 KOps/s | 6.1934 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1816ms | 0.1014ms | 9.8664 KOps/s | 9.8197 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 86.2520μs | 22.4257μs | 44.5918 KOps/s | 47.9233 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1321ms | 69.2812μs | 14.4339 KOps/s | 14.6355 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1602ms | 81.6160μs | 12.2525 KOps/s | 12.3001 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1303ms | 67.9637μs | 14.7137 KOps/s | 14.8160 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.4428ms | 0.2215ms | 4.5157 KOps/s | 4.7158 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.5650ms | 1.4267ms | 700.9410 Ops/s | 716.5211 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4207ms | 0.2151ms | 4.6500 KOps/s | 4.7926 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.0768ms | 0.8295ms | 1.2056 KOps/s | 1.1734 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.9995ms | 0.4758ms | 2.1015 KOps/s | 2.1973 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.7404ms | 2.7969ms | 357.5448 Ops/s | 382.0838 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1210ms | 39.3418μs | 25.4182 KOps/s | 26.2020 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5722ms | 33.7454μs | 29.6337 KOps/s | 29.4322 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1089ms | 30.7110μs | 32.5616 KOps/s | 31.5989 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 97.9430μs | 23.6778μs | 42.2337 KOps/s | 42.4657 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 93.4340μs | 32.0957μs | 31.1568 KOps/s | 30.8185 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 83.0240μs | 23.5009μs | 42.5516 KOps/s | 42.1691 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1288ms | 54.7178μs | 18.2756 KOps/s | 19.1375 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4075ms | 21.3935μs | 46.7432 KOps/s | 49.9751 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1219ms | 45.7885μs | 21.8395 KOps/s | 21.6724 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 70.4110μs | 18.7015μs | 53.4717 KOps/s | 53.5799 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1310ms | 47.3947μs | 21.0994 KOps/s | 21.4136 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 76.3330μs | 18.7908μs | 53.2175 KOps/s | 53.3768 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1302ms | 56.0135μs | 17.8528 KOps/s | 18.6940 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.1280ms | 21.4332μs | 46.6566 KOps/s | 50.5109 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1235ms | 46.6933μs | 21.4163 KOps/s | 21.2519 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 72.4450μs | 18.9108μs | 52.8799 KOps/s | 54.0245 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1086ms | 47.1199μs | 21.2225 KOps/s | 21.3282 KOps/s | |
test_compile_indexing[int-pytree-eager] | 81.7420μs | 18.6572μs | 53.5986 KOps/s | 53.1358 KOps/s | |
test_mod_add[eager] | 88.4350μs | 36.3221μs | 27.5314 KOps/s | 27.7826 KOps/s | |
test_mod_add[compile] | 0.1252ms | 66.6497μs | 15.0038 KOps/s | 15.8624 KOps/s | |
test_mod_add[compile-overhead] | 0.1412ms | 66.5912μs | 15.0170 KOps/s | 15.6815 KOps/s | |
test_mod_wrap[eager] | 0.4528ms | 0.2264ms | 4.4165 KOps/s | 4.3668 KOps/s | |
test_mod_wrap[compile] | 2.0968ms | 0.2346ms | 4.2626 KOps/s | 4.3292 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3653ms | 0.2290ms | 4.3661 KOps/s | 4.4600 KOps/s | |
test_mod_wrap_and_backward[eager] | 15.0182ms | 13.4314ms | 74.4523 Ops/s | 78.4721 Ops/s | |
test_mod_wrap_and_backward[compile] | 13.9426ms | 11.7551ms | 85.0693 Ops/s | 78.4431 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 13.5605ms | 11.6265ms | 86.0102 Ops/s | 86.3346 Ops/s | |
test_seq_add[eager] | 0.2212ms | 0.1213ms | 8.2430 KOps/s | 8.3988 KOps/s | |
test_seq_add[compile] | 0.1907ms | 80.7149μs | 12.3893 KOps/s | 13.1419 KOps/s | |
test_seq_add[compile-overhead] | 0.2117ms | 79.3563μs | 12.6014 KOps/s | 13.5606 KOps/s | |
test_seq_wrap[eager] | 1.0997ms | 0.4599ms | 2.1746 KOps/s | 2.2300 KOps/s | |
test_seq_wrap[compile] | 0.7175ms | 0.2575ms | 3.8838 KOps/s | 4.1174 KOps/s | |
test_seq_wrap[compile-overhead] | 0.5026ms | 0.2519ms | 3.9704 KOps/s | 4.1989 KOps/s | |
test_func_call_runtime[False-eager] | 0.8495ms | 0.5619ms | 1.7796 KOps/s | 1.8163 KOps/s | |
test_func_call_runtime[False-compile] | 0.6404ms | 0.4535ms | 2.2049 KOps/s | 2.2541 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6645ms | 0.4514ms | 2.2155 KOps/s | 2.2657 KOps/s | |
test_func_call_runtime[True-eager] | 1.9117ms | 0.8062ms | 1.2403 KOps/s | 1.3261 KOps/s | |
test_func_call_runtime[True-compile] | 0.6618ms | 0.4743ms | 2.1086 KOps/s | 2.1730 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6847ms | 0.4781ms | 2.0918 KOps/s | 2.1353 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7784ms | 0.5744ms | 1.7409 KOps/s | 1.8573 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6860ms | 0.4458ms | 2.2432 KOps/s | 2.2732 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6231ms | 0.4467ms | 2.2388 KOps/s | 2.2657 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.5525ms | 0.9392ms | 1.0647 KOps/s | 1.1198 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1515ms | 0.8269ms | 1.2094 KOps/s | 1.2623 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.0760ms | 0.8323ms | 1.2015 KOps/s | 1.2536 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 3.1765ms | 1.9855ms | 503.6484 Ops/s | 516.2960 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9305ms | 0.5576ms | 1.7934 KOps/s | 1.8403 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.7552ms | 0.5529ms | 1.8087 KOps/s | 1.8545 KOps/s | |
test_distributed | 0.2568ms | 0.1281ms | 7.8055 KOps/s | 7.7316 KOps/s | |
test_tdmodule | 59.1900μs | 27.2048μs | 36.7582 KOps/s | 37.1141 KOps/s | |
test_tdmodule_dispatch | 0.1099ms | 52.8709μs | 18.9140 KOps/s | 20.8602 KOps/s | |
test_tdseq | 64.8010μs | 29.3346μs | 34.0895 KOps/s | 33.8440 KOps/s | |
test_tdseq_dispatch | 95.4580μs | 55.2698μs | 18.0931 KOps/s | 17.7815 KOps/s | |
test_instantiation_functorch | 1.7761ms | 1.5438ms | 647.7314 Ops/s | 637.3794 Ops/s | |
test_exec_functorch | 0.4016ms | 0.1804ms | 5.5447 KOps/s | 5.5072 KOps/s | |
test_exec_functional_call | 0.3249ms | 0.1757ms | 5.6931 KOps/s | 5.6987 KOps/s | |
test_exec_td_decorator | 0.5823ms | 0.2403ms | 4.1621 KOps/s | 4.2975 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9214ms | 0.6656ms | 1.5024 KOps/s | 1.4753 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0246ms | 0.6642ms | 1.5055 KOps/s | 1.5299 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9267ms | 0.5416ms | 1.8463 KOps/s | 1.8499 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8526ms | 0.5432ms | 1.8408 KOps/s | 1.8604 KOps/s | |
test_to_module_speed[True] | 2.5566ms | 1.3891ms | 719.8748 Ops/s | 749.5396 Ops/s | |
test_to_module_speed[False] | 2.1061ms | 1.3334ms | 749.9800 Ops/s | 759.0140 Ops/s | |
test_tc_init | 0.1065ms | 47.9298μs | 20.8638 KOps/s | 20.9647 KOps/s | |
test_tc_init_nested | 0.1815ms | 95.6915μs | 10.4502 KOps/s | 10.6365 KOps/s | |
test_tc_first_layer_tensor | 36.1070μs | 1.5591μs | 641.3809 KOps/s | 650.1975 KOps/s | |
test_tc_first_layer_nontensor | 33.5020μs | 4.6797μs | 213.6886 KOps/s | 214.1823 KOps/s | |
test_tc_second_layer_tensor | 30.9180μs | 2.8828μs | 346.8905 KOps/s | 352.8782 KOps/s | |
test_tc_second_layer_nontensor | 32.8110μs | 6.1219μs | 163.3485 KOps/s | 162.8304 KOps/s | |
test_unbind | 0.2707s | 15.2394ms | 65.6194 Ops/s | 68.4239 Ops/s | |
test_full_like | 12.3197ms | 10.5573ms | 94.7214 Ops/s | 110.2471 Ops/s | |
test_zeros_like | 5.6675ms | 3.6320ms | 275.3281 Ops/s | 284.9023 Ops/s | |
test_ones_like | 5.5379ms | 4.5698ms | 218.8275 Ops/s | 241.0863 Ops/s | |
test_clone | 14.8106ms | 7.7020ms | 129.8369 Ops/s | 161.8701 Ops/s | |
test_squeeze | 90.9190μs | 12.6046μs | 79.3360 KOps/s | 77.7566 KOps/s | |
test_unsqueeze | 0.1805ms | 96.2508μs | 10.3895 KOps/s | 10.6351 KOps/s | |
test_split | 0.3638ms | 0.2004ms | 4.9892 KOps/s | 5.1743 KOps/s | |
test_permute | 0.3242ms | 0.2045ms | 4.8894 KOps/s | 5.0136 KOps/s | |
test_stack | 29.1887ms | 26.3323ms | 37.9762 Ops/s | 36.4714 Ops/s | |
test_cat | 34.0061ms | 26.1292ms | 38.2714 Ops/s | 35.1470 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 34.1410μs | 13.1100μs | 76.2778 KOps/s | 76.5418 KOps/s | |
test_plain_set_stack_nested | 47.3410μs | 13.1945μs | 75.7892 KOps/s | 75.6787 KOps/s | |
test_plain_set_nested_inplace | 42.2510μs | 14.1823μs | 70.5104 KOps/s | 70.1240 KOps/s | |
test_plain_set_stack_nested_inplace | 44.3400μs | 14.1119μs | 70.8621 KOps/s | 71.1976 KOps/s | |
test_items | 85.9820μs | 2.8424μs | 351.8213 KOps/s | 345.1006 KOps/s | |
test_items_nested | 0.4688ms | 0.3660ms | 2.7323 KOps/s | 2.6976 KOps/s | |
test_items_nested_locked | 0.4266ms | 0.3693ms | 2.7078 KOps/s | 2.6741 KOps/s | |
test_items_nested_leaf | 0.1204ms | 67.1649μs | 14.8887 KOps/s | 14.8433 KOps/s | |
test_items_stack_nested | 0.4227ms | 0.3656ms | 2.7350 KOps/s | 2.7122 KOps/s | |
test_items_stack_nested_leaf | 0.1302ms | 68.4824μs | 14.6023 KOps/s | 14.6240 KOps/s | |
test_items_stack_nested_locked | 0.4068ms | 0.3649ms | 2.7408 KOps/s | 2.6912 KOps/s | |
test_keys | 37.3110μs | 3.7094μs | 269.5869 KOps/s | 291.7881 KOps/s | |
test_keys_nested | 0.1183ms | 88.7748μs | 11.2645 KOps/s | 11.2938 KOps/s | |
test_keys_nested_locked | 0.7740ms | 94.6061μs | 10.5701 KOps/s | 10.5784 KOps/s | |
test_keys_nested_leaf | 0.1593ms | 80.3353μs | 12.4478 KOps/s | 12.6043 KOps/s | |
test_keys_stack_nested | 0.1237ms | 88.9075μs | 11.2476 KOps/s | 11.2063 KOps/s | |
test_keys_stack_nested_leaf | 0.1094ms | 79.8610μs | 12.5218 KOps/s | 12.3113 KOps/s | |
test_keys_stack_nested_locked | 0.1276ms | 94.5882μs | 10.5721 KOps/s | 10.4531 KOps/s | |
test_values | 5.6768μs | 0.8550μs | 1.1696 MOps/s | 1.1792 MOps/s | |
test_values_nested | 67.0810μs | 37.4057μs | 26.7339 KOps/s | 26.6677 KOps/s | |
test_values_nested_locked | 70.6810μs | 39.1910μs | 25.5161 KOps/s | 25.4531 KOps/s | |
test_values_nested_leaf | 0.2262ms | 42.4971μs | 23.5310 KOps/s | 23.5179 KOps/s | |
test_values_stack_nested | 68.4510μs | 38.1276μs | 26.2277 KOps/s | 26.0903 KOps/s | |
test_values_stack_nested_leaf | 0.1053ms | 42.6322μs | 23.4565 KOps/s | 23.3160 KOps/s | |
test_values_stack_nested_locked | 63.2910μs | 39.8644μs | 25.0850 KOps/s | 24.8861 KOps/s | |
test_membership | 2.7296μs | 0.5036μs | 1.9858 MOps/s | 1.9963 MOps/s | |
test_membership_nested | 15.9605μs | 2.0005μs | 499.8808 KOps/s | 496.0456 KOps/s | |
test_membership_nested_leaf | 15.9505μs | 2.0163μs | 495.9570 KOps/s | 496.3671 KOps/s | |
test_membership_stacked_nested | 56.6510μs | 2.0677μs | 483.6195 KOps/s | 482.1015 KOps/s | |
test_membership_stacked_nested_leaf | 35.2610μs | 2.0603μs | 485.3637 KOps/s | 481.7970 KOps/s | |
test_membership_nested_last | 27.0500μs | 3.0791μs | 324.7685 KOps/s | 322.2496 KOps/s | |
test_membership_nested_leaf_last | 40.2500μs | 3.1159μs | 320.9348 KOps/s | 322.1221 KOps/s | |
test_membership_stacked_nested_last | 41.5200μs | 8.5396μs | 117.1014 KOps/s | 279.1475 KOps/s | |
test_membership_stacked_nested_leaf_last | 35.4610μs | 8.4476μs | 118.3769 KOps/s | 276.1373 KOps/s | |
test_nested_getleaf | 35.8100μs | 6.5170μs | 153.4456 KOps/s | 155.8826 KOps/s | |
test_nested_get | 32.8500μs | 6.1929μs | 161.4741 KOps/s | 159.7780 KOps/s | |
test_stacked_getleaf | 75.3220μs | 6.4559μs | 154.8974 KOps/s | 156.3088 KOps/s | |
test_stacked_get | 38.6210μs | 5.9837μs | 167.1204 KOps/s | 166.4831 KOps/s | |
test_nested_getitemleaf | 38.2810μs | 6.6505μs | 150.3645 KOps/s | 150.6346 KOps/s | |
test_nested_getitem | 35.2800μs | 6.2813μs | 159.2027 KOps/s | 157.8667 KOps/s | |
test_stacked_getitemleaf | 40.2910μs | 6.6207μs | 151.0415 KOps/s | 151.6943 KOps/s | |
test_stacked_getitem | 34.5200μs | 6.2009μs | 161.2668 KOps/s | 160.6900 KOps/s | |
test_lock_nested | 0.4164ms | 0.3359ms | 2.9773 KOps/s | 2.9471 KOps/s | |
test_lock_stack_nested | 0.3959ms | 0.3396ms | 2.9447 KOps/s | 2.8838 KOps/s | |
test_unlock_nested | 0.3745ms | 0.2805ms | 3.5656 KOps/s | 3.5861 KOps/s | |
test_unlock_stack_nested | 0.3359ms | 0.2757ms | 3.6268 KOps/s | 3.5416 KOps/s | |
test_flatten_speed | 0.1124ms | 85.2033μs | 11.7366 KOps/s | 11.8695 KOps/s | |
test_unflatten_speed | 0.3660ms | 0.3253ms | 3.0743 KOps/s | 3.0899 KOps/s | |
test_common_ops | 0.8081ms | 0.6232ms | 1.6046 KOps/s | 1.6033 KOps/s | |
test_creation | 0.1305ms | 1.7306μs | 577.8307 KOps/s | 566.1271 KOps/s | |
test_creation_empty | 35.9900μs | 8.8491μs | 113.0055 KOps/s | 109.4116 KOps/s | |
test_creation_nested_1 | 37.0710μs | 10.4615μs | 95.5890 KOps/s | 95.1912 KOps/s | |
test_creation_nested_2 | 48.3010μs | 13.2249μs | 75.6151 KOps/s | 75.1427 KOps/s | |
test_clone | 60.2910μs | 10.8580μs | 92.0979 KOps/s | 93.7342 KOps/s | |
test_getitem[int] | 1.3205ms | 10.3289μs | 96.8155 KOps/s | 95.3511 KOps/s | |
test_getitem[slice_int] | 0.1729ms | 20.5568μs | 48.6457 KOps/s | 49.1522 KOps/s | |
test_getitem[range] | 0.1323ms | 36.7445μs | 27.2150 KOps/s | 27.2817 KOps/s | |
test_getitem[tuple] | 0.1371ms | 17.8270μs | 56.0947 KOps/s | 56.3951 KOps/s | |
test_getitem[list] | 0.1339ms | 32.5643μs | 30.7084 KOps/s | 30.4265 KOps/s | |
test_setitem_dim[int] | 39.3510μs | 18.5084μs | 54.0294 KOps/s | 52.7522 KOps/s | |
test_setitem_dim[slice_int] | 62.5010μs | 37.9831μs | 26.3275 KOps/s | 26.8807 KOps/s | |
test_setitem_dim[range] | 82.3010μs | 50.8953μs | 19.6482 KOps/s | 19.5674 KOps/s | |
test_setitem_dim[tuple] | 54.7510μs | 31.7541μs | 31.4920 KOps/s | 32.1182 KOps/s | |
test_setitem | 79.0620μs | 15.2100μs | 65.7461 KOps/s | 63.9478 KOps/s | |
test_set | 84.1110μs | 14.9871μs | 66.7242 KOps/s | 66.5575 KOps/s | |
test_set_shared | 0.6231ms | 0.1566ms | 6.3840 KOps/s | 6.3765 KOps/s | |
test_update | 0.3847ms | 18.4415μs | 54.2255 KOps/s | 53.7452 KOps/s | |
test_update_nested | 70.7410μs | 23.5828μs | 42.4037 KOps/s | 39.3979 KOps/s | |
test_update__nested | 0.6204ms | 25.3862μs | 39.3916 KOps/s | 35.7991 KOps/s | |
test_set_nested | 63.0010μs | 16.2093μs | 61.6930 KOps/s | 56.0930 KOps/s | |
test_set_nested_new | 0.2220ms | 18.3595μs | 54.4678 KOps/s | 49.9221 KOps/s | |
test_select | 0.1120ms | 30.0320μs | 33.2978 KOps/s | 33.1459 KOps/s | |
test_select_nested | 0.1085ms | 44.3755μs | 22.5349 KOps/s | 22.3178 KOps/s | |
test_exclude_nested | 0.1119ms | 63.8555μs | 15.6604 KOps/s | 15.7501 KOps/s | |
test_empty[True] | 0.3600ms | 0.2972ms | 3.3645 KOps/s | 3.3643 KOps/s | |
test_empty[False] | 18.3733μs | 0.8270μs | 1.2092 MOps/s | 1.2111 MOps/s | |
test_to | 91.3620μs | 55.4381μs | 18.0381 KOps/s | 18.2650 KOps/s | |
test_to_nonblocking | 0.1836ms | 47.5076μs | 21.0493 KOps/s | 21.5422 KOps/s | |
test_unbind_speed | 0.2854ms | 0.2420ms | 4.1327 KOps/s | 4.1286 KOps/s | |
test_unbind_speed_stack0 | 0.2932ms | 0.2325ms | 4.3012 KOps/s | 4.0311 KOps/s | |
test_unbind_speed_stack1 | 0.1080s | 0.7417ms | 1.3483 KOps/s | 1.2974 KOps/s | |
test_split | 0.1190s | 1.6632ms | 601.2637 Ops/s | 584.5910 Ops/s | |
test_chunk | 0.1097s | 1.6336ms | 612.1429 Ops/s | 565.0844 Ops/s | |
test_consolidate[False-None] | 2.7164ms | 2.6281ms | 380.5002 Ops/s | 375.1722 Ops/s | |
test_consolidate[default-None] | 1.7796ms | 1.6885ms | 592.2416 Ops/s | 580.9256 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.8547ms | 1.7150ms | 583.0871 Ops/s | 567.8051 Ops/s | |
test_consolidate_njt[False-None] | 6.7415ms | 6.3621ms | 157.1799 Ops/s | 155.2339 Ops/s | |
test_to[False-False-None] | 1.9124ms | 1.7343ms | 576.6083 Ops/s | 574.2382 Ops/s | |
test_to[True-False-None] | 1.5138ms | 1.2901ms | 775.1316 Ops/s | 759.9633 Ops/s | |
test_to[within-False-None] | 4.2502ms | 4.0383ms | 247.6319 Ops/s | 245.5147 Ops/s | |
test_to[True-default-None] | 5.4018ms | 5.0936ms | 196.3261 Ops/s | 194.8867 Ops/s | |
test_to_njt[False-False-None] | 7.1106ms | 6.8726ms | 145.5063 Ops/s | 144.7037 Ops/s | |
test_to_njt[True-False-None] | 5.8024ms | 5.4132ms | 184.7341 Ops/s | 182.8684 Ops/s | |
test_to_njt[within-False-None] | 12.4192ms | 12.0430ms | 83.0360 Ops/s | 82.9050 Ops/s | |
test_creation[device0] | 0.4678ms | 81.1045μs | 12.3298 KOps/s | 12.7089 KOps/s | |
test_creation_from_tensor | 0.5462ms | 81.7932μs | 12.2260 KOps/s | 12.1728 KOps/s | |
test_add_one[memmap_tensor0] | 0.2575ms | 6.6024μs | 151.4593 KOps/s | 148.3446 KOps/s | |
test_contiguous[memmap_tensor0] | 1.6715μs | 0.4028μs | 2.4827 MOps/s | 2.4883 MOps/s | |
test_stack[memmap_tensor0] | 24.4700μs | 4.2437μs | 235.6411 KOps/s | 237.1434 KOps/s | |
test_memmaptd_index | 1.7616ms | 0.2448ms | 4.0855 KOps/s | 4.1553 KOps/s | |
test_memmaptd_index_astensor | 0.4260ms | 0.3027ms | 3.3041 KOps/s | 3.3196 KOps/s | |
test_memmaptd_index_op | 0.7966ms | 0.5833ms | 1.7145 KOps/s | 1.7063 KOps/s | |
test_serialize_model | 0.1334s | 0.1325s | 7.5474 Ops/s | 7.5428 Ops/s | |
test_serialize_model_pickle | 1.3486s | 1.1887s | 0.8412 Ops/s | 0.8217 Ops/s | |
test_serialize_weights | 0.1321s | 0.1312s | 7.6239 Ops/s | 7.6262 Ops/s | |
test_serialize_weights_returnearly | 0.3714s | 56.2635ms | 17.7735 Ops/s | 23.5085 Ops/s | |
test_serialize_weights_pickle | 1.3476s | 1.2118s | 0.8252 Ops/s | 0.8136 Ops/s | |
test_reshape_pytree | 0.1717ms | 22.1001μs | 45.2487 KOps/s | 45.6574 KOps/s | |
test_reshape_td | 0.1481ms | 25.8639μs | 38.6640 KOps/s | 37.2234 KOps/s | |
test_view_pytree | 0.1295ms | 21.7433μs | 45.9912 KOps/s | 45.7852 KOps/s | |
test_view_td | 64.5710μs | 29.9707μs | 33.3659 KOps/s | 30.1901 KOps/s | |
test_unbind_pytree | 0.1431ms | 27.8668μs | 35.8850 KOps/s | 36.0182 KOps/s | |
test_unbind_td | 0.5977ms | 36.2112μs | 27.6157 KOps/s | 27.8963 KOps/s | |
test_split_pytree | 0.1508ms | 29.6385μs | 33.7399 KOps/s | 33.8709 KOps/s | |
test_split_td | 0.1891s | 52.3198μs | 19.1132 KOps/s | 25.8219 KOps/s | |
test_add_pytree | 0.1650ms | 34.4229μs | 29.0505 KOps/s | 28.5886 KOps/s | |
test_add_td | 0.1843ms | 52.8920μs | 18.9065 KOps/s | 19.8205 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2690ms | 0.1203ms | 8.3110 KOps/s | 7.8212 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2863ms | 0.1351ms | 7.4008 KOps/s | 7.3106 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2438ms | 96.0161μs | 10.4149 KOps/s | 10.3208 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.2366ms | 0.1464ms | 6.8311 KOps/s | 6.7876 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.4977ms | 33.1980μs | 30.1223 KOps/s | 45.1957 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.4128ms | 29.4377μs | 33.9701 KOps/s | 33.5711 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.4894ms | 63.6755μs | 15.7046 KOps/s | 15.4159 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.4978ms | 49.8324μs | 20.0673 KOps/s | 19.9877 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2784ms | 0.1425ms | 7.0157 KOps/s | 7.0599 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.6324ms | 0.2200ms | 4.5451 KOps/s | 4.5827 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.3011ms | 97.7331μs | 10.2319 KOps/s | 10.1777 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2650ms | 54.7244μs | 18.2734 KOps/s | 17.6981 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2890ms | 0.1350ms | 7.4051 KOps/s | 7.0487 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6263ms | 0.4671ms | 2.1409 KOps/s | 2.1028 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4371ms | 0.2649ms | 3.7753 KOps/s | 3.8091 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2687ms | 0.1429ms | 6.9979 KOps/s | 6.7605 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2157ms | 66.6354μs | 15.0070 KOps/s | 14.1843 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2397ms | 98.6144μs | 10.1405 KOps/s | 9.6332 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5473ms | 0.3950ms | 2.5314 KOps/s | 2.5050 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2827ms | 0.1345ms | 7.4367 KOps/s | 7.1780 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1527ms | 17.8529μs | 56.0133 KOps/s | 56.0979 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.4272ms | 32.2029μs | 31.0531 KOps/s | 31.7432 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.4649ms | 69.7740μs | 14.3320 KOps/s | 14.3486 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.4526ms | 52.4838μs | 19.0535 KOps/s | 18.9732 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.6532ms | 0.4540ms | 2.2027 KOps/s | 2.2285 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.9066ms | 2.6227ms | 381.2897 Ops/s | 380.9208 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5884ms | 0.4318ms | 2.3160 KOps/s | 2.2604 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.1344ms | 2.5859ms | 386.7171 Ops/s | 388.5016 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.8842ms | 0.1182ms | 8.4609 KOps/s | 8.7410 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5843ms | 81.5750μs | 12.2587 KOps/s | 12.3695 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.8031ms | 0.1101ms | 9.0792 KOps/s | 9.1591 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.2526ms | 70.8249μs | 14.1193 KOps/s | 14.4069 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2884ms | 0.1123ms | 8.9014 KOps/s | 9.0411 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2642ms | 70.8321μs | 14.1179 KOps/s | 14.1720 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2869ms | 0.1052ms | 9.5081 KOps/s | 10.0282 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1800ms | 17.7473μs | 56.3465 KOps/s | 56.9541 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2400ms | 96.3051μs | 10.3837 KOps/s | 10.0366 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1643ms | 15.5015μs | 64.5098 KOps/s | 64.6219 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2872ms | 99.6354μs | 10.0366 KOps/s | 10.3175 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1973ms | 15.4841μs | 64.5825 KOps/s | 65.0724 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.3147ms | 0.1052ms | 9.5095 KOps/s | 9.9667 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5976ms | 17.1486μs | 58.3138 KOps/s | 58.7899 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2866ms | 0.1011ms | 9.8888 KOps/s | 10.2596 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1515ms | 15.4049μs | 64.9144 KOps/s | 64.6911 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2733ms | 96.8859μs | 10.3214 KOps/s | 9.7900 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.2073ms | 15.4996μs | 64.5177 KOps/s | 64.8718 KOps/s | |
test_mod_add[eager] | 0.1899ms | 38.0485μs | 26.2823 KOps/s | 26.2749 KOps/s | |
test_mod_add[compile] | 0.3011ms | 79.5013μs | 12.5784 KOps/s | 12.3373 KOps/s | |
test_mod_add[compile-overhead] | 0.3873ms | 0.1867ms | 5.3572 KOps/s | 5.5401 KOps/s | |
test_mod_wrap[eager] | 0.4164ms | 0.2478ms | 4.0360 KOps/s | 4.0505 KOps/s | |
test_mod_wrap[compile] | 0.5048ms | 0.2803ms | 3.5677 KOps/s | 3.5370 KOps/s | |
test_mod_wrap[compile-overhead] | 7.1692ms | 3.7983ms | 263.2756 Ops/s | 256.1896 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5750ms | 1.3813ms | 723.9404 Ops/s | 699.6335 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.4736ms | 1.2664ms | 789.6148 Ops/s | 800.6139 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4315ms | 0.9265ms | 1.0793 KOps/s | 1.0524 KOps/s | |
test_seq_add[eager] | 0.5471ms | 0.1143ms | 8.7452 KOps/s | 8.6166 KOps/s | |
test_seq_add[compile] | 0.5058ms | 87.0956μs | 11.4816 KOps/s | 11.6263 KOps/s | |
test_seq_add[compile-overhead] | 0.2739ms | 0.1286ms | 7.7733 KOps/s | 7.5580 KOps/s | |
test_seq_wrap[eager] | 0.8715ms | 0.4282ms | 2.3353 KOps/s | 2.3186 KOps/s | |
test_seq_wrap[compile] | 0.6865ms | 0.3007ms | 3.3252 KOps/s | 3.3851 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4134ms | 0.2264ms | 4.4165 KOps/s | 4.4639 KOps/s | |
test_func_call_runtime[False-eager] | 1.2155ms | 0.7846ms | 1.2745 KOps/s | 1.3761 KOps/s | |
test_func_call_runtime[False-compile] | 1.1847ms | 0.7583ms | 1.3187 KOps/s | 1.3666 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.7806ms | 0.3614ms | 2.7667 KOps/s | 2.7464 KOps/s | |
test_func_call_runtime[True-eager] | 1.3215ms | 0.9149ms | 1.0930 KOps/s | 1.1083 KOps/s | |
test_func_call_runtime[True-compile] | 1.1809ms | 0.7906ms | 1.2648 KOps/s | 1.3234 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8219ms | 0.3874ms | 2.5815 KOps/s | 2.5905 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.1833ms | 0.7646ms | 1.3079 KOps/s | 1.3665 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.1694ms | 0.7559ms | 1.3230 KOps/s | 1.3506 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4983ms | 0.3642ms | 2.7454 KOps/s | 2.7396 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4337ms | 1.0132ms | 986.9366 Ops/s | 998.9880 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.4070ms | 1.0002ms | 999.8276 Ops/s | 1.0091 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.4120ms | 1.0012ms | 998.7986 Ops/s | 1.0093 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5262ms | 2.1204ms | 471.6114 Ops/s | 475.0406 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.2261ms | 0.8245ms | 1.2129 KOps/s | 1.2429 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5705ms | 0.4164ms | 2.4015 KOps/s | 2.3771 KOps/s | |
test_distributed | 2.5495ms | 0.1791ms | 5.5821 KOps/s | 8.7119 KOps/s | |
test_tdmodule | 0.1860ms | 21.1175μs | 47.3541 KOps/s | 48.2634 KOps/s | |
test_tdmodule_dispatch | 0.2315ms | 37.3240μs | 26.7924 KOps/s | 28.1922 KOps/s | |
test_tdseq | 43.0210μs | 22.1488μs | 45.1492 KOps/s | 48.0266 KOps/s | |
test_tdseq_dispatch | 0.2122ms | 40.4741μs | 24.7072 KOps/s | 24.8114 KOps/s | |
test_instantiation_functorch | 1.9439ms | 1.5476ms | 646.1584 Ops/s | 647.9153 Ops/s | |
test_exec_functorch | 0.5618ms | 0.1460ms | 6.8485 KOps/s | 6.8497 KOps/s | |
test_exec_functional_call | 0.3307ms | 0.1373ms | 7.2820 KOps/s | 7.2573 KOps/s | |
test_exec_td_decorator | 0.5993ms | 0.1887ms | 5.3007 KOps/s | 5.1941 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0878ms | 0.6885ms | 1.4524 KOps/s | 1.4498 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9336ms | 0.6901ms | 1.4490 KOps/s | 1.4577 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7686ms | 0.6008ms | 1.6644 KOps/s | 1.6376 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9994ms | 0.6014ms | 1.6628 KOps/s | 1.6712 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.9775ms | 19.3799ms | 51.6000 Ops/s | 52.0165 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.9814ms | 19.3839ms | 51.5891 Ops/s | 52.0417 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.5500ms | 19.1879ms | 52.1163 Ops/s | 52.2612 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.4976ms | 19.2174ms | 52.0363 Ops/s | 52.5283 Ops/s | |
test_to_module_speed[True] | 1.2350ms | 0.9963ms | 1.0037 KOps/s | 984.1275 Ops/s | |
test_to_module_speed[False] | 1.1923ms | 0.9833ms | 1.0169 KOps/s | 1.0057 KOps/s | |
test_tc_init | 96.6520μs | 37.1083μs | 26.9481 KOps/s | 27.2903 KOps/s | |
test_tc_init_nested | 0.1775ms | 74.7785μs | 13.3728 KOps/s | 13.1924 KOps/s | |
test_tc_first_layer_tensor | 22.6710μs | 0.8123μs | 1.2311 MOps/s | 1.1860 MOps/s | |
test_tc_first_layer_nontensor | 20.6710μs | 2.2817μs | 438.2694 KOps/s | 420.7372 KOps/s | |
test_tc_second_layer_tensor | 12.3225μs | 1.4121μs | 708.1884 KOps/s | 669.0214 KOps/s | |
test_tc_second_layer_nontensor | 58.5110μs | 3.0153μs | 331.6433 KOps/s | 321.5717 KOps/s | |
test_unbind | 0.2304s | 12.6061ms | 79.3266 Ops/s | 141.1797 Ops/s | |
test_full_like | 11.3575ms | 10.3736ms | 96.3984 Ops/s | 90.2700 Ops/s | |
test_zeros_like | 10.1420ms | 7.6132ms | 131.3515 Ops/s | 204.0316 Ops/s | |
test_ones_like | 5.3697ms | 4.6556ms | 214.7956 Ops/s | 206.0561 Ops/s | |
test_clone | 13.1081ms | 10.1338ms | 98.6794 Ops/s | 125.2921 Ops/s | |
test_squeeze | 0.1114ms | 9.6817μs | 103.2875 KOps/s | 104.1994 KOps/s | |
test_unsqueeze | 0.1740ms | 72.6407μs | 13.7664 KOps/s | 13.3700 KOps/s | |
test_split | 0.6528ms | 0.1576ms | 6.3435 KOps/s | 6.3293 KOps/s | |
test_permute | 0.3532ms | 0.1891ms | 5.2893 KOps/s | 5.3428 KOps/s | |
test_stack | 54.4714ms | 52.8744ms | 18.9127 Ops/s | 18.6521 Ops/s | |
test_cat | 53.7535ms | 52.5316ms | 19.0362 Ops/s | 18.4552 Ops/s |
vmoens
added a commit
that referenced
this pull request
Feb 19, 2025
ghstack-source-id: 27805b68d4663d51f4ecd67f0495de8f83c90c41 Pull Request resolved: #1221
vmoens
added a commit
that referenced
this pull request
Feb 19, 2025
ghstack-source-id: 8667892d782a5904e2c5117a1b039edcdaacb9e0 Pull Request resolved: #1221
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):