You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
commit be480be
Author: Beinsezii <beinsezii@gmail.com>
Date: Fri Jun 7 14:39:37 2024 -0700
More FMA3 ops
Average 2-4% perf gain, should also be more accurate.
Code little less readable but with all the formulae being open source
documents it shouldn't be too bad
commit 0f14713
Author: Beinsezii <beinsezii@gmail.com>
Date: Thu Jun 6 18:28:20 2024 -0700
TESTS use f64 for everything except NaN checks
Lowered epsilons to accomodate this.
commit cd658e2
Author: Beinsezii <beinsezii@gmail.com>
Date: Thu Jun 6 17:47:59 2024 -0700
Fold
commit 5733c9d
Author: Beinsezii <beinsezii@gmail.com>
Date: Thu Jun 6 17:46:29 2024 -0700
Move UTs to separate file
commit efb769d
Author: Beinsezii <beinsezii@gmail.com>
Date: Thu Jun 6 17:42:41 2024 -0700
F64 Part 5: convert_space
Leaving FFI as-is
commit 83d8153
Author: Beinsezii <beinsezii@gmail.com>
Date: Thu Jun 6 17:37:03 2024 -0700
F64 Part 4: More C FFI
commit b09da2b
Author: Beinsezii <beinsezii@gmail.com>
Date: Thu Jun 6 17:28:07 2024 -0700
F64 Part 3: Backward functions
commit 75c7df7
Author: Beinsezii <beinsezii@gmail.com>
Date: Thu Jun 6 17:07:55 2024 -0700
F64 Part 2: Forward functions
commit 6990067
Author: Beinsezii <beinsezii@gmail.com>
Date: Thu Jun 6 16:27:36 2024 -0700
F64 Part 1: Transfer and util functions
commit b28529f
Author: Beinsezii <beinsezii@gmail.com>
Date: Mon Jun 3 18:12:15 2024 -0700
Use macros to quickdly define external C fns
commit 7a31d4a
Author: Beinsezii <beinsezii@gmail.com>
Date: Mon Jun 3 17:44:19 2024 -0700
Use custom traits instead of Into<f32>
commit e96880b
Author: Beinsezii <beinsezii@gmail.com>
Date: Mon Jun 3 17:18:25 2024 -0700
DT with only f64
Looks clean. Though I might want slices and arrays too? .into() won't
work for that. New trait?
commit ad60e5d
Author: Beinsezii <beinsezii@gmail.com>
Date: Mon Jun 3 16:42:26 2024 -0700
Attempt proper SIMD unweave
Fail. 1000x slower
commit 20e5927
Author: Beinsezii <beinsezii@gmail.com>
Date: Mon Jun 3 03:58:35 2024 -0700
Gate mul_add() behind FMA3 check
It's crazy slow without FMA3 and the compiler won't auto change between
FMA or not because *technically* it changes the results and that's
a sin in the Rust bible
commit f51e096
Author: Beinsezii <beinsezii@gmail.com>
Date: Mon Jun 3 03:20:23 2024 -0700
Rustfmt
commit bf2c194
Merge: 34d3a86f9d3e65
Author: Beinsezii <beinsezii@gmail.com>
Date: Mon Jun 3 03:11:52 2024 -0700
Merge branch 'master' into portable_simd
commit 34d3a86
Author: Beinsezii <beinsezii@gmail.com>
Date: Sun Jun 2 23:31:28 2024 -0700
Convert lrgb_to_xyz to DType
Should be a best case scenario. Literally just element-wise FMA.
Almost +30%: 107µs to 77µs on arch=native
It's *cool* yes but the code quality degrades so much I wonder if its
even worth it. Then when you factor in the complex 3-dimension
deinterleave that'll be needed to use it properly...
I still have to test it of course, but I just feel it'll eat what little
perf I get. I have AVX512 as well, so AVX≤2 will probably end up hurting
even more.
commit 138a072
Author: Beinsezii <beinsezii@gmail.com>
Date: Sun Jun 2 22:29:14 2024 -0700
Attempt using `portable_simd`
It's actually no faster than the autovectorized version?
commit 37882b9
Author: Beinsezii <beinsezii@gmail.com>
Date: Sun Jun 2 20:55:42 2024 -0700
Initial F64 + Autovectorize test
disappointing. Like 10% faster at most. Probably from the branch
0 commit comments