You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: crates/core_simd/src/core_simd_docs.md
+35
Original file line number
Diff line number
Diff line change
@@ -2,3 +2,38 @@ Portable SIMD module.
2
2
3
3
This module offers a portable abstraction for SIMD operations
4
4
that is not bound to any particular hardware architecture.
5
+
6
+
# What is "portable"?
7
+
8
+
This module provides a SIMD implementation that is fast and predictable on any target.
9
+
10
+
### Portable SIMD works on every target
11
+
12
+
Unlike target-specific SIMD in `std::arch`, portable SIMD compiles for every target.
13
+
In this regard, it is just like "regular" Rust.
14
+
15
+
### Portable SIMD is consistent between targets
16
+
17
+
A program using portable SIMD can expect identical behavior on any target.
18
+
In most regards, [`Simd<T, N>`] can be thought of as a parallelized `[T; N]` and operates like a sequence of `T`.
19
+
20
+
This has one notable exception: a handful of older architectures (e.g. `armv7` and `powerpc`) flush [subnormal](`f32::is_subnormal`)`f32` values to zero.
21
+
On these architectures, subnormal `f32` input values are replaced with zeros, and any operation producing subnormal `f32` values produces zeros instead.
22
+
This doesn't affect most architectures or programs.
23
+
24
+
### Operations use the best instructions available
25
+
26
+
Operations provided by this module compile to the best available SIMD instructions.
27
+
28
+
Portable SIMD is not a low-level vendor library, and operations in portable SIMD _do not_ necessarily map to a single instruction.
29
+
Instead, they map to a reasonable implementation of the operation for the target.
30
+
31
+
Consistency between targets is not compromised to use faster or fewer instructions.
32
+
In some cases, `std::arch` will provide a faster function that has slightly different behavior than the `std::simd` equivalent.
33
+
For example, `_mm_min_ps`[^1] can be slightly faster than [`SimdFloat::simd_min`](`num::SimdFloat::simd_min`), but does not conform to the IEEE standard also used by [`f32::min`].
34
+
When necessary, [`Simd<T, N>`] can be converted to the types provided by `std::arch` to make use of target-specific functions.
35
+
36
+
Many targets simply don't have SIMD, or don't support SIMD for a particular element type.
37
+
In those cases, regular scalar operations are generated instead.
38
+
39
+
[^1]: `_mm_min_ps(x, y)` is equivalent to `x.simd_lt(y).select(x, y)`
0 commit comments