Skip to content

Commit da2926f

Browse files
committed
Remove link to core::arch::x86_64
1 parent 6e00c7b commit da2926f

File tree

1 file changed

+35
-0
lines changed

1 file changed

+35
-0
lines changed

crates/core_simd/src/core_simd_docs.md

+35
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,38 @@ Portable SIMD module.
22

33
This module offers a portable abstraction for SIMD operations
44
that is not bound to any particular hardware architecture.
5+
6+
# What is "portable"?
7+
8+
This module provides a SIMD implementation that is fast and predictable on any target.
9+
10+
### Portable SIMD works on every target
11+
12+
Unlike target-specific SIMD in `std::arch`, portable SIMD compiles for every target.
13+
In this regard, it is just like "regular" Rust.
14+
15+
### Portable SIMD is consistent between targets
16+
17+
A program using portable SIMD can expect identical behavior on any target.
18+
In most regards, [`Simd<T, N>`] can be thought of as a parallelized `[T; N]` and operates like a sequence of `T`.
19+
20+
This has one notable exception: a handful of older architectures (e.g. `armv7` and `powerpc`) flush [subnormal](`f32::is_subnormal`) `f32` values to zero.
21+
On these architectures, subnormal `f32` input values are replaced with zeros, and any operation producing subnormal `f32` values produces zeros instead.
22+
This doesn't affect most architectures or programs.
23+
24+
### Operations use the best instructions available
25+
26+
Operations provided by this module compile to the best available SIMD instructions.
27+
28+
Portable SIMD is not a low-level vendor library, and operations in portable SIMD _do not_ necessarily map to a single instruction.
29+
Instead, they map to a reasonable implementation of the operation for the target.
30+
31+
Consistency between targets is not compromised to use faster or fewer instructions.
32+
In some cases, `std::arch` will provide a faster function that has slightly different behavior than the `std::simd` equivalent.
33+
For example, `_mm_min_ps`[^1] can be slightly faster than [`SimdFloat::simd_min`](`num::SimdFloat::simd_min`), but does not conform to the IEEE standard also used by [`f32::min`].
34+
When necessary, [`Simd<T, N>`] can be converted to the types provided by `std::arch` to make use of target-specific functions.
35+
36+
Many targets simply don't have SIMD, or don't support SIMD for a particular element type.
37+
In those cases, regular scalar operations are generated instead.
38+
39+
[^1]: `_mm_min_ps(x, y)` is equivalent to `x.simd_lt(y).select(x, y)`

0 commit comments

Comments
 (0)