From a539558158fd35b28a526671dcf41ffc113c4faa Mon Sep 17 00:00:00 2001 From: lcnr Date: Thu, 29 Feb 2024 10:46:28 +0100 Subject: [PATCH 01/12] add implied bounds doc (#1915) * add implied bounds doc * lazy type aliases also have explicit implied bounds --- src/SUMMARY.md | 1 + src/traits/implied-bounds.md | 84 ++++++++++++++++++++++++++++++++++++ 2 files changed, 85 insertions(+) create mode 100644 src/traits/implied-bounds.md diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 82e0d79aa..8140dc9f0 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -120,6 +120,7 @@ - [Interactions with turbofishing](./turbofishing-and-early-late-bound.md) - [Higher-ranked trait bounds](./traits/hrtb.md) - [Caching subtleties](./traits/caching.md) + - [Implied bounds](./traits/implied-bounds.md) - [Specialization](./traits/specialization.md) - [Chalk-based trait solving](./traits/chalk.md) - [Lowering to logic](./traits/lowering-to-logic.md) diff --git a/src/traits/implied-bounds.md b/src/traits/implied-bounds.md new file mode 100644 index 000000000..911553ad3 --- /dev/null +++ b/src/traits/implied-bounds.md @@ -0,0 +1,84 @@ +# Implied bounds + +We currently add implied region bounds to avoid explicit annotations. e.g. +`fn foo<'a, T>(x: &'a T)` can freely assume that `T: 'a` holds without specifying it. + +There are two kinds of implied bounds: explicit and implicit. Explicit implied bounds +get added to the `fn predicates_of` of the relevant item while implicit ones are +handled... well... implicitly. + +## explicit implied bounds + +The explicit implied bounds are computed in [`fn inferred_outlives_of`]. Only ADTs and +lazy type aliases have explicit implied bounds which are computed via a fixpoint algorithm +in the [`fn inferred_outlives_crate`] query. + +We use [`fn insert_required_predicates_to_be_wf`] on all fields of all ADTs in the crate. +This function computes the outlives bounds for each component of the field using a +separate implementation. + +For ADTs, trait objects, and associated types the initially required predicates are +computed in [`fn check_explicit_predicates`]. This simply uses `fn explicit_predicates_of` +without elaborating them. + +Region predicates are added via [`fn insert_outlives_predicate`]. This function takes +an outlives predicate, decomposes it and adds the components as explicit predicates only +if the outlived region is a region parameter. [It does not add `'static` requirements][nostatic]. + + [`fn inferred_outlives_of`]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_hir_analysis/src/outlives/mod.rs#L20 + [`fn inferred_outlives_crate`]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_hir_analysis/src/outlives/mod.rs#L83 + [`fn insert_required_predicates_to_be_wf`]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_hir_analysis/src/outlives/implicit_infer.rs#L89 + [`fn check_explicit_predicates`]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_hir_analysis/src/outlives/implicit_infer.rs#L238 + [`fn insert_outlives_predicate`]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_hir_analysis/src/outlives/utils.rs#L15 + [nostatic]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_hir_analysis/src/outlives/utils.rs#L159-L165 + +## implicit implied bounds + +As we are unable to handle implications in binders yet, we cannot simply add the outlives +requirements of impls and functions as explicit predicates. + +### using implicit implied bounds as assumptions + +These bounds are not added to the `ParamEnv` of the affected item itself. For lexical +region resolution they are added using [`fn OutlivesEnvironment::with_bounds`]. +Similarly,during MIR borrowck we add them using +[`fn UniversalRegionRelationsBuilder::add_implied_bounds`]. + +[We add implied bounds for the function signature and impl header in MIR borrowck][mir]. +Outside of MIR borrowck we add the outlives requirements for the types returned by the +[`fn assumed_wf_types`] query. + +The assumed outlives constraints for implicit bounds are computed using the +[`fn implied_outlives_bounds`] query. This directly +[extracts the required outlives bounds from `fn wf::obligations`][boundsfromty]. + +MIR borrowck adds the outlives constraints for both the normalized and unnormalized types, +lexical region resolution [only uses the unnormalized types][notnorm]. + +[`fn OutlivesEnvironment::with_bounds`]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_infer/src/infer/outlives/env.rs#L90-L97 +[`fn UniversalRegionRelationsBuilder::add_implied_bounds`]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_borrowck/src/type_check/free_region_relations.rs#L316 +[mir]: https://github.com/rust-lang/rust/blob/91cae1dcdcf1a31bd8a92e4a63793d65cfe289bb/compiler/rustc_borrowck/src/type_check/free_region_relations.rs#L258-L332 +[`fn assumed_wf_types`]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_ty_utils/src/implied_bounds.rs#L21 +[`fn implied_outlives_bounds`]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_traits/src/implied_outlives_bounds.rs#L18C4-L18C27 +[boundsfromty]: https://github.com/rust-lang/rust/blob/5b8bc568d28b2e922290c9a966b3231d0ce9398b/compiler/rustc_trait_selection/src/traits/query/type_op/implied_outlives_bounds.rs#L95-L96 +[notnorm]: https://github.com/rust-lang/rust/blob/91cae1dcdcf1a31bd8a92e4a63793d65cfe289bb/compiler/rustc_trait_selection/src/traits/engine.rs#L227-L250 + +### proving implicit implied bounds + +As the implicit implied bounds are not included in `fn predicates_of` we have to +separately make sure they actually hold. We generally handle this by checking that +all used types are well formed by emitting `WellFormed` predicates. + +We cannot emit `WellFormed` predicates when instantiating impls, as this would result +in - currently often inductive - trait solver cycles. We also do not emit constraints +involving higher ranked regions as we're lacking the implied bounds from their binder. + +This results in multiple unsoundnesses: +- by using subtyping: [#25860] +- by using super trait upcasting for a higher ranked trait bound: [#84591] +- by being able to normalize a projection when using an impl while not being able + to normalize it when checking the impl: [#100051] + +[#25860]: https://github.com/rust-lang/rust/issues/25860 +[#84591]: https://github.com/rust-lang/rust/issues/84591 +[#100051]: https://github.com/rust-lang/rust/issues/100051 \ No newline at end of file From 231c30fcfacf047d79fcb848f63aa5830821a63a Mon Sep 17 00:00:00 2001 From: lcnr Date: Fri, 1 Mar 2024 13:11:55 +0100 Subject: [PATCH 02/12] opaque types in new solver (#1918) * add opaque types doc * summary --- src/SUMMARY.md | 1 + src/solve/opaque-types.md | 119 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 120 insertions(+) create mode 100644 src/solve/opaque-types.md diff --git a/src/SUMMARY.md b/src/SUMMARY.md index 8140dc9f0..e7cba82c7 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -134,6 +134,7 @@ - [Coinduction](./solve/coinduction.md) - [Proof trees](./solve/proof-trees.md) - [Normalization](./solve/normalization.md) + - [Opaque types](./solve/opaque-types.md) - [`Unsize` and `CoerceUnsized` traits](./traits/unsize.md) - [Type checking](./type-checking.md) - [Method Lookup](./method-lookup.md) diff --git a/src/solve/opaque-types.md b/src/solve/opaque-types.md new file mode 100644 index 000000000..5a9920a49 --- /dev/null +++ b/src/solve/opaque-types.md @@ -0,0 +1,119 @@ +# Opaque types in the new solver + +The way opaque types are handled in the new solver differs from the old implementation. +This should be a self-contained explanation of the behavior in the new solver. + +## opaques are alias types + +Opaque types are treated the same as other aliases, most notabily associated types, +whenever possible. There should be as few divergences in behavior as possible. + +This is desirable, as they are very similar to other alias types, in that they can be +normalized to their hidden type and also have the same requirements for completeness. +Treating them this way also reduces the complexity of the type system by sharing code. +Having to deal with opaque types separately results in more complex rules and new kinds +of interactions. As we need to treat them like other aliases in the implicit-negative +mode, having significant differences between modes also adds complexity. + +*open question: is there an alternative approach here, maybe by treating them more like rigid +types with more limited places to instantiate them? they would still have to be ordinary +aliases during coherence* + +### `normalizes-to` for opaques + +[source][norm] + +`normalizes-to` is used to define the one-step normalization behavior for aliases in the new +solver: `<::Assoc as IdOuter>::Assoc` first normalizes to `::Assoc` +which then normalizes to `T`. It takes both the `AliasTy` which is getting normalized and the +expected `Term`. To use `normalizes-to` for actual normalization, the expected term can simply +be an unconstrained inference variable. + +For opaque types in the defining scope and in the implicit-negative coherence mode, this is +always done in two steps. Outside of the defining scope `normalizes-to` for opaques always +returns `Err(NoSolution)`. + +We start by trying to to assign the expected type as a hidden type. + +In the implicit-negative coherence mode, this currently always results in ambiguity without +interacting with the opaque types storage. We could instead add allow 'defining' all opaque types, +discarding their inferred types at the end, changing the behavior of an opaque type is used +multiple times during coherence: [example][coherence-example] + +Inside of the defining scope we start by checking whether the type and const arguments of the +opaque are all placeholders: [source](placeholder-ck). If this check is ambiguous, +return ambiguity, if it fails, return `Err(NoSolution)`. This check ignores regions which are +only checked at the end of borrowck. If it succeeds, continue. + +We then check whether we're able to *semantically* unify the generic arguments of the opaque +with the arguments of any opaque type already in the opaque types storage. If so, we unify the +previously stored type with the expected type of this `normalizes-to` call: [source][eq-prev][^1]. + +If not, we insert the expected type in the opaque types storage: [source][insert-storage][^2]. +Finally, we check whether the item bounds of the opaque hold for the expected type: [source]. + +[norm]: https://github.com/rust-lang/rust/blob/384d26fc7e3bdd7687cc17b2662b091f6017ec2a/compiler/rustc_trait_selection/src/solve/normalizes_to/opaque_types.rs#L13 +[coherence-example]: https://github.com/rust-lang/rust/blob/master/tests/ui/type-alias-impl-trait/coherence_different_hidden_ty.rs +[placeholder-ck]: https://github.com/rust-lang/rust/blob/384d26fc7e3bdd7687cc17b2662b091f6017ec2a/compiler/rustc_trait_selection/src/solve/normalizes_to/opaque_types.rs#L33 +[check-storage]: https://github.com/rust-lang/rust/blob/384d26fc7e3bdd7687cc17b2662b091f6017ec2a/compiler/rustc_trait_selection/src/solve/normalizes_to/opaque_types.rs#L51-L52 +[eq-prev]: https://github.com/rust-lang/rust/blob/384d26fc7e3bdd7687cc17b2662b091f6017ec2a/compiler/rustc_trait_selection/src/solve/normalizes_to/opaque_types.rs#L51-L59 +[insert-storage]: https://github.com/rust-lang/rust/blob/384d26fc7e3bdd7687cc17b2662b091f6017ec2a/compiler/rustc_trait_selection/src/solve/normalizes_to/opaque_types.rs#L68 +[item-bounds-ck]: https://github.com/rust-lang/rust/blob/384d26fc7e3bdd7687cc17b2662b091f6017ec2a/compiler/rustc_trait_selection/src/solve/normalizes_to/opaque_types.rs#L69-L74 +[^1]: FIXME: this should ideally only result in a unique candidate given that we require the args to be placeholders and regions are always inference vars +[^2]: FIXME: why do we check whether the expected type is rigid for this. + +### using alias-bounds of normalizable aliases + +https://github.com/rust-lang/trait-system-refactor-initiative/issues/77 + +Using an `AliasBound` candidate for normalizable aliases is generally not possible as an +associated type can have stronger bounds then the resulting type when normalizing via a +`ParamEnv` candidate. + +These candidates would change our exact normalization strategy to be user-facing. It is otherwise +pretty much unobservable whether we eagerly normalize. Where we normalize is something we likely +want to change that after removing support for the old solver, so that would be undesirable. + +## opaque types can be defined anywhere + +Opaque types in their defining-scope can be defined anywhere, whether when simply relating types +or in the trait solver. This removes order dependence and incompleteness. Without this the result +of a goal can differ due to subtle reasons, e.g. whether we try to evaluate a goal using the +opaque before the first defining use of the opaque. + +## higher ranked opaque types in their defining scope + +These are not supported and trying to define them right now should always error. + +FIXME: Because looking up opaque types in the opaque type storage can now unify regions, +we have to eagerly check that the opaque types does not reference placeholders. We otherwise +end up leaking placeholders. + +## member constraints + +The handling of member constraints does not change in the new solver. See the +[relevant existing chapter][member-constraints] for that. + +[member-constraints]: https://rustc-dev-guide.rust-lang.org/borrow_check/region_inference/member_constraints.html + +## calling methods on opaque types + +FIXME: We need to continue to support calling methods on still unconstrained +opaque types in their defining scope. It's unclear how to best do this. +```rust +use std::future::Future; +use futures::FutureExt; + +fn go(i: usize) -> impl Future + Send + 'static { + async move { + if i != 0 { + // This returns `impl Future` in its defining scope, + // we don't know the concrete type of that opaque at this point. + // Currently treats the opaque as a known type and succeeds, but + // from the perspective of "easiest to soundly implement", it would + // be good for this to be ambiguous. + go(i - 1).boxed().await; + } + } +} +``` \ No newline at end of file From 9ef55c55db824923b02c69c1545e069cd311f42c Mon Sep 17 00:00:00 2001 From: Nilstrieb <48135649+Nilstrieb@users.noreply.github.com> Date: Fri, 1 Mar 2024 21:26:19 +0100 Subject: [PATCH 03/12] make shell.nix better (#1858) * make shell.nix better * Mention using RUST_BOOTSTRAP_CONFIG * Move things to `buildInputs` and add `glibc.out glibc.static` This fixes the nofile-limit.rs UI test. * short lines for the short line fans * Fix pkgs --- src/building/suggested.md | 81 ++++++++++++--------------------------- 1 file changed, 24 insertions(+), 57 deletions(-) diff --git a/src/building/suggested.md b/src/building/suggested.md index d4938cbf8..cb415a198 100644 --- a/src/building/suggested.md +++ b/src/building/suggested.md @@ -276,67 +276,34 @@ If you're using nix, you can use the following nix-shell to work on Rust: ```nix { pkgs ? import {} }: - -# This file contains a development shell for working on rustc. -let - # Build configuration for rust-lang/rust. Based on `config.example.toml` (then called - # `config.toml.example`) from `1bd30ce2aac40c7698aa4a1b9520aa649ff2d1c5` - config = pkgs.writeText "rustc-config" '' - profile = "compiler" # you may want to choose a different profile, like `library` or `tools` - - [build] - patch-binaries-for-nix = true - # The path to (or name of) the GDB executable to use. This is only used for - # executing the debuginfo test suite. - gdb = "${pkgs.gdb}/bin/gdb" - python = "${pkgs.python3Full}/bin/python" - - [rust] - debug = true - incremental = true - deny-warnings = false - - # Indicates whether some LLVM tools, like llvm-objdump, will be made available in the - # sysroot. - llvm-tools = true - - # Print backtrace on internal compiler errors during bootstrap - backtrace-on-ice = true - ''; - - ripgrepConfig = - let - # Files that are ignored by ripgrep when searching. - ignoreFile = pkgs.writeText "rustc-rgignore" '' - configure - config.example.toml - x.py - LICENSE-MIT - LICENSE-APACHE - COPYRIGHT - **/*.txt - **/*.toml - **/*.yml - **/*.nix - *.md - src/ci - src/etc/ - src/llvm-emscripten/ - src/llvm-project/ - src/rtstartup/ - src/rustllvm/ - src/stdsimd/ - src/tools/rls/rls-analysis/test_data/ - ''; - in - pkgs.writeText "rustc-ripgreprc" "--ignore-file=${ignoreFile}"; -in pkgs.mkShell { name = "rustc"; nativeBuildInputs = with pkgs; [ - gcc_multi binutils cmake ninja openssl pkgconfig python39 git curl cacert patchelf nix psutils + binutils cmake ninja pkg-config python3 git curl cacert patchelf nix + ]; + buildInputs = with pkgs; [ + openssl glibc.out glibc.static ]; - RIPGREP_CONFIG_PATH = ripgrepConfig; + # Avoid creating text files for ICEs. + RUSTC_ICE = "0"; +} +``` + +Note that when using nix on a not-NixOS distribution, it may be necessary to set +**`patch-binaries-for-nix = true` in `config.toml`**. +Bootstrap tries to detect whether it's running in nix and enable patching automatically, +but this detection can have false negatives. + +You can also use your nix shell to manage `config.toml`: + +```nix +let + config = pkgs.writeText "rustc-config" '' + # Your config.toml content goes here + '' +pkgs.mkShell { + /* ... */ + # This environment varaible tells bootstrap where our config.toml is. RUST_BOOTSTRAP_CONFIG = config; } ``` From cf9fb8804962ec1375b5e0bf70a0b0ba1835cf05 Mon Sep 17 00:00:00 2001 From: Christopher Smyth Date: Fri, 1 Mar 2024 17:20:06 -0500 Subject: [PATCH 04/12] Add some more details on feature gating (#1891) * Add some more details on feature gating * Apply suggestions from code review --------- Co-authored-by: Ross Smyth Co-authored-by: Nilstrieb <48135649+Nilstrieb@users.noreply.github.com> --- src/implementing_new_features.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/src/implementing_new_features.md b/src/implementing_new_features.md index 0140c09bb..1b9170e49 100644 --- a/src/implementing_new_features.md +++ b/src/implementing_new_features.md @@ -123,6 +123,8 @@ a new unstable feature: 1. Add the feature name to `rustc_span/src/symbol.rs` in the `Symbols {...}` block. + Note that this block must be in alphbetical order. + 1. Add a feature gate declaration to `rustc_feature/src/unstable.rs` in the unstable `declare_features` block. @@ -171,9 +173,13 @@ a new unstable feature: For an example of adding an error, see [#81015]. For features introducing new syntax, pre-expansion gating should be used instead. - To do so, extend the [`GatedSpans`] struct, add spans to it during parsing, - and then finally feature-gate all the spans in - [`rustc_ast_passes::feature_gate::check_crate`]. + During parsing, when the new syntax is parsed, the symbol must be inserted to the + current crate's [`GatedSpans`] via `self.sess.gated_span.gate(sym::my_feature, span)`. + + After being inserted to the gated spans, the span must be checked in the + [`rustc_ast_passes::feature_gate::check_crate`] function, which actually denies + features. Exactly how it is gated depends on the exact type of feature, but most + likely will use the `gate_all!()` macro. 1. Add a test to ensure the feature cannot be used without a feature gate, by creating `tests/ui/feature-gates/feature-gate-$feature_name.rs`. From a9ab50ba686a82591bd4bcd4f75d9548c0ddf1cf Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=AE=B8=E6=9D=B0=E5=8F=8B=20Jieyou=20Xu=20=28Joe=29?= <39484203+jieyouxu@users.noreply.github.com> Date: Fri, 1 Mar 2024 22:22:50 +0000 Subject: [PATCH 05/12] Update run-make test description (#1920) --- src/tests/compiletest.md | 40 +++++++++++++++++++++++++++++++++++----- src/tests/headers.md | 3 ++- 2 files changed, 37 insertions(+), 6 deletions(-) diff --git a/src/tests/compiletest.md b/src/tests/compiletest.md index 46dfe1c00..aa8d45d91 100644 --- a/src/tests/compiletest.md +++ b/src/tests/compiletest.md @@ -63,7 +63,8 @@ The following test suites are available, with links for more information: - [`codegen-units`](#codegen-units-tests) — tests for codegen unit partitioning - [`assembly`](#assembly-tests) — verifies assembly output - [`mir-opt`](#mir-opt-tests) — tests for MIR generation -- [`run-make`](#run-make-tests) — general purpose tests using a Makefile +- [`run-make`](#run-make-tests) — general purpose tests using Rust programs (or + Makefiles (legacy)) - `run-make-fulldeps` — `run-make` tests which require a linkable build of `rustc`, or the rust demangler - [`run-pass-valgrind`](#valgrind-tests) — tests run with Valgrind @@ -368,15 +369,43 @@ your test, causing separate files to be generated for 32bit and 64bit systems. ### `run-make` tests -The tests in [`tests/run-make`] are general-purpose tests using Makefiles -which provide the ultimate in flexibility. -These should be used as a last resort. -If possible, you should use one of the other test suites. +> NOTE: +> We are planning to migrate all existing Makefile-based `run-make` tests +> to Rust recipes. You should not be adding new Makefile-based `run-make` +> tests. + +The tests in [`tests/run-make`] are general-purpose tests using Rust *recipes*, +which are small programs allowing arbitrary Rust code such as `rustc` +invocations, and is supported by a [`run_make_support`] library. Using Rust +recipes provide the ultimate in flexibility. + +*These should be used as a last resort*. If possible, you should use one of the +other test suites. + If there is some minor feature missing which you need for your test, consider extending compiletest to add a header command for what you need. However, if running a bunch of commands is really what you need, `run-make` is here to the rescue! +#### Using Rust recipes + +Each test should be in a separate directory with a `rmake.rs` Rust program, +called the *recipe*. A recipe will be compiled and executed by compiletest +with the `run_make_support` library linked in. + +If you need new utilities or functionality, consider extending and improving +the [`run_make_support`] library. + +Two `run-make` tests are ported over to Rust recipes as examples: + +- +- + +#### Using Makefiles (legacy) + +> NOTE: +> You should avoid writing new Makefile-based `run-make` tests. + Each test should be in a separate directory with a `Makefile` indicating the commands to run. There is a [`tools.mk`] Makefile which you can include which provides a bunch of @@ -385,6 +414,7 @@ Take a look at some of the other tests for some examples on how to get started. [`tools.mk`]: https://github.com/rust-lang/rust/blob/master/tests/run-make/tools.mk [`tests/run-make`]: https://github.com/rust-lang/rust/tree/master/tests/run-make +[`run_make_support`]: https://github.com/rust-lang/rust/tree/master/src/tools/run-make-support ### Valgrind tests diff --git a/src/tests/headers.md b/src/tests/headers.md index 207716836..3964ba9f8 100644 --- a/src/tests/headers.md +++ b/src/tests/headers.md @@ -5,7 +5,8 @@ Header commands are special comments that tell compiletest how to build and interpret a test. They must appear before the Rust source in the test. -They may also appear in Makefiles for [run-make tests](compiletest.md#run-make-tests). +They may also appear in legacy Makefiles for +[run-make tests](compiletest.md#run-make-tests). They are normally put after the short comment that explains the point of this test. Compiletest test suites use `//@` to signal that a comment is a header. From d866c3863c4a3b71ed406b1d92e0b26f45f2a74f Mon Sep 17 00:00:00 2001 From: Arthur Milchior Date: Fri, 1 Mar 2024 23:26:16 +0100 Subject: [PATCH 06/12] Use different type in an example (#1908) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * Use different type in an example Sentences such as «without the argument u32» were ambiguous, as there were two distincts u32. Having a single one, the one in the monomorphization of the type, remove the ambiguity. * Update src/ty.md --------- Co-authored-by: Nilstrieb <48135649+Nilstrieb@users.noreply.github.com> --- src/ty.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/ty.md b/src/ty.md index 80a72b116..c1bf6315b 100644 --- a/src/ty.md +++ b/src/ty.md @@ -281,7 +281,7 @@ modules choose to import a larger or smaller set of names explicitly. Let's consider the example of a type like `MyStruct`, where `MyStruct` is defined like so: ```rust,ignore -struct MyStruct { x: u32, y: T } +struct MyStruct { x: u8, y: T } ``` The type `MyStruct` would be an instance of `TyKind::Adt`: From fbea74600240689222d103f66ddabb17b44153a3 Mon Sep 17 00:00:00 2001 From: Stuart Cook Date: Sat, 2 Mar 2024 09:30:17 +1100 Subject: [PATCH 07/12] Add compiletest docs for FileCheck prefixes and `//@ filecheck-flags:` (#1914) This patch also adds docs for `//@ llvm-cov-flags:`, and notes that coverage tests support revisions (though none of the current tests actually do so). --- src/tests/compiletest.md | 16 ++++++++++++++++ src/tests/headers.md | 16 ++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/src/tests/compiletest.md b/src/tests/compiletest.md index aa8d45d91..175e23b12 100644 --- a/src/tests/compiletest.md +++ b/src/tests/compiletest.md @@ -598,6 +598,21 @@ fn test_foo() { } ``` +In test suites that use the LLVM [FileCheck] tool, the current revision name is +also registered as an additional prefix for FileCheck directives: + +```rust,ignore +//@ revisions: NORMAL COVERAGE +//@ [COVERAGE] compile-flags: -Cinstrument-coverage +//@ [COVERAGE] needs-profiler-support + +// COVERAGE: @__llvm_coverage_mapping +// NORMAL-NOT: @__llvm_coverage_mapping + +// CHECK: main +fn main() {} +``` + Note that not all headers have meaning when customized to a revision. For example, the `ignore-test` header (and all "ignore" headers) currently only apply to the test as a whole, not to particular @@ -609,6 +624,7 @@ Following is classes of tests that support revisions: - UI - assembly - codegen +- coverage - debuginfo - rustdoc UI tests - incremental (these are special in that they inherently cannot be run in parallel) diff --git a/src/tests/headers.md b/src/tests/headers.md index 3964ba9f8..c64982670 100644 --- a/src/tests/headers.md +++ b/src/tests/headers.md @@ -95,6 +95,9 @@ found in [`header.rs`] from the compiletest source. for a known bug that has not yet been fixed * [Assembly](compiletest.md#assembly-tests) headers * `assembly-output` — the type of assembly output to check +* [Tool-specific headers](#tool-specific-headers) + * `filecheck-flags` - passes extra flags to the `FileCheck` tool + * `llvm-cov-flags` - passes extra flags to the `llvm-cov` tool ### Ignoring tests @@ -231,6 +234,19 @@ test suites. to be loaded by the host compiler. +### Tool-specific headers + +The following headers affect how certain command-line tools are invoked, +in test suites that use those tools: + +* `filecheck-flags` adds extra flags when running LLVM's `FileCheck` tool. + - Used by [codegen tests](compiletest.md#codegen-tests), + [assembly tests](compiletest.md#assembly-tests), and + [MIR-opt tests](compiletest.md#mir-opt-tests). +* `llvm-cov-flags` adds extra flags when running LLVM's `llvm-cov` tool. + - Used by [coverage tests](compiletest.md#coverage-tests) in `coverage-run` mode. + + ## Substitutions Headers values support substituting a few variables which will be replaced From 3af8b74e5456d044a30fba8dd2d42812025c55e4 Mon Sep 17 00:00:00 2001 From: lcnr Date: Mon, 4 Mar 2024 17:26:06 +0100 Subject: [PATCH 08/12] next-solver: document caching (#1923) --- src/SUMMARY.md | 1 + src/solve/caching.md | 111 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 112 insertions(+) create mode 100644 src/solve/caching.md diff --git a/src/SUMMARY.md b/src/SUMMARY.md index e7cba82c7..5ac26f150 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -132,6 +132,7 @@ - [The solver](./solve/the-solver.md) - [Canonicalization](./solve/canonicalization.md) - [Coinduction](./solve/coinduction.md) + - [Caching](./solve/caching.md) - [Proof trees](./solve/proof-trees.md) - [Normalization](./solve/normalization.md) - [Opaque types](./solve/opaque-types.md) diff --git a/src/solve/caching.md b/src/solve/caching.md new file mode 100644 index 000000000..cbe96757c --- /dev/null +++ b/src/solve/caching.md @@ -0,0 +1,111 @@ +# Caching in the new trait solver + +Caching results of the trait solver is necessary for performance. +We have to make sure that it is sound. Caching is handled by the +[`SearchGraph`] + +[`SearchGraph`]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L102-L117 + +## The global cache + +At its core, the cache is fairly straightforward. When evaluating a goal, we +check whether it's in the global cache. If so, we reuse that entry. If not, we +compute the goal and then store its result in the cache. + +To handle incremental compilation the computation of a goal happens inside of +[`DepGraph::with_anon_task`] which creates a new `DepNode` which depends on all queries +used inside of this computation. When accessing the global cache we then read this +`DepNode`, manually adding a dependency edge to all the queries used: [source][wdn]. + +### Dealing with overflow + +Hitting the recursion limit is not fatal in the new trait solver but instead simply +causes it to return ambiguity: [source][overflow]. Whether we hit the recursion limit +can therefore change the result without resulting in a compilation failure. This +means we must consider the remaining available depth when accessing a cache result. + +We do this by storing more information in the cache entry. For goals whose evaluation +did not reach the recursion limit, we simply store its reached depth: [source][req-depth]. +These results can freely be used as long as the current `available_depth` is higher than +its `reached_depth`: [source][req-depth-ck]. We then update the reached depth of the +current goal to make sure that whether we've used the global cache entry is not +observable: [source][update-depth]. + +For goals which reach the recursion limit we currently only use the cached result if the +available depth *exactly matches* the depth of the entry. The cache entry for each goal +therefore contains a separate result for each remaining depth: [source][rem-depth].[^1] + +## Handling cycles + +The trait solver has to support cycles. These cycles are either inductive or coinductive, +depending on the participating goals. See the [chapter on coinduction] for more details. +We distinguish between the cycle heads and the cycle root: a stack entry is a +cycle head if it recursively accessed. The *root* is the deepest goal on the stack which +is involved in any cycle. Given the following dependency tree, `A` and `B` are both cycle +heads, while only `A` is a root. + +```mermaid +graph TB + A --> B + B --> C + C --> B + C --> A +``` + +The result of cycle participants depends on the result of goals still on the stack. +However, we are currently computing that result, so its result is still unknown. This is +handled by evaluating cycle heads until we reach a fixpoint. In the first iteration, we +return either success or overflow with no constraints, depending on whether the cycle is +coinductive: [source][initial-prov-result]. After evaluating the head of a cycle, we +check whether its [`provisional_result`] is equal to the result of this iteration. If so, +we've finished evaluating this cycle and return its result. If not, we update the provisional +result and reevaluate the goal: [source][fixpoint]. After the first iteration it does not +matter whether cycles are coinductive or inductive. We always use the provisional result. + +### Only caching cycle roots + +We cannot move the result of any cycle participant to the global cache until we've +finished evaluating the cycle root. However, even after we've completely evaluated the +cycle, we are still forced to discard the result of all participants apart from the root +itself. + +We track the query dependencies of all global cache entries. This causes the caching of +cycle participants to be non-trivial. We cannot simply reuse the `DepNode` of the cycle +root.[^2] If we have a cycle `A -> B -> A`, then the `DepNode` for `A` contains a dependency +from `A -> B`. Reusing this entry for `B` may break if the source is changed. The `B -> A` +edge may not exist anymore and `A` may have been completely removed. This can easily result +in an ICE. + +However, it's even worse as the result of a cycle can change depending on which goal is +the root: [example][unstable-result-ex]. This forces us to weaken caching even further. +We must not use a cache entry of a cycle root, if there exists a stack entry, which was +a participant of its cycle involving that root. We do this by storing all cycle participants +of a given root in its global cache entry and checking that it contains no element of the +stack: [source][cycle-participants]. + +### The provisional cache + +TODO: write this :3 + +- stack dependence of provisional results +- edge case: provisional cache impacts behavior + + +[`with_anon_task`]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L391 +[wdn]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L78 +[overflow]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L276 +[req-depth]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L102 +[req-depth-ck]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L76-L86 +[update-depth]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L308 +[rem-depth]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L124 +[^1]: This is overly restrictive: if all nested goal return the overflow response with some +availabledepth `n`, then their result should be the same for any depths smaller than `n`. +We can implement this optimization in the future. +[chapter on coinduction]: ./coinduction.md +[`provisional_result`]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L57 +[initial-prov-result]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L366-L370 +[fixpoint]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L425-L446 +[^2]: summarizing the relevant [zulip thread] +[zulip thread]: https://rust-lang.zulipchat.com/#narrow/stream/364551-t-types.2Ftrait-system-refactor/topic/global.20cache +[unstable-result-ex]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/tests/ui/traits/next-solver/cycles/coinduction/incompleteness-unstable-result.rs#L4-L16 +[cycle-participants]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L72-L74 \ No newline at end of file From 0d04d711d4564241d2d27180cdead5480deacbda Mon Sep 17 00:00:00 2001 From: lcnr Date: Mon, 4 Mar 2024 17:31:01 +0100 Subject: [PATCH 09/12] unfk links --- src/solve/caching.md | 1 + 1 file changed, 1 insertion(+) diff --git a/src/solve/caching.md b/src/solve/caching.md index cbe96757c..92b27c2f4 100644 --- a/src/solve/caching.md +++ b/src/solve/caching.md @@ -101,6 +101,7 @@ TODO: write this :3 [^1]: This is overly restrictive: if all nested goal return the overflow response with some availabledepth `n`, then their result should be the same for any depths smaller than `n`. We can implement this optimization in the future. + [chapter on coinduction]: ./coinduction.md [`provisional_result`]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L57 [initial-prov-result]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L366-L370 From e082dc6f003e88a3cefaf2da307d5c9990b3045c Mon Sep 17 00:00:00 2001 From: lcnr Date: Mon, 4 Mar 2024 17:33:19 +0100 Subject: [PATCH 10/12] and again --- src/solve/caching.md | 1 + 1 file changed, 1 insertion(+) diff --git a/src/solve/caching.md b/src/solve/caching.md index 92b27c2f4..0ef8b7bd1 100644 --- a/src/solve/caching.md +++ b/src/solve/caching.md @@ -107,6 +107,7 @@ We can implement this optimization in the future. [initial-prov-result]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L366-L370 [fixpoint]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_trait_selection/src/solve/search_graph.rs#L425-L446 [^2]: summarizing the relevant [zulip thread] + [zulip thread]: https://rust-lang.zulipchat.com/#narrow/stream/364551-t-types.2Ftrait-system-refactor/topic/global.20cache [unstable-result-ex]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/tests/ui/traits/next-solver/cycles/coinduction/incompleteness-unstable-result.rs#L4-L16 [cycle-participants]: https://github.com/rust-lang/rust/blob/7606c13961ddc1174b70638e934df0439b7dc515/compiler/rustc_middle/src/traits/solve/cache.rs#L72-L74 \ No newline at end of file From d43fff786b27988d74263556ea9202077de3f47a Mon Sep 17 00:00:00 2001 From: Tbkhi <157125900+Tbkhi@users.noreply.github.com> Date: Mon, 4 Mar 2024 16:00:53 -0400 Subject: [PATCH 11/12] Update overview.md (#1898) * Update overview.md Various link addition and minor edits for clarity. * generic improvements * fix line lengths for ci/cd --------- Co-authored-by: Tbkhi Co-authored-by: Oliver Dechant --- src/overview.md | 328 +++++++++++++++++++++++++----------------------- 1 file changed, 172 insertions(+), 156 deletions(-) diff --git a/src/overview.md b/src/overview.md index fb0b07e63..6708d860c 100644 --- a/src/overview.md +++ b/src/overview.md @@ -6,25 +6,24 @@ This chapter is about the overall process of compiling a program -- how everything fits together. The Rust compiler is special in two ways: it does things to your code that -other compilers don't do (e.g. borrow checking) and it has a lot of +other compilers don't do (e.g. borrow-checking) and it has a lot of unconventional implementation choices (e.g. queries). We will talk about these -in turn in this chapter, and in the rest of the guide, we will look at all the +in turn in this chapter, and in the rest of the guide, we will look at the individual pieces in more detail. ## What the compiler does to your code So first, let's look at what the compiler does to your code. For now, we will -avoid mentioning how the compiler implements these steps except as needed; -we'll talk about that later. +avoid mentioning how the compiler implements these steps except as needed. ### Invocation -Compilation begins when a user writes a Rust source program in text -and invokes the `rustc` compiler on it. The work that the compiler needs to -perform is defined by command-line options. For example, it is possible to -enable nightly features (`-Z` flags), perform `check`-only builds, or emit -LLVM-IR rather than executable machine code. The `rustc` executable call may -be indirect through the use of `cargo`. +Compilation begins when a user writes a Rust source program in text and invokes +the `rustc` compiler on it. The work that the compiler needs to perform is +defined by command-line options. For example, it is possible to enable nightly +features (`-Z` flags), perform `check`-only builds, or emit the LLVM +Intermediate Representation (`LLVM-IR`) rather than executable machine code. +The `rustc` executable call may be indirect through the use of `cargo`. Command line argument parsing occurs in the [`rustc_driver`]. This crate defines the compile configuration that is requested by the user and passes it @@ -34,140 +33,151 @@ to the rest of the compilation process as a [`rustc_interface::Config`]. The raw Rust source text is analyzed by a low-level *lexer* located in [`rustc_lexer`]. At this stage, the source text is turned into a stream of -atomic source code units known as _tokens_. The lexer supports the +atomic source code units known as _tokens_. The `lexer` supports the Unicode character encoding. The token stream passes through a higher-level lexer located in [`rustc_parse`] to prepare for the next stage of the compile process. The -[`StringReader`] struct is used at this stage to perform a set of validations +[`StringReader`] `struct` is used at this stage to perform a set of validations and turn strings into interned symbols (_interning_ is discussed later). [String interning] is a way of storing only one immutable copy of each distinct string value. -The lexer has a small interface and doesn't depend directly on the -diagnostic infrastructure in `rustc`. Instead it provides diagnostics as plain -data which are emitted in `rustc_parse::lexer` as real diagnostics. -The lexer preserves full fidelity information for both IDEs and proc macros. +The lexer has a small interface and doesn't depend directly on the diagnostic +infrastructure in `rustc`. Instead it provides diagnostics as plain data which +are emitted in [`rustc_parse::lexer`] as real diagnostics. The `lexer` +preserves full fidelity information for both IDEs and procedural macros +(sometimes referred to as "proc-macros"). -The *parser* [translates the token stream from the lexer into an Abstract Syntax +The *parser* [translates the token stream from the `lexer` into an Abstract Syntax Tree (AST)][parser]. It uses a recursive descent (top-down) approach to syntax -analysis. The crate entry points for the parser are the +analysis. The crate entry points for the `parser` are the [`Parser::parse_crate_mod()`][parse_crate_mod] and [`Parser::parse_mod()`][parse_mod] methods found in [`rustc_parse::parser::Parser`]. The external module parsing entry point is [`rustc_expand::module::parse_external_mod`][parse_external_mod]. -And the macro parser entry point is [`Parser::parse_nonterminal()`][parse_nonterminal]. +And the macro-`parser` entry point is [`Parser::parse_nonterminal()`][parse_nonterminal]. -Parsing is performed with a set of `Parser` utility methods including `bump`, -`check`, `eat`, `expect`, `look_ahead`. +Parsing is performed with a set of [`parser`] utility methods including [`bump`], +[`check`], [`eat`], [`expect`], [`look_ahead`]. Parsing is organized by semantic construct. Separate `parse_*` methods can be found in the [`rustc_parse`][rustc_parse_parser_dir] directory. The source file name follows the construct name. For example, the -following files are found in the parser: - -- `expr.rs` -- `pat.rs` -- `ty.rs` -- `stmt.rs` - -This naming scheme is used across many compiler stages. You will find -either a file or directory with the same name across the parsing, lowering, -type checking, THIR lowering, and MIR building sources. - -Macro expansion, AST validation, name resolution, and early linting also take place -during this stage. - -The parser uses the standard `DiagnosticBuilder` API for error handling, but we -try to recover, parsing a superset of Rust's grammar, while also emitting an error. -`rustc_ast::ast::{Crate, Mod, Expr, Pat, ...}` AST nodes are returned from the parser. - -### HIR lowering - -Next, we take the AST and convert it to [High-Level Intermediate -Representation (HIR)][hir], a more compiler-friendly representation of the -AST. This process is called "lowering". It involves a lot of desugaring of things -like loops and `async fn`. - -We then use the HIR to do [*type inference*] (the process of automatic -detection of the type of an expression), [*trait solving*] (the process -of pairing up an impl with each reference to a trait), and [*type -checking*]. Type checking is the process of converting the types found in the HIR -([`hir::Ty`]), which represent what the user wrote, -into the internal representation used by the compiler ([`Ty<'tcx>`]). -That information is used to verify the type safety, correctness and -coherence of the types used in the program. - -### MIR lowering - -The HIR is then [lowered to Mid-level Intermediate Representation (MIR)][mir], -which is used for [borrow checking]. - -Along the way, we also construct the THIR, which is an even more desugared HIR. -THIR is used for pattern and exhaustiveness checking. It is also more -convenient to convert into MIR than HIR is. - -We do [many optimizations on the MIR][mir-opt] because it is still -generic and that improves the code we generate later, improving compilation -speed too. -MIR is a higher level (and generic) representation, so it is easier to do -some optimizations at MIR level than at LLVM-IR level. For example LLVM -doesn't seem to be able to optimize the pattern the [`simplify_try`] mir -opt looks for. - -Rust code is _monomorphized_, which means making copies of all the generic -code with the type parameters replaced by concrete types. To do -this, we need to collect a list of what concrete types to generate code for. -This is called _monomorphization collection_ and it happens at the MIR level. +following files are found in the `parser`: + +- [`expr.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/expr.rs) +- [`pat.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/pat.rs) +- [`ty.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/ty.rs) +- [`stmt.rs`](https://github.com/rust-lang/rust/blob/master/compiler/rustc_parse/src/parser/stmt.rs) + +This naming scheme is used across many compiler stages. You will find either a +file or directory with the same name across the parsing, lowering, type +checking, [Typed High-level Intermediate Representation (`THIR`)] lowering, and +[Mid-level Intermediate Representation (`MIR`)][mir] building sources. + +Macro-expansion, `AST`-validation, name-resolution, and early linting also take +place during the lexing and parsing stage. + +The [`rustc_ast::ast`]::{[`Crate`], [`Expr`], [`Pat`], ...} `AST` nodes are +returned from the parser while the standard [`DiagnosticBuilder`] API is used +for error handling. Generally Rust's compiler will try to recover from errors +by parsing a superset of Rust's grammar, while also emitting an error type. + +### `HIR` lowering + +Next the `AST` is converted into [High-Level Intermediate Representation +(`HIR`)][hir], a more compiler-friendly representation of the `AST`. This process +is called "lowering" and involves a lot of desugaring (the expansion and +formalizing of shortened or abbreviated syntax constructs) of things like loops +and `async fn`. + +We then use the `HIR` to do [*type inference*] (the process of automatic +detection of the type of an expression), [*trait solving*] (the process of +pairing up an impl with each reference to a `trait`), and [*type checking*]. Type +checking is the process of converting the types found in the `HIR` ([`hir::Ty`]), +which represent what the user wrote, into the internal representation used by +the compiler ([`Ty<'tcx>`]). It's called type checking because the information +is used to verify the type safety, correctness and coherence of the types used +in the program. + +### `MIR` lowering + +The `HIR` is further lowered to `MIR` +(used for [borrow checking]) by constructing the `THIR` (an even more desugared `HIR` used for +pattern and exhaustiveness checking) to convert into `MIR`. + +We do [many optimizations on the MIR][mir-opt] because it is generic and that +improves later code generation and compilation speed. It is easier to do some +optimizations at `MIR` level than at `LLVM-IR` level. For example LLVM doesn't seem +to be able to optimize the pattern the [`simplify_try`] `MIR`-opt looks for. + +Rust code is also [_monomorphized_] during code generation, which means making +copies of all the generic code with the type parameters replaced by concrete +types. To do this, we need to collect a list of what concrete types to generate +code for. This is called _monomorphization collection_ and it happens at the +`MIR` level. + +[_monomorphized_]: https://en.wikipedia.org/wiki/Monomorphization ### Code generation -We then begin what is vaguely called _code generation_ or _codegen_. -The [code generation stage][codegen] is when higher level -representations of source are turned into an executable binary. `rustc` -uses LLVM for code generation. The first step is to convert the MIR -to LLVM Intermediate Representation (LLVM IR). This is where the MIR -is actually monomorphized, according to the list we created in the -previous step. -The LLVM IR is passed to LLVM, which does a lot more optimizations on it. -It then emits machine code. It is basically assembly code with additional -low-level types and annotations added (e.g. an ELF object or WASM). -The different libraries/binaries are then linked together to produce the final -binary. +We then begin what is simply called _code generation_ or _codegen_. The [code +generation stage][codegen] is when higher-level representations of source are +turned into an executable binary. Since `rustc` uses LLVM for code generation, +the first step is to convert the `MIR` to `LLVM-IR`. This is where the `MIR` is +actually monomorphized. The `LLVM-IR` is passed to LLVM, which does a lot more +optimizations on it, emitting machine code which is basically assembly code +with additional low-level types and annotations added (e.g. an ELF object or +`WASM`). The different libraries/binaries are then linked together to produce +the final binary. -[String interning]: https://en.wikipedia.org/wiki/String_interning -[`rustc_lexer`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html +[*trait solving*]: traits/resolution.md +[*type checking*]: type-checking.md +[*type inference*]: type-inference.md +[`bump`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.bump +[`check`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.check +[`Crate`]: https://doc.rust-lang.org/beta/nightly-rustc/rustc_ast/ast/struct.Crate.html +[`DiagnosticBuilder`]: https://doc.rust-lang.org/beta/nightly-rustc/rustc_errors/struct.DiagnosticBuilder.html +[`eat`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.eat +[`expect`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.expect +[`Expr`]: https://doc.rust-lang.org/beta/nightly-rustc/rustc_ast/ast/struct.Expr.html +[`hir::Ty`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/struct.Ty.html +[`look_ahead`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.look_ahead +[`Parser`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html +[`Pat`]: https://doc.rust-lang.org/beta/nightly-rustc/rustc_ast/ast/struct.Pat.html +[`rustc_ast::ast`]: https://doc.rust-lang.org/beta/nightly-rustc/rustc_ast/index.html [`rustc_driver`]: rustc-driver.md [`rustc_interface::Config`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_interface/interface/struct.Config.html -[lex]: the-parser.md -[`StringReader`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/struct.StringReader.html +[`rustc_lexer`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_lexer/index.html +[`rustc_parse::lexer`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/index.html +[`rustc_parse::parser::Parser`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html [`rustc_parse`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html -[parser]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html -[hir]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html -[*type inference*]: type-inference.md -[*trait solving*]: traits/resolution.md -[*type checking*]: type-checking.md -[mir]: mir/index.md -[borrow checking]: borrow_check.md -[mir-opt]: mir/optimizations.md [`simplify_try`]: https://github.com/rust-lang/rust/pull/66282 +[`StringReader`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/lexer/struct.StringReader.html +[`Ty<'tcx>`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Ty.html +[borrow checking]: borrow_check.md [codegen]: backend/codegen.md -[parse_nonterminal]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_nonterminal +[hir]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html +[lex]: the-parser.md +[mir-opt]: mir/optimizations.md +[mir]: mir/index.md [parse_crate_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_crate_mod -[parse_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_mod -[`rustc_parse::parser::Parser`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html [parse_external_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/module/fn.parse_external_mod.html +[parse_mod]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_mod +[parse_nonterminal]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/parser/struct.Parser.html#method.parse_nonterminal +[parser]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_parse/index.html [rustc_parse_parser_dir]: https://github.com/rust-lang/rust/tree/master/compiler/rustc_parse/src/parser -[`hir::Ty`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/hir/struct.Ty.html -[`Ty<'tcx>`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Ty.html +[String interning]: https://en.wikipedia.org/wiki/String_interning +[Typed High-level Intermediate Representation (`THIR`)]: https://rustc-dev-guide.rust-lang.org/thir.html ## How it does it -Ok, so now that we have a high-level view of what the compiler does to your -code, let's take a high-level view of _how_ it does all that stuff. There are a -lot of constraints and conflicting goals that the compiler needs to +Now that we have a high-level view of what the compiler does to your code, +let's take a high-level view of _how_ it does all that stuff. There are a lot +of constraints and conflicting goals that the compiler needs to satisfy/optimize for. For example, -- Compilation speed: how fast is it to compile a program. More/better +- Compilation speed: how fast is it to compile a program? More/better compile-time analyses often means compilation is slower. - Also, we want to support incremental compilation, so we need to take that into account. How can we keep track of what work needs to be redone and @@ -190,17 +200,17 @@ satisfy/optimize for. For example, the input programs says they do, and should continue to do so despite the tremendous amount of change constantly going on. - Integration: a number of other tools need to use the compiler in - various ways (e.g. cargo, clippy, miri) that must be supported. + various ways (e.g. `cargo`, `clippy`, `MIRI`) that must be supported. - Compiler stability: the compiler should not crash or fail ungracefully on the stable channel. - Rust stability: the compiler must respect Rust's stability guarantees by not breaking programs that previously compiled despite the many changes that are always going on to its implementation. -- Limitations of other tools: rustc uses LLVM in its backend, and LLVM has some - strengths we leverage and some limitations/weaknesses we need to work around. +- Limitations of other tools: `rustc` uses LLVM in its backend, and LLVM has some + strengths we leverage and some aspects we need to work around. -So, as you read through the rest of the guide, keep these things in mind. They -will often inform decisions that we make. +So, as you continue your journey through the rest of the guide, keep these +things in mind. They will often inform decisions that we make. ### Intermediate representations @@ -217,31 +227,32 @@ for different purposes: - Token stream: the lexer produces a stream of tokens directly from the source code. This stream of tokens is easier for the parser to deal with than raw text. -- Abstract Syntax Tree (AST): the abstract syntax tree is built from the stream +- Abstract Syntax Tree (`AST`): the abstract syntax tree is built from the stream of tokens produced by the lexer. It represents pretty much exactly what the user wrote. It helps to do some syntactic sanity checking (e.g. checking that a type is expected where the user wrote one). -- High-level IR (HIR): This is a sort of desugared AST. It's still close +- High-level IR (HIR): This is a sort of desugared `AST`. It's still close to what the user wrote syntactically, but it includes some implicit things such as some elided lifetimes, etc. This IR is amenable to type checking. -- Typed HIR (THIR): This is an intermediate between HIR and MIR, and used to be called - High-level Abstract IR (HAIR). It is like the HIR but it is fully typed and a bit - more desugared (e.g. method calls and implicit dereferences are made fully explicit). - Moreover, it is easier to lower to MIR from THIR than from HIR. -- Middle-level IR (MIR): This IR is basically a Control-Flow Graph (CFG). A CFG +- Typed `HIR` (THIR) _formerly High-level Abstract IR (HAIR)_: This is an + intermediate between `HIR` and MIR. It is like the `HIR` but it is fully typed + and a bit more desugared (e.g. method calls and implicit dereferences are + made fully explicit). As a result, it is easier to lower to `MIR` from `THIR` than + from HIR. +- Middle-level IR (`MIR`): This IR is basically a Control-Flow Graph (CFG). A CFG is a type of diagram that shows the basic blocks of a program and how control - flow can go between them. Likewise, MIR also has a bunch of basic blocks with + flow can go between them. Likewise, `MIR` also has a bunch of basic blocks with simple typed statements inside them (e.g. assignment, simple computations, etc) and control flow edges to other basic blocks (e.g., calls, dropping - values). MIR is used for borrow checking and other + values). `MIR` is used for borrow checking and other important dataflow-based checks, such as checking for uninitialized values. It is also used for a series of optimizations and for constant evaluation (via - MIRI). Because MIR is still generic, we can do a lot of analyses here more + `MIRI`). Because `MIR` is still generic, we can do a lot of analyses here more efficiently than after monomorphization. -- LLVM IR: This is the standard form of all input to the LLVM compiler. LLVM IR +- `LLVM-IR`: This is the standard form of all input to the LLVM compiler. `LLVM-IR` is a sort of typed assembly language with lots of annotations. It's a standard format that is used by all compilers that use LLVM (e.g. the clang - C compiler also outputs LLVM IR). LLVM IR is designed to be easy for other + C compiler also outputs `LLVM-IR`). `LLVM-IR` is designed to be easy for other compilers to emit and also rich enough for LLVM to run a bunch of optimizations on it. @@ -258,25 +269,25 @@ representations are interned. ### Queries -The first big implementation choice is the _query_ system. The Rust compiler -uses a query system which is unlike most textbook compilers, which are -organized as a series of passes over the code that execute sequentially. The -compiler does this to make incremental compilation possible -- that is, if the -user makes a change to their program and recompiles, we want to do as little -redundant work as possible to produce the new binary. +The first big implementation choice is Rust's use of the _query_ system in its +compiler. The Rust compiler _is not_ organized as a series of passes over the +code which execute sequentially. The Rust compiler does this to make +incremental compilation possible -- that is, if the user makes a change to +their program and recompiles, we want to do as little redundant work as +possible to output the new binary. In `rustc`, all the major steps above are organized as a bunch of queries that call each other. For example, there is a query to ask for the type of something -and another to ask for the optimized MIR of a function. These -queries can call each other and are all tracked through the query system. -The results of the queries are cached on disk so that we can tell which -queries' results changed from the last compilation and only redo those. This is -how incremental compilation works. +and another to ask for the optimized `MIR` of a function. These queries can call +each other and are all tracked through the query system. The results of the +queries are cached on disk so that the compiler can tell which queries' results +changed from the last compilation and only redo those. This is how incremental +compilation works. In principle, for the query-fied steps, we do each of the above for each item -individually. For example, we will take the HIR for a function and use queries -to ask for the LLVM IR for that HIR. This drives the generation of optimized -MIR, which drives the borrow checker, which drives the generation of MIR, and +individually. For example, we will take the `HIR` for a function and use queries +to ask for the `LLVM-IR` for that HIR. This drives the generation of optimized +`MIR`, which drives the borrow checker, which drives the generation of `MIR`, and so on. ... except that this is very over-simplified. In fact, some queries are not @@ -295,8 +306,8 @@ Moreover, the compiler wasn't originally built to use a query system; the query system has been retrofitted into the compiler, so parts of it are not query-fied yet. Also, LLVM isn't our code, so that isn't querified either. The plan is to eventually query-fy all of the steps listed in the previous section, -but as of November 2022, only the steps between HIR and -LLVM IR are query-fied. That is, lexing, parsing, name resolution, and macro +but as of November 2022, only the steps between `HIR` and +`LLVM-IR` are query-fied. That is, lexing, parsing, name resolution, and macro expansion are done all at once for the whole program. One other thing to mention here is the all-important "typing context", @@ -308,7 +319,7 @@ queries are defined as methods on the [`TyCtxt`] type, and the in-memory query cache is stored there too. In the code, there is usually a variable called `tcx` which is a handle on the typing context. You will also see lifetimes with the name `'tcx`, which means that something is tied to the lifetime of the -`TyCtxt` (usually it is stored or interned there). +[`TyCtxt`] (usually it is stored or interned there). [`TyCtxt`]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html @@ -320,9 +331,10 @@ program) is [`rustc_middle::ty::Ty`][ty]. This is so important that we have a wh on [`ty::Ty`][ty], but for now, we just want to mention that it exists and is the way `rustc` represents types! -Also note that the `rustc_middle::ty` module defines the `TyCtxt` struct we mentioned before. +Also note that the [`rustc_middle::ty`] module defines the [`TyCtxt`] struct we mentioned before. [ty]: https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.Ty.html +[`rustc_middle::ty`]: https://doc.rust-lang.org/beta/nightly-rustc/rustc_middle/ty/index.html ### Parallelism @@ -330,17 +342,21 @@ Compiler performance is a problem that we would like to improve on (and are always working on). One aspect of that is parallelizing `rustc` itself. -Currently, there is only one part of rustc that is parallel by default: codegen. +Currently, there is only one part of rustc that is parallel by default: +[code generation](./parallel-rustc.md#Codegen). However, the rest of the compiler is still not yet parallel. There have been lots of efforts spent on this, but it is generally a hard problem. The current -approach is to turn `RefCell`s into `Mutex`s -- that is, we +approach is to turn [`RefCell`]s into [`Mutex`]s -- that is, we switch to thread-safe internal mutability. However, there are ongoing challenges with lock contention, maintaining query-system invariants under concurrency, and the complexity of the code base. One can try out the current work by enabling parallel compilation in `config.toml`. It's still early days, but there are already some promising performance improvements. +[`RefCell`]: https://doc.rust-lang.org/std/cell/struct.RefCell.html +[`Mutex`]: https://doc.rust-lang.org/std/sync/struct.Mutex.html + ### Bootstrapping `rustc` itself is written in Rust. So how do we compile the compiler? We use an @@ -362,7 +378,7 @@ For more details on bootstrapping, see - Does LLVM ever do optimizations in debug builds? - How do I explore phases of the compile process in my own sources (lexer, parser, HIR, etc)? - e.g., `cargo rustc -- -Z unpretty=hir-tree` allows you to - view HIR representation + view `HIR` representation - What is the main source entry point for `X`? - Where do phases diverge for cross-compilation to machine code across different platforms? @@ -387,16 +403,16 @@ For more details on bootstrapping, see - [Entry point for first file in crate](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_interface/passes/fn.parse.html) - [Entry point for outline module parsing](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/module/fn.parse_external_mod.html) - [Entry point for macro fragments][parse_nonterminal] - - AST definition: [`rustc_ast`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/index.html) + - `AST` definition: [`rustc_ast`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_ast/ast/index.html) - Feature gating: **TODO** - Early linting: **TODO** - The High Level Intermediate Representation (HIR) - Guide: [The HIR](hir.md) - Guide: [Identifiers in the HIR](hir.md#identifiers-in-the-hir) - - Guide: [The HIR Map](hir.md#the-hir-map) - - Guide: [Lowering AST to HIR](lowering.md) - - How to view HIR representation for your code `cargo rustc -- -Z unpretty=hir-tree` - - Rustc HIR definition: [`rustc_hir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html) + - Guide: [The `HIR` Map](hir.md#the-hir-map) + - Guide: [Lowering `AST` to HIR](lowering.md) + - How to view `HIR` representation for your code `cargo rustc -- -Z unpretty=hir-tree` + - Rustc `HIR` definition: [`rustc_hir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_hir/index.html) - Main entry point: **TODO** - Late linting: **TODO** - Type Inference @@ -406,21 +422,21 @@ For more details on bootstrapping, see - Main entry point (type checking bodies): [the `typeck` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/struct.TyCtxt.html#method.typeck) - These two functions can't be decoupled. - The Mid Level Intermediate Representation (MIR) - - Guide: [The MIR (Mid level IR)](mir/index.md) + - Guide: [The `MIR` (Mid level IR)](mir/index.md) - Definition: [`rustc_middle/src/mir`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/mir/index.html) - Definition of sources that manipulates the MIR: [`rustc_mir_build`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_build/index.html), [`rustc_mir_dataflow`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_dataflow/index.html), [`rustc_mir_transform`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/index.html) - The Borrow Checker - Guide: [MIR Borrow Check](borrow_check.md) - Definition: [`rustc_borrowck`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/index.html) - Main entry point: [`mir_borrowck` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_borrowck/fn.mir_borrowck.html) -- MIR Optimizations +- `MIR` Optimizations - Guide: [MIR Optimizations](mir/optimizations.md) - Definition: [`rustc_mir_transform`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/index.html) - Main entry point: [`optimized_mir` query](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_mir_transform/fn.optimized_mir.html) - Code Generation - Guide: [Code Generation](backend/codegen.md) - - Generating Machine Code from LLVM IR with LLVM - **TODO: reference?** + - Generating Machine Code from `LLVM-IR` with LLVM - **TODO: reference?** - Main entry point: [`rustc_codegen_ssa::base::codegen_crate`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_crate.html) - - This monomorphizes and produces LLVM IR for one codegen unit. It then + - This monomorphizes and produces `LLVM-IR` for one codegen unit. It then starts a background thread to run LLVM, which must be joined later. - Monomorphization happens lazily via [`FunctionCx::monomorphize`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/mir/struct.FunctionCx.html#method.monomorphize) and [`rustc_codegen_ssa::base::codegen_instance `](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/base/fn.codegen_instance.html) From f8631011aab31e48148ce191dc4bf35f3919f542 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E8=AE=B8=E6=9D=B0=E5=8F=8B=20Jieyou=20Xu=20=28Joe=29?= <39484203+jieyouxu@users.noreply.github.com> Date: Fri, 8 Mar 2024 20:44:41 +0000 Subject: [PATCH 12/12] Document that test names cannot contain dots (#1927) --- src/tests/ui.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/tests/ui.md b/src/tests/ui.md index cffecda82..3f8502640 100644 --- a/src/tests/ui.md +++ b/src/tests/ui.md @@ -64,6 +64,9 @@ The general form is: *test-name*`.`*revision*`.`*compare_mode*`.`*extension* +* *test-name* cannot contain dots. This is so that the general form of test + output filenames have a predictable form we can pattern match on in order to + track stray test output files. * *revision* is the [revision](#cfg-revisions) name. This is not included when not using revisions. * *compare_mode* is the [compare mode](#compare-modes).