Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

preprocessor to skip recompiling tests on non-interface source file changes #252

Draft
wants to merge 32 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
abd521f
wip
klkvr Sep 11, 2024
fb64ca7
wip
klkvr Sep 11, 2024
0876b01
add preprocessor
klkvr Sep 11, 2024
314f284
wip
klkvr Sep 11, 2024
86f0562
wip
klkvr Sep 11, 2024
af6550f
fixes
klkvr Sep 13, 2024
276bc94
helper libs
klkvr Sep 13, 2024
fa5d7bf
some docs
klkvr Sep 13, 2024
15ce120
clean up
klkvr Sep 13, 2024
a379db3
fixes
klkvr Sep 16, 2024
25e5ef9
Merge branch 'main' into klkvr/pp-test-rebase
grandizzy Feb 18, 2025
3faaa02
Port to solar
grandizzy Feb 19, 2025
0f78197
Clippy
grandizzy Feb 19, 2025
594c09b
Changes after review
grandizzy Feb 19, 2025
3105272
Patch solar to get HIR visitor impl
grandizzy Feb 19, 2025
7d0edb8
Review changes
grandizzy Feb 19, 2025
4025e5a
reuse replace source fn
grandizzy Feb 19, 2025
0c940d1
Clippy
grandizzy Feb 19, 2025
79d0a0f
Cleanup, move Update type in lib
grandizzy Feb 19, 2025
d37cafb
Change replace_source_content sig, move apply_updates
grandizzy Feb 20, 2025
c526df6
Port to solar HIR, refactor
grandizzy Feb 26, 2025
e691a96
add preprocessing parse constructors
grandizzy Feb 26, 2025
4241d2b
Contract id cleanup
grandizzy Feb 26, 2025
c86c4aa
Cleanup, filter collected dependencies to be source contracts
grandizzy Feb 27, 2025
f4a109d
Cleanup, remove Hir:: usage, use ContractIds
grandizzy Feb 28, 2025
f0818ea
Optional preprocessed cache
grandizzy Mar 3, 2025
2d1e59f
Review cleanup
grandizzy Mar 3, 2025
c354a7c
Simplify find and remove branches
grandizzy Mar 3, 2025
719d4e8
Autodetect and recompile mocks
grandizzy Mar 4, 2025
4342f82
Fix description
grandizzy Mar 4, 2025
d8ece14
Cleanup autodetect and update cached mocks:
grandizzy Mar 5, 2025
c4ec52c
Invalidate cache on preprocess option toggle
grandizzy Mar 5, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ serde = { version = "1", features = ["derive", "rc"] }
serde_json = "1.0"
similar-asserts = "1"
solar-parse = { version = "=0.1.1", default-features = false }
solar-sema = { version = "=0.1.1", default-features = false }
svm = { package = "svm-rs", version = "0.5", default-features = false }
tempfile = "3.9"
thiserror = "2"
Expand All @@ -67,3 +68,8 @@ futures-util = "0.3"
tokio = { version = "1.35", features = ["rt-multi-thread"] }

snapbox = "0.6.9"

[patch.crates-io]
solar-parse = { git = "https://github.com/paradigmxyz/solar", rev = "964b054" }
solar-sema = { git = "https://github.com/paradigmxyz/solar", rev = "964b054" }

2 changes: 1 addition & 1 deletion crates/artifacts/solc/src/ast/misc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ use std::{fmt, fmt::Write, str::FromStr};
/// Represents the source location of a node: `<start byte>:<length>:<source index>`.
///
/// The `start`, `length` and `index` can be -1 which is represented as `None`
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)]
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct SourceLocation {
pub start: Option<usize>,
pub length: Option<usize>,
Expand Down
1 change: 1 addition & 0 deletions crates/compilers/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ thiserror.workspace = true
path-slash.workspace = true
yansi.workspace = true
solar-parse.workspace = true
solar-sema.workspace = true
futures-util = { workspace = true, optional = true }
tokio = { workspace = true, optional = true }

Expand Down
165 changes: 107 additions & 58 deletions crates/compilers/src/cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ use crate::{
buildinfo::RawBuildInfo,
compilers::{Compiler, CompilerSettings, Language},
output::Builds,
preprocessor::interface_representation_hash,
resolver::GraphEdges,
ArtifactFile, ArtifactOutput, Artifacts, ArtifactsMap, Graph, OutputContext, Project,
ProjectPaths, ProjectPathsConfig, SourceCompilationKind,
Expand Down Expand Up @@ -411,6 +412,8 @@ pub struct CacheEntry {
pub last_modification_date: u64,
/// hash to identify whether the content of the file changed
pub content_hash: String,
/// hash of the interface representation of the file, if it's a source file
pub interface_repr_hash: Option<String>,
Comment on lines +421 to +422
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: noticed interfaceReprHash is sometimes null in the cache, would it make sense to skip this field when it is None during serialization w/ #[serde(skip_serializing_if = "Option::is_none")]?

Screenshot from 2025-03-05 12-45-41

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, that's null for all files under test / script dirs, will check if any side effect if we don't serialize them but shouldn't

/// identifier name see [`foundry_compilers_core::utils::source_name()`]
pub source_name: PathBuf,
/// fully resolved imports of the file
Expand Down Expand Up @@ -654,11 +657,20 @@ pub(crate) struct ArtifactsCacheInner<

/// The file hashes.
pub content_hashes: HashMap<PathBuf, String>,

/// The interface representations for source files.
pub interface_repr_hashes: HashMap<PathBuf, String>,
}

impl<T: ArtifactOutput<CompilerContract = C::CompilerContract>, C: Compiler>
ArtifactsCacheInner<'_, T, C>
{
/// Whether given file is a source file or a test/script file.
fn is_source_file(&self, file: &Path) -> bool {
!file.starts_with(&self.project.paths.tests)
&& !file.starts_with(&self.project.paths.scripts)
}

/// Creates a new cache entry for the file
fn create_cache_entry(&mut self, file: PathBuf, source: &Source) {
let imports = self
Expand All @@ -668,10 +680,14 @@ impl<T: ArtifactOutput<CompilerContract = C::CompilerContract>, C: Compiler>
.map(|import| strip_prefix(import, self.project.root()).into())
.collect();

let interface_repr_hash =
self.is_source_file(&file).then(|| interface_representation_hash(source, &file));

let entry = CacheEntry {
last_modification_date: CacheEntry::read_last_modification_date(&file)
.unwrap_or_default(),
content_hash: source.content_hash(),
interface_repr_hash,
source_name: strip_prefix(&file, self.project.root()).into(),
imports,
version_requirement: self.edges.version_requirement(&file).map(|v| v.to_string()),
Expand Down Expand Up @@ -765,26 +781,23 @@ impl<T: ArtifactOutput<CompilerContract = C::CompilerContract>, C: Compiler>
return true;
}

// If any requested extra files are missing for any artifact, mark source as dirty to
// generate them
for artifacts in self.cached_artifacts.values() {
for artifacts in artifacts.values() {
for artifact_file in artifacts {
if self.project.artifacts_handler().is_dirty(artifact_file).unwrap_or(true) {
return true;
}
}
}
}

false
}

// Walks over all cache entires, detects dirty files and removes them from cache.
fn find_and_remove_dirty(&mut self) {
fn populate_dirty_files<D>(
file: &Path,
dirty_files: &mut HashSet<PathBuf>,
edges: &GraphEdges<D>,
) {
for file in edges.importers(file) {
// If file is marked as dirty we either have already visited it or it was marked as
// dirty initially and will be visited at some point later.
if !dirty_files.contains(file) {
dirty_files.insert(file.to_path_buf());
populate_dirty_files(file, dirty_files, edges);
}
}
}

let existing_profiles = self.project.settings_profiles().collect::<BTreeMap<_, _>>();

let mut dirty_profiles = HashSet::new();
Expand Down Expand Up @@ -821,76 +834,103 @@ impl<T: ArtifactOutput<CompilerContract = C::CompilerContract>, C: Compiler>
}
}

// Iterate over existing cache entries.
let files = self.cache.files.keys().cloned().collect::<HashSet<_>>();

let mut sources = Sources::new();

// Read all sources, marking entries as dirty on I/O errors.
for file in &files {
let Ok(source) = Source::read(file) else {
self.dirty_sources.insert(file.clone());
// Read all sources, removing entries on I/O errors.
for file in self.cache.files.keys().cloned().collect::<Vec<_>>() {
let Ok(source) = Source::read(&file) else {
self.cache.files.remove(&file);
continue;
};
sources.insert(file.clone(), source);
}

// Build a temporary graph for walking imports. We need this because `self.edges`
// only contains graph data for in-scope sources but we are operating on cache entries.
if let Ok(graph) = Graph::<C::ParsedSource>::resolve_sources(&self.project.paths, sources) {
let (sources, edges) = graph.into_sources();
// Calculate content hashes for later comparison.
self.fill_hashes(&sources);

// Calculate content hashes for later comparison.
self.fill_hashes(&sources);
// Pre-add all sources that are guaranteed to be dirty
for file in self.cache.files.keys() {
if self.is_dirty_impl(file, false) {
self.dirty_sources.insert(file.clone());
}
}

// Pre-add all sources that are guaranteed to be dirty
for file in sources.keys() {
if self.is_dirty_impl(file) {
// Build a temporary graph for populating cache. We want to ensure that we preserve all just
// removed entries with updated data. We need separate graph for this because
// `self.edges` only contains graph data for in-scope sources but we are operating on cache
// entries.
let Ok(graph) = Graph::<C::ParsedSource>::resolve_sources(&self.project.paths, sources)
else {
// Purge all sources on graph resolution error.
self.cache.files.clear();
return;
};

let (sources, edges) = graph.into_sources();

// Mark sources as dirty based on their imports
for file in sources.keys() {
if self.dirty_sources.contains(file) {
continue;
}
let is_src = self.is_source_file(file);
for import in edges.imports(file) {
// Any source file importing dirty source file is dirty.
if is_src && self.dirty_sources.contains(import) {
self.dirty_sources.insert(file.clone());
break;
// For non-src files we mark them as dirty only if they import dirty non-src file
// or src file for which interface representation changed.
} else if !is_src
&& self.dirty_sources.contains(import)
&& (!self.is_source_file(import) || self.is_dirty_impl(import, true))
{
self.dirty_sources.insert(file.clone());
}
}

// Perform DFS to find direct/indirect importers of dirty files.
for file in self.dirty_sources.clone().iter() {
populate_dirty_files(file, &mut self.dirty_sources, &edges);
}
} else {
// Purge all sources on graph resolution error.
self.dirty_sources.extend(files);
}

// Remove all dirty files from cache.
for file in &self.dirty_sources {
debug!("removing dirty file from cache: {}", file.display());
self.cache.remove(file);
}
}

fn is_dirty_impl(&self, file: &Path) -> bool {
let Some(hash) = self.content_hashes.get(file) else {
trace!("missing content hash");
return true;
};
// Create new entries for all source files
for (file, source) in sources {
if self.cache.files.contains_key(&file) {
continue;
}

self.create_cache_entry(file.clone(), &source);
}
}

fn is_dirty_impl(&self, file: &Path, use_interface_repr: bool) -> bool {
let Some(entry) = self.cache.entry(file) else {
trace!("missing cache entry");
return true;
};

if entry.content_hash != *hash {
trace!("content hash changed");
return true;
}
if use_interface_repr {
let Some(interface_hash) = self.interface_repr_hashes.get(file) else {
trace!("missing interface hash");
return true;
};

// If any requested extra files are missing for any artifact, mark source as dirty to
// generate them
for artifacts in self.cached_artifacts.values() {
for artifacts in artifacts.values() {
for artifact_file in artifacts {
if self.project.artifacts_handler().is_dirty(artifact_file).unwrap_or(true) {
return true;
}
}
if entry.interface_repr_hash.as_ref() != Some(interface_hash) {
trace!("interface hash changed");
return true;
};
} else {
let Some(content_hash) = self.content_hashes.get(file) else {
trace!("missing content hash");
return true;
};

if entry.content_hash != *content_hash {
trace!("content hash changed");
return true;
}
}

Expand All @@ -904,6 +944,14 @@ impl<T: ArtifactOutput<CompilerContract = C::CompilerContract>, C: Compiler>
if let hash_map::Entry::Vacant(entry) = self.content_hashes.entry(file.clone()) {
entry.insert(source.content_hash());
}
// Fill interface representation hashes for source files
if self.is_source_file(file) {
if let hash_map::Entry::Vacant(entry) =
self.interface_repr_hashes.entry(file.clone())
{
entry.insert(interface_representation_hash(source, file));
}
}
}
}
}
Expand Down Expand Up @@ -993,6 +1041,7 @@ impl<'a, T: ArtifactOutput<CompilerContract = C::CompilerContract>, C: Compiler>
dirty_sources: Default::default(),
content_hashes: Default::default(),
sources_in_scope: Default::default(),
interface_repr_hashes: Default::default(),
};

ArtifactsCache::Cached(cache)
Expand Down
Loading
Loading