Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate benchmarks by enabled features #1538

Merged
merged 6 commits into from
Jan 11, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions candle-core/benches/bench_utils.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
use candle_core::{Device, Result};

pub(crate) trait BenchDevice {
fn sync(&self) -> Result<()>;
}

impl BenchDevice for Device {
fn sync(&self) -> Result<()> {
match self {
Device::Cpu => Ok(()),
Device::Cuda(device) => {
#[cfg(feature = "cuda")]
return Ok(device.synchronize()?);
#[cfg(not(feature = "cuda"))]
panic!("Cuda device without cuda feature enabled: {:?}", device)
}
Device::Metal(device) => {
#[cfg(feature = "metal")]
return Ok(device.wait_until_completed()?);
#[cfg(not(feature = "metal"))]
panic!("Metal device without metal feature enabled: {:?}", device)
}
}
}
}

#[allow(dead_code)]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this allow pragma? Maybe there is a way to avoid it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I don't know a way to get around it. Criterion doesn't run the code in the same way as the standard rust testing framework, so it doesn't register as being used.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I guess you googled a bit on the criterion side and haven't found anything (I haven't checked), could you add a comment to explain why these are needed? (and link to any criterion resource that would document this if any)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, cargo isn't complaining about unused code anymore 👍

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure? I just tried running the benchmarks with no feature enabled and got the warnings?
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I tried after cargo clean and still get no warnings 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is related to the /benches folder file structure https://users.rust-lang.org/t/how-to-properly-add-test-cases-for-benches/102859/4

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the warnings still present? If they are still there for any of you I'll try a different approach.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Just remove the utils file. Bench will walk all files including the utils which is why it will complain. Just but them in the core file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "core" file, do you mean matmul.rs?
I was thinking we could have several benchmarks matmul.rs, random.rs, etc.
If I place these in their own module and load them from a bench_main.rs file we won't be able to specify which benchmark to run.

pub(crate) fn device() -> Result<Device> {
return if cfg!(feature = "metal") {
Device::new_metal(0)
} else if cfg!(feature = "cuda") {
Device::new_cuda(0)
} else {
Ok(Device::Cpu)
};
}

#[allow(dead_code)]
pub(crate) fn bench_name<S: Into<String>>(name: S) -> String {
format!("{}_{}", device_variant(), name.into())
}

#[allow(dead_code)]
const fn device_variant() -> &'static str {
return if cfg!(feature = "metal") {
"metal"
} else if cfg!(feature = "cuda") {
"cuda"
} else if cfg!(feature = "accelerate") {
"accelerate"
} else if cfg!(feature = "mkl") {
"mkl"
} else {
"cpu"
};
}
16 changes: 8 additions & 8 deletions candle-core/benches/matmul.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
use candle_core::{DType, Device, Tensor};
mod bench_utils;

use crate::bench_utils::bench_name;
use bench_utils::{device, BenchDevice};
use candle_core::{DType, Tensor};
use criterion::{black_box, criterion_group, criterion_main, Criterion, Throughput};
use std::time::Instant;

Expand All @@ -12,26 +16,22 @@ fn criterion_benchmark(c: &mut Criterion) {
let n = 2048;
let k = 2048;

let device = Device::new_metal(0).unwrap();
let device = device().unwrap();
let dtype = DType::F32;
let lhs = Tensor::zeros((b, m, k), dtype, &device).unwrap();
let rhs = Tensor::zeros((b, n, k), dtype, &device).unwrap();

let flops = b * m * n * k;

let mut group = c.benchmark_group("matmul_metal");
let mut group = c.benchmark_group(bench_name("matmul"));
group.throughput(Throughput::Bytes(flops as u64));
group.bench_function("iter", move |b| {
b.iter_custom(|iters| {
let start = Instant::now();
for _i in 0..iters {
run(black_box(&lhs), black_box(&rhs));
}
if let Device::Metal(device) = &device {
device.wait_until_completed().unwrap();
} else {
panic!("Expected metal device");
}
device.sync().unwrap();
start.elapsed()
})
});
Expand Down
Loading