-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Evaluation, Reproducibility, Benchmarks Meeting 32
Nicholas Heller edited this page Feb 26, 2025
·
1 revision
Date: 26th February, 2025
The MICCAI deadline is coming up, so we might have fewer today.
- Carole
- Annika
- Lena
- Olivier
- Nick
- "False Promises" project
- Already have some comments from Olivier
- Make more visible the contributions so far
- Initiatives could be phrased better -- we don't really work on developing new metrics
- Include links/summaries of papers we've published
- Closer to top of page
- Carole, Nick, and Annika will sit together and flesh out new wording and send this to the group
Original presentation
- CIs are always required by regulatory bodies, and generally good practice for science
- Not often used at MICCAI (reference)
- There are several methods to use
- Parametric vs nonparametric, etc.
- Paper describes the five most common methods in use
- Characteristics of a good method
- Coverage (can test via simulation only because you don't know the true value)
- First conclusion: There is no parametric distribution that appears to be a good fit
- Second conclusion: The mean is not a robust summary statistic
- Looking at CIs over the median instead gets interesting. SciPy's default methods perform poorly (BCA bootstrap)
- Should use percentile bootstrap instead
- Concludes with a flowchart with recommendations based on presence/absence of outliers and sample size
Feedback
- Is the decision about mean vs median specifically about outliers? Or are others like skewness applicable here?
- Should maybe defer the guidelines for deciding mean vs median to prior publications
- Is MICCAI an appropriate venue for proposing guideines?
- Yes, we think so, but we can make clear that they are provisional