Skip to content

Investigate whether dispersion should be fixed or estimated from the data #255

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jr-leary7 opened this issue Oct 17, 2024 · 1 comment
Closed
Labels
enhancement New feature or request GEE related to the GEE model backend

Comments

@jr-leary7
Copy link
Owner

  • related to Speed up GEE mode #241, as not estimating scale probably increases speed by reducing the number of scoring iterations
  • in extreme cases, very high dispersion values lead to wildly inflated standard errors and thus deflated test statistics / weird plots
  • dispersion is currently estimated via:

$$ \hat{\phi} = \left(-p + \sum_{i=1}^n n_i\right)^{-1} \sum_{i=1}^n \sum_{t=1}^{n_i} \hat{r}_{it}^2 $$

where $\hat{r}_{it}$ is the estimated residual for subject $i$ at timepoint $t$

@jr-leary7 jr-leary7 added enhancement New feature or request GEE related to the GEE model backend labels Oct 17, 2024
@jr-leary7
Copy link
Owner Author

commit 990bee8 removed the gee.scale.fix argument in favor of strictly fixing dispersion to be equal to 1 throughout. this decision was based on benchmarking performed on simulated data, in which we saw that fixing the scale lead to slightly better dynamic gene classification as well as lower runtimes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request GEE related to the GEE model backend
Projects
None yet
Development

No branches or pull requests

1 participant