Skip to content

Mislabelled output for BayesFactor::regressionBF() models? #1084

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
profandyfield opened this issue Apr 17, 2025 · 16 comments
Open

Mislabelled output for BayesFactor::regressionBF() models? #1084

profandyfield opened this issue Apr 17, 2025 · 16 comments
Labels
Bug 🐛 Something isn't working Enhancement 💥 Implemented features can be improved or revised Low priority 😴 This issue does not impact package functionality much

Comments

@profandyfield
Copy link

With regressionBF, if I inspect a model directly I get

require(discovr)
require(BayesFactor)

album_tib <- discovr::album_sales
album_bf <- BayesFactor::regressionBF(sales ~ adverts + airplay + image, rscaleCont = "medium", data = album_tib)
album_bf


  |==================================================================================================| 100%
Bayes factor analysis
--------------
[1] adverts                   : 1.320123e+16 ±0%
[2] airplay                   : 4.723817e+17 ±0.01%
[3] image                     : 6039.289     ±0%
[4] adverts + airplay         : 5.65038e+39  ±0%
[5] adverts + image           : 2.65494e+20  ±0%
[6] airplay + image           : 1.034464e+20 ±0%
[7] adverts + airplay + image : 7.746101e+42 ±0%

Against denominator:
  Intercept only 
---
Bayes factor type: BFlinearModel, JZS

but with model_parameters(album_bf) I get:

model_parameters(album_bf)


Multiple `BFBayesFactor` models detected - posteriors are extracted from the first numerator
  model.
  See help("get_parameters", package = "insight").
# Extra Parameters 

Parameter |  Median |             95% CI |   pd |       BF
----------------------------------------------------------
mu        |  193.03 | [ 183.63,  202.51] | 100% | 1.32e+16
adverts   |    0.09 | [   0.08,    0.11] | 100% | 4.72e+17
sig2      | 4389.28 | [3629.69, 5366.11] | 100% | 6.04e+03
g         |    0.42 | [   0.08,   12.32] | 100% | 5.65e+39

# Fixed Effects 

Parameter |       BF
--------------------
adverts   | 2.65e+20
airplay   | 1.03e+20
image     | 7.75e+42

The values in column BF map onto the output of album_bf but the labels in Parameter do not. Am I misunderstanding the labels, or is model_parameters() mis-labelling? [For the record I'm using model_parameters() to get nice output and because I want students to learn a consistent workflow with all models.]

@strengejacke
Copy link
Member

What about the bayestestR functions?

@profandyfield
Copy link
Author

You mean, what about using them instead? I don't have space to properly talk about priors so I need something that has defaults comparable to BayesFactor (which I'm useing as a gateway drug for the reader!) I don't have space to get into stuff like brms or stan. If you can point me to something that shows how to mimic Bayesfactor functions using bayestestR then I'll take a look. I couldn't find anything obvious on the bayestestR website.

I should also add that @DominiqueMakowski is exerting considerable pressure to do everything with model-parameters() (and I think his point is quite compelling in terms of students having to learn a workflow that they can apply to almost any model).

@DominiqueMakowski
Copy link
Member

What about the bayestestR functions?

Haha I literally said to andy "use parameters instead of bayestestR for that"

But regardless, same issue:

> bayestestR::describe_posterior(album_bf)
Multiple `BFBayesFactor` models detected - posteriors are extracted from
  the first numerator model.
  See help("get_parameters", package = "insight").
Summary of Posterior Distribution

Parameter       |  Median |             95% CI |   pd |          ROPE
---------------------------------------------------------------------
mu              |  193.14 | [ 183.86,  202.58] | 100% | [-8.07, 8.07]
adverts         |    0.09 | [   0.08,    0.11] | 100% | [-8.07, 8.07]
sig2            | 4389.83 | [3631.90, 5373.08] | 100% | [-8.07, 8.07]
g               |    0.44 | [   0.08,   11.72] | 100% | [-8.07, 8.07]
adverts-adverts |         |                    |      |              
airplay-airplay |         |                    |      |              
image-image     |         |                    |      |              

Parameter       | % in ROPE |       BF | Prior
----------------------------------------------
mu              |        0% | 1.32e+16 |      
adverts         |      100% | 1.32e+16 |      
sig2            |        0% | 1.32e+16 |      
g               |    98.76% | 1.32e+16 |      
adverts-adverts |           |          |      
airplay-airplay |           |          |      
image-image     |           |          |

@DominiqueMakowski
Copy link
Member

We probably never implemented full support for RegressionBF in insight:

> insight::find_parameters(album_bf)
$conditional
[1] "adverts-adverts" "airplay-airplay" "image-image"    

$extra
[1] "mu"      "adverts" "sig2"    "g"

@strengejacke
Copy link
Member

Yes, I think so, too

@mattansb
Copy link
Member

This is definitely wrong. I'll take a look.

(However, @profandyfield might I suggest not teaching the {BayesFactor} package for inference anything more complex than a correlation/contingency table/t test 📛)

@profandyfield
Copy link
Author

I think the only context - other rthan those you list - that I use it is comparing linear models (as in the example above). For this case what would you suggest instead? (Bearing in mind the aim is as a gateway to more sophisticated approaches should the user be convinced to find out more about Bayesian methods.)

@strengejacke
Copy link
Member

The bf_*() functions in bayestestR should work. But testing parameters only works for Bayesian models

@DominiqueMakowski
Copy link
Member

Sidetracking the original issue, but I also only use {BayesFactor} exclusively for t-tests & correlations, and find BayesFactor::regressionBF confusing: you would expect that it would be consistent with other "regression" functions (lm, glm) and return parameters of the specified model, but it does something quite different and the output is quite confusing

I would also just not use BFs for anything else than simple tests, and simply signpost that doing Bayesian regressions requires a bit more thought / different approach and is outside the scope of the module...

@strengejacke
Copy link
Member

I just saw that the BF function does a combined all variables and indeed tests models.

In this case, you could fit the single models and use bf_models().

@strengejacke
Copy link
Member

(and that might be the reason why parameters or insight fail, because of the dynamic output which we haven't taken into consideration yet)

@strengejacke
Copy link
Member

album_tib <- discovr::album_sales

lm0 <- lm(sales ~ 1, data = album_tib)
lm1 <- lm(sales ~ adverts, data = album_tib)
lm2 <- lm(sales ~ airplay, data = album_tib)
lm3 <- lm(sales ~ image, data = album_tib)
lm4 <- lm(sales ~ adverts + airplay, data = album_tib)
lm5 <- lm(sales ~ adverts + image, data = album_tib)
lm6 <- lm(sales ~ airplay + image, data = album_tib)
lm7 <- lm(sales ~ adverts + airplay + image, data = album_tib)

bayestestR::bf_models(lm0, lm1, lm2, lm3, lm4, lm5, lm6, lm7, denominator = 1)
#> Bayes Factors for Model Comparison
#> 
#>       Model                           BF
#> [lm1] adverts                   3.50e+16
#> [lm2] airplay                   1.39e+18
#> [lm3] image                     5.40e+03
#> [lm4] adverts + airplay         6.24e+40
#> [lm5] adverts + image           7.11e+20
#> [lm6] airplay + image           2.67e+20
#> [lm7] adverts + airplay + image 1.00e+44
#> 
#> * Against Denominator: [lm0] (Intercept only)
#> *   Bayes Factor Type: BIC approximation

Created on 2025-04-18 with reprex v2.1.1

@mattansb
Copy link
Member

@profandyfield For a soft entry into Bayesian estimation, I would avoid {BayesFactor} because the parameterization there is non-standard, and it has very little support (plotting, emmeans/marginaleffects, etc...) - instead I would teach {rstanarm} (though I typically skip straight to {brms}).
For Bayesian model comparisons, I would also avoid {BayesFactor} because it is limited only to linear models (and also there is much debate regarding the validity of "default priors" for testing in complex models). A very soft entry would be the BIC approximations, as demonstrated above by @strengejacke - it is easy to already do if you know how to fit models, and widely applicable. For BFs for specific hypothesis, I would also switch to {rstanarm}/{brms} + {bridgesampling} (which we wrap in bayestestR with bayesfactor_models()).


Regarding your issue, parameters::model_parameters() and bayestestR::describe_posterior() both have a BF column, but it is not the same - in bayestestR::describe_posterior() it is repreating the BF of the selected (first) model compared to the null, while in parameters::model_parameters() it is the BF table.

I think we discussed this elsewhere, but I think in both cases the BF column is inappropriate as it implies these are parameter-specific Bayes factors (such as those given by bayesfactor_parameters()), which they are not.

We should be returning information by all the parameters provided by BayesFactor::posterior() other than g or those starting with g_.

@profandyfield
Copy link
Author

Thanks for the useful comments @mattansb. Given my time constraints, I will use the BIC approximations in the first instance but have made a note-to-self to look at rstanarm (which I haven't used but looks interesting) in more detail and come back to it later in the writing cycle if there's time.

@strengejacke strengejacke added Bug 🐛 Something isn't working Enhancement 💥 Implemented features can be improved or revised Low priority 😴 This issue does not impact package functionality much labels Apr 24, 2025
@bwiernik
Copy link
Contributor

@profandyfield I used the Regression and Other Stories textbook when I taught intro regression. It uses rstanarm for all of its code examples and I think it's a great text for folks new to modeling in general or to Bayesian modeling in particular.

@profandyfield
Copy link
Author

That's a convenient recommendation @bwiernik 😀image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug 🐛 Something isn't working Enhancement 💥 Implemented features can be improved or revised Low priority 😴 This issue does not impact package functionality much
Projects
None yet
Development

No branches or pull requests

5 participants