Skip to content

Discussion: merging survival packages #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
RaphaelS1 opened this issue Jul 15, 2022 · 5 comments
Open

Discussion: merging survival packages #44

RaphaelS1 opened this issue Jul 15, 2022 · 5 comments

Comments

@RaphaelS1
Copy link

In similar spirit to #1 I have my own package (originally just hobby code to learn Julia, pls don't judge too harsh) for survival analysis. As @ararslan and I discussed on Slack I don't want to be 'competing' or causing duplicative code to be written so just wanted to open this issue to see if there are any parts I can merge in or if I should just archive it/keep for hobby code only.

So far I have:

  1. KaplanMeier (fitting, confidence intervals, plotting)
  2. NelsonAalen (fitting, confidence intervals, plotting)
  3. Surv - Generic survival object for left, right, or interval censoring so far. Has fields for pulling out key stats (e.g. risk sets, event times, etc.)
  4. WIP code for parametric PH models

My future plans were going to be:

  1. CoxPH
  2. Parameter PH and AFT (again with Distributions.jl for parametrisations)
  3. Transformers between distribution, ranking, and linear predictor predictions

My plan was then to hook this up between Turing.jl and mlr3proba for cross-language probabilistic ML in R.

If useful happy to go into detail about features/methods but won't for now.

I was curious about some comparisons, these might be of interest:

  1. Kaplan-Meier is 100x faster than R (as expected) and same results:
using RCall
using Random: seed!
using Distributions
using BenchmarkTools
using Survival
seed!(1)
n = 1000
T = round.(rand(Uniform(1, 10), n));
Δ = rand(Binomial(), n) .== 1;
surv = Surv(T, Δ, "right");
R"
    library(survival)
    time = $T
    status = $Δ
    surv = Surv(time, status)
";
@benchmark R"
    km = survfit(surv ~ 1)
"
julia> @benchmark R"
           km = survfit(surv ~ 1)
       "
BenchmarkTools.Trial: 6946 samples with 1 evaluation.
 Range (min  max):  599.208 μs  67.817 ms  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     625.708 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   718.383 μs ±  1.690 ms  ┊ GC (mean ± σ):  0.00% ± 0.00%

  █▇▅▃▂▁▁                                                      ▁
  ████████▇▇▇▆▆▅▅▆▄▄▄▄▄▅▁▁▃▄▄▃▃▃▄▁▁▄▁▁▁▁▁▃▄▁▁▃▃▃▁▁▁▁▃▃▃▃▁▁▃▁▁▃ █
  599 μs        Histogram: log(frequency) by time      2.26 ms <

 Memory estimate: 1.12 KiB, allocs estimate: 41.
julia> @benchmark kaplan(surv)
BenchmarkTools.Trial: 10000 samples with 5 evaluations.
 Range (min  max):  6.067 μs  619.550 μs  ┊ GC (min  max): 0.00%  97.93%
 Time  (median):     6.308 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   6.881 μs ±  14.405 μs  ┊ GC (mean ± σ):  6.14% ±  2.92%

      ██▁                                                      
  ▂▅▆▆███▆▃▃▃▃▄▄▄▃▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
  6.07 μs         Histogram: frequency by time        8.47 μs <

 Memory estimate: 8.03 KiB, allocs estimate: 151.
julia> R"
           paste(round(km$lower, 2),
           round(km$upper, 2), sep = ',')
       "
RObject{StrSxp}
 [1] "0.97,0.99" "0.9,0.94"  "0.83,0.88" "0.76,0.81" "0.69,0.75" "0.62,0.69"
 [7] "0.52,0.6"  "0.41,0.49" "0.27,0.35" "0.14,0.23"


julia> km = kaplan(surv);

julia> confint(km)
11-element Vector{Tuple{Float64, Float64}}:
 (1.0, 1.0)
 (0.9655902766870649, 0.9846564570042777)
 (0.90008189973811, 0.9343531290390091)
 (0.8304177051112486, 0.8755272501579991)
 (0.753866856584728, 0.8079562064849091)
 (0.6863770858305547, 0.7466648377965709)
 (0.6175078894755917, 0.683065386931595)
 (0.5222658655269816, 0.5941460334482926)
 (0.40985396069034197, 0.4878272995385684)
 (0.2674278465799201, 0.35115893829800077)
 (0.136515397416271, 0.22525900129765383)
  1. Competitive speeds against this package (mainly due to design decisions to slow creation of Surv object in favour of generating stats up front) - very interested in hearing opinions on this
julia> n = 1000;
julia> T = round.(rand(Uniform(1, 10), n));
julia> Δ = rand(Binomial(), n) .== 1;
julia> et = Survival.EventTime.(T, Δ);
julia> ot = Surv(T, Δ, "right");
julia> @btime EventTime.(T, Δ);
  792.120 ns (3 allocations: 15.81 KiB)
julia> @btime Surv(T, Δ, "right"); ## longer as expected due to postprocessing
  25.041 μs (49 allocations: 35.20 KiB)
julia> @btime (k = kaplan(srv); confint(k));
  6.700 μs (152 allocations: 8.27 KiB)
julia> @btime (k = fit(Survival.KaplanMeier, et); confint(k));
  36.041 μs (34 allocations: 52.33 KiB)
julia> @btime (k = kaplan(srv); confint(k));
  6.700 μs (152 allocations: 8.27 KiB)
julia> @btime (k = fit(Survival.KaplanMeier, et); confint(k));
  36.041 μs (34 allocations: 52.33 KiB)
julia> using Plots
julia> plot(kaplan(srv))

Screenshot 2022-07-15 at 19 27 18

@mwsohn
Copy link

mwsohn commented Mar 19, 2025

Your package with a CoxPh will be pretty usable. I am all for merging the two packages.

@RaphaelS1
Copy link
Author

Hey @mwsohn it's been a long time since I looked at this code! But very happy to have a chat to think about what merging would look like

@ararslan
Copy link
Member

Apologies for the (multi-year...) delayed response on this, I have very little time for open source stuff these days. Thinking back to what limited memory I have of our conversation in Slack, I may have misunderstood what you were saying at the time, @RaphaelS1: I thought you were intending to depend on Turing and/or R libraries in order to use them in your implementation, but looking very briefly at your package, it appears you aren't depending on them (currently, anyway). That means combining implementations should be more straightforward.

Contributions to this package would certainly be welcome, but I unfortunately can't guarantee a timely review, as evidenced by the existing PRs that have been languishing for years without review or response.

@RaphaelS1
Copy link
Author

Hey @ararslan strangely I was also not contributing to OS for several years and then returned to academia and OS exactly when @mwsohn messaged me, so perhaps a sign!

From what I remember, our packages have different design decisions so a PR could fundamentally change the interface.

Very tentatively I could take over maintenance of this package, would that help?

@mwsohn
Copy link

mwsohn commented Mar 28, 2025

I am not familiar with your package but I have used the coxph from the Survival package previously and it worked well. Because your package is much more developed with many more features, I would be happy if you can implement coxph in your package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants