Skip to content

Parallel #167

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 138 commits into from
Closed

Parallel #167

wants to merge 138 commits into from

Conversation

farhadrclass
Copy link
Contributor

Summary

The function bmark_solvers_parallel is designed to benchmark a set of solvers on a set of problems in parallel, leveraging multi-core CPUs for performance improvements on large sets of problems or computationally intensive solvers. By running in parallel, it aims to significantly reduce runtime.

Key Functionality

Parallel Execution: Running solvers concurrently across multiple threads to enhance benchmarking speed.
CPU Utilization: This function is optimized for multi-core CPUs, and users should configure the number of threads based on their machine’s capabilities. If not specified, it defaults to single-threaded execution, similar to bmark_solver().
To specify the number of threads, users can set the JULIA_NUM_THREADS environment variable before starting Julia:

export JULIA_NUM_THREADS=4  # for 4 threads

Then, within the session, verify the number of threads with:


Threads.nthreads()

Questions and Considerations

The following are my concerns and Questions:

Examples for Documentation:
Examples should include a comparison of single-threaded versus multi-threaded runs, showing how to use bmark_solvers_parallel with different thread configurations.
Consider an example with varying numbers of solvers and problems to illustrate performance benefits.

Combining bmark_solvers_parallel and bmark_solvers:
Combining both functions could streamline the API. Adding a threads=4 default argument (with threads=1 for sequential execution) would allow users to control threading from within the function, reducing the need for environment variable configuration and simplifying usage.

Parallelizing Problems vs. Solvers:
For now, parallelizing the solvers is a good first step. In the future, should we consider parallelizing over both solvers and problems to maximize performance on systems with many cores.

Multi-CPU or Single Multi-Core CPU:
This implementation is best suited to single CPUs with multiple cores (e.g., 4-8 cores typical for desktop CPUs). Supporting multi-CPU systems would require additional architecture-specific optimizations and may be better handled as a future extension if user demand and workload complexity increase.

@farhadrclass farhadrclass self-assigned this Oct 30, 2024
@farhadrclass farhadrclass requested review from dpo and tmigot October 30, 2024 16:16
@MaxenceGollier
Copy link
Contributor

Thank you @farhadrclass, this will be very helpful for me.
As already mentionned by @dpo, wouldn't it be easier to just modify bmark_solvers ?
I had this in mind:
Modify

for (name, solver) in solvers

with

Threads.@threads for (name, solver) in solvers

We could also modify

for (id, problem) in enumerate(problems)

with

Threads.@threads for (id, problem) in enumerate(problems) 

although the second option might require a little more care on stats because each thread pushes into it, it is maybe preferrable because we would still have each solver printed out sequentially.
In all cases, thank you again, I hope this will be merged soon !

@farhadrclass
Copy link
Contributor Author

@MaxenceGollier
thanks for your feedback, yes the idea was t include them both in there,
I would suggest the solvers, (vs. Problems vs. both) the issue in the problems is that we have many of them and once we update it then writing to stats would be problematic, and for time it is actually the same if we do problems or solvers, doing both might be more challenging since we need to coordinate when each is finish and how we write to the stats.,
I will push the code soon for review

@farhadrclass
Copy link
Contributor Author

@dpo and @MaxenceGollier the code has been updated

@farhadrclass
Copy link
Contributor Author

@dpo the checks keep failing and says can not find cons_nln method, not sure why

@dpo
Copy link
Member

dpo commented Nov 12, 2024

I don't see that error. Where is it?

You can drop Julia 1.6 and replace it with 1.10.

Co-authored-by: Dominique <dominique.orban@gmail.com>
@farhadrclass
Copy link
Contributor Author

I don't see that error. Where is it?

You can drop Julia 1.6 and replace it with 1.10.

The issue is the night build

@farhadrclass
Copy link
Contributor Author

I don't see that error. Where is it?

You can drop Julia 1.6 and replace it with 1.10.

do you want me to do it in this PR or create a new one

@tmigot
Copy link
Member

tmigot commented Nov 12, 2024

Hi guys! I tried to explain here #172 why the Examples fails.

@MaxenceGollier
Copy link
Contributor

@farhadrclass
As I was expecting, this seems to fail when benchmarking with CUTEst:

using SolverCore
using CUTEst

import SolverCore.dummy_solver

function test_cutest()
  problem_names = CUTEst.select(min_con=1, max_con = 3, max_var= 3, only_equ_con=true, only_free_var=true)[1:5]
  problem_list = (CUTEstModel(name) for name in problem_names)
  solvers = Dict(:dummy_1 => dummy_solver, :dummy_2 => dummy_solver)
  stats = bmark_solvers(solvers, problem_list)
end

test_cutest()

produces exceptions

[ Info: running solver dummy_2
[ Info: running solver dummy_1
At line 398 of file ../src/decode/sifdecoder_standalone.f90 (unit = 57)
Fortran runtime error: File cannot be deleted

Error termination. Backtrace:

Could not print backtrace: libbacktrace could not find executable to open
#0  0x8562adda
#1  0x855b0d51
#2  0x854d47db
#3  0x85597743
#4  0x625310ff
#5  0x4016e4
#6  0x40310f
#7  0x4012f2
#8  0x401405
#9  0xc88e259c
#10  0xc8aeaf37
#11  0xffffffff

gfortran.exe: error: ELFUN.o: No such file or directory
ERROR: LoadError: TaskFailedException

    nested task error: failed process: Process(`'C:\Users\mgoll\.julia\artifacts\fdff308295487f361ef6e8dc2d27f5abe8a6eee9\mingw64\bin\gfortran.exe' -shared -o libWAYSEA1NE_double.dll ELFUN.o -Wl,--whole-archive 'C:\Users\mgoll\.julia\artifacts\eacd3c5d669a4b2c1bd3b7aad671b55cd097cd99\lib\libcutest_double.a' -Wl,--no-whole-archive`, ProcessExited(1)) [1]

    Stacktrace:
      [1] pipeline_error
        @ .\process.jl:598 [inlined]
      [2] run(::Cmd; wait::Bool)
        @ Base .\process.jl:513
      [3] run
        @ .\process.jl:510 [inlined]
      [4] (::CUTEst.var"#9#10"{String})()
        @ CUTEst C:\Users\mgoll\.julia\packages\CUTEst\Eiaqh\src\sifdecoder.jl:218
      [5] cd(f::CUTEst.var"#9#10"{String}, dir::String)
        @ Base.Filesystem .\file.jl:101
      [6] build_libsif(name::String; precision::Symbol, libsif_folder::String)
        @ CUTEst C:\Users\mgoll\.julia\packages\CUTEst\Eiaqh\src\sifdecoder.jl:175
      [7] (::CUTEst.var"#3#4"{Float64, Bool, Bool, Tuple{}, Base.RefValue{Int32}, Base.RefValue{Int32}, String, String, Symbol})()
        @ CUTEst C:\Users\mgoll\.julia\packages\CUTEst\Eiaqh\src\model.jl:131
      [8] cd(f::CUTEst.var"#3#4"{Float64, Bool, Bool, Tuple{}, Base.RefValue{Int32}, Base.RefValue{Int32}, String, String, Symbol}, dir::String)
        @ Base.Filesystem .\file.jl:101
      [9] CUTEstModel{Float64}(::String; decode::Bool, verbose::Bool, efirst::Bool, lfirst::Bool, lvfirst::Bool)
        @ CUTEst C:\Users\mgoll\.julia\packages\CUTEst\Eiaqh\src\model.jl:124
     [10] CUTEstModel(::String; precision::Symbol, decode::Bool, verbose::Bool, efirst::Bool, lfirst::Bool, lvfirst::Bool)
        @ CUTEst C:\Users\mgoll\.julia\packages\CUTEst\Eiaqh\src\model.jl:93
     [11] CUTEstModel
        @ C:\Users\mgoll\.julia\packages\CUTEst\Eiaqh\src\model.jl:74 [inlined]
     [12] #1
        @ .\none:0 [inlined]
     [13] iterate
        @ .\generator.jl:48 [inlined]
     [14] iterate
        @ .\iterators.jl:206 [inlined]
     [15] iterate
        @ .\iterators.jl:205 [inlined]
     [16] solve_problems(solver::typeof(dummy_solver), solver_name::Symbol, problems::Base.Generator{Vector{String}, var"#1#2"}; solver_logger::Base.CoreLogging.NullLogger, reset_problem::Bool, skipif::SolverBenchmark.var"#39#47", colstats::Vector{Symbol}, info_hdr_override::Dict{Symbol, String}, prune::Bool, kwargs::@Kwargs{})
        @ SolverBenchmark C:\Users\mgoll\SolverBenchmark.jl\src\run_solver.jl:96
     [17] solve_problems(solver::Function, solver_name::Symbol, problems::Base.Generator{Vector{String}, var"#1#2"})
        @ SolverBenchmark C:\Users\mgoll\SolverBenchmark.jl\src\run_solver.jl:30
     [18] macro expansion
        @ C:\Users\mgoll\SolverBenchmark.jl\src\bmark_solvers.jl:24 [inlined]
     [19] (::SolverBenchmark.var"#110#threadsfor_fun#51"{SolverBenchmark.var"#110#threadsfor_fun#50#52"{@Kwargs{}, Tuple{Base.Generator{Vector{String}, var"#1#2"}}, Dict{Symbol, DataFrame}, Vector{Pair{Symbol, typeof(dummy_solver)}}}})(tid::Int64; onethread::Bool)
        @ SolverBenchmark .\threadingconstructs.jl:252
     [20] #110#threadsfor_fun
        @ .\threadingconstructs.jl:219 [inlined]
     [21] (::Base.Threads.var"#1#2"{SolverBenchmark.var"#110#threadsfor_fun#51"{SolverBenchmark.var"#110#threadsfor_fun#50#52"{@Kwargs{}, Tuple{Base.Generator{Vector{String}, var"#1#2"}}, Dict{Symbol, DataFrame}, Vector{Pair{Symbol, typeof(dummy_solver)}}}}, Int64})()
        @ Base.Threads .\threadingconstructs.jl:154

...and 1 more exception.

Stacktrace:
  [1] threading_run(fun::SolverBenchmark.var"#110#threadsfor_fun#51"{SolverBenchmark.var"#110#threadsfor_fun#50#52"{@Kwargs{}, Tuple{Base.Generator{Vector{String}, var"#1#2"}}, Dict{Symbol, DataFrame}, Vector{Pair{Symbol, typeof(dummy_solver)}}}}, static::Bool)
    @ Base.Threads .\threadingconstructs.jl:172
  [2] macro expansion
    @ .\threadingconstructs.jl:189 [inlined]
  [3] #bmark_solvers#49
    @ C:\Users\mgoll\SolverBenchmark.jl\src\bmark_solvers.jl:22 [inlined]
  [4] bmark_solvers
    @ C:\Users\mgoll\SolverBenchmark.jl\src\bmark_solvers.jl:20 [inlined]
  [5] test_cutest()
    @ Main C:\Users\mgoll\SolverBenchmark.jl\test\test_cutest.jl:10
  [6] top-level scope
    @ C:\Users\mgoll\SolverBenchmark.jl\test\test_cutest.jl:13
  [7] include(fname::String)
    @ Main .\sysimg.jl:38
  [8] top-level scope
    @ C:\Users\mgoll\SolverBenchmark.jl\test\runtests.jl:25
  [9] include(fname::String)
    @ Main .\sysimg.jl:38
 [10] top-level scope
    @ none:6
in expression starting at C:\Users\mgoll\SolverBenchmark.jl\test\test_cutest.jl:13
in expression starting at C:\Users\mgoll\SolverBenchmark.jl\test\runtests.jl:25
ERROR: Package SolverBenchmark errored during testing

I am trying to investigate what is going on here. Could you try to see if the error reproduces on your machine ?
for info,

julia> versioninfo()
Julia Version 1.11.1
Commit 8f5b7ca12a (2024-10-16 10:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 12 × Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, skylake)
Threads: 4 default, 0 interactive, 2 GC (on 12 virtual cores)

@farhadrclass
Copy link
Contributor Author

@MaxenceGollier I ran the test on my machine, it passed, however some of the solvers were skipped due to some catch and expect logic, something along the line of the Hessian is not available.
Let me know if you want anything on my side, I would have some time to look at it more on the weekend

@MaxenceGollier
Copy link
Contributor

MaxenceGollier commented Nov 13, 2024

however some of the solvers were skipped due to some catch and expect logic, something along the line of the Hessian is not available.

Yes all dummy solvers will fail on cutest models this is normal.

@MaxenceGollier I ran the test on my machine, it passed

This is weird my initial guess whas that CUTEst had issued with different models being used at once (i.e decoding multiple sifs at once) I guess this is not the case if it passes on your machine. Could you send me your versioninfo ?

@farhadrclass
Copy link
Contributor Author

@MaxenceGollier sorry for delay, didn't see the notification,
here is my julia version (I need to update but need to Finilize some experiment before :D )

julia> versioninfo()
Julia Version 1.9.4
Commit 8e5136fa29 (2023-11-14 08:46 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: 8 × Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, skylake)
  Threads: 4 on 8 virtual cores
Environment:
  JULIA_EDITOR = code
  JULIA_NUM_THREADS = 4

@dpo
Copy link
Member

dpo commented Jan 21, 2025

@farhadrclass Please remove merge commits from here and rebase your branch to use Julia 1.10 on GitHub CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants