Skip to content

Add some precompiles to help loading time #58436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

IanButterworth
Copy link
Member

@IanButterworth IanButterworth commented May 16, 2025

A tiny PR that just shaves about 10% off Plots load time (1.1s -> 0.95s) by adding some precompiles that JLL init's hit.

I thought I'd take the opportunity to benchmark, on a M2 Pro macbook:

1.10.9

julia> @time Pkg.precompile("Plots")
Precompiling Plots finished.
  146 dependencies successfully precompiled in 45 seconds. 6 already precompiled.
 45.247241 seconds (2.59 M allocations: 196.504 MiB, 0.06% gc time, 1.81% compilation time)

julia> @time using Plots
  0.999457 seconds (1.01 M allocations: 67.811 MiB, 1.81% compilation time)

julia> @time display(plot(rand(3), rand(3)))
  0.667132 seconds (96.95 k allocations: 6.709 MiB, 20.10% compilation time: 5% of which was recompilation)

1.11.5

julia> @time Pkg.precompile("Plots")
Precompiling Plots...
  147 dependencies successfully precompiled in 53 seconds. 35 already precompiled.
 52.805236 seconds (2.44 M allocations: 167.030 MiB, 0.07% gc time, 3 lock conflicts, 0.85% compilation time: 56% of which was recompilation)

julia> @time using Plots
  1.170431 seconds (1.22 M allocations: 79.566 MiB, 3.49% gc time, 2.23% compilation time: 43% of which was recompilation)

julia> @time display(plot(rand(3), rand(3)))
  0.384659 seconds (204.01 k allocations: 10.608 MiB, 42.04% compilation time: 13% of which was recompilation)

1.12.0-beta3

julia> @time Pkg.precompile("Plots")
Precompiling Plots finished.
  147 dependencies successfully precompiled in 56 seconds. 36 already precompiled.
 55.739992 seconds (2.01 M allocations: 160.495 MiB, 0.26% gc time, 5 lock conflicts, 0.39% compilation time: 66% of which was recompilation)

julia> @time using Plots
  1.246677 seconds (1.88 M allocations: 105.135 MiB, 3.32% gc time, 4.70% compilation time)

julia> @time display(plot(rand(3), rand(3)))
  0.376974 seconds (208.82 k allocations: 10.938 MiB, 38.16% compilation time: 18% of which was recompilation)

Master 43ead47

julia> @time Pkg.precompile("Plots")
Precompiling Plots finished.
  147 dependencies successfully precompiled in 59 seconds. 35 already precompiled.
 59.410907 seconds (1.92 M allocations: 155.733 MiB, 0.24% gc time, 5 lock conflicts, 0.40% compilation time: 49% of which was recompilation)

julia> @time using Plots
  1.108766 seconds (1.54 M allocations: 87.170 MiB, 11.34% gc time, 6.27% compilation time)

julia> @time display(plot(rand(3), rand(3)))
  0.403667 seconds (202.36 k allocations: 10.240 MiB, 44.11% compilation time: 16% of which was recompilation)

PR

julia> @time Pkg.precompile("Plots")
Precompiling Plots finished.
  147 dependencies successfully precompiled in 58 seconds. 35 already precompiled.
 58.074903 seconds (1.91 M allocations: 155.245 MiB, 0.21% gc time, 3 lock conflicts, 0.33% compilation time: 61% of which was recompilation)

julia> @time using Plots
  0.951054 seconds (1.23 M allocations: 71.972 MiB, 4.11% gc time)

julia> @time display(plot(rand(3), rand(3)))
  0.404166 seconds (202.13 k allocations: 10.213 MiB, 43.13% compilation time: 16% of which was recompilation)

And with this PR almost all loading compilation is gone

julia> @trace_compile @time_imports using Plots
#=    4.6 ms =# precompile(Tuple{typeof(Base.vect), Array{String, 1}, Vararg{Array{String, 1}}})
#=    3.3 ms =# precompile(Tuple{typeof(Base.iterate), Array{Array{String, 1}, 1}})
      1.4 ms  Statistics
               ┌ 93.5 ms SuiteSparse_jll.__init__()
     94.6 ms  SuiteSparse_jll
      1.6 ms  Serialization
               ┌ 1.2 ms SparseArrays.CHOLMOD.__init__()
    123.1 ms  SparseArrays
      1.8 ms  #=    7.5 ms =# precompile(Tuple{typeof(Base.Filesystem.joinpath), NTuple{7, String}})
Statistics → SparseArraysExt
      1.5 ms  Preferences
      1.0 ms  PrecompileTools
      1.0 ms  Reexport
      1.3 ms  Scratch
      1.4 ms  RelocatableFolders
      3.8 ms  RecipesBase
     14.9 ms  FixedPointNumbers
               ┌ 0.0 ms ColorTypes.__init__()
     15.7 ms  ColorTypes
     24.3 ms  Colors
      2.2 ms  TensorCore
               ┌ 0.0 ms ColorVectorSpace.__init__()
     22.0 ms  ColorVectorSpace
      4.4 ms  ColorSchemes
      1.7 ms  StableRNGs
     39.1 ms  PlotUtils
      5.2 ms  PlotThemes
               ┌ 2.9 ms OpenLibm_jll.__init__()
      4.3 ms  OpenLibm_jll
      1.4 ms  NaNMath
     10.7 ms  RecipesPipeline
               ┌ 0.0 ms Requires.__init__()
      1.6 ms  Requires
      2.0 ms  UnicodeFun
      1.2 ms  ColorTypes → StyledStringsExt
      1.6 ms  DataAPI
      1.3 ms  Compat
      1.2 ms  Compat → CompatLinearAlgebraExt
      4.7 ms  OrderedCollections
     26.2 ms  DataStructures
      1.7 ms  SortingAlgorithms
      4.1 ms  Missings
               ┌ 0.0 ms DocStringExtensions.__init__()
      1.9 ms  DocStringExtensions
      6.5 ms  IrrationalConstants
      1.6 ms  LogExpFunctions
      1.6 ms  StatsAPI
      1.5 ms  PtrArrays
      2.2 ms  AliasTables
     10.2 ms  StatsBase
      1.5 ms  Showoff
      3.1 ms  Unzip
      1.5 ms  JLLWrappers
#=    0.0 ms =# precompile(Tuple{typeof(JLLWrappers.get_julia_libpaths)})
               ┌ 0.1 ms fzf_jll.__init__()
      1.5 ms  fzf_jll
      1.4 ms  JLFzf
      1.8 ms  Mmap
     11.9 ms  Parsers
      3.5 ms  JSON
      3.5 ms  Measures
               ┌ 2.2 ms Bzip2_jll.__init__()
      3.8 ms  Bzip2_jll
#=    0.0 ms =# precompile(Tuple{typeof(Bzip2_jll.eager_mode)})
               ┌ 2.2 ms FreeType2_jll.__init__()
      3.9 ms  FreeType2_jll
               ┌ 2.1 ms FriBidi_jll.__init__()
      3.7 ms  FriBidi_jll
               ┌ 4.1 ms Libiconv_jll.__init__()
      5.8 ms  Libiconv_jll
               ┌ 2.1 ms Libffi_jll.__init__()
      3.8 ms  Libffi_jll
#=    0.0 ms =# precompile(Tuple{typeof(Libiconv_jll.eager_mode)})
               ┌ 2.9 ms XML2_jll.__init__()
      4.7 ms  XML2_jll
#=    0.0 ms =# precompile(Tuple{typeof(XML2_jll.eager_mode)})
               ┌ 3.4 ms Gettext_jll.__init__()
      5.2 ms  Gettext_jll
               ┌ 4.4 ms PCRE2_jll.__init__()
      6.1 ms  PCRE2_jll
#=    0.0 ms =# precompile(Tuple{typeof(Libffi_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(Gettext_jll.eager_mode)})
               ┌ 12.5 ms Glib_jll.__init__()
     14.3 ms  Glib_jll
               ┌ 3.8 ms LLVMOpenMP_jll.__init__()
      5.7 ms  LLVMOpenMP_jll
#=    0.0 ms =# precompile(Tuple{typeof(LLVMOpenMP_jll.eager_mode)})
               ┌ 2.8 ms Pixman_jll.__init__()
      4.7 ms  Pixman_jll
               ┌ 2.5 ms libpng_jll.__init__()
      4.3 ms  libpng_jll
      1.8 ms  Libuuid_jll
               ┌ 2.4 ms Expat_jll.__init__()
      4.4 ms  Expat_jll
#=    0.0 ms =# precompile(Tuple{typeof(FreeType2_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(Expat_jll.eager_mode)})
               ┌ 2.4 ms Fontconfig_jll.__init__()
      4.5 ms  Fontconfig_jll
#=    0.0 ms =# precompile(Tuple{typeof(Glib_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(Pixman_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(libpng_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(Fontconfig_jll.eager_mode)})
               ┌ 7.5 ms Cairo_jll.__init__()
      9.5 ms  Cairo_jll
               ┌ 2.7 ms Graphite2_jll.__init__()
      4.7 ms  Graphite2_jll
#=    0.0 ms =# precompile(Tuple{typeof(Cairo_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(Graphite2_jll.eager_mode)})
               ┌ 7.8 ms HarfBuzz_jll.__init__()
      9.9 ms  HarfBuzz_jll
#=    0.0 ms =# precompile(Tuple{typeof(FriBidi_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(HarfBuzz_jll.eager_mode)})
               ┌ 2.9 ms libass_jll.__init__()
      5.0 ms  libass_jll
               ┌ 2.7 ms libfdk_aac_jll.__init__()
      4.8 ms  libfdk_aac_jll
               ┌ 2.9 ms LAME_jll.__init__()
      4.9 ms  LAME_jll
               ┌ 2.2 ms Ogg_jll.__init__()
      4.4 ms  Ogg_jll
#=    0.0 ms =# precompile(Tuple{typeof(Ogg_jll.eager_mode)})
               ┌ 8.0 ms libvorbis_jll.__init__()
     10.3 ms  libvorbis_jll
               ┌ 3.0 ms libaom_jll.__init__()
      5.3 ms  libaom_jll
               ┌ 2.9 ms x264_jll.__init__()
      5.1 ms  x264_jll
               ┌ 3.7 ms x265_jll.__init__()
      5.9 ms  x265_jll
               ┌ 2.9 ms Opus_jll.__init__()
      5.3 ms  Opus_jll
#=    0.0 ms =# precompile(Tuple{typeof(libass_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(libfdk_aac_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(LAME_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(libvorbis_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(libaom_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(x264_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(x265_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(Opus_jll.eager_mode)})
               ┌ 17.1 ms FFMPEG_jll.__init__()
     19.8 ms  FFMPEG_jll
      2.3 ms  FFMPEG
               ┌ 3.0 ms GLFW_jll.__init__()
      5.5 ms  GLFW_jll
               ┌ 6.2 ms JpegTurbo_jll.__init__()
      8.9 ms  JpegTurbo_jll
               ┌ 3.2 ms LERC_jll.__init__()
      5.8 ms  LERC_jll
               ┌ 2.9 ms XZ_jll.__init__()
      5.6 ms  XZ_jll
               ┌ 3.1 ms Zstd_jll.__init__()
      5.6 ms  Zstd_jll
#=    0.0 ms =# precompile(Tuple{typeof(JpegTurbo_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(LERC_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(XZ_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(Zstd_jll.eager_mode)})
               ┌ 3.2 ms Libtiff_jll.__init__()
      5.7 ms  Libtiff_jll
      2.5 ms  libinput_jll
      2.5 ms  Xorg_libXext_jll
      2.7 ms  Xorg_libxcb_jll
      2.5 ms  Xorg_xcb_util_wm_jll
      2.5 ms  Xorg_xcb_util_cursor_jll
      2.6 ms  Xorg_xcb_util_image_jll
      4.2 ms  Xorg_xcb_util_keysyms_jll
      2.7 ms  Xorg_xcb_util_renderutil_jll
      2.6 ms  Xorg_libXrender_jll
      2.7 ms  Xorg_libSM_jll
      2.6 ms  xkbcommon_jll
      2.7 ms  Libglvnd_jll
               ┌ 3.6 ms Vulkan_Loader_jll.__init__()
      6.4 ms  Vulkan_Loader_jll
#=    0.0 ms =# precompile(Tuple{typeof(Vulkan_Loader_jll.eager_mode)})
               ┌ 58.1 ms Qt6Base_jll.__init__()
     61.1 ms  Qt6Base_jll
#=    0.0 ms =# precompile(Tuple{typeof(FFMPEG_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(GLFW_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(Libtiff_jll.eager_mode)})
#=    0.0 ms =# precompile(Tuple{typeof(Qt6Base_jll.eager_mode)})
               ┌ 0.1 ms GR_jll.__init__()
      3.4 ms  GR_jll
               ┌ 0.1 ms GR.GRPreferences.__init__()
      9.3 ms  GR
               ┌ 0.5 ms Plots.__init__()
    339.5 ms  Plots

Notes

Linux is a bit faster than macOS on this M2, and I think it's #58409, which #58405 works to avoid. i.e. If I remove the dlpath calls in SuiteSparse_jll alone I get

julia> @time using Plots
  0.866295 seconds (1.22 M allocations: 71.202 MiB, 4.40% gc time)

Also, I think we should change these zero trace-compiles to indicate more clearly that they're being inferred, not compiled, AFAIU. i.e. #= 0.0 ms =# precompile(Tuple{typeof(FFMPEG_jll.eager_mode)})

@IanButterworth IanButterworth added latency Latency backport 1.12 Change should be backported to release-1.12 labels May 16, 2025
@IanButterworth IanButterworth merged commit fc456bd into JuliaLang:master May 16, 2025
11 of 13 checks passed
@IanButterworth IanButterworth deleted the ib/loading_precompiles branch May 17, 2025 03:43
KristofferC pushed a commit that referenced this pull request May 20, 2025
@KristofferC KristofferC mentioned this pull request May 20, 2025
39 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 1.12 Change should be backported to release-1.12 latency Latency
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant