[WIP] Design of pipeline to save results from ML inversion #317

facusapienza21 · 2025-05-12T22:12:14Z

@albangossard as I am running simulation for the inversion of the diffusivity D, I thought it was going to be useful to figure out how to save relevant outputs of the simulation so these can be analyze later. The most important of these, the trained parameters of the neural network to allow post evaluation. This PR is trying to address this.

Notice that Sleipnir also has a save results function, but this is rather different in nature than saving the information of the model (rather than the information of a forward SIA simulation). What do you think?

This PR is work in progress, so it would be nice to think (assuming we approved the API design) what else we would like to save. Notice that the architecture of the NN will be harder to save, so maybe we want to think about this.

So far, this PR has

Design of simple Results object
Save of such object at the end of simulation in jld2 format
A test to assess that the file is saving the information correctly

Any feedback welcome!

albangossard

Thank you @facusapienza21, that's a good start to start logging information about the training!
I'm wondering if we should use https://github.com/JuliaLogging/TensorBoardLogger.jl to log information like the training histories. That would make our life easier.
That's not incompatible with the Results struct that you created, it's just complementary. We could use tensorboard to log the stats and loss histories, and use Results to collect the weights and the parameters in order to run predictions afterwards after the training.
What is your opinion on this?

albangossard · 2025-05-21T08:26:17Z

src/results/result_utils.jl

+
+    # Create path for simulation results
+    if isnothing(path)
+        simulation_path = joinpath(dirname(Base.current_project()), "data/results/simulation")


Maybe something more precise than "simulation" which could be either an inversion or a prediction? Currently, ODINN already generates a "data/results/predictions" folder

albangossard · 2025-05-21T08:40:06Z

src/results/Results.jl

+@kwdef struct Result{F <: AbstractFloat} <: AbstractResult
+    θ::ComponentVector
+    losses::Vector{F}
+    params::Sleipnir.Parameters


We could add some stats about the performance like the time per epoch, what do you think?

I'm even wondering if we should rely on an external library that does everything for us. Tensorboard is the gold standard to log learning histories in Python and there is a Julia backend: https://github.com/JuliaLogging/TensorBoardLogger.jl
That would allow us to access many stats like the time per step, and the training/validation loss seamlessly. What do you think?

…ates

[WIP] Design of pipeline to save results from ML inversion

555fdb9

facusapienza21 requested a review from albangossard May 12, 2025 22:12

albangossard reviewed May 21, 2025

View reviewed changes

facusapienza21 added 3 commits May 30, 2025 16:14

Changes to results to track training history and save intermediate st…

50edb64

…ates

Diffusion inversion with Halfar

16075e0

Add pretraining util for neural network

9aad5c3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Design of pipeline to save results from ML inversion #317

[WIP] Design of pipeline to save results from ML inversion #317

Uh oh!

facusapienza21 commented May 12, 2025

Uh oh!

albangossard left a comment

Uh oh!

albangossard May 21, 2025

Uh oh!

albangossard May 21, 2025

Uh oh!

Uh oh!

[WIP] Design of pipeline to save results from ML inversion #317

Are you sure you want to change the base?

[WIP] Design of pipeline to save results from ML inversion #317

Uh oh!

Conversation

facusapienza21 commented May 12, 2025

Uh oh!

albangossard left a comment

Choose a reason for hiding this comment

Uh oh!

albangossard May 21, 2025

Choose a reason for hiding this comment

Uh oh!

albangossard May 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!