Skip to content

[WIP] Design of pipeline to save results from ML inversion #317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

facusapienza21
Copy link
Member

@albangossard as I am running simulation for the inversion of the diffusivity D, I thought it was going to be useful to figure out how to save relevant outputs of the simulation so these can be analyze later. The most important of these, the trained parameters of the neural network to allow post evaluation. This PR is trying to address this.

Notice that Sleipnir also has a save results function, but this is rather different in nature than saving the information of the model (rather than the information of a forward SIA simulation). What do you think?

This PR is work in progress, so it would be nice to think (assuming we approved the API design) what else we would like to save. Notice that the architecture of the NN will be harder to save, so maybe we want to think about this.

So far, this PR has

  • Design of simple Results object
  • Save of such object at the end of simulation in jld2 format
  • A test to assess that the file is saving the information correctly

Any feedback welcome!

Copy link
Member

@albangossard albangossard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @facusapienza21, that's a good start to start logging information about the training!
I'm wondering if we should use https://github.com/JuliaLogging/TensorBoardLogger.jl to log information like the training histories. That would make our life easier.
That's not incompatible with the Results struct that you created, it's just complementary. We could use tensorboard to log the stats and loss histories, and use Results to collect the weights and the parameters in order to run predictions afterwards after the training.
What is your opinion on this?


# Create path for simulation results
if isnothing(path)
simulation_path = joinpath(dirname(Base.current_project()), "data/results/simulation")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something more precise than "simulation" which could be either an inversion or a prediction? Currently, ODINN already generates a "data/results/predictions" folder

@kwdef struct Result{F <: AbstractFloat} <: AbstractResult
θ::ComponentVector
losses::Vector{F}
params::Sleipnir.Parameters
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add some stats about the performance like the time per epoch, what do you think?

I'm even wondering if we should rely on an external library that does everything for us. Tensorboard is the gold standard to log learning histories in Python and there is a Julia backend: https://github.com/JuliaLogging/TensorBoardLogger.jl
That would allow us to access many stats like the time per step, and the training/validation loss seamlessly. What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants