1
- # Trainax - Learning Methodologies for Autoregressive Neural Emulators
1
+ # Trainax
2
2
3
- ![ ] ( https://ceyron.github.io/predictor-learning-setups/sup-3-none-true-full_gradient.svg )
3
+ <p align =" center " >
4
+ <b >Learning Methodologies for Autoregressive Neural Emulators</b >
5
+ </p >
6
+
7
+ <p align =" center " >
8
+ <img src="https://ceyron.github.io/predictor-learning-setups/sup-3-none-true-full_gradient.svg" width="400">
9
+ </p >
4
10
5
11
After the discretization of space and time, the simulation of a transient
6
12
partial differential equation amounts to the repeated application of a
58
64
59
65
where $l$ is a ** time-level loss** . In the easiest case $l = \text{MSE}$.
60
66
61
- ### More
62
-
63
- Focus is clearly on the number of update steps, not on the number of epochs
64
-
65
-
66
67
### A taxonomy of learning setups
67
68
68
69
The major axes that need to be chosen are:
@@ -108,7 +109,7 @@ There are three levels of hierarchy:
108
109
diverted-chain, mix-chain, residuum). The most general diverted chain
109
110
implementation contains supervised and branch-one diverted chain as special
110
111
cases. See the section "Relation between Diverted Chain and Residuum
111
- Training" for details how residuum training fits into the picture. All
112
+ Training" (TODO) for details how residuum training fits into the picture. All
112
113
configurations allow setting additional constructor arguments to, e.g., cut
113
114
the backpropagation through time (sparsely) or to supply time-level
114
115
weightings (for example to exponentially discount contributions over long
@@ -119,4 +120,4 @@ There are three levels of hierarchy:
119
120
combining the relevant configuration with the ` GeneralTrainer ` and a
120
121
trajectory substacker.
121
122
122
- ### Relation between Diverted Chain and Residuum Training
123
+ You can find an overview of predictor learning setups [ here ] ( https://fkoehler.site/predictor-learning-setups/ ) .
0 commit comments