-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting multiple forward or adjoint directional derivatives with one call. #1902
Comments
Would it be possible to calculate and cache the values upon the first invocation of the respective API functions? |
Yes, it's possible, but often not desired or practical. It would require the storing of every intermediate operations used for the calculation, which may require too much memory. |
To add to this: Think of the case where the BLT sorting in a Modelica tool results in a series of Newton iterations. In the forward/adjoint propagation, you would need to do linear solves using the same linearized system (transposed for adjoints). Multiple forward/adjoint directions just corresponds to multiple right-hand-sides of the linear systems. Working with multiple right-hand-sides is usually more efficient. Also, if you have to redo the Jacobian calculation and/or factorization, this cost will be the same regardless of the number of directions. |
To explain the proposal I think one should view it as an interface that computes J*V for a matrix V instead of a vector v (and similarly for adjoint), I can see that it might offer some minor advantage (especially for adjoints), but it is not clear to me that it is needed and I could see some alternatives.
However, the Jacobian factorization is normally the same ones as used for computing the values - so either one could cache it from then or one could consider an interface computing both (at least for forward). If the Jacobian is recomputed and refactorized during the solution it's messier - either all factored Jacobians need to be stored (typically less than a handful) or one has to "cheat". Sending in multiple directions require that all of the directions are known at the same time; whereas caching would handle the different directions being computed in sequence. If we are concerned that caching takes memory and time when not used, one could consider doing something special before the value call. |
FMI-Design meeting: |
To take a step back: Fundamentally, what is important is that we can calculate multiple directional derivatives efficiently. If that can be resolved with caching inside the FMUs, doing it in one call is not necessary. I think that there are fundamental reasons to why you can be faster with a single call as opposed to multiple calls (such as being able to use matrix-operations internally and less caching/checkpointing), but caching multiple calls might be fast enough. |
Comment: this could also been in a layered standard, that defines new additional functions. If it shall be included in a minor version a working group should work on a proposal |
(From a discussion with @jaeandersson on efficiency improvements for using FMI in optimization problems)
As a very simple example, consider the scalar function f(x) = sin(x). The forward or adjoint directional derivative is df1(x, v) = cos(x)*v. If you can calculate two directional derivatives, you have df2(x, v1, v2) = (cos(x)*v1, cos(x)*v2). df2 is a lot cheaper than two calls to df1, since cos(x) only need to be calculated once and trigonometric functions an order of magnitude more expensive than addition/multiplication. The same will hold for very large ducting from Modelica tools. Keep in mind that the intermediate operations and their linearizations (cos(x) above) usually have to be recalculated between calls, otherwise memory use can blow up.
In terms of API, you could add the support in a backwards compatible way by permitting nSensitivity to be multiples of the total number of knowns in fmi3GetDirectionalDerivative or multiples of the total number of unknowns in fmi3GetAdjointDerivative.
The text was updated successfully, but these errors were encountered: