Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors in InterpContinuousModel when duplicate parameters and inputs are used #71

Closed
tju999 opened this issue Nov 12, 2024 · 4 comments
Closed
Labels
question Request information from the developers

Comments

@tju999
Copy link

tju999 commented Nov 12, 2024

Hello,
I am very sorry to bother you again. With your help last time, I have made new research progress. Thank you very much. However, when dealing with InterpContinuousModel, I encountered a problem that I could not solve.

Now in 'opinf.operators' I intend to use InterpContinuousModel, but when I obey ParametricContinuousModel format, the code went wrong.

rom= opinf.ParametricROM(
    basis=opinf.basis.PODBasis(residual_energy=1e-6),    
    ddt_estimator=opinf.ddt.UniformFiniteDifferencer(t, "ord6"),
    model=opinf.models.InterpContinuousModel(
        operators = [
        opinf.operators.InterpLinearOperator(),
        opinf.operators.InterpInputOperator(),
        opinf.operators.InterpConstantOperator(),
        ],
        solver=opinf.lstsq.L2Solver(regularizer=1e-1),     
    ),
)

rom.fit(parameters = K_rom_train,states = c_scaled_train_data,inputs = Cr_rom_train)

This is the same as before,

  • The shape of K_rom_trainis (205), where 41 is a different scalar parameter,5 is a different buondary,41*5=205; All parameters are repeated 5 times (number of boundaries)
  • The shape of c_scaled_train_data is (205, N, T), where 205 is the corresponding parameter and duplicate value, N is the number of points, and T is the number of time points.
  • The shape of Cr_rom_train is (205, T), where 205 is the corresponding parameter and duplicate value, the input is a 1D vector (with boundary conditions on the left sides ), and T is the number of time points.
    Among them, N=64*64, T=120;But when I design the model like this, there will be error alerts

Last time you said:
"In your case, you have 10 parameter values but 4 sets of inputs, so there are 40 total training trajectories. This means K_list_train should have 40 entries, H_list_train should be a (40, N, T) array (or a list of 40 (N, T) arrays), and Hr_train should be a (40, 2, T) array (or a list of 40 (2, T) arrays). It's okay that K_list_train will have repeated entries."

I follow the input format you said last time.But report an error

ERROR:root:(ValueError) `x` must be strictly increasing sequence. (raised after 43.568697 s)
ValueError: `x` must be strictly increasing sequence.

Does the ERROR mean that the parameter must be a strictly increased sequence before interpolation can be applied? But I need to make the parameters repeat, because I am not only different for the parameters, but also different for the input (boundary conditions). When no input is involved (boundary conditions), the parameters do not need to be repeated, no errors are reported. But now that I want to use Interp and include inputs (boundary conditions), which involve duplicate parameters, what can I do to avoid errors?

Thank you very much for your help before. I really appreciate your timely reply and look forward to receiving your reply, because I really can't solve this problem independently based on ERROR only.

Looking forward to receiving your reply!

@tju999 tju999 added the help wanted The developers are looking for altruistic helpers to solve this issue label Nov 12, 2024
@shanemcq18 shanemcq18 added question Request information from the developers and removed help wanted The developers are looking for altruistic helpers to solve this issue labels Nov 12, 2024
@shanemcq18
Copy link
Member

Hi @tju999, this is a tricky case but I think we can handle it. Interpolation is not defined for data sets where there are multiple values for a single parameter, and that's what we have here, since for the same parameter value you have different input functions. This isn't a problem in #70 because there are only affine-parametric operators, not interpolatory operators.

The ParametricROM class does not account for this scenario (yet!), but we can get around it by basically grouping snapshots by parameter value. However, we'll have to make sure that the time derivative estimates don't use snapshots from different trajectories, so we'll have to do that step separately instead of letting ParametricROM handle it all.

For each of the K=41 parameter values you have 5 trajectories of T=120 snapshots each, and each snapshot has N=4096 entries. You need to structure the states argument (c_scaled_train_data) so that its shape is (41, N, 5*T) = (41, 4096, 600). This way, each of the 5 trajectories are used to solve for operators corresponding to a fixed parameter value. We also need time derivative estimates and to organize the inputs the same way. Here's some pseudocode for how you might do this.

ddt_estimator = opinf.ddt.UniformFiniteDifferencer(t, "ord6")
all_states, all_ddts, all_inputs = [], [], []

for param in training_parameter_values:
    states, ddts, inputs = [], [], []

    for input_func in training_input_functions:
        U = input_func(t)                                               # get inputs: (M, T) array
        Q = full_order_solve(param, initial_condition, t, input_func)   # get snapshots: (N, T) array
        Q, dQ, U = ddt_estimator.estimate(Q, U)                         # get time derivatives: (N, T) array
        states.append(Q)
        ddts.append(dQ)
        inputs.append(U)

    all_inputs.append(np.hstack(inputs))                                # group inputs into (M, 5T) array
    all_states.append(np.hstack(states))                                # group snapshots into (N, 5T) array
    all_ddts.append(np.hstack(ddts))                                    # group time derivatives into (N, 5T) array


# all_states and all_ddts are now (K, N, 5T) and all_inputs is (K, M, 5T).
rom = opinf.ParametricROM(
    basis=opinf.basis.PODBasis(residual_energy=1e-6),    
    model=opinf.models.InterpContinuousModel(
        operators = [
            opinf.operators.InterpLinearOperator(),
            opinf.operators.InterpInputOperator(),
            opinf.operators.InterpConstantOperator(),
       ],
       solver=opinf.lstsq.L2Solver(regularizer=1e-1),
    ),
).fit(
    parameters=training_parameter_values,
    states=all_states,
    lhs=all_ddts,
    inputs=all_inputs
)

If you're scaling your data, make sure that you do so before taking the derivatives. Good luck!

@shanemcq18
Copy link
Member

Closing in favor of #72.

@tju999
Copy link
Author

tju999 commented Nov 15, 2024

Thank you very much for your reply.

I have solved the problem according to your answer. However, I found that the speed and accuracy of Interp seemed to decrease when the dimension of Parameter is higher. Is Interp more suitable for low-dimensional studies?

Thanks again for your prompt reply!

@shanemcq18
Copy link
Member

@tju999 Yes, interpolation in higher dimensions is tricky due to the curse of dimensionality. It might also be helpful to read up on Runge's phenomenon. These are both interpolation issues, not operator inference issues per se.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Request information from the developers
Development

No branches or pull requests

2 participants