-
Notifications
You must be signed in to change notification settings - Fork 16
Numerical issues with Positivstellensatz and interpolant-based polynomial optimization #798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @projekter. Thanks for such a detailed post- you've gone deep into the weeds investigating possible ways to get things to work! I'll take a few days to ponder and get back |
In summary, I don't see anything really bad about the models. The dual works a bit better for me. As you note, the interpolation-basis WSOS cone will work better if you can get "good" points from your domain and we are aware this can be tricky. For complicated domains, it's not clear what the best strategy is.
Your summary is basically right. What you did for (1) is what I'd do. The ideal case is that you would know how to get points efficiently for your domain. Regarding (2), I think a few things should be different:
Right, those Ps shouldn't be multiplied by the sqrt of g and they should have 3 columns in your example. Actually your gs could be negative in this approach and taking the squareroot might not make sense. I'll add a third option, which is to work with the dual. For the problem:
The dual is:
I added the dual in the same gist. Again, Hypatia fails to converge when the interpolation points for the interpolation basis are from [-1, 1]^n. But if the interpolation points are picked from I deliberately avoided merging the
For completeness, a fourth approach (that requires more work) is to fix this issue. It would require users to know some points from their domain of nonnegativity in order to get an initial interior point oracle, but the basis interpolation points could be from any region. I briefly revisited it on a branch but similar to above attempts, the number of iterations is very dependent on how the interpolation points are picked.
It's normal for the objective to oscillate, this is not a property of Hypatia or the WSOS cone. May I also ask from curiosity where the polynomial problem you are solving is from? |
Thank you for your detailed analysis - so I'll give the dual formulation a try, this looks promising. But actually, that's not really what I am interested in. What I want to do is to apply polynomial optimization for quantum problems. In particular, I will have some semidefinite constraints (states and maps), linear constraints (trace) and also nonconvex quadratic constraints (map applied on state). The number of variables will be high, but, exploiting sparsity, I hope that relaxations will not blow up to an unmanagable size. Still, it would probably be good if I didn't have to rewrite this as an SDP. I will try applying this to "small" actual problem instances and also compare with your branch that allows to specify initial values. Thanks for the help! |
Sounds cool. |
I observe some numerical difficulties in constrained polynomial optimization with small problems; and before scaling them up, I wanted to ask whether you'd expect such a behavior and the interpolant-based approach is not really suitable for what I want to do, or whether this actually indicates some problems in my formulation or even in the solver. I apologize in advance for the long text, but I hope this puts the problem in some context. If a new example arises from this or you want to include my code in some way, I'm happy to contribute.
I want to use Hypatia's WSOS capabilities to optimize over polynomials with constraints. As far as I can see, there are two options to do this (using the
PolyUtils
framework):Domain
descendant which implements its own sampling and weights. This work great in principle, however, the sampling points should be chosen "well" and all the constraints evaluated at the sampling points must be nonnegative. For complicated constraint functions and/or lots of variables, it is not easy (or even possible) to do this.WSOSInterpNonnegativeCone
for the SOS polynomials that act as multipliers of the constraint polynomials. I would expect that this still scales better than the equivalent PSD matrix formulation.I tried to apply both of the approaches to a very simple example (this was one of Lasserre's original examples when he introduced the moment problem),
The region is a triangle with edges
(1, 2), (2, 2), (2, 3)
, the maximum value is2
, which is attained at all three edges.The first approach with a custom domain works flawlessly. With my implementation of
CustomBoxDomain
, the domain would be defined like this (I choose a relatively loose enclosing rectangle)and then it's all standard interpolation and optimization. But since in this geometry, it is not really hard to take appropriate samples, this is perhaps not representative. My problem is with the second one.
Now, if I rewrite this problem in terms of Putinar's SOS decomposition,
then with
2d = 2
, I get the upper bound3
and with2d = 4
, the exact maximum2
is obtained, using the PSD matrix formulation (see this gist for solving the problem with arbitrary precision).I would now expect that if I write the
is a SOS-polynomial
conditions in terms of theWSOSInterpNonnegativeCone
cone, then this should give the same results. In order to get the appropriateP
matrices on interpolation, I extendedPolyUtils.interpolate
a bit. Basically, I have to cut off more columns from the full matrixP0
in order to restrict the degree of my SOS polynomials. And then there are two choices: either I can still use the WSOS approach forSOS0...SOS3
, so that they also contain the boundaries of the rectangle, or I just use the unweighted approach.Now
constraint_Ps
is a vector that contains the appropriatePs
matrices forSOS1...SOS3
(here, as they all have the same degree, it's just three times the same matrix). Basicallyconstraint_Ps[1][1]
is the same asPs[1][:, 1:3]
(corresponding to the degrees(0, 0)
,(1, 0)
, and(0, 1)
).constraint_Ps[1][2]
is based onconstraint_Ps[1][1]
, but also multiplied by the square root of the box constraint(x[1] -0) * (3 - x[1])
, so it just has a single column.Now I form the model and optimize (here, I put
SOS0
in the last column):Now, this converges to the correct value in 231 iterations, with NearOptimal status and slow progress warning. The 6x6 moment matrix based on
dual(wsos)
in the monomial basis allows to extract the optimal points{{2., 3.},{2., 2.},{1., 2.}}
using the Hankel decomposition method of Harmouch. Problem solved - but neither the large number of iterations for such a small problem nor the status and warning are very comforting.I tried several variations of this:
SOS1...SOS3
may not be so good. It does not really correspond to the exact version of Putinar's, so I should perhaps not expect similar results. So in the constraints, I write insteadconstraint_Ps[i][1:1]
, just using the "unweighted" SOS. As a result, after 56 iterations, the status is SlowProgress and the result5.3227
(dual0.16003
) is nowhere close to what it should be anymore (in fact, the objective meanders, getting smaller and larger).If I set
tol_slow=1e-4
, after 365 iterations, the optimal value is obtained (status Optimal) and all optimal points can be extracted.constraint_Ps[i][1:1]
, but alsoPs[1:1]
. Similarly, SlowProgress (primal4.6338
, dual0.45718
) after 108 iterations with no clear monotonicity between the iterations.If I set
tol_slow=1e-6
, after 290 iterations, the optimal value is obtained (status NearOptimal) and all optimal points can be extracted.SOS0
(Ps[1:1]
), but put them intoSOS1...SOS3
(constraint_Ps[i]
). This indeed converges to the correct value in 273 iterations with status Optimal and no slow progress warning. All optimal points can be extracted.Can you see a reason why Hypatia struggles so much with this small problem despite the fact that the PSD matrix formulation is extremely simple? And then of course the main question: If the interpolation points cannot be constructed so easily (or, which is also a main motivation for me, I want to actually use a SOS-PSD matrix in the qmodule:
p = SOS0 + sum(SOS[i] g[i]) + tr(SOSPSD G)
) and therefore the "native" weights cannot be used - does it make sense to employ an interpolation basis approach instead of resorting to the much larger PSD formulation? In the end, all the potential gain may be negated if it comes at the pain of hundreds or thousands of iterations with unstable monotonicity on top.The text was updated successfully, but these errors were encountered: