Upper Limit for training data in Exact GPs #1928
-
Hi! Perhaps, this has already been asked somewhere but I was curious, what's the upper limit for the amount of data trained (n_samples, features) using the exact GP? I know the paper said it could be for 1 million data points. But I was wondering with KeOps as well as more (and newer) GPUs, I wanted to know if that record was beaten by any of the original authors (or anyone else)? Just curious. Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
With KeOps, I've actually run the full house electric dataset (1.8ish million data points) on a single RTX GPU and that takes roughly 12 hours for 50 optimization steps using Adam. I haven't tried going beyond that recently (but I think it's possible if you're willing to run something for 1-2 days) or trying to use KeOps plus kernel checkpointing. The falkon paper also uses KeOps but for kernel ridge regression and seems to get into the billions of data points with Nystrom approximations (so maybe not really exact then). It's possible other people have had different experiences / more detailed knowledge though. |
Beta Was this translation helpful? Give feedback.
With KeOps, I've actually run the full house electric dataset (1.8ish million data points) on a single RTX GPU and that takes roughly 12 hours for 50 optimization steps using Adam. I haven't tried going beyond that recently (but I think it's possible if you're willing to run something for 1-2 days) or trying to use KeOps plus kernel checkpointing.
The falkon paper also uses KeOps but for kernel ridge regression and seems to get into the billions of data points with Nystrom approximations (so maybe not really exact then).
It's possible other people have had different experiences / more detailed knowledge though.