Recommend GPyTorch for small data/CPU environment #1664
-
I am quite excited having found this framework, and am considering porting over a lot of my own lab's codebase to use GPyTorch. I am most interested in the ease of modeling features such as support of using Non-Gaussian data models, and hyper parameter priors/hierarchical models using a library like Pyro. I operate in a setting with small data (<1000 data points), a need for a large number of forward predictions, and operating in a CPU only environment. In this setting, would you say that GPyTorch is a good choice? In particular, is there any significant overhead in using PyTorch backends without the benefit of GPUs? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
There's certainly no real overhead in using PyTorch on the CPU! PyTorch handles math the same way NumPy does, by dispatching to linear algebra routines implemented in highly optimized BLAS and LAPACK libraries. This is actually true whether you're on the CPU or on the GPU -- it's only a difference of the hardware that the underlying BLAS/LAPACK libraries are implemented on. In general, I think that in the small data CPU only regime, the differences between the most popular GP packages will really come down to your own preferences about whether you prefer PyTorch / Numpy / Tensorflow and how you feel code is written in the various packages. In the n<1000 regime, the most fundamental operations in GP training and predictions are all mostly handled the same way across packages, and the differences are going to come down to features of the packages other than performance. In GPyTorch, a lot of our features designed for the low data regime are motivated by our collaboration with the folks over at BoTorch, since Bayesian optimization utilizes all sorts of GP modelling in the low data regime (although scalable BayesOpt matters too!). Some of their tutorials might be useful to look through to see what's possible out of the box with GPyTorch beyond the tutorials we have here. |
Beta Was this translation helpful? Give feedback.
There's certainly no real overhead in using PyTorch on the CPU! PyTorch handles math the same way NumPy does, by dispatching to linear algebra routines implemented in highly optimized BLAS and LAPACK libraries. This is actually true whether you're on the CPU or on the GPU -- it's only a difference of the hardware that the underlying BLAS/LAPACK libraries are implemented on.
In general, I think that in the small data CPU only regime, the differences between the most popular GP packages will really come down to your own preferences about whether you prefer PyTorch / Numpy / Tensorflow and how you feel code is written in the various packages. In the n<1000 regime, the most fundamental operat…