Adding a Kernel
A walkthrough of adding a new stationary kernel to LightGP. We’ll use a hypothetical “rational quadratic” kernel as the running example, but the pattern is the same for any kernel that depends on the pairwise distance \(r = \|x - x'\|\).
1. Pick the math
The rational-quadratic kernel:
Three log-hyperparameters: log_length_scale, log_signal_var, and
log_alpha.
2. Implement the C++ kernel
Create kernels/rational_quadratic_kernel.h and .cpp:
// kernels/rational_quadratic_kernel.h
#pragma once
#include "kernel_base.h"
namespace lightgp {
class RationalQuadraticKernel : public Kernel {
public:
explicit RationalQuadraticKernel(float length_scale = 1.0f,
float signal_var = 1.0f,
float alpha = 1.0f);
Tensor compute(const Tensor& X1, const Tensor& X2, Backend) const override;
Tensor compute_diag(const Tensor& X) const override;
int num_params() const override { return 3; }
std::vector<float> get_log_params() const override;
void set_log_params(const std::vector<float>& p) override;
std::string name() const override { return "RationalQuadratic"; }
private:
float l_, sf2_, alpha_;
};
} // namespace lightgp
The CPU implementation re-uses the BLAS distance trick from rbf_cpu.cpp:
Tensor RationalQuadraticKernel::compute(const Tensor& X1, const Tensor& X2,
Backend /*backend*/) const {
const auto r2 = pairwise_squared_distance(X1, X2); // shared helper
Tensor K(r2.rows(), r2.cols());
const float factor = 1.0f / (2.0f * alpha_ * l_ * l_);
for (std::size_t i = 0; i < r2.size(); ++i) {
K.data()[i] = sf2_ * std::pow(1.0f + factor * r2.data()[i], -alpha_);
}
return K;
}
Tensor RationalQuadraticKernel::compute_diag(const Tensor& X) const {
// k(x, x) = sf2 for any stationary kernel.
Tensor d(X.rows(), 1);
std::fill(d.data(), d.data() + X.rows(), sf2_);
return d;
}
3. Wire it into the build
Add the source file to build.sh:
CPP_SRCS=(
...
kernels/rbf_kernel.cpp
kernels/rational_quadratic_kernel.cpp # new
...
)
CMake (if you use it) picks up new files automatically when listed in
CMakeLists.txt.
4. Test against the reference
Add a C++ test in tests/test_rational_quadratic.cpp:
void run_rational_quadratic_tests() {
std::printf("[rational_quadratic] starting...\n");
Tensor X = make_grid(8, -1.0f, 1.0f);
RationalQuadraticKernel rq(/*l=*/1.0f, /*sf2=*/1.0f, /*alpha=*/0.5f);
Tensor K = rq.compute(X, X);
// Compare against hand-derived reference values at r=0 and r=1.
LIGHTGP_CHECK_NEAR(K(0, 0), 1.0f, 1e-5f); // k(x, x) = sf2
// ... more checks ...
}
Register it in tests/run_tests.cpp and ./build/run_tests.
5. Add the GPU kernel (optional)
If you want a Metal / CUDA shader for the kernel matrix construction, mirror
the RBF pattern in kernels/metal/rbf_metal.mm /
kernels/cuda/rbf_cuda.cu. For most kernels you can re-use the
distance-then-scalar-function approach: compute \(r^2\) via a tiled
shader, then apply the kernel’s scalar function in a final pass.
Until you add the GPU path, the dispatcher falls back to CPU automatically.
6. Bind to Python
In python/bindings.cpp:
py::class_<lightgp::RationalQuadraticKernel, lightgp::Kernel,
std::shared_ptr<lightgp::RationalQuadraticKernel>>(m, "RationalQuadratic")
.def(py::init<float, float, float>(),
py::arg("length_scale") = 1.0f,
py::arg("signal_var") = 1.0f,
py::arg("alpha") = 1.0f);
Re-export it from python/lightgp/__init__.py:
from lightgp._core import (
...
RationalQuadratic,
)
7. Add a Python test
In python/tests/test_kernels.py:
def test_rational_quadratic_basic():
k = gp.RationalQuadratic()
assert k.num_params() == 3
assert "RationalQuadratic" in k.name()
# k(x, x) == sf2
X = np.zeros((4, 1), dtype=np.float32)
diag = gp.GPExact(k).fit # smoke test fit
# ... etc ...
8. Document it
Add a class entry to docs/source/api/kernels.rst matching the existing
style — math, parameter description, optional example. Re-build the docs:
cd docs && sphinx-build -b html source build/html
That’s the whole pipeline. The Kernel ABC plus the dispatch layer mean a new kernel typically lands in 100–200 LOC plus tests, with no changes to the inference layer.