A Simple Way to Serve ONNX Models with KServe on macOS and Kind

Running ONNX models with KServe on macOS, especially ARM64, can be tricky without GPU support from runtimes like NVIDIA Triton. I created a simple demo to make it work: kserve-onnx-predictor-demo.

It uses a custom predictor with Python and ONNX Runtime, perfect for macOS and deployable to a Kind cluster with KServe. No GPU required, just a clean way to serve ONNX models.

Curious? The repo has all the details—code, manifests, and steps. Check it out at github.com/hbelmiro/kserve-onnx-predictor-demo and give it a spin!

A Simple Way to Serve ONNX Models with KServe on macOS and Kind

Related