ONNX Runtime

An open-source inference engine (from Microsoft) for running machine learning models in a hardware-accelerated, cross-platform way. ONNX (Open Neural Network Exchange) is a format for representing ML models; think of it as PDF for neural networks. ONNX Runtime is the engine that renders them. It supports models exported from PyTorch, TensorFlow, scikit-learn, and other frameworks, and runs on Windows, Linux, macOS, iOS, Android, and in the browser.

Why it matters for writers: ONNX Runtime lets you run embedding models, classification models, and other AI models locally in a .NET app without depending on a cloud API. For projects like FractalRecall (which needs to generate embeddings) or DocStratum (which could use local models for content analysis), ONNX Runtime is the path to local, offline-capable AI in C#. No API keys. No per-token billing. No sending your proprietary content to someone else's server. This matters more than the industry lets on.

Related terms: Embedding · ML.NET · Inference