Everinfer is a system that offloads inference of ONNX graphs to remote GPUs.

Core Features

Superior tech

  • See the simplest example of Everinfer in action.

  • Want to skip the boring parts and dive straight in? Take a look at how you could deploy Faster-RCNN while fusing pre- and post-processing in a single graph with the model.

  • Doubt latency and scalability claims? Take a look at GPT-2 running at 900 RPS, still with four lines of code.

  • Stable Diffusion demo - offload U-net to remote GPUs, while running lightweight models locally.

