Everinfer
Contact us
  • Introduction
  • Getting started
    • Basics
    • Model management
    • Limitations
    • Faster-RCNN example
  • Examples
    • GPT2: 900+RPS
    • BERT With Zero Overhead
    • Segformer from HuggingFace
    • Stable Diffusion: Decouple GPU Ops from Code
  • Essays
    • Our Vision for Serverless ML and Everinfer Internals
Powered by GitBook
On this page
  1. Getting started

Model management

PreviousBasicsNextLimitations

Last updated 1 year ago

Model chaining

Everinfer allows to "chain" multiple ONNX graphs in a single pipeline. Use pipeline creation syntax in the following way...

client.register_pipeline('model_chaining_example', 
['model_1.onnx', 'model_2.onnx', ...., 'model_N.onnx'])

...to merge multiple models in a single graph. Outputs of each model will be used as inputs for the next model.

Output names of the model must match input names of the next one!

This can be used in a multitude of ways, for example:

  • Fuse pre- and post-processing in a single graph with the main model. Check out our , showcasing that approach.

  • Do simple computations locally and offload demanding models to Everinfer. does exactly that, offloading U-Net model to remote GPUs.

  • Deploy huge models, like Large Language Models, by splitting them into multiple graphs.

Got cool ideas and use cases for model chaining on Everinfer?

Please, hit us up through , we will be glad to include them as examples and give credits to you!

FasterRCNN example
Stable Diffusion example
hello@everinfer.ai