Limitations

  • Only one engine instance can be active per API key. Please kindly apply your best effort to completely destroy created engines and associated processes from existence after use.

  • Hosting multiple models simultaneously requires separate API keys for each model - take a look at Stable Diffusion demo to see how it is done.

  • Interrupting Jupyter Notebook kernel does not guarantee engine destruction. Please restart the kernel before initializing new engine.

  • Max file size for each ONNX stage is 2GB. Uploading models with external weight data will be possible soon.

  • Max engine input and output sizes should be 3GB (although it appears to be coded that way, we did not try anything truly gargantuous so far).

  • Prediction will time out after 1 minute. Common timeout causes: wrong input types, serialization issues. Please check your inputs.

Last updated