Limitations

Only one engine instance can be active per API key. Please kindly apply your best effort to completely destroy created engines and associated processes from existence after use.
Hosting multiple models simultaneously requires separate API keys for each model - take a look at Stable Diffusion demo to see how it is done.
Interrupting Jupyter Notebook kernel does not guarantee engine destruction. Please restart the kernel before initializing new engine.
Max file size for each ONNX stage is 2GB. Uploading models with external weight data will be possible soon.
Max engine input and output sizes should be 3GB (although it appears to be coded that way, we did not try anything truly gargantuous so far).
Prediction will time out after 1 minute. Common timeout causes: wrong input types, serialization issues. Please check your inputs.

Last updated 2 years ago