Limitations
Only one
engine
instance can be active per API key. Please kindly apply your best effort to completely destroy created engines and associated processes from existence after use.Hosting multiple models simultaneously requires separate API keys for each model - take a look at Stable Diffusion demo to see how it is done.
Interrupting Jupyter Notebook kernel does not guarantee
engine
destruction. Please restart the kernel before initializing newengine
.Max file size for each ONNX stage is 2GB. Uploading models with external weight data will be possible soon.
Max
engine
input and output sizes should be 3GB (although it appears to be coded that way, we did not try anything truly gargantuous so far).Prediction will time out after 1 minute. Common timeout causes: wrong input types, serialization issues. Please check your inputs.
Last updated