Limitations
- Only one
engine
instance can be active per API key. Please kindly apply your best effort to completely destroy created engines and associated processes from existence after use. - Hosting multiple models simultaneously requires separate API keys for each model - take a look at Stable Diffusion demo to see how it is done.
- Interrupting Jupyter Notebook kernel does not guarantee
engine
destruction. Please restart the kernel before initializing newengine
. - Max file size for each ONNX stage is 2GB. Uploading models with external weight data will be possible soon.
- Max
engine
input and output sizes should be 3GB (although it appears to be coded that way, we did not try anything truly gargantuous so far). - Prediction will time out after 1 minute. Common timeout causes: wrong input types, serialization issues. Please check your inputs.
Last modified 2mo ago