Segformer from HuggingFace
Deploy SegFormer to serverless machines with 4 lines of code
In this tutorial we're going to demonstrate how well we are integrated with amazing HugginFace libraries by deploying SegFormer computer vision model with zero effort.
Install Everinfer and HuggingFace transformers library.
Convert the model to ONNX format:
!python3 -m transformers.onnx --model=nvidia/segformer-b0-finetuned-ade-512-512 onnx_segformer/
Authenticate on Everinfer using your API key, upload the model, and create inference engine:
from everinfer import Client
client = Client('my_api_key') # hit us up on [email protected] to get your key
pipeline = client.register_pipeline('segformer', ['onnx_segformer/model.onnx'])
runner = client.create_engine(pipeline['uuid'])
You are ready to go, only 4 lines of code to deploy your model to remote GPUs!
Since HuggingFace image preprocessors are fully compatible with Everinfer expected input format, you can feed tokenizer outputs directly to the deployed model:
from transformers import SegformerFeatureExtractor
from PIL import Image
feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b0-finetuned-ade-512-512")
url = "http://images.cocodataset.org/val2017/000000039769.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = feature_extractor(images=image, return_tensors="np")
preds = runner.predict([inputs]) # runs on remote hardware!
Everinfer is highly efficient as it is, even while transferring tensors over the network, let's check how fast the deployed model is:
Decent performance even without any additional data transfer optimization!
It is possible to speed up that deployment even further by fusing pre-processing step into an ONNX graph and transferring only raw .jpg images over the network.