Models
Pipelogic supports deploying machine learning models through Nvidia Triton and TorchServe. Supported formats include frozen exported models such as ONNX, TorchScript, and TensorFlow SavedModel.
Each ML framework provides its own tooling for exporting trained models. Below are helpful links for exporting models from common libraries:
- Hugging Face Transformers
- MMdeploy for OpenMMlab libraries such as MMPretrain, MMPose, and MMDetection
- TensorFlow Model Export
Step-by-Step Export and Model Preparation
Step 1: Download a Pretrained Model
Clone a repository or download a model from Hugging Face:
git clone https://huggingface.co/hustvl/yolos-tiny
Step 2: Install Conversion Tools
You will need the following tools to export and inspect models:
pip install optimum[exporters] polygraphy
Step 3: Export to ONNX Format
Use the optimum-cli
to convert the model to ONNX format:
optimum-cli export onnx --model hustvl/yolos-tiny .
Rename the output file to model.onnx
for consistency.
Step 4: Inspect the Model
To determine the model's input and output layers, use:
polygraphy inspect model model.onnx --mode=onnx
This will print a summary that includes input names, shapes, and data types, such as:
[I] ==== ONNX Model ====
Name: main_graph | ONNX Opset: 12
---- 1 Graph Input(s) ----
{pixel_values [dtype=float32, shape=('batch_size', 'num_channels', 'height', 'width')]}
---- 2 Graph Output(s) ----
{logits [dtype=float32, shape=('batch_size', 'num_queries', 92)],
pred_boxes [dtype=float32, shape=('batch_size', 'num_queries', 4)]}
Step 5: Create config.pbtxt
File
Triton Inference Server requires a model configuration file named config.pbtxt
. Refer to the Triton documentation for full reference.
Use the information from the previous step to define input and output layers:
input: [
{
name: "pixel_values",
data_type: TYPE_FP32,
dims: [1, 3, 512, 1333]
}
],
output: [
{
name: "logits",
data_type: TYPE_FP32,
dims: [-1, -1, 92]
},
{
name: "pred_boxes",
data_type: TYPE_FP32,
dims: [-1, -1, 4]
}
]
Save this as config.pbtxt
.
Step 6: Prepare the Model Directory
The model must follow the directory structure expected by Triton Inference Server:
ONNX Example
model_name/
├── config.pbtxt
└── 1/
└── model.onnx
PyTorch Example
model_name/
├── config.pbtxt
└── 1/
└── model.pt
TensorFlow Example
model_name/
├── config.pbtxt
└── 1/
└── model.savedmodel/
├── saved_model.pb
└── variables/
├── variables.data-00000-of-00001
└── variables.index