Features
Contents
Types
Types are definitions created to serialize data into structured forms. Types are used for managing communication between components, components can only connect and communicate with each other only if their type matches. Each component receives data stream described as a specific type, then component decodes it, process it and sends out data stream.
Types are defined based on ANTLR g4 format.
Possible Types
-
Atomic types:
Int32
,Int64
,Bool
,String
,Bytes
,Float
,Double
,UInt32
,UInt64
-
Lists: Denoted by
[]
, they represent an arbitrary, finite number of elements of the same type:[Int32]
-
Tuples: Represent an ordered collection of elements with static length and types. The syntax is similar to Python tuple:
(Bool, String, a)
,()
,(UInt64,)
, ... -
Records (aka Structs): Like tuples, but unordered and with named fields:
{name : String, age : UInt32, score : (UInt64, UInt64)}
-
Named types: Any type can be given a name. The on-wire representation of a named type is the same as that of its base type, but the two are not directly compatible, though they can be converted amongst each other. E.g
person := {name : String, age : UInt32, score : (UInt64, UInt64)}
, this is the named typeperson
that an age, name and score for a competition It is possible to introduce a named type without content, e.g. typeNothing = ()
.
Named types can include other named types
Image := {width:UInt64,height:UInt64,
format:Image.BGR|Image.BGRA|Image.GRAY|
Image.RGB|Image.RGBA,data:Bytes} // format can be BGR, BGRA, ...
VideoFrame := {image:Image,timestamp:UInt64} // VideoFrame is image with timestamp
Some types include a namespace, the namespace indicates to which workspace the type belongs to.
Type with Namespace
// this BoundingBox with angle is belongs to super_workspace
super_workspace.BoundingBox := {class:DetectedClass,rectangle:Rectangle<Double>, angle: Double}
- Unions: Unordered collections of different types that represent elements that might be of any of the given types.
E.g.
Int32 | Int64
;Bool | String
The language also supports type variables, similar to generics in Python or templates in C++. A type variable is an arbitrary name starting with lower case (to distinguish from types). They denote any single type and are scoped within a single component's definition. That is, the same variable refers to the same type everywhere within one component but not across two different components. E.g. a
, b
, myType
Named Type Definitions
Here are examples on how named typed can be defined.
Basic Named Type
// DetectedClass is a type with named field 'id' and 'confidence'
DetectedClass:= {id: UInt64, confidence: Double}
Template Type
Point<t0> := {x:t0,y:t0} // It is possible to insert any existing type definition as t0
// it is possible to build on top of existing template type definitions
Types built on Template Types
// Rectangle is a template type using an existing template type
Rectangle<t0> := {top_left:Point<t0>,bottom_right:Point<t0>}
// Combines DetectedClass and Rectangle
// Rectangle is intialized with Double
BoundingBox := {class:DetectedClass,rectangle:Rectangle<Double>}
Empty Types
// Creates types without content Image.BGR, Image.RGB, ...
Image.BGR = ()
Image.RGB = ()
Image.GRAY = ()
Image.RGBA = ()
Image.BGRA = ()
These types are used for defining color channel format in type Image
.
Types with Union
Image := {width:UInt64,height:UInt64,
format:Image.BGR|Image.BGRA|Image.GRAY|
Image.RGB|Image.RGBA,data:Bytes} // format can be either BGR, BGRA, ...
Next Steps:
Components
A component is container for compiled code, models and other dependencies that are needed in order to execute a particular task. A component can be shared and reused in multiple pipelines. Components may have typed inputs and outputs, making sure they connect with other components only if types match. Developers can use C++ or Python to program components. A component could for instance load the model and run inference with pre- and post- processing, or take a sequence of bounding boxes and apply Kalman filter for tracking of objects.
Components serve as containerized building blocks crafted for rapid prototyping, reusability, and deployment functionalities. They communicate with each other through the Pipelogic type system, ensuring a consistent and portable experience across both development and production environments.
Component Definition
In technical terms, a component is a stateless function, defined as follows:
func(Stream a, Stream b, Stream c...) returns (Stream x, Stream y, Stream z...)
Here, Stream a signifies the acceptance of a specific number of values of type a. The component's author determines the exact number of inputs and outputs, as well as the shape of inputs and outputs (i.e., the quantity of a in Stream a).
For example, consider the following simple definition:
func(Stream a, Stream b) returns (Stream c)
In this case, the component's author can specify that the function is executable when there are two values of a and three values of b or three values of a and one value of b. The function will execute as soon as at least one of these options becomes available, with prioritization specified by the author. Similarly, the author decides the output, determining any amount of values of c (possibly none) during execution. A pseudocode representation could look like this:
func(Stream a, Stream b) returns (Stream c) {
if len(a) == 2 and len(b) == 3 {
return {c1, c2, c3};
} else
if len(a) == 3 and len(b) == 1 {
return {};
}
}
However, this approach has limitations; it lacks state and is purely functional, limiting its use in a production environment. To address this, the function can be redefined as follows:
func(VirtualInput, State, Stream a, Stream b, Stream c...) returns (NewState, Stream x, Stream y, Stream z...)
This redefinition resolves both issues while maintaining the initial properties. The author is freed from handling errors, and the state is entirely managed by the Pipelogic system. Additionally, the virtual input can be used to generate input and introduce side effects, such as generating random input or reading from a file.
Components are designed functionally due to their inherent functional benefits. Similar to map-reduce, solutions can scale and automatically parallelize without user input. Pipelogic's philosophy is not to create a new language that solves everything but to facilitate the integration of existing and new code. Users can write components in any language, and Pipelogic seamlessly integrates them with minimal latency.
Depending on the language used, the way a component is created varies. It is possible to create more efficient and optimized components with C++, but it requires more knowledge and detailed configurations. As an alternative, you can use Python api and complexities of the components are handled automatically.
Component Structure
A typical component structure will look like the following:
.
├── component.yml
├── .pipecomponent
├── src
│ ├── requirements.txt
│ └── main.py
└── tests
├── test_0
│ ├── input_0.txt
│ ├── input_1.txt
│ └── output_0.txt
└── test_1
├── input_0.txt
├── input_1.txt
└── output_0.txt
-
src/
-> contains source code with all the input/output sources (main.py
andrequirements.txt
for Python,main.cpp
for C++) -
tests/
-> folder with a list of individual tests specifying the list of inputs and the list of expected outputs. -
.pipecomponent
-> contains uniquecomponent_id
for Pipelogic Database -
component.yml
-> Component configuration file
Configuration File
The configuration file specifies the component metadata, such as the name, the programming language in which it is written, the platform on which to compile, i.e. linux/arm64
, linux/amd64
or linux
to specify both. It's also necessary to specify the input and output types. They can be specified by input_type
and output_type
respectively for a one-to-one
component or as a list given by input_types
and output_types
for a many-to-many
component.
It is important to have correct input and output types, components can only connect and communicate with each other only if their type matches.
Example Configuration File
name: "Double Value"
platform: linux
language: py
build_system: 1.1.3
tags: ["default", "one-to-one"]
worker:
input_type: Int64
output_type: Int64
Example configuration file describes following information:
-
Component takes in data stream with type
Int64
as input, multiplies it with two and outputs data stream with typeInt64
-
It compiled for all
linux
architectures, bothlinux/arm64
andlinux/amd64
-
It is written in Python language
-
It has
default
andone-to-one
tags
Next Steps:
Transformations
Transformations are pre-defined generalized workers applicable for various purposes. They function as compact helper functions embedded within a component, facilitating data transformations and performing other useful tasks. The existing list of transformations includes:
-
UnpackNamed: Unwraps an encapsulated object; for instance, if we have
Just a
, then the transformation will yielda
.Example:
BoundingBox
->{class:DetectedClass,rectangle:Rectangle<Double>}
-
PackNamed: Takes
a
and encapsulates it into a named type, for example froma
toJust a
.Example:
{class:DetectedClass,rectangle:Rectangle<Double>}
->BoundingBox
-
UnpackTuple: Unwraps a stream of tuples, returning all its elements into separate streams
($ts...) -> $ts...
.Example:
(Segmentation, BoundingBox)
->Segmentation
andBoundingBox
-
PackTuple: Encapsulates a stream of tuple elements into a stream of tuples
$ts... -> ($ts...)
.Example:
Segmentation
andBoundingBox
->(Segmentation, BoundingBox)
-
UnpackRecord: Unpacks a record type into separate stream outputs. Behaves just like tuple except has field names instead of indexed types.
Example 1:
{class:DetectedClass,rectangle:Rectangle<Double>}
->DetectedClass
andRectangle<Double>
Example 2:
{class:DetectedClass, rectangle:Rectangle<Double>, counter:UInt64}
->DetectedClass
,Rectangle<Double>
andUInt64
-
PackRecord: Packs streams inputs into a single stream of records.
Example:
DetectedClass
,Rectangle<Double>
andUInt64
->{class:DetectedClass, rectangle:Rectangle<Double>, counter:UInt64}
-
UnpackUnion: Splits stream of unions into separate streams for every union option.
Example:
Image.BGR | Image.GRAY | Image.RGB
->Image.BGR
,Image.GRAY
,Image.RGB
-
PackUnion: Packs input streams into a stream of unions
Example:
Image.BGR
,Image.GRAY
,Image.RGB
->Image.BGR | Image.GRAY | Image.RGB
-
Constant: Streams exactly one pre-defined value given as a parameter.
-
DelayByOne: Delays the stream by one message, initially streaming a pre-defined value provided as a parameter.
Example: If the streamed message is
1
and predefined value is0
; then DelayByOne outputs0
as the first message and1
as second. -
ConvertValue: Takes an atomic type and converts another atomic type. Convertion is done according to C++ rules for corresponding fundamental types. (
a -> b
)Example: From
Int32
toDouble
,1
->1.0
. -
Flatten: Takes a list and streams the elements within that list
[a] -> a
.Example:
[Int32]
->Int32
, Turn[0, 1, 2, 3, 4]
into 5 different messages0
,1
,2
,3
and4
-
Filter: Takes two values, a boolean and another value, and returns nothing if the boolean is false; otherwise, it returns that value (
Bool, t -> t
)Example: If inputs are
true
and1
returns1
. If inputs arefalse
and1
then no messages will be sent. -
Repeat: Takes two inputs, first input is of type
UInt64
and has valuex
and the second input is a steam of any type. The transformation streams message of second inputx
many times.Example: If
x
is set to 5, the value1
will be streamed 5 times.Input 1: 5, Input 2: ["this is a message"]
Output: ["this is a message"], ["this is a message"], ["this is a message"], ["this is a message"], ["this is a message"]
-
Length: Takes a list and returns its size (
[t] -> UInt64
).Example: Input
[0, 1, 2, 3, 4]
, Output5
-
UniteStreams: Takes a number of streams and combines them into a single stream (
t... -> t
). -
SelectStreams: Takes a number of streams and an index and streams a value from the stream selected by the index (
UInt64, t... -> t
). -
Shuffle: Takes a list of pairs and combines the pairs with the same key (
[(K, V)] -> [(K, [V])]
). -
Join: Functions as the SQL join operation (
[(K, V1)], [(K, V2)] -> [(K, V1, V2)]
). -
LiftUnroll: Takes a list and outputs its size and streams its each elements as a separate Message.
Example: If the Input is
[0, 1, 2, 3, 4]
, the first output streams5
, the second output streams 5 different messages0
,1
, ...,4
. -
LiftReroll: Takes the streamed elements and puts them into a list, first input is the size of the list.
Example: If the first input stream contains two, three and five, streams the following lists:
Message1, Message2, ..., Message10
->[Message1, Message2], [Message3, Message4, Message5], [Message6, Message7, Message8, Message9, Message10]
Models
Pipelogic relies on Nvidia Triton and Torchserve to run inference with ML models. Only exported frozen models such as Tensorflow, PyTorch, Onnx can be used. Each Machine learning framework provides their own tool for exporting models.
Conversion tools of the most commonly used libraries:
Download the Model
git clone https://huggingface.co/hustvl/yolos-tiny
Install Conversion Tool
pip install optimum[exporters] polygraphy
Freeze Model
optimum-cli export onnx --model hustvl/yolos-tiny .
Rename exported model to model.onnx
.
Inspect Model
Inspect Model to find input and output layers
polygraphy inspect model model.onnx --mode=onnx
Prompt
[I] ==== ONNX Model ====
Name: main_graph | ONNX Opset: 12
---- 1 Graph Input(s) ----
{pixel_values [dtype=float32, shape=('batch_size', 'num_channels', 'height', 'width')]}
---- 2 Graph Output(s) ----
{logits [dtype=float32, shape=('batch_size', 'num_queries', 92)],
pred_boxes [dtype=float32, shape=('batch_size', 'num_queries', 4)]}
---- 214 Initializer(s) ----
---- 1306 Node(s) ----
Create Model Configuration
Triton needs Model Configuration config.pbtxt
file to run the model. Check triton documentation on how to prepare it.
Enter input and output model layers from previous step into config.pbtxt
file.
config.pbtxt
input: [
{
name: "pixel_values",
data_type: TYPE_FP32,
dims: [
1,
3,
512,
1333
]
}
],
output: [
{
name: "logits",
data_type: TYPE_FP32,
dims: [
-1,
-1,
92
]
},
{
name: "pred_boxes",
data_type: TYPE_FP32,
dims: [
-1,
-1,
4
]
}
],
Prepare Model Structure
Models should follow Triton Inference Server file structure.
Model Structures
+-- model_name
|
+-- config.pbtxt
+-- 1
|
+-- model.onnx
Next Steps: