Features

Contents

Types

Types are definitions created to serialize data into structured forms. Types are used for managing communication between components, components can only connect and communicate with each other only if their type matches. Each component receives data stream described as a specific type, then component decodes it, process it and sends out data stream.

Types are defined based on ANTLR g4 format.

Possible Types

  • Atomic types: Int32, Int64, Bool, String, Bytes, Float, Double, UInt32, UInt64

  • Lists: Denoted by [], they represent an arbitrary, finite number of elements of the same type: [Int32]

  • Tuples: Represent an ordered collection of elements with static length and types. The syntax is similar to Python tuple: (Bool, String, a), (), (UInt64,), ...

  • Records (aka Structs): Like tuples, but unordered and with named fields: {name : String, age : UInt32, score : (UInt64, UInt64)}

  • Named types: Any type can be given a name. The on-wire representation of a named type is the same as that of its base type, but the two are not directly compatible, though they can be converted amongst each other. E.g person := {name : String, age : UInt32, score : (UInt64, UInt64)}, this is the named type person that an age, name and score for a competition It is possible to introduce a named type without content, e.g. type Nothing = ().

Named types can include other named types

Image := {width:UInt64,height:UInt64,
      format:Image.BGR|Image.BGRA|Image.GRAY|
      Image.RGB|Image.RGBA,data:Bytes} // format can be BGR, BGRA, ...
VideoFrame := {image:Image,timestamp:UInt64} // VideoFrame is image with timestamp

Type with Namespace

// this BoundingBox with angle is belongs to super_workspace
super_workspace.BoundingBox := {class:DetectedClass,rectangle:Rectangle<Double>, angle: Double}
  • Unions: Unordered collections of different types that represent elements that might be of any of the given types. E.g. Int32 | Int64; Bool | String

The language also supports type variables, similar to generics in Python or templates in C++. A type variable is an arbitrary name starting with lower case (to distinguish from types). They denote any single type and are scoped within a single component's definition. That is, the same variable refers to the same type everywhere within one component but not across two different components. E.g. a, b, myType

Named Type Definitions

Here are examples on how named typed can be defined.

Basic Named Type

// DetectedClass is a type with named field 'id' and 'confidence' 
DetectedClass:= {id: UInt64, confidence: Double}

Template Type

Point<t0> := {x:t0,y:t0} // It is possible to insert any existing type definition as t0
// it is possible to build on top of existing template type definitions

Types built on Template Types

// Rectangle is a template type using an existing template type
Rectangle<t0> := {top_left:Point<t0>,bottom_right:Point<t0>}
// Combines DetectedClass and Rectangle
// Rectangle is intialized with Double
BoundingBox := {class:DetectedClass,rectangle:Rectangle<Double>}

Empty Types

// Creates types without content Image.BGR, Image.RGB, ...
Image.BGR = () 
Image.RGB = ()
Image.GRAY = ()
Image.RGBA = ()
Image.BGRA = ()

Types with Union

Image := {width:UInt64,height:UInt64,
      format:Image.BGR|Image.BGRA|Image.GRAY|
      Image.RGB|Image.RGBA,data:Bytes} // format can be either BGR, BGRA, ...

Next Steps:

Components

A component is container for compiled code, models and other dependencies that are needed in order to execute a particular task. A component can be shared and reused in multiple pipelines. Components may have typed inputs and outputs, making sure they connect with other components only if types match. Developers can use C++ or Python to program components. A component could for instance load the model and run inference with pre- and post- processing, or take a sequence of bounding boxes and apply Kalman filter for tracking of objects.

Components serve as containerized building blocks crafted for rapid prototyping, reusability, and deployment functionalities. They communicate with each other through the Pipelogic type system, ensuring a consistent and portable experience across both development and production environments.

Component Definition

In technical terms, a component is a stateless function, defined as follows:

func(Stream a, Stream b, Stream c...) returns (Stream x, Stream y, Stream z...)

Here, Stream a signifies the acceptance of a specific number of values of type a. The component's author determines the exact number of inputs and outputs, as well as the shape of inputs and outputs (i.e., the quantity of a in Stream a).

For example, consider the following simple definition:

func(Stream a, Stream b) returns (Stream c)

In this case, the component's author can specify that the function is executable when there are two values of a and three values of b or three values of a and one value of b. The function will execute as soon as at least one of these options becomes available, with prioritization specified by the author. Similarly, the author decides the output, determining any amount of values of c (possibly none) during execution. A pseudocode representation could look like this:

func(Stream a, Stream b) returns (Stream c) {
  if len(a) == 2 and len(b) == 3 {
    return {c1, c2, c3};
  } else
  if len(a) == 3 and len(b) == 1 {
    return {};
  }
}

However, this approach has limitations; it lacks state and is purely functional, limiting its use in a production environment. To address this, the function can be redefined as follows:

func(VirtualInput, State, Stream a, Stream b, Stream c...) returns (NewState, Stream x, Stream y, Stream z...)

This redefinition resolves both issues while maintaining the initial properties. The author is freed from handling errors, and the state is entirely managed by the Pipelogic system. Additionally, the virtual input can be used to generate input and introduce side effects, such as generating random input or reading from a file.

Components are designed functionally due to their inherent functional benefits. Similar to map-reduce, solutions can scale and automatically parallelize without user input. Pipelogic's philosophy is not to create a new language that solves everything but to facilitate the integration of existing and new code. Users can write components in any language, and Pipelogic seamlessly integrates them with minimal latency.

Depending on the language used, the way a component is created varies. It is possible to create more efficient and optimized components with C++, but it requires more knowledge and detailed configurations. As an alternative, you can use Python api and complexities of the components are handled automatically.

Component Structure

A typical component structure will look like the following:

.
├── component.yml
├── .pipecomponent
├── src
│   ├── requirements.txt
│   └── main.py
└── tests
  ├── test_0
  │   ├── input_0.txt
  │   ├── input_1.txt
  │   └── output_0.txt
  └── test_1
      ├── input_0.txt
      ├── input_1.txt
      └── output_0.txt    
  • src/ -> contains source code with all the input/output sources (main.py and requirements.txt for Python, main.cpp for C++)

  • tests/ -> folder with a list of individual tests specifying the list of inputs and the list of expected outputs.

  • .pipecomponent -> contains unique component_id for Pipelogic Database

  • component.yml -> Component configuration file

Configuration File

The configuration file specifies the component metadata, such as the name, the programming language in which it is written, the platform on which to compile, i.e. linux/arm64, linux/amd64 or linux to specify both. It's also necessary to specify the input and output types. They can be specified by input_type and output_type respectively for a one-to-one component or as a list given by input_types and output_types for a many-to-many component.

It is important to have correct input and output types, components can only connect and communicate with each other only if their type matches.

Example Configuration File

name: "Double Value"
platform: linux
language: py
build_system: 1.1.3
tags: ["default", "one-to-one"]
worker:
  input_type: Int64
  output_type: Int64

Example configuration file describes following information:

  • Component takes in data stream with type Int64 as input, multiplies it with two and outputs data stream with type Int64

  • It compiled for all linux architectures, both linux/arm64 and linux/amd64

  • It is written in Python language

  • It has default and one-to-one tags

Next Steps:

Transformations

Transformations are pre-defined generalized workers applicable for various purposes. They function as compact helper functions embedded within a component, facilitating data transformations and performing other useful tasks. The existing list of transformations includes:

  • UnpackNamed: Unwraps an encapsulated object; for instance, if we have Just a, then the transformation will yield a.

    Example: BoundingBox -> {class:DetectedClass,rectangle:Rectangle<Double>}

  • PackNamed: Takes a and encapsulates it into a named type, for example from a to Just a.

    Example: {class:DetectedClass,rectangle:Rectangle<Double>} -> BoundingBox

  • UnpackTuple: Unwraps a stream of tuples, returning all its elements into separate streams ($ts...) -> $ts....

    Example: (Segmentation, BoundingBox) -> Segmentation and BoundingBox

  • PackTuple: Encapsulates a stream of tuple elements into a stream of tuples $ts... -> ($ts...).

    Example: Segmentation and BoundingBox -> (Segmentation, BoundingBox)

  • UnpackRecord: Unpacks a record type into separate stream outputs. Behaves just like tuple except has field names instead of indexed types.

    Example 1: {class:DetectedClass,rectangle:Rectangle<Double>} -> DetectedClass and Rectangle<Double>

    Example 2: {class:DetectedClass, rectangle:Rectangle<Double>, counter:UInt64} -> DetectedClass, Rectangle<Double> and UInt64

  • PackRecord: Packs streams inputs into a single stream of records.

    Example: DetectedClass, Rectangle<Double> and UInt64 -> {class:DetectedClass, rectangle:Rectangle<Double>, counter:UInt64}

  • UnpackUnion: Splits stream of unions into separate streams for every union option.

    Example: Image.BGR | Image.GRAY | Image.RGB -> Image.BGR, Image.GRAY, Image.RGB

  • PackUnion: Packs input streams into a stream of unions

    Example: Image.BGR, Image.GRAY, Image.RGB -> Image.BGR | Image.GRAY | Image.RGB

  • Constant: Streams exactly one pre-defined value given as a parameter.

  • DelayByOne: Delays the stream by one message, initially streaming a pre-defined value provided as a parameter.

    Example: If the streamed message is 1and predefined value is 0; then DelayByOne outputs 0 as the first message and 1 as second.

  • ConvertValue: Takes an atomic type and converts another atomic type. Convertion is done according to C++ rules for corresponding fundamental types. (a -> b)

    Example: From Int32 to Double, 1 -> 1.0.

  • Flatten: Takes a list and streams the elements within that list [a] -> a.

    Example: [Int32] -> Int32, Turn [0, 1, 2, 3, 4] into 5 different messages 0, 1, 2, 3 and 4

  • Filter: Takes two values, a boolean and another value, and returns nothing if the boolean is false; otherwise, it returns that value (Bool, t -> t)

    Example: If inputs are true and 1 returns 1. If inputs are false and 1 then no messages will be sent.

  • Repeat: Takes two inputs, first input is of type UInt64 and has value x and the second input is a steam of any type. The transformation streams message of second input x many times.

    Example: If x is set to 5, the value 1 will be streamed 5 times.

    Input 1: 5, Input 2: ["this is a message"]

    Output: ["this is a message"], ["this is a message"], ["this is a message"], ["this is a message"], ["this is a message"]

  • Length: Takes a list and returns its size ([t] -> UInt64).

    Example: Input [0, 1, 2, 3, 4], Output 5

  • UniteStreams: Takes a number of streams and combines them into a single stream (t... -> t).

  • SelectStreams: Takes a number of streams and an index and streams a value from the stream selected by the index (UInt64, t... -> t).

  • Shuffle: Takes a list of pairs and combines the pairs with the same key ([(K, V)] -> [(K, [V])]).

  • Join: Functions as the SQL join operation ([(K, V1)], [(K, V2)] -> [(K, V1, V2)]).

  • LiftUnroll: Takes a list and outputs its size and streams its each elements as a separate Message.

    Example: If the Input is [0, 1, 2, 3, 4], the first output streams 5, the second output streams 5 different messages 0, 1, ...,4.

  • LiftReroll: Takes the streamed elements and puts them into a list, first input is the size of the list.

    Example: If the first input stream contains two, three and five, streams the following lists:

    Message1, Message2, ..., Message10 -> [Message1, Message2], [Message3, Message4, Message5], [Message6, Message7, Message8, Message9, Message10]

Models

Pipelogic relies on Nvidia Triton and Torchserve to run inference with ML models. Only exported frozen models such as Tensorflow, PyTorch, Onnx can be used. Each Machine learning framework provides their own tool for exporting models.

Conversion tools of the most commonly used libraries:

Download the Model

git clone https://huggingface.co/hustvl/yolos-tiny

Install Conversion Tool

pip install optimum[exporters] polygraphy

Freeze Model

optimum-cli export onnx --model hustvl/yolos-tiny .

Rename exported model to model.onnx.

Inspect Model

Inspect Model to find input and output layers

polygraphy inspect model  model.onnx --mode=onnx

Prompt

  [I] ==== ONNX Model ====
  Name: main_graph | ONNX Opset: 12

  ---- 1 Graph Input(s) ----
  {pixel_values [dtype=float32, shape=('batch_size', 'num_channels', 'height', 'width')]}

  ---- 2 Graph Output(s) ----
  {logits [dtype=float32, shape=('batch_size', 'num_queries', 92)],
   pred_boxes [dtype=float32, shape=('batch_size', 'num_queries', 4)]}

  ---- 214 Initializer(s) ----

  ---- 1306 Node(s) ----

Create Model Configuration

Enter input and output model layers from previous step into config.pbtxt file.

config.pbtxt

input: [
  {
  name: "pixel_values",
  data_type: TYPE_FP32,
  dims: [
      1,
      3,
      512,
      1333
  ]
  }
],
output: [
  {
  name: "logits",
  data_type: TYPE_FP32,
  dims: [
      -1,
      -1,
      92
  ]
  },
  {
  name: "pred_boxes",
  data_type: TYPE_FP32,
  dims: [
      -1,
      -1,
      4
  ]
  }
],

Prepare Model Structure

Models should follow Triton Inference Server file structure.

Model Structures

+-- model_name
    |
    +-- config.pbtxt
    +-- 1
    |
    +-- model.onnx

Next Steps:

Was this page helpful?