Components
A component is container for compiled code, models and other dependencies that are needed in order to execute a particular task. A component can be shared and reused in multiple pipelines. Components may have typed inputs and outputs, making sure they connect with other components only if types match. Developers can use C++ or Python to program components. A component could for instance load the model and run inference with pre- and post- processing, or take a sequence of bounding boxes and apply Kalman filter for tracking of objects.
Components serve as containerized building blocks crafted for rapid prototyping, reusability, and deployment functionalities. They communicate with each other through the Pipelogic type system, ensuring a consistent and portable experience across both development and production environments.
Component Definition
In technical terms, a component is a stateless function, defined as follows:
func(Stream a, Stream b, Stream c...) returns (Stream x, Stream y, Stream z...)
Here, Stream a signifies the acceptance of a specific number of values of type a. The component's author determines the exact number of inputs and outputs, as well as the shape of inputs and outputs (i.e., the quantity of a in Stream a).
For example, consider the following simple definition:
func(Stream a, Stream b) returns (Stream c)
In this case, the component's author can specify that the function is executable when there are two values of a and three values of b or three values of a and one value of b. The function will execute as soon as at least one of these options becomes available, with prioritization specified by the author. Similarly, the author decides the output, determining any amount of values of c (possibly none) during execution. A pseudocode representation could look like this:
func(Stream a, Stream b) returns (Stream c) {
if len(a) == 2 and len(b) == 3 {
return {c1, c2, c3};
} else
if len(a) == 3 and len(b) == 1 {
return {};
}
}
However, this approach has limitations; it lacks state and is purely functional, limiting its use in a production environment. To address this, the function can be redefined as follows:
func(VirtualInput, State, Stream a, Stream b, Stream c...) returns (NewState, Stream x, Stream y, Stream z...)
This redefinition resolves both issues while maintaining the initial properties. The author is freed from handling errors, and the state is entirely managed by the Pipelogic system. Additionally, the virtual input can be used to generate input and introduce side effects, such as generating random input or reading from a file.
Components are designed functionally due to their inherent functional benefits. Similar to map-reduce, solutions can scale and automatically parallelize without user input. Pipelogic's philosophy is not to create a new language that solves everything but to facilitate the integration of existing and new code. Users can write components in any language, and Pipelogic seamlessly integrates them with minimal latency.
Depending on the language used, the way a component is created varies. It is possible to create more efficient and optimized components with C++, but it requires more knowledge and detailed configurations. As an alternative, you can use Python api and complexities of the components are handled automatically.
Component Structure
A typical component structure will look like the following:
.
├── component.yml
├── .pipecomponent
├── src
│ ├── requirements.txt
│ └── main.py
└── tests
├── test_0
│ ├── input_0.txt
│ ├── input_1.txt
│ └── output_0.txt
└── test_1
├── input_0.txt
├── input_1.txt
└── output_0.txt
-
src/
-> contains source code with all the input/output sources (main.py
andrequirements.txt
for Python,main.cpp
for C++) -
tests/
-> folder with a list of individual tests specifying the list of inputs and the list of expected outputs. -
.pipecomponent
-> contains uniquecomponent_id
for Pipelogic Database -
component.yml
-> Component configuration file
Configuration File
The configuration file specifies the component metadata, such as the name, the programming language in which it is written, the platform on which to compile, i.e. linux/arm64
, linux/amd64
or linux
to specify both. It's also necessary to specify the input and output types. They can be specified by input_type
and output_type
respectively for a one-to-one
component or as a list given by input_types
and output_types
for a many-to-many
component.
It is important to have correct input and output types, components can only connect and communicate with each other only if their type matches.
Example Configuration File
name: "Double Value"
platform: linux
language: py
build_system: 1.1.3
tags: ["default", "one-to-one"]
worker:
input_type: Int64
output_type: Int64
Example configuration file describes following information:
-
Component takes in data stream with type
Int64
as input, multiplies it with two and outputs data stream with typeInt64
-
It compiled for all
linux
architectures, bothlinux/arm64
andlinux/amd64
-
It is written in Python language
-
It has
default
andone-to-one
tags
Next Steps: