MediaPipe Concepts

The basics

Packet

The basic data flow unit. A packet consists of a numeric timestamp and a shared pointer to an immutable payload. The payload can be of any C++ type, and the payload’s type is also referred to as the type of the packet. Packets are value classes and can be copied cheaply. Each copy shares ownership of the payload, with reference-counting semantics. Each copy has its own timestamp. Details.

Graph

MediaPipe processing takes place inside a graph, which defines packet flow paths between nodes. A graph can have any number of inputs and outputs, and data flow can branch and merge. Generally data flows forward, but backward loops are possible.

Nodes

Nodes produce and/or consume packets, and they are where the bulk of the graph’s work takes place. They are also known as “calculators”, for historical reasons. Each node’s interface defines a number of input and output ports, identified by a tag and/or an index.

Streams

A stream is a connection between two nodes that carries a sequence of packets, whose timestamps must be monotonically increasing.

Side packets

A side packet connection between nodes carries a single packet (with unspecified timestamp). It can be used to provide some data that will remain constant, whereas a stream represents a flow of data that changes over time.

Packet Ports

A port has an associated type; packets transiting through the port must be of that type. An output stream port can be connected to any number of input stream ports of the same type; each consumer receives a separate copy of the output packets, and has its own queue, so it can consume them at its own pace. Similarly, a side packet output port can be connected to as many side packet input ports as desired.

A port can be required, meaning that a connection must be made for the graph to be valid, or optional, meaning it may remain unconnected.

Note: even if a stream connection is required, the stream may not carry a packet for all timestamps.

Input and output

Data flow can originate from source nodes, which have no input streams and produce packets spontaneously (e.g. by reading from a file); or from graph input streams, which let an application feed packets into a graph.

Similarly, there are sink nodes that receive data and write it to various destinations (e.g. a file, a memory buffer, etc.), and an application can also receive output from the graph using callbacks.

Runtime behavior

Graph lifetime

Once a graph has been initialized, it can be started to begin processing data, and can process a stream of packets until each stream is closed or the graph is canceled. Then the graph can be destroyed or started again.

Node lifetime

There are three main lifetime methods the framework will call on a node:

  • Open: called once, before the other methods. When it is called, all input side packets required by the node will be available.
  • Process: called multiple times, when a new set of inputs is available, according to the node’s input policy.
  • Close: called once, at the end.

In addition, each calculator can define constructor and destructor, which are useful for creating and deallocating resources that are independent of the processed data.

Input policies

The default input policy is deterministic collation of packets by timestamp. A node receives all inputs for the same timestamp at the same time, in an invocation of its Process method; and successive input sets are received in their timestamp order. This can require delaying the processing of some packets until a packet with the same timestamp is received on all input streams, or until it can be guaranteed that a packet with that timestamp will not be arriving on the streams that have not received it.

Other policies are also available, implemented using a separate kind of component known as an InputStreamHandler.

See scheduling for more details.