Kenning API

Deployment API overview

Kenning core classes and interactions between them

Fig. 5 Kenning core classes and interactions between them. The green blocks represent the flow of the input data that is passed to the model for inference. The orange blocks represent the flow of model deployment flow, from training to inference on target device. The grey blocks represent the inference results and metrics flow.

Kenning provides:

  • Dataset class - performs dataset downloading, preparation, input preprocessing, output postprocessing and model evaluation,

  • ModelWrapper class - trains the model, prepares the model, performs model-specific input preprocessing and output postprocessing, runs inference on host using native framework,

  • Optimizer class - optimizes and compiles the model,

  • Runtime class - loads the model, performs inference on compiled model, runs target-specific processing of inputs and outputs, and runs performance benchmarks,

  • RuntimeProtocol class - implements the communication protocol between the host and the target,

  • DataProvider class - implements providing data from such sources as camera, TCP connection or others for inference,

  • OutputCollector class - implements parsing and utilizing data coming from inference (such as displaying the visualizations, sending the results to via TCP).

Model processing

The orange blocks and arrows in the Fig. 5 represent the model life cycle:

  • the model is designed, trained, evaluated and improved - the training is implemented in the ModelWrapper. .. note:: This is an optional step - the already trained model can also be wrapped and used.

  • the model is passed to the Optimizer, where it is optimized for a given hardware and later compiled,

  • during inference testing, the model is sent to the target using RuntimeProtocol,

  • the model is loaded on target side and used for inference using Runtime.

Once the development of the model is complete, the optimized and compiled model can be used directly on target device using Runtime.

I/O data flow

The data flow is represented in the Fig. 5 with the green blocks. The input data flow is depicted using green arrows, and the output data flow is depicted using grey arrows.

Firstly, the input and output data is loaded from dataset files and processed. Later, since every model has its specific input preprocessing and output postprocessing routines, the data is passed to the ModelWrapper methods to apply those modifications. During inference testing, the data is sent to and from the target using RuntimeProtocol.

In the end, since Runtime runtimes also have their specific representations of data, the proper I/O processing is applied.

Reporting data flow

The report rendering requires performance metrics and quality metrics. The flow for this is presented with grey lines and blocks in Fig. 5.

On target side, the performance metrics are computed and sent using RuntimeProtocol back to the host, and later passed to the report rendering. After the output data goes through processing in the Runtime and ModelWrapper, it is compared to the ground truth in the Dataset during model evaluation. In the end, the results of model evaluation are passsed to the report rendering.

The final report is generated as an RST file with figures, as can be observed in the Sample autogenerated report.

Dataset

kennning.core.dataset.Dataset-based classes are responsible for:

  • preparing the dataset, including the download routines (use --download-dataset flag to download the dataset data),

  • preprocessing the inputs into the format expected by most of the models for a given task,

  • postprocessing the outputs for the evaluation process,

  • evaluating a given model based on its predictions,

  • subdividing the samples into training and validation datasets.

The Dataset objects are used by:

  • ModelWrapper - for training purposes and model evaluation,

  • Optimizer - can be used i.e. for extracting calibration dataset for quantization purposes,

  • Runtime - is used for evaluating the model on target hardware.

The available implementations of datasets are included in the kenning.datasets submodule. The example implementations:

class kenning.core.dataset.Dataset(root: Path, batch_size: int = 1, download_dataset: bool = False, external_calibration_dataset: Optional[Path] = None)

Wraps the datasets for training, evaluation and optimization.

This class provides an API for datasets used by models, compilers (i.e. for calibration) and benchmarking scripts.

Each Dataset object should implement methods for:

  • processing inputs and outputs from dataset files,

  • downloading the dataset,

  • evaluating the model based on dataset’s inputs and outputs.

The Dataset object provides routines for iterating over dataset samples with configured batch size, splitting the dataset into subsets and extracting loaded data from dataset files for training purposes.

dataX

List of input data (or data representing input data, i.e. file paths)

Type

List[Any]

dataY

List of output data (or data representing output data)

Type

List[Any]

batch_size

The batch size for the dataset

Type

int

_dataindex

ID of the next data to be delivered for inference

Type

int

calibration_dataset_generator(percentage: float = 0.25, seed: int = 12345) Generator[List[Any], None, None]

Creates generator for the calibration data.

Parameters
  • percentage (float) – The fraction of data to use for calibration

  • seed (int) – The seed for random state

download_dataset_fun()

Downloads the dataset to the root directory defined in the constructor.

evaluate(predictions: list, truth: list) Measurements

Evaluates the model based on the predictions.

The method should compute various quality metrics fitting for the problem the model solves - i.e. for classification it may be accuracy, precision, G-mean, for detection it may be IoU and mAP.

The evaluation results should be returned in a form of Measurements object.

Parameters
  • predictions (List) – The list of predictions from the model

  • truth (List) – The ground truth for given batch

Returns

Measurements

Return type

The dictionary containing the evaluation results

classmethod form_argparse()

Creates argparse parser for the Dataset object.

This method is used to create a list of arguments for the object so it is possible to configure the object from the level of command line.

Returns

tuple with the argument parser object that can act as parent for program’s argument parser, and the corresponding arguments’ group pointer

Return type

(ArgumentParser, ArgumentGroup)

classmethod form_parameterschema()

Creates schema for the Dataset class.

Returns

Dict

Return type

schema for the class

classmethod from_argparse(args)

Constructor wrapper that takes the parameters from argparse args.

This method takes the arguments created in form_argparse and uses them to create the object.

Parameters

args (Dict) – arguments from ArgumentParser object

Returns

Dataset

Return type

object of class Dataset

classmethod from_json(json_dict: Dict)

Constructor wrapper that takes the parameters from json dict.

This function checks if the given dictionary is valid according to the arguments_structure defined. If it is then it invokes the constructor.

Parameters

json_dict (Dict) – Arguments for the constructor

Returns

Dataset

Return type

object of class Dataset

get_class_names() List[str]

Returns list of class names in order of their IDs.

Returns

List[str]

Return type

List of class names

get_data() Tuple[List, List]

Returns the tuple of all inputs and outputs for the dataset.

Warning

It loads all entries with prepare_input_samples and prepare_output_samples to the memory - for large datasets it may result in filling the whole memory.

Returns

Tuple[List, List]

Return type

the list of data samples

get_data_unloaded() Tuple[List, List]

Returns the input and output representations before loading.

The representations can be opened using prepare_input_samples and prepare_output_samples.

Returns

Tuple[List, List]

Return type

the list of data samples representations

get_input_mean_std() Tuple[Any, Any]

Returns mean and std values for input tensors.

The mean and std values returned here should be computed using compute_input_mean_std method.

Returns

the standardization values for a given train dataset. Tuple of two variables describing mean and std values

Return type

Tuple[Any, Any]

prepare()

Prepares dataX and dataY attributes based on the dataset contents.

This can i.e. store file paths in dataX and classes in dataY that will be later loaded using prepare_input_samples and prepare_output_samples.

prepare_external_calibration_dataset(percentage: float = 0.25, seed: int = 12345) List[Path]

Prepares the data for external calibration dataset.

This method is supposed to scan external_calibration_dataset directory and prepares the list of entries that are suitable for the prepare_input_samples method.

This method is called by the calibration_dataset_genereator method to get the data for calibration when external_calibration_dataset is provided.

By default, this method scans for all files in the directory and returns the list of those files.

Returns

List of objects that are usable by the prepare_input_samples method

Return type

List[Any]

prepare_input_samples(samples: List) List

Preprocesses input samples, i.e. load images from files, converts them.

By default the method returns data as is - without any conversions. Since the input samples can be large, it does not make sense to load all data to the memory - this method handles loading data for a given data batch.

Parameters

samples (List) – List of input samples to be processed

Returns

List

Return type

preprocessed input samples

prepare_output_samples(samples: List) List

Preprocesses output samples.

By default the method returns data as is. It can be used i.e. to create the one-hot output vector with class association based on a given sample.

Parameters

samples (List) – List of output samples to be processed

Returns

List

Return type

preprocessed output samples

set_batch_size(batch_size)

Sets the batch size of the data in the iterator batches.

Parameters

batch_size (int) – Number of input samples per batch

train_test_split_representations(test_fraction: float = 0.25, seed: int = 12345)

Splits the data representations into train dataset and test dataset.

Parameters
  • test_fraction (float) – The fraction of data to leave for model validation

  • seed (int) – The seed for random state

ModelWrapper

kenning.core.model.ModelWrapper base class requires implementing methods for:

  • model preparation,

  • model saving and loading,

  • model saving to the ONNX format,

  • model-specific preprocessing of inputs and postprocessing of outputs, if neccessary,

  • model inference,

  • providing metadata (framework name and version),

  • model training,

  • input format specification,

  • conversion of model inputs and outputs to bytes for the kenning.core.runtimeprotocol.RuntimeProtocol objects.

The ModelWrapper provides methods for running the inference in a loop for data from dataset and measuring both the quality and inference performance of the model.

The kenning.modelwrappers.frameworks submodule contains framework-wise implementations of ModelWrapper class - they implement all methods that are common for given frameworks regardless of used model.

For the `Pet Dataset wrapper`_ object there is an example classifier implemented in TensorFlow 2.x called TensorFlowPetDatasetMobileNetV2.

Examples of model wrappers:

class kenning.core.model.ModelWrapper(modelpath: Path, dataset: Dataset, from_file: bool = True)

Wraps the given model.

convert_input_to_bytes(inputdata: Any) bytes

Converts the input returned by the preprocess_input method to bytes.

Parameters

inputdata (Any) – The preprocessed inputs

Returns

bytes

Return type

Input data as byte stream

convert_output_from_bytes(outputdata: bytes) Any

Converts bytes array to the model output format.

The converted bytes are later passed to postprocess_outputs method.

Parameters

outputdata (bytes) – Output data in raw bytes

Returns

Any

Return type

Output data to feed to postprocess_outputs

classmethod form_argparse()

Creates argparse parser for the ModelWrapper object.

Returns

the argument parser object that can act as parent for program’s argument parser

Return type

ArgumentParser

classmethod form_parameterschema()

Creates schema for the ModelWrapper class.

Returns

Dict

Return type

schema for the class

classmethod from_argparse(dataset: Dataset, args, from_file: bool = True)

Constructor wrapper that takes the parameters from argparse args.

Parameters
  • dataset (Dataset) – The dataset object to feed to the model

  • args (Dict) – Arguments from ArgumentParser object

  • from_file (bool) – Determines if the model should be loaded from modelspath

Returns

ModelWrapper

Return type

object of class ModelWrapper

classmethod from_json(dataset: Dataset, json_dict: Dict, from_file: bool = True)

Constructor wrapper that takes the parameters from json dict.

This function checks if the given dictionary is valid according to the arguments_structure defined. If it is then it invokes the constructor.

Parameters
  • dataset (Dataset) – The dataset object to feed to the model

  • json_dict (Dict) – Arguments for the constructor

  • from_file (bool) – Determines if the model should be loaded from modelspath

Returns

ModelWrapper

Return type

object of class ModelWrapper

get_framework_and_version() Tuple[str, str]

Returns name of the framework and its version in a form of a tuple.

get_io_specification() Dict[str, List[Dict]]

Returns a saved dictionary with input and output keys that map to input and output specifications.

A single specification is a list of dictionaries with names, shapes and dtypes for each layer. The order of the dictionaries is assumed to be expected by the ModelWrapper

It is later used in optimization and compilation steps

Returns

Dict[str, List[Dict]] – layers specification

Return type

Dictionary that conveys input and output

get_io_specification_from_model() Dict[str, List[Dict]]

Returns a new instance of dictionary with input and output keys that map to input and output specifications.

A single specification is a list of dictionaries with names, shapes and dtypes for each layer. The order of the dictionaries is assumed to be expected by the ModelWrapper

It is later used in optimization and compilation steps.

It is used by get_io_specification function to get the specification and save it for later use.

Returns

Dict[str, List[Dict]] – layers specification

Return type

Dictionary that conveys input and output

get_output_formats() List[str]

Returns list of names of possible output formats.

get_path() Path

Returns path to the model in a form of a Path object.

Returns

modelpath – The path to the model

Return type

Path

load_model(modelpath: Path)

Loads the model from file.

Parameters

modelpath (Path) – Path to the model file

postprocess_outputs(y: Any) List

Processes the outputs for a given model.

By default no action is taken, and the outputs are passed unmodified.

Parameters

y (Any) – The output from the model

Returns

The postprocessed outputs from the model that need to be in format requested by the Dataset object.

Return type

List

prepare_model()

Downloads the model (if required) and loads it to the device.

preprocess_input(X: List) Any

Preprocesses the inputs for a given model before inference.

By default no action is taken, and the inputs are passed unmodified.

Parameters

X (List) – The input data from the Dataset object

Returns

Any

Return type

The preprocessed inputs that are ready to be fed to the model

run_inference(X: List) Any

Runs inference for a given preprocessed input.

Parameters

X (List) – The preprocessed inputs for the model

Returns

Any

Return type

The results of the inference.

save_io_specification(modelpath: Path)

Saves input/output model specification to a file named modelpath + .json. This function uses get_io_specification() function to get the properties.

It is later used in optimization and compilation steps.

Parameters

modelpath (Path) – Path that is used to store the model input/output specification

save_model(modelpath: Path)

Saves the model to file.

Parameters

modelpath (Path) – Path to the model file

save_to_onnx(modelpath: Path)

Saves the model in the ONNX format.

Parameters

modelpath (Path) – Path to the ONNX file

test_inference() Measurements

Runs the inference with a given dataset.

Returns

Measurements

Return type

The inference results

train_model(batch_size: int, learning_rate: float, epochs: int, logdir: Path)

Trains the model with a given dataset.

This method should implement training routine for a given dataset and save a working model to a given path in a form of a single file.

The training should be performed with given batch size, learning rate, and number of epochs.

The model needs to be saved explicitly.

Parameters
  • batch_size (int) – The batch size for the training

  • learning_rate (float) – The learning rate for the training

  • epochs (int) – The number of epochs for training

  • logdir (Path) – Path to the logging directory

Optimizer

kenning.core.optimizer.Optimizer objects wrap the deep learning compilation process. They can perform the optimization of models (operation fusion, quantization) as well.

All Optimizer objects should provide methods for compiling models in ONNX format, but they can also provide support for other formats (like Keras .h5 files, or PyTorch .th files).

Example model compilers:

class kenning.core.optimizer.Optimizer(dataset: Dataset, compiled_model_path: Path)

Compiles the given model to a different format or runtime.

compile(inputmodelpath: Path, io_spec: Optional[Dict[str, List[Dict]]] = None)

Compiles the given model to a target format.

The function compiles the model and saves it to the output file.

The model can be compiled to a binary, a different framework or a different programming language.

If io_spec is passed, then the function uses it during the compilation, otherwise load_io_specification is used to fetch the specification saved in inputmodelpath + .json.

The compiled model is saved to compiled_model_path and the specification is saved to compiled_model_path + .json

Parameters
  • inputmodelpath (Path) – Path to the input model

  • io_spec (Optional[Dict[str, List[Dict]]]) – Dictionary that has input and output keys that contain list of dictionaries mapping (property name) -> (property value) for the layers

consult_model_type(previous_block: Union[ModelWrapper, Optimizer], force_onnx=False) str

Finds output format of the previous block in the chain matching with an input format of the current block.

Parameters

previous_block (Union[ModelWrapper, Optimizer]) – Previous block in the optimization chain.

:raises ValueError : Raised if there is no matching format.:

Returns

str

Return type

Matching format.

classmethod form_argparse()

Creates argparse parser for the Optimizer object.

Returns

tuple with the argument parser object that can act as parent for program’s argument parser, and the corresponding arguments’ group pointer

Return type

(ArgumentParser, ArgumentGroup)

classmethod form_parameterschema()

Creates schema for the Optimizer class.

Returns

Dict

Return type

schema for the class

classmethod from_argparse(dataset: Dataset, args)

Constructor wrapper that takes the parameters from argparse args.

Parameters
  • dataset (Dataset) – The dataset object that is optionally used for optimization

  • args (Dict) – arguments from ArgumentParser object

Returns

Optimizer

Return type

object of class Optimizer

classmethod from_json(dataset: Dataset, json_dict: Dict)

Constructor wrapper that takes the parameters from json dict.

This function checks if the given dictionary is valid according to the arguments_structure defined. If it is then it invokes the constructor.

Parameters
  • dataset (Dataset) – The dataset object that is optionally used for optimization

  • json_dict (Dict) – Arguments for the constructor

Returns

Optimizer

Return type

object of class Optimizer

get_framework_and_version() Tuple[str, str]

Returns name of the framework and its version in a form of a tuple.

get_input_formats() List[str]

Returns list of names of possible input formats.

get_output_formats() List[str]

Returns list of names of possible output formats.

get_spec_path(modelpath: Path) Path

Returns input/output specification path for the model saved in modelpath. It concatenates modelpath and .json.

Parameters

modelpath (Path) – Path where the model is saved

Returns

Path

Return type

Path to the input/output specification of a given model.

load_io_specification(modelpath: Path) Optional[Dict[str, List[Dict]]]

Returns saved input and output specification of a model saved in modelpath if there is one. Otherwise returns None

Parameters

modelpath (Path) – Path to the model which specification the function should read

Returns

Optional[Dict[str, List[Dict]]] – in modelpath if there is one. None otherwise

Return type

Specification of a model saved

save_io_specification(inputmodelpath: Path, io_spec: Optional[Dict[str, List[Dict]]] = None)

Internal function that saves input/output model specification which is used during both inference and compilation. If io_spec is None, the function uses specification of an input model stored in inputmodelpath + .json. If there is no specification stored in this path the function does not do anything.

The input/output specification is a list of dictionaries mapping properties names to their values. Legal properties names are dtype, prequantized_dtype, shape, name, scale, zero_point.

The order of the layers has to be preserved.

Parameters
  • inputmodelpath (Path) – Path to the input model

  • io_spec (Optional[Dict[str, List[Dict]]]) – Specification of the input/ouput layers

set_compiled_model_path(compiled_model_path: Path)

Sets path for compiled model.

set_input_type(inputtype: str)

Sets input type of the model for the compiler.

Runtime

kenning.core.runtime.Runtime class provides interfaces for methods for running compiled models locally or remotely on target device. Runtimes are usually compiler-specific (frameworks for deep learning compilers provide runtime libraries to run compiled models on a given hardware).

The client (host) side of the Runtime class utilizes the methods from Dataset, ModelWrapper and RuntimeProtocol classes to run inference on the target device. The server (target) side of the Runtime class requires implementing methods for:

  • loading model delivered by the client,

  • preparing inputs delivered by the client,

  • running inference,

  • preparing outputs to be delivered to the client,

  • (optionally) sending inference statistics.

The examples for runtimes are:

class kenning.core.runtime.Runtime(protocol: RuntimeProtocol, collect_performance_data: bool = True)

Runtime object provides an API for testing inference on target devices.

Using a provided RuntimeProtocol it sets up a client (host) and server (target) communication, during which the inference metrics are being analyzed.

close_server()

Indicates that the server should be closed.

classmethod form_argparse()

Creates argparse parser for the Runtime object.

Returns

the argument parser object that can act as parent for program’s argument parser

Return type

ArgumentParser

classmethod form_parameterschema()

Creates schema for the Runtime class.

Returns

Dict

Return type

schema for the class

classmethod from_argparse(protocol, args)

Constructor wrapper that takes the parameters from argparse args.

Parameters
  • protocol (RuntimeProtocol) – RuntimeProtocol object

  • args (Dict) – arguments from ArgumentParser object

Returns

RuntimeProtocol

Return type

object of class RuntimeProtocol

classmethod from_json(protocol: RuntimeProtocol, json_dict: Dict)

Constructor wrapper that takes the parameters from json dict.

This function checks if the given dictionary is valid according to the arguments_structure defined. If it is then it invokes the constructor.

Parameters
  • protocol (RuntimeProtocol) – RuntimeProtocol object

  • json_dict (Dict) – Arguments for the constructor

Returns

Runtime

Return type

object of class Runtime

get_io_spec_path(modelpath: Path) Path

Gets path to a input/output specification file which is modelpath and .json concatenated.

Parameters

modelpath (Path) – Path to the compiled model

Returns

Path

Return type

Returns path to the specification

inference_session_end()

Calling this function indicates that the inference session has ended.

This method should be called once all the inference data is sent to the server by the client.

This will stop performance tracking.

inference_session_start()

Calling this function indicates that the client is connected.

This method should be called once the client has connected to a server.

This will enable performance tracking.

postprocess_output(results: List[ndarray]) bytes

The method accepts output of the model and postprocesses it.

The output is quantized and converted to a correct dtype if needed.

Some compilers can change the order of the layers. If that’s the case the methods also reorders the output to match the original order of the model before compilation.

Parameters

results (list[np.ndarray]) – List of outputs of the model

Returns

bytes

Return type

Postprocessed output converted to bytes

:raises AttributeError : Raised if output specification is not loaded.:

prepare_client()

Runs initialization for the client.

prepare_input(input_data: bytes)

Loads and converts delivered data to the accelerator for inference.

This method is called when the input is received from the client. It is supposed to prepare input before running inference.

Parameters

input_data (bytes) – Input data in bytes delivered by the client, preprocessed

Returns

bool

Return type

True if succeded

:raises ModelNotLoadedError : Raised if model is not loaded:

prepare_io_specification(input_data: Optional[bytes]) bool

Receives the io_specification from the client in bytes and saves it for later use.

input_data stores the io_specification representation in bytes. If input_data is None, the io_specification is extracted from another source (i.e. from existing file). If it can not be found in this path, io_specification is not loaded.

The function returns True, as some Runtimes may not need io_specification to run the inference.

Parameters

input_data (Optional[bytes]) – io_specification or None, if it should be loaded from another source.

Returns

bool

Return type

True

prepare_local() bool

Runs initialization for the local inference.

Returns

bool

Return type

True if initialized successfully

prepare_model(input_data: Optional[bytes]) bool

Receives the model to infer from the client in bytes.

The method should load bytes with the model, optionally save to file and allocate the model on target device for inference.

input_data stores the model representation in bytes. If input_data is None, the model is extracted from another source (i.e. from existing file).

Parameters

input_data (Optional[bytes]) – Model data or None, if the model should be loaded from another source.

Returns

bool

Return type

True if succeded

prepare_server()

Runs initialization of the server.

preprocess_input(input_data: bytes) List[ndarray]

The method accepts input_data in bytes and preprocesses it so that it can be passed to the model.

It creates np.ndarray for every input layer using the metadata in self.input_spec and quantizes the data if needed.

Some compilers can change the order of the layers. If that’s the case the method also reorders the layers to match the specification of the model.

Parameters

input_data (bytes) – Input data in bytes delivered by the client.

Returns

list[np.ndarray] – ready to be passed to the model.

Return type

List of inputs for each layer which are

:raises AttributeError : Raised if output specification is not loaded.: :raises ValueError : Raised if size of input doesn’t match the input specification # noqa: E501:

process_input(input_data)

Processes received input and measures the performance quality.

Parameters

input_data (bytes) – Not used here

read_io_specification(io_spec: Dict)

Saves input/output specification so that it can be used during the inference.

input_spec and output_spec are lists, where every element is a dictionary mapping (property name) -> (property value) for the layers.

The standard property names are: name, dtype and shape.

If the model is quantized it also has scale, zero_point and prequantized_dtype properties.

If the layers of the model are reorder it also has order property.

Parameters

io_spec (Dict) – Specification of the input/output layers

run()

Runs inference on prepared input.

The input should be introduced in runtime’s model representation, or it should be delivered using a variable that was assigned in prepare_input method.

:raises ModelNotLoadedError : Raised if model is not loaded:

run_client(dataset: Dataset, modelwrapper: ModelWrapper, compiledmodelpath: Path)

Main runtime client program.

The client performance procedure is as follows:

  • connect with the server

  • upload the model

  • send dataset data in a loop to the server:

    • upload input

    • request processing of inputs

    • request predictions for inputs

    • evaluate the response

  • collect performance statistics

  • end connection

Parameters
  • dataset (Dataset) – Dataset to verify the inference on

  • modelwrapper (ModelWrapper) – Model that is executed on target hardware

  • compiledmodelpath (Path) – Path to the file with a compiled model

Returns

bool

Return type

True if executed successfully

run_locally(dataset: Dataset, modelwrapper: ModelWrapper, compiledmodelpath: Path)

Runs inference locally using a given runtime.

Parameters
  • dataset (Dataset) – Dataset to verify the inference on

  • modelwrapper (ModelWrapper) – Model that is executed on target hardware

  • compiledmodelpath (Path) – Path to the file with a compiled model

Returns

bool

Return type

True if executed successfully

run_server()

Main runtime server program.

It waits for requests from a single client.

Based on requests, it loads the model, runs inference and provides statistics.

upload_essentials(compiledmodelpath: Path)

Wrapper for uploading data to the server. Uploads model by default.

Parameters

compiledmodelpath (Path) – Path to the file with a compiled model

upload_output(input_data: bytes) bytes

Returns the output to the client, in bytes.

The method converts the direct output from the model to bytes and returns them.

The wrapper later sends the data to the client.

Parameters

input_data (bytes) – Not used here

Returns

bytes

Return type

data to send to the client

:raises ModelNotLoadedError : Raised if model is not loaded:

upload_stats(input_data: bytes) bytes

Returns statistics of inference passes to the client.

Default implementation converts collected metrics in MeasurementsCollector to JSON format and returns them for sending.

Parameters

input_data (bytes) – Not used here

Returns

bytes

Return type

statistics to be sent to the client

RuntimeProtocol

kenning.core.runtimeprotocol.RuntimeProtocol class conducts the communication between the client (host) and the server (target).

The RuntimeProtocol class requires implementing methods for:

  • initializing the server and the client (communication-wise),

  • waiting for the incoming data,

  • sending the data,

  • receiving the data,

  • uploading the model inputs to the server,

  • uploading the model to the server,

  • requesting the inference on target,

  • downloading the outputs from the server,

  • (optionally) downloading the statistics from the server (i.e. performance speed, CPU/GPU utilization, power consumption),

  • notifying of success or failure by the server,

  • parsing messages.

Based on the above-mentioned methods, the kenning.core.runtime.Runtime connects the host with the target.

The examples of RuntimeProtocol:

  • NetworkProtocol - implements a TCP-based communication between the host and the client.

Runtime protocol specification

The communication protocol is message-based. There are:

  • OK messages - indicate success, and may come with additional information,

  • ERROR messages - indicate failure,

  • DATA messages - provide input data for inference,

  • MODEL messages - provide model to load for inference,

  • PROCESS messages - request processing inputs delivered in DATA message,

  • OUTPUT messages - request results of processing,

  • STATS messages - request statistics from the target device.

The message types and enclosed data are encoded in format implemented in the kenning.core.runtimeprotocol.RuntimeProtocol-based class.

The communication during inference benchmark session is as follows:

  • The client (host) connects to the server (target),

  • The client sends the MODEL request along with the compiled model,

  • The server loads the model from request, prepares everything for running the model and sends the OK response,

  • After receiving the OK response from the server, the client starts reading input samples from the dataset, preprocesses the inputs, and sends DATA request with the preprocessed input,

  • Upon receiving the DATA request, the server stores the input for inference, and sends the OK message,

  • Upon receiving confirmation, the client sends the PROCESS request,

  • Just after receiving the PROCESS request, the server should send the OK message to confirm that it starts the inference, and just after finishing the inference the server should send another OK message to confirm that the inference is finished,

  • After receiving the first OK message, the client starts measuring inference time until the second OK response is received,

  • The client sends the OUTPUT request in order to receive the outputs from the server,

  • Server sends the OK message along with the output data,

  • The client parses the output and evaluates model performance,

  • The client sends STATS request to obtain additional statistics (inference time, CPU/GPU/Memory utilization) from the server,

  • If server provides any statistics, it sends the OK message with the data,

  • The same process applies to the rest of input samples.

The way of determining the message type and sending data between the server and the client depends on the implementation of the kenning.core.runtimeprotocol.RuntimeProtocol class. The implementation of running inference on the given target is implemented in the kenning.core.runtime.Runtime class.

RuntimeProtocol API

kenning.core.runtimeprotocol.RuntimeProtocol-based classes implement the Runtime protocol specification in a given mean of transport, i.e. TCP connection, or UART. It requires implementing methods for:

  • initializing server (target hardware) and client (compiling host),

  • sending and receiving data,

  • connecting and disconnecting,

  • uploading (host) and downloading (target hardware) the model,

  • parsing and creating messages.

class kenning.core.runtimeprotocol.RuntimeProtocol

The interface for the communication protocol with the target devices.

The target device acts as a server in the communication.

The machine that runs the benchmark and collects the results is the client for the target device.

The inheriting classes for this class implement at least the client-side of the communication with the target device.

disconnect()

Ends connection with the other side.

download_output() Tuple[bool, Optional[bytes]]

Downloads the outputs from the target device.

Requests and downloads the latest inference output from the target device for quality measurements.

Returns

Tuple[bool, Optional[bytes]] – successful) and downloaded data

Return type

tuple with download status (True if

download_statistics() Measurements

Downloads inference statistics from the target device.

By default no statistics are gathered.

Returns

Measurements

Return type

inference statistics on target device

classmethod form_argparse()

Creates argparse parser for the RuntimeProtocol object.

Returns

tuple with the argument parser object that can act as parent for program’s argument parser, and the corresponding arguments’ group pointer

Return type

(ArgumentParser, ArgumentGroup)

classmethod form_parameterschema()

Creates schema for the RuntimeProtocol class.

Returns

Dict

Return type

schema for the class

classmethod from_argparse(args)

Constructor wrapper that takes the parameters from argparse args.

Parameters

args (Dict) – arguments from RuntimeProtocol object

Returns

RuntimeProtocol

Return type

object of class RuntimeProtocol

classmethod from_json(json_dict: Dict)

Constructor wrapper that takes the parameters from json dict.

This function checks if the given dictionary is valid according to the arguments_structure defined. If it is then it invokes the constructor.

Parameters

json_dict (Dict) – Arguments for the constructor

Returns

RuntimeProtocol

Return type

object of class RuntimeProtocol

initialize_client() bool

Initializes client side of the runtime protocol.

The client side is supposed to run on host testing the target hardware.

The parameters for the client should be provided in the constructor.

Returns

bool

Return type

True if succeded

initialize_server() bool

Initializes server side of the runtime protocol.

The server side is supposed to run on target hardware.

The parameters for the server should be provided in the constructor.

Returns

bool

Return type

True if succeded

parse_message(message: bytes) Tuple[MessageType, bytes]

Parses message received in the wait_for_activity method.

The message type is determined from its contents and the optional data is returned along with it.

Parameters

message (bytes) – Received message

Returns

Tuple[‘MessageType’, bytes]

Return type

message type and accompanying data

receive_data() Tuple[ServerStatus, Any]

Gathers data from the client.

This method should be called by wait_for_activity method in order to receive data from the client.

Returns

Tuple[ServerStatus, Any]

Return type

receive status along with received data

request_failure() bool

Sends ERROR message back to the client if it failed to handle request.

Returns

bool

Return type

True if sent successfully

request_processing() bool

Requests processing of input data and waits for acknowledgement.

This method triggers inference on target device and waits until the end of inference on target device is reached.

This method measures processing time on the target device from the level of the host.

Target may send its own measurements in the statistics.

Returns

bool

Return type

True if inference finished successfully

request_success(data: bytes = b'') bool

Sends OK message back to the client once the request is finished.

Parameters

data (bytes) – Optional data upon success, if any

Returns

bool

Return type

True if sent successfully

send_data(data: bytes) bool

Sends data to the target device.

Data can be model to use, input to process, additional configuration.

Parameters

data (bytes) – Data to send

Returns

bool

Return type

True if successful

upload_input(data: bytes) bool

Uploads input to the target device and waits for acknowledgement.

This method should wait until the target device confirms the data is delivered and preprocessed for inference.

Parameters

data (bytes) – Input data for inference

Returns

bool

Return type

True if ready for inference

upload_io_specification(path: Path) bool

Uploads input/output specification to the target device.

This method takes the specification in a json format from the given Path and sends it to the target device.

This method should receive the status of uploading the data to the target.

pathPath

Path to the json file

Returns

bool

Return type

True if data upload finished successfully

upload_model(path: Path) bool

Uploads the model to the target device.

This method takes the model from given Path and sends it to the target device.

This method should receive the status of uploading the model from the target.

Parameters

path (Path) – Path to the model

Returns

bool

Return type

True if model upload finished successfully

wait_for_activity() List[Tuple[ServerStatus, Any]]

Waits for incoming data from the other side of connection.

This method should wait for the input data to arrive and return the appropriate status code along with received data.

Returns

list of messages along with status codes.

Return type

List[Tuple[‘ServerStatus’, Any]]

Measurements

kenning.core.measurements module contains Measurements and MeasurementsCollector classes for collecting performance and quality metrics. Measurements is a dict-like object that provides various methods for adding the performance metrics, adding values for time series, and updating existing values.

The dictionary held by Measurements needs to have serializable data, since most of the scripts save the performance results later in the JSON format for later report generation.

Module containing decorators for benchmark data gathering.

class kenning.core.measurements.Measurements

Stores benchmark measurements for later processing.

This is a dict-like object that wraps all processing results for later raport generation.

The dictionary in Measurements has measurement type as a key, and list of values for given measurement type.

There can be other values assigned to a given measurement type than list, but it requires explicit initialization.

data

Dictionary storing lists of values

Type

dict

accumulate(measurementtype: str, valuetoadd: ~typing.Any, initvaluefunc: ~typing.Callable[[], ~typing.Any] = <function Measurements.<lambda>>) List

Adds given value to a measurement.

This function adds given value (it can be integer, float, numpy array, or any type that implements iadd operator).

If it is the first assignment to a given measurement type, the first list element is initialized with the initvaluefunc (function returns the initial value).

Parameters
  • measurementtype (str) – the name of the measurement

  • valuetoadd (Any) – New value to add to the measurement

  • initvaluefunc (Any) – The initial value of the measurement, default 0

add_measurement(measurementtype: str, value: ~typing.Any, initialvaluefunc: ~typing.Callable = <function Measurements.<lambda>>)

Add new value to a given measurement type.

Parameters
  • measurementtype (str) – the measurement type to be updated

  • value (Any) – the value to add

  • initialvaluefunc (Callable) – the initial value for the measurement

add_measurements_list(measurementtype: str, valueslist: List)

Adds new values to a given measurement type.

Parameters
  • measurementtype (str) – the measurement type to be updated

  • valueslist (List) – the list of values to add

clear()

Clears measurement data.

get_values(measurementtype: str) List

Returns list of values for a given measurement type.

Parameters

measurementtype (str) – The name of the measurement type

Returns

List

Return type

list of values for a given measurement type

initialize_measurement(measurement_type: str, value: Any)

Sets the initial value for a given measurement type.

By default, the initial values for every measurement are empty lists. Lists are meant to collect time series data and other probed measurements for further analysis.

In case the data is collected in a different container, it should be configured explicitly.

Parameters
  • measurement_type (str) – The type (name) of the measurement

  • value (Any) – The initial value for the measurement type

update_measurements(other: Union[Dict, Measurements])

Adds measurements of types given in the other object.

It requires another Measurements object, or a dictionary that has string keys and values that are lists of values. The lists from the other object are appended to the lists in this object.

Parameters

other (Union[Dict, 'Measurements']) – A dictionary or another Measurements object that contains lists in every entry.

class kenning.core.measurements.MeasurementsCollector

It is a ‘static’ class collecting measurements from various sources.

classmethod clear()

Clears measurement data.

classmethod save_measurements(resultpath: Path)

Saves measurements to JSON file.

Parameters

resultpath (Path) – Path to the saved JSON file

class kenning.core.measurements.SystemStatsCollector(prefix: str, step: float = 0.1)

It is a separate thread used for collecting system statistics.

It collects:

  • CPU utilization,

  • RAM utilization,

  • GPU utilization,

  • GPU Memory utilization.

It can be executed in parallel to another function to check its utilization of resources.

get_measurements()

Returns measurements from the thread.

Collected measurements names are prefixed by the prefix given in the constructor.

The list of measurements:

  • <prefix>_cpus_percent: gives per-core CPU utilization (%),

  • <prefix>_mem_percent: gives overall memory usage (%),

  • <prefix>_gpu_utilization: gives overall GPU utilization (%),

  • <prefix>_gpu_mem_utilization: gives overall memory utilization (%),

  • <prefix>_timestamp: gives the timestamp of above measurements (ns).

Returns

Measurements

Return type

Measurements object.

run()

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

kenning.core.measurements.systemstatsmeasurements(measurementname: str, step: float = 0.5)

Decorator for measuring memory usage of the function.

Check SystemStatsCollector.get_measurements for list of delivered measurements.

Parameters
  • measurementname (str) – The name of the measurement type.

  • step (float) – The step for the measurements, in seconds

kenning.core.measurements.tagmeasurements(tagname: str)

Decorator for adding tags for measurements and saving their timestamps.

Parameters

tagname (str) – The name of tag.

kenning.core.measurements.timemeasurements(measurementname: str)

Decorator for measuring time of the function.

The duration is given in nanoseconds.

Parameters

measurementname (str) – The name of the measurement type.

ONNXConversion

ONNXConversion object contains methods for converting models in various frameworks to ONNX and vice versa. It also provides methods for testing the conversion process empirically on a list of deep learning models implemented in tested frameworks.

class kenning.core.onnxconversion.ONNXConversion(framework, version)

Creates ONNX conversion support matrix for given framework and models.

add_entry(name, modelgenerator, **kwargs)

Adds new model for verification.

Parameters
  • name (str) – Full name of the model, should match the name of the same models in other framework’s implementations

  • modelgenerator (Callable) – Function that generates the model for ONNX conversion in a given framework. The callable should accept no arguments

  • kwargs (Dict[str, Any]) – Additional arguments that are passed to ModelEntry object as parameters

check_conversions(modelsdir: Path) List[Support]

Runs ONNX conversion for every model entry in the list of models.

Parameters

modelsdir (Path) – Path to the directory where the intermediate models will be saved.

Returns

List with Support tuples describing support status.

Return type

List[Support]

onnx_export(modelentry: ModelEntry, exportpath: Path)

Virtual function for exporting the model to ONNX in a given framework.

This method needs to be implemented for a given framework in inheriting class.

Parameters
  • modelentry (ModelEntry) – ModelEntry object.

  • exportpath (Path) – Path to the output ONNX file.

Returns

SupportStatus

Return type

the support status of exporting given model to ONNX

onnx_import(modelentry: ModelEntry, importpath: Path)

Virtual function for importing ONNX model to a given framework.

This method needs to be implemented for a given framework in inheriting class.

Parameters
  • modelentry (ModelEntry) – ModelEntry object.

  • importpath (Path) – Path to the input ONNX file.

Returns

SupportStatus

Return type

the support status of importing given model from ONNX

prepare()

Virtual function for preparing the ONNX conversion test.

This method should add model entries using add_entry methos.

It is later called in the constructor to prepare the list of models to test.

DataProvider

The DataProvider classes are used during deployment for providing data to infer. They can provide data from such sources as camera, video files, microphone data or TCP connection.

class kenning.core.dataprovider.DataProvider
detach_from_source()

Detaches from the source during shutdown

fetch_input() Any

Gets the sample from device

Returns

Any

Return type

data to be processed by the model

classmethod form_argparse()

Creates argparse parser for the DataProvider object.

This method is used to create a list of arguments for the object so it is possible to configure the object from the level of command line.

Returns

tuple with the argument parser object that can act as parent for program’s argument parser, and the corresponding arguments’ group pointer

Return type

(ArgumentParser, ArgumentGroup)

classmethod from_argparse(args)

Constructor wrapper that takes the parameters from argparse args.

This method takes the arguments created in form_argparse and uses them to create the object.

Parameters

args (Dict) – arguments from ArgumentParser object

Returns

DataProvider

Return type

object of class DataProvider

prepare()

Prepares the source for data gathering depending on the source type.

This will for example initialize the camera and set the self.device to it

preprocess_input(data: Any) Any

Performs provider-specific preprocessing of inputs

Parameters

data (Any) – the data to be preprocessed

Returns

Any

Return type

preprocessed data

OutputCollector

The OutputCollector classes are used during deployment for receiving and processing inference results. They can display the results, send them, or store them in a file.

class kenning.core.outputcollector.OutputCollector
detach_from_output()

Detaches from the output during shutdown

classmethod form_argparse()

Creates argparse parser for the OutputCollector object.

This method is used to create a list of arguments for the object so it is possible to configure the object from the level of command line.

Returns

tuple with the argument parser object that can act as parent for program’s argument parser, and the corresponding arguments’ group pointer

Return type

(ArgumentParser, ArgumentGroup)

classmethod from_argparse(args)

Constructor wrapper that takes the parameters from argparse args.

This method takes the arguments created in form_argparse and uses them to create the object.

Parameters

args (Dict) – arguments from ArgumentParser object

Returns

OutputCollector

Return type

object of class OutputCollector

process_output(input_data: Any, output_data: Any)

Returns the infered data back to the specific place/device/connection

Eg. it can save a video file with bounding boxes on objects or stream it via a TCP connection, or just show it on screen

Parameters
  • input_data (Any) – Data collected from Datacollector that was processed by the model

  • output_data (Any) – Data returned from the model

should_close() bool

Checks if a specific exit condition was reached

This allows the OutputCollector to close gracefully if an exit condition was reached, eg. when a key was pressed.

Returns

bool

Return type

True if exit condition was reached to break the loop