Kenning API¶
Deployment API overview¶

Figure 5 Kenning core classes and interactions between them. The green blocks represent the flow of input data passed to the model for inference. The orange blocks represent the flow of model deployment, from training to inference on target device. The grey blocks represent the inference results and metrics flow.¶
Kenning provides:
a Dataset class - performs dataset download, preparation, input preprocessing, output postprocessing and model evaluation,
a ModelWrapper class - trains the model, prepares the model, performs model-specific input preprocessing and output postprocessing, runs inference on host using a native framework,
a Optimizer class - optimizes and compiles the model,
a Runtime class - loads the model, performs inference on compiled model, runs target-specific processing of inputs and outputs, and runs performance benchmarks,
a RuntimeProtocol class - implements the communication protocol between the host and the target,
a DataProvider class - implements providing data for inference from such sources as camera, TCP connection, or others,
a OutputCollector class - implements parsing and utilizing data coming from inference (such as displaying visualizations or sending results via TCP).
Model processing¶
The orange blocks and arrows in Figure 5 represent a model’s life cycle:
the model is designed, trained, evaluated and improved - the training is implemented in the ModelWrapper.
Note
This is an optional step - an already trained model can also be wrapped and used.
the model is passed to the Optimizer where it is optimized for given hardware and later compiled,
during inference testing, the model is sent to the target using RuntimeProtocol,
the model is loaded on target side and used for inference using Runtime.
Once the development of the model is complete, the optimized and compiled model can be used directly on target device using Runtime.
I/O data flow¶
The data flow is represented in the Figure 5 with green blocks. The input data flow is depicted using green arrows, and the output data flow is depicted using grey arrows.
Firstly, the input and output data is loaded from dataset files and processed. Later, since every model has its specific input preprocessing and output postprocessing routines, the data is passed to the ModelWrapper methods in order to to apply modifications. During inference testing, the data is sent to and from the target using RuntimeProtocol.
Lastly, since Runtimes also have their specific representations of data, proper I/O processing is applied.
Data flow reporting¶
Report rendering requires performance metrics and quality metrics. The flow for this is presented with grey lines and blocks in Figure 5.
On target side, performance metrics are computed and sent back to the host using the RuntimeProtocol, and later passed to report rendering. After the output data goes through processing in the Runtime and ModelWrapper, it is compared to the ground truth in the Dataset during model evaluation. In the end, the results of model evaluation are passsed to report rendering.
The final report is generated as an RST file containing figures, as can be observed in the Sample autogenerated report.
KenningFlow¶
kenning.core.flow.KenningFlow
class allows for creation and execution of arbitrary flows built of runners.
It is responsible for validating all runners provided in a config file and their IO compatibility.
- class kenning.core.flow.KenningFlow(runners: list[Runner])¶
Allows for creation of custom flows using Kenning core classes.
KenningFlow class creates and executes customized flows consisting of the runners implemented based on kenning.core classes, such as DatasetProvider, ModelRunner, OutputCollector. Designed flows may be formed into non-linear, graph-like structures.
The flow may be defined either directly via dictionaries or in a predefined JSON format.
The JSON format must follow well defined structure. Each runner should consist of following entires:
type - Type of a Kenning class to use for this module parameters - Inner parameters of chosen class inputs - Optional, set of pairs (local name, global name) outputs - Optional, set of pairs (local name, global name)
All global names (inputs and outputs) must be unique. All local names are predefined for each class. All variables used as input to a runner must be defined as a output of a runner that is placed before that runner.
- classmethod form_parameterschema()¶
Creates schema for the KenningFlow class.
- classmethod from_json(runners_specifications: list[dict[str, Any]])¶
Constructor wrapper that takes the parameters from json dict.
This function checks if the given dictionary is valid according to the json schema defined in
form_parameterschema
. If it is then it parses json and invokes the constructor.
- run()¶
Main process function. Repeatedly runs constructed graph in a loop.
- run_single_step()¶
Runs flow one time.
Runner¶
kenning.core.runner.Runner
-based classes are responsible for executing various operation in KenningFlow (i.e. data providing, model execution, data visualization).
The available runner implementations are:
DataProvider - base class for data providing,
ModelRuntimeRunner - for running model inference,
OutputCollector - for processing model output.
- class kenning.core.runner.Runner(inputs_sources: dict[str, tuple[int, str]], inputs_specs: dict[str, dict], outputs: dict[str, str])¶
Represents an operation block in Kenning Flow.
- cleanup()¶
Method that cleans resources after Runner is no longer needed.
- classmethod from_argparse(args: Namespace, inputs_sources: dict[str, tuple[int, str]], inputs_specs: dict[str, dict], outputs: dict[str, str])¶
Constructor wrapper that takes the parameters from argparse args.
This method takes the arguments created in form_argparse and uses them to create the object.
- classmethod from_json(json_dict: dict, inputs_sources: dict[str, tuple[int, str]], inputs_specs: dict[str, dict], outputs: dict[str, str])¶
Constructor wrapper that takes the parameters from json dict.
This function checks if the given dictionary is valid according to the json schema defined. If it is then it invokes the constructor.
Dataset¶
kennning.core.dataset.Dataset
-based classes are responsible for:
dataset preparation, including download routines (use the
--download-dataset
flag to download the dataset data),input preprocessing into a format expected by most models for a given task,
output postprocessing for the evaluation process,
model evaluation based on its predictions,
sample subdivision into training and validation datasets.
The Dataset objects are used by:
ModelWrapper - for training purposes and model evaluation,
Optimizer - can be used e.g. for extracting a calibration dataset for quantization purposes,
Runtime - for model evaluation on target hardware.
The available dataset implementations are included in the kenning.datasets
submodule.
Example implementations:
PetDataset for classification,
OpenImagesDatasetV6 for object detection,
-
class kenning.core.dataset.Dataset(root: Path, batch_size: int =
1
, download_dataset: bool =True
, force_download_dataset: bool =False
, external_calibration_dataset: Path | None =None
, split_fraction_test: float =0.2
, split_fraction_val: float | None =None
, split_seed: int =1234
)¶ Wraps the datasets for training, evaluation and optimization.
This class provides an API for datasets used by models, compilers (i.e. for calibration) and benchmarking scripts.
Each Dataset object should implement methods for:
processing inputs and outputs from dataset files,
downloading the dataset,
evaluating the model based on dataset’s inputs and outputs.
The Dataset object provides routines for iterating over dataset samples with configured batch size, splitting the dataset into subsets and extracting loaded data from dataset files for training purposes.
-
calibration_dataset_generator(percentage: float =
0.25
, seed: int =12345
) Generator[list[Any], None, None] ¶ Creates generator for the calibration data.
- abstract download_dataset_fun()¶
Downloads the dataset to the root directory defined in the constructor.
- abstract evaluate(predictions: list, truth: list) Measurements ¶
Evaluates the model based on the predictions.
The method should compute various quality metrics fitting for the problem the model solves - i.e. for classification it may be accuracy, precision, G-mean, for detection it may be IoU and mAP.
The evaluation results should be returned in a form of Measurements object.
- classmethod from_argparse(args: Namespace)¶
Constructor wrapper that takes the parameters from argparse args.
This method takes the arguments created in
form_argparse
and uses them to create the object.
- classmethod from_json(json_dict: dict)¶
Constructor wrapper that takes the parameters from json dict.
This function checks if the given dictionary is valid according to the
arguments_structure
defined. If it is then it invokes the constructor.
- get_class_names() list[str] ¶
Returns list of class names in order of their IDs.
- get_data() tuple[list, list] ¶
Returns the tuple of all inputs and outputs for the dataset.
Warning
It loads all entries with prepare_input_samples and prepare_output_samples to the memory - for large datasets it may result in filling the whole memory.
- get_data_unloaded() tuple[list, list] ¶
Returns the input and output representations before loading.
The representations can be opened using prepare_input_samples and prepare_output_samples.
- get_input_mean_std() tuple[Any, Any] ¶
Returns mean and std values for input tensors.
The mean and std values returned here should be computed using
compute_input_mean_std
method.
- abstract prepare()¶
Prepares dataX and dataY attributes based on the dataset contents.
This can i.e. store file paths in dataX and classes in dataY that will be later loaded using prepare_input_samples and prepare_output_samples.
-
prepare_external_calibration_dataset(percentage: float =
0.25
, seed: int =12345
) list[Path] ¶ Prepares the data for external calibration dataset.
This method is supposed to scan external_calibration_dataset directory and prepares the list of entries that are suitable for the prepare_input_samples method.
This method is called by the
calibration_dataset_generator
method to get the data for calibration when external_calibration_dataset is provided.By default, this method scans for all files in the directory and returns the list of those files.
- prepare_input_samples(samples: list) list ¶
Prepares input samples, i.e. load images from files, converts them.
By default the method returns data as is - without any conversions. Since the input samples can be large, it does not make sense to load all data to the memory - this method handles loading data for a given data batch.
- prepare_output_samples(samples: list) list ¶
Prepares output samples.
By default the method returns data as is. It can be used i.e. to create the one-hot output vector with class association based on a given sample.
- save_dataset_checksum()¶
Writes dataset checksum to file.
- set_batch_size(batch_size)¶
Sets the batch size of the data in the iterator batches.
-
train_test_split_representations(test_fraction: float | None =
None
, val_fraction: float | None =None
, seed: int | None =None
, stratify: bool =True
) tuple[list, ...] ¶ Splits the data representations into train dataset and test dataset.
ModelWrapper¶
kenning.core.model.ModelWrapper
base class requires implementing methods for:
model preparation,
model saving and loading,
model saving to the ONNX format,
model-specific preprocessing of inputs and postprocessing of outputs, if neccessary,
model inference,
providing metadata (framework name and version),
model training,
input format specification,
conversion of model inputs and outputs to bytes for the
kenning.core.runtimeprotocol.RuntimeProtocol
objects.
The ModelWrapper
provides methods for running inference in a loop for data from a dataset and measuring the model’s quality and inference performance.
The kenning.modelwrappers.frameworks
submodule contains framework-wise implementations of the ModelWrapper
class - they implement all methods common for given frameworks regardless of the model used.
For the Pet Dataset wrapper
object, there is an example classifier implemented in TensorFlow 2.x called TensorFlowPetDatasetMobileNetV2 <https://github.com/antmicro/kenning/blob/main/kenning/modelwrappers/classification/tensorflow_pet_dataset.py>
_.
Model wrapper examples:
PyTorchWrapper and TensorFlowWrapper implement common methods for all PyTorch and TensorFlow framework models,
PyTorchPetDatasetMobileNetV2 wraps the MobileNetV2 model for Pet classification implemented in PyTorch,
TensorFlowDatasetMobileNetV2 wraps the MobileNetV2 model for Pet classification implemented in TensorFlow,
TVMDarknetCOCOYOLOV3 wraps the YOLOv3 model for COCO object detection implemented in Darknet (without training and inference methods).
-
class kenning.core.model.ModelWrapper(model_path: Path | ResourceURI, dataset: Dataset | None, from_file: bool =
True
)¶ Wraps the given model.
- abstract convert_input_to_bytes(inputdata: Any) bytes ¶
Converts the input returned by the
preprocess_input
method to bytes.
- abstract convert_output_from_bytes(outputdata: bytes) list[Any] ¶
Converts bytes array to the model output format.
The converted output should be compatible with
postprocess_outputs
method.
- classmethod derive_io_spec_from_json_params(json_dict: dict) dict[str, list[dict]] ¶
Creates IO specification by deriving parameters from parsed JSON dictionary. The resulting IO specification may differ from the results of get_io_specification, information that couldn’t be retrieved from JSON parameters are absent from final IO spec or are filled with general value (example: ‘-1’ for unknown dimension shape).
-
classmethod from_argparse(dataset: Dataset | None, args: Namespace, from_file: bool =
True
)¶ Constructor wrapper that takes the parameters from argparse args.
-
classmethod from_json(dataset: Dataset | None, json_dict: dict, from_file: bool =
True
)¶ Constructor wrapper that takes the parameters from json dict.
This function checks if the given dictionary is valid according to the
arguments_structure
defined. If it is then it invokes the constructor.
- abstract get_framework_and_version() tuple[str, str] ¶
Returns name of the framework and its version in a form of a tuple.
- get_io_specification() dict[str, list[dict]] ¶
Returns a saved dictionary with input and output keys that map to input and output specifications.
A single specification is a list of dictionaries with names, shapes and dtypes for each layer. The order of the dictionaries is assumed to be expected by the ModelWrapper.
It is later used in optimization and compilation steps.
- abstract get_io_specification_from_model() dict[str, list[dict]] ¶
Returns a new instance of dictionary with input and output keys that map to input and output specifications.
A single specification is a list of dictionaries with names, shapes and dtypes for each layer. The order of the dictionaries is assumed to be expected by the ModelWrapper.
It is later used in optimization and compilation steps.
It is used by get_io_specification function to get the specification and save it for later use.
- abstract get_output_formats() list[str] ¶
Returns list of names of possible output formats.
- get_path() Path | ResourceURI ¶
Returns path to the model in a form of a Path or ResourceURI object.
- load_model(model_path: Path | ResourceURI)¶
Loads the model from file.
- classmethod parse_io_specification_from_json(json_dict)¶
Return dictionary with ‘input’ and ‘output’ keys that will map to input and output specification of an object created by the argument json schema.
A single specification is a list of dictionaries with names, shapes and dtypes for each layer.
Since no object initialization is done for this method, some IO specification may be incomplete, this method fils in -1 in case the information is missing from the JSON dictionary.
- postprocess_outputs(y: list[Any]) Any ¶
Processes the outputs for a given model.
By default no action is taken, and the outputs are passed unmodified.
- abstract prepare_model()¶
Downloads the model (if required) and loads it to the device.
Should be used whenever the model is actually required.
The prepare_model method should check model_prepared field to determine if the model is not already loaded.
It should also set model_prepared field to True once the model is prepared.
- preprocess_input(X: list) Any ¶
Preprocesses the inputs for a given model before inference.
By default no action is taken, and the inputs are passed unmodified.
- save_model(model_path: Path | ResourceURI)¶
Saves the model to file.
- save_to_onnx(model_path: Path | ResourceURI)¶
Saves the model in the ONNX format.
- test_inference() Measurements ¶
Runs the inference with a given dataset.
- train_model(batch_size: int, learning_rate: float, epochs: int, logdir: Path)¶
Trains the model with a given dataset.
This method should implement training routine for a given dataset and save a working model to a given path in a form of a single file.
The training should be performed with given batch size, learning rate, and number of epochs.
The model needs to be saved explicitly.
Optimizer¶
kenning.core.optimizer.Optimizer
objects wrap the deep learning compilation process.
They can perform the model optimization (operation fusion, quantization) as well.
All Optimizer objects should provide methods for compiling models in ONNX format, but they can also provide support for other formats (like Keras .h5 files, or PyTorch .th files).
Example model compilers:
TFLiteCompiler - wraps TensorFlow Lite compilation,
TVMCompiler - wraps TVM compilation.
- class kenning.core.optimizer.Optimizer(dataset: Dataset | None, compiled_model_path: Path)¶
Compiles the given model to a different format or runtime.
-
abstract compile(input_model_path: Path | ResourceURI, io_spec: dict[str, list[dict]] | None =
None
)¶ Compiles the given model to a target format.
The function compiles the model and saves it to the output file.
The model can be compiled to a binary, a different framework or a different programming language.
If io_spec is passed, then the function uses it during the compilation, otherwise load_io_specification is used to fetch the specification saved in input_model_path + .json.
The compiled model is saved to compiled_model_path and the specification is saved to compiled_model_path + .json
-
consult_model_type(previous_block: ModelWrapper | Optimizer, force_onnx: bool =
False
) str ¶ Finds output format of the previous block in the chain matching with an input format of the current block.
- classmethod from_argparse(dataset: Dataset, args: Namespace)¶
Constructor wrapper that takes the parameters from argparse args.
- classmethod from_json(dataset: Dataset | None, json_dict: dict)¶
Constructor wrapper that takes the parameters from json dict.
This function checks if the given dictionary is valid according to the
arguments_structure
defined. If it is then it invokes the constructor.
- abstract get_framework_and_version() tuple[str, str] ¶
Returns name of the framework and its version in a form of a tuple.
- get_input_formats() list[str] ¶
Returns list of names of possible input formats.
- get_output_formats() list[str] ¶
Returns list of names of possible output formats.
- get_spec_path(model_path: Path | ResourceURI) ResourceURI ¶
Returns input/output specification path for the model saved in model_path. It concatenates model_path and .json.
- load_io_specification(model_path: Path | ResourceURI) dict[str, list[dict]] | None ¶
Returns saved input and output specification of a model saved in model_path if there is one. Otherwise returns None.
-
save_io_specification(input_model_path: Path | ResourceURI, io_spec: dict[str, list[dict]] | None =
None
)¶ Internal function that saves input/output model specification which is used during both inference and compilation. If io_spec is None, the function uses specification of an input model stored in input_model_path + .json. If there is no specification stored in this path the function does not do anything.
The input/output specification is a list of dictionaries mapping properties names to their values. Legal properties names are dtype, prequantized_dtype, shape, name, scale, zero_point.
The order of the layers has to be preserved.
- set_compiled_model_path(compiled_model_path: Path)¶
Sets path for compiled model.
- compiled_model_pathPathOrURI
Path to be set.
- set_input_type(inputtype: str)¶
Sets input type of the model for the compiler.
- inputtypestr
Path to be set.
-
abstract compile(input_model_path: Path | ResourceURI, io_spec: dict[str, list[dict]] | None =
Runtime¶
The kenning.core.runtime.Runtime
class provides interfaces for methods for running compiled models locally or remotely on a target device.
Runtimes are usually compiler-specific (frameworks for deep learning compilers provide runtime libraries to run compiled models on particular hardware).
The client (host) side of the Runtime
class utilizes the methods from Dataset, ModelWrapper and RuntimeProtocol classes to run inference on a target device.
The server (target) side of the Runtime
class requires method implementation for:
loading a model delivered by the client,
preparing inputs delivered by the client,
running inference,
preparing outputs for delivery to the client,
(optionally) sending inference statistics.
Runtime examples:
TFLiteRuntime for models compiled with TensorFlow Lite,
TVMRuntime for models compiled with TVM.
-
class kenning.core.runtime.Runtime(protocol: RuntimeProtocol | None, disable_performance_measurements: bool =
False
)¶ Runtime object provides an API for testing inference on target devices.
Using a provided RuntimeProtocol it sets up a client (host) and server (target) communication, during which the inference metrics are being analyzed.
- close_server()¶
Indicates that the server should be closed.
- extract_output() list[Any] ¶
Extracts and postprocesses the output of the model.
- classmethod from_argparse(protocol: RuntimeProtocol | None, args: Namespace)¶
Constructor wrapper that takes the parameters from argparse args.
- classmethod from_json(protocol: RuntimeProtocol | None, json_dict: dict)¶
Constructor wrapper that takes the parameters from json dict.
This function checks if the given dictionary is valid according to the
arguments_structure
defined. If it is then it invokes the constructor.
- get_input_formats() list[str] ¶
Returns list of names of possible input formats names.
- get_io_spec_path(model_path: Path | ResourceURI) Path ¶
Gets path to a input/output specification file which is model_path and .json concatenated.
-
infer(X: ndarray, modelwrapper: ModelWrapper, postprocess: bool =
True
) Any ¶ Runs inference on single batch locally using a given runtime.
- inference_session_end()¶
Calling this function indicates that the inference session has ended.
This method should be called once all the inference data is sent to the server by the client.
This will stop performance tracking.
- inference_session_start()¶
Calling this function indicates that the client is connected.
This method should be called once the client has connected to a server.
This will enable performance tracking.
- postprocess_output(results: list[ndarray]) list[ndarray] ¶
The method accepts output of the model and postprocesses it.
The output is quantized and converted to a correct dtype if needed.
Some compilers can change the order of the layers. If that’s the case the methods also reorders the output to match the original order of the model before compilation.
- prepare_client() bool ¶
Runs initialization for the client.
- prepare_input(input_data: bytes) bool ¶
Loads and converts delivered data to the accelerator for inference.
This method is called when the input is received from the client. It is supposed to prepare input before running inference.
- prepare_io_specification(input_data: bytes | None) bool ¶
Receives the io_specification from the client in bytes and saves it for later use.
input_data
stores the io_specification representation in bytes. Ifinput_data
is None, the io_specification is extracted from another source (i.e. from existing file). If it can not be found in this path, io_specification is not loaded.When no specification file is found, the function returns True as some Runtimes may not need io_specification to run the inference.
- prepare_local() bool ¶
Runs initialization for the local inference.
- prepare_model(input_data: bytes | None) bool ¶
Receives the model to infer from the client in bytes.
The method should load bytes with the model, optionally save to file and allocate the model on target device for inference.
input_data
stores the model representation in bytes. Ifinput_data
is None, the model is extracted from another source (i.e. from existing file).
- prepare_server() bool ¶
Runs initialization of the server.
- preprocess_input(input_data: bytes) list[ndarray] ¶
The method accepts input_data in bytes and preprocesses it so that it can be passed to the model.
It creates np.ndarray for every input layer using the metadata in self.input_spec and quantizes the data if needed.
Some compilers can change the order of the layers. If that’s the case the method also reorders the layers to match the specification of the model.
- Parameters:¶
- input_data : bytes¶
Input data in bytes delivered by the client.
- Returns:¶
List of inputs for each layer which are ready to be passed to the model.
- Return type:¶
list[np.ndarray]
- Raises:¶
AttributeError : – Raised if output specification is not loaded.
ValueError : – Raised if size of input doesn’t match the input specification. # noqa: E501
- process_input(input_data)¶
Processes received input and measures the performance quality.
- read_io_specification(io_spec: dict)¶
Saves input/output specification so that it can be used during the inference.
input_spec and output_spec are lists, where every element is a dictionary mapping (property name) -> (property value) for the layers.
The standard property names are: name, dtype and shape.
If the model is quantized it also has scale, zero_point and prequantized_dtype properties.
If the layers of the model are reorder it also has order property.
- run()¶
Runs inference on prepared input.
The input should be introduced in runtime’s model representation, or it should be delivered using a variable that was assigned in prepare_input method.
- Raises:¶
ModelNotLoadedError : – Raised if model is not loaded.
- run_client(dataset: Dataset, modelwrapper: ModelWrapper, compiled_model_path: Path | ResourceURI) bool ¶
Main runtime client program.
The client performance procedure is as follows:
connect with the server
upload the model
send dataset data in a loop to the server:
upload input
request processing of inputs
request predictions for inputs
evaluate the response
collect performance statistics
end connection
- run_locally(dataset: Dataset, modelwrapper: ModelWrapper, compiled_model_path: Path | ResourceURI) bool ¶
Runs inference locally using a given runtime.
- run_server()¶
Main runtime server program.
It waits for requests from a single client.
Based on requests, it loads the model, runs inference and provides statistics.
- upload_essentials(compiled_model_path: Path | ResourceURI) bool ¶
Wrapper for uploading data to the server. Uploads model by default.
- upload_output(input_data: bytes) bytes ¶
Returns the output to the client, in bytes.
The method converts the direct output from the model to bytes and returns them.
The wrapper later sends the data to the client.
- upload_stats(input_data: bytes) bytes ¶
Returns statistics of inference passes to the client.
Default implementation converts collected metrics in MeasurementsCollector to JSON format and returns them for sending.
RuntimeProtocol¶
The kenning.core.runtimeprotocol.RuntimeProtocol
class conducts communication between the client (host) and the server (target).
The RuntimeProtocol class requires method implementation for:
initializing the server and the client (communication-wise),
waiting for the incoming data,
data sending,
data receiving,
uploading model inputs to the server,
uploading the model to the server,
requesting inference on target,
downloading outputs from the server,
(optionally) downloading the statistics from the server (e.g. performance speed, CPU/GPU utilization, power consumption),
success or failure notifications from the server,
message parsing.
Based on the above-mentioned methods, the kenning.core.runtime.Runtime
connects the host with the target.
RuntimeProtocol examples:
NetworkProtocol - implements a TCP-based communication between the host and the client.
Runtime protocol specification¶
The communication protocol is message-based. Possible messages are:
OK
messages - indicate success, and may come with additional information,ERROR
messages - indicate failure,DATA
messages - provide input data for inference,MODEL
messages - provide model to load for inference,PROCESS
messages - request processing inputs delivered inDATA
message,OUTPUT
messages - request processing results,STATS
messages - request statistics from the target device.
The message types and enclosed data are encoded in a format implemented in the kenning.core.runtimeprotocol.RuntimeProtocol
-based class.
Communication during an inference benchmark session goes as follows:
The client (host) connects to the server (target),
The client sends a
MODEL
request along with the compiled model,The server loads the model from request, prepares everything to run the model and sends an
OK
response,After receiving the
OK
response from the server, the client starts reading input samples from the dataset, preprocesses the inputs, and sends aDATA
request with the preprocessed input,Upon receiving the
DATA
request, the server stores the input for inference, and sends anOK
message,Upon receiving confirmation, the client sends a
PROCESS
request,Just after receiving the
PROCESS
request, the server should send anOK
message to confirm start of inference, and just after the inference is finished, the server should send anotherOK
message to confirm that the inference has finished,After receiving the first
OK
message, the client starts measuring inference time until the secondOK
response is received,The client sends an
OUTPUT
request in order to receive the outputs from the server,The server sends an
OK
message along with the output data,The client parses the output and evaluates model performance,
The client sends a
STATS
request to obtain additional statistics (inference time, CPU/GPU/Memory utilization) from the server,If the server provides any statistics, it sends an
OK
message with the data,The same process applies to the rest of input samples.
The way the message type is determined and the data between the server and the client is sent depends on the implementation of the kenning.core.runtimeprotocol.RuntimeProtocol
class.
The implementation of running inference on the given target is contained within the kenning.core.runtime.Runtime
class.
RuntimeProtocol API¶
kenning.core.runtimeprotocol.RuntimeProtocol
-based classes implement the Runtime protocol specification in a given means of transport, e.g. TCP connection or UART.
It requires method implementation for:
server (target hardware) and client (compiling host) initialization,
sending and receiving data,
connecting and disconnecting,
model upload (host) and download (target hardware),
message parsing and creation.
- class kenning.core.runtimeprotocol.RuntimeProtocol¶
The interface for the communication protocol with the target devices.
The target device acts as a server in the communication.
The machine that runs the benchmark and collects the results is the client for the target device.
The inheriting classes for this class implement at least the client-side of the communication with the target device.
- disconnect()¶
Ends connection with the other side.
- download_output() tuple[bool, bytes | None] ¶
Downloads the outputs from the target device.
Requests and downloads the latest inference output from the target device for quality measurements.
- download_statistics() Measurements ¶
Downloads inference statistics from the target device.
By default no statistics are gathered.
- classmethod from_argparse(args: Namespace)¶
Constructor wrapper that takes the parameters from argparse args.
- classmethod from_json(json_dict: dict)¶
Constructor wrapper that takes the parameters from json dict.
This function checks if the given dictionary is valid according to the
arguments_structure
defined. If it is then it invokes the constructor.
-
gather_data(timeout: float | None =
None
) tuple[ServerStatus, Any | None] ¶ Gathers data from the client.
This method should be called by receive_message in order to get data from the client.
- Parameters:¶
- timeout : Optional[float]¶
Receive timeout in seconds. If timeout > 0, this specifies the maximum wait time, in seconds. If timeout <= 0, the call won’t block, and will report the currently ready file objects. If timeout is None, the call will block until a monitored file object becomes ready.
- Returns:¶
Receive status along with received data.
- Return type:¶
Tuple[ServerStatus, Optional[Any]]
- initialize_client() bool ¶
Initializes client side of the runtime protocol.
The client side is supposed to run on host testing the target hardware.
The parameters for the client should be provided in the constructor.
- initialize_server() bool ¶
Initializes server side of the runtime protocol.
The server side is supposed to run on target hardware.
The parameters for the server should be provided in the constructor.
- receive_confirmation() tuple[bool, bytes | None] ¶
Waits until the OK message is received.
Method waits for the OK message from the other side of connection.
- receive_data(connection: Any, mask: int) tuple[ServerStatus, Any | None] ¶
Receives data from the target device.
-
receive_message(timeout: float | None =
None
) tuple[ServerStatus, Message] ¶ Waits for incoming data from the other side of connection.
This method should wait for the input data to arrive and return the appropriate status code along with received data.
- Parameters:¶
- timeout : Optional[float]¶
Receive timeout in seconds. If timeout > 0, this specifies the maximum wait time, in seconds. If timeout <= 0, the call won’t block, and will report the currently ready file objects. If timeout is None, the call will block until a monitored file object becomes ready.
- Returns:¶
Tuple containing server status and received message. The status is NOTHING if message is incomplete and DATA_READY if it is complete.
- Return type:¶
Tuple(ServerStatus, Message)
- request_failure() bool ¶
Sends ERROR message back to the client if it failed to handle request.
- request_processing(get_time_func: ~typing.Callable[[], float] = <built-in function perf_counter>) bool ¶
Requests processing of input data and waits for acknowledgement.
This method triggers inference on target device and waits until the end of inference on target device is reached.
This method measures processing time on the target device from the level of the host.
Target may send its own measurements in the statistics.
-
request_success(data: bytes | None =
b''
) bool ¶ Sends OK message back to the client once the request is finished.
- send_data(data: Any) bool ¶
Sends data to the target device.
Data can be model to use, input to process, additional configuration.
- upload_input(data: bytes) bool ¶
Uploads input to the target device and waits for acknowledgement.
This method should wait until the target device confirms the data is delivered and preprocessed for inference.
- upload_io_specification(path: Path) bool ¶
Uploads input/output specification to the target device.
This method takes the specification in a json format from the given Path and sends it to the target device.
This method should receive the status of uploading the data to the target.
Measurements¶
The kenning.core.measurements
module contains Measurements
and MeasurementsCollector
classes for collecting performance and quality metrics.
Measurements
is a dict-like object that provides various methods for adding performance metrics, adding values for time series, and updating existing values.
The dictionary held by Measurements
requires serializable data, since most scripts save performance results in JSON format for later report generation.
Module containing decorators for benchmark data gathering.
- class kenning.core.measurements.Measurements¶
Stores benchmark measurements for later processing.
This is a dict-like object that wraps all processing results for later report generation.
The dictionary in Measurements has measurement type as a key, and list of values for given measurement type.
There can be other values assigned to a given measurement type than list, but it requires explicit initialization.
- accumulate(measurementtype: str, valuetoadd: ~typing.Any, initvaluefunc: ~typing.Callable[[], ~typing.Any] = <function Measurements.<lambda>>) list ¶
Adds given value to a measurement.
This function adds given value (it can be integer, float, numpy array, or any type that implements iadd operator).
If it is the first assignment to a given measurement type, the first list element is initialized with the
initvaluefunc
(function returns the initial value).- Parameters:¶
- measurementtype : str
The name of the measurement.
- valuetoadd : Any
New value to add to the measurement.
- initvaluefunc : Callable[[], Any]
The initial value of the measurement, default 0.
- add_measurement(measurementtype: str, value: ~typing.Any, initialvaluefunc: ~typing.Callable[[], ~typing.Any] = <function Measurements.<lambda>>)¶
Add new value to a given measurement type.
- Parameters:¶
- measurementtype : str
The measurement type to be updated.
- value : Any
The value to add.
- initialvaluefunc : Callable
The initial value for the measurement.
- add_measurements_list(measurementtype: str, valueslist: list)¶
Adds new values to a given measurement type.
- clear()¶
Clears measurement data.
- get_values(measurementtype: str) list ¶
Returns list of values for a given measurement type.
- initialize_measurement(measurement_type: str, value: Any)¶
Sets the initial value for a given measurement type.
By default, the initial values for every measurement are empty lists. Lists are meant to collect time series data and other probed measurements for further analysis.
In case the data is collected in a different container, it should be configured explicitly.
- update_measurements(other: dict | Measurements)¶
Adds measurements of types given in the other object.
It requires another Measurements object, or a dictionary that has string keys and values that are lists of values. The lists from the other object are appended to the lists in this object.
- class kenning.core.measurements.MeasurementsCollector¶
It is a ‘static’ class collecting measurements from various sources.
- classmethod clear()¶
Clears measurement data.
- classmethod save_measurements(resultpath: Path)¶
Saves measurements to JSON file.
-
class kenning.core.measurements.SystemStatsCollector(prefix: str, step: float =
0.1
)¶ It is a separate thread used for collecting system statistics.
It collects:
CPU utilization,
RAM utilization,
GPU utilization,
GPU Memory utilization.
It can be executed in parallel to another function to check its utilization of resources.
- get_measurements()¶
Returns measurements from the thread.
Collected measurements names are prefixed by the prefix given in the constructor.
The list of measurements:
<prefix>_cpus_percent: gives per-core CPU utilization (%),
<prefix>_mem_percent: gives overall memory usage (%),
<prefix>_gpu_utilization: gives overall GPU utilization (%),
<prefix>_gpu_mem_utilization: gives overall memory utilization (%),
<prefix>_timestamp: gives the timestamp of above measurements (ns).
- run()¶
Method representing the thread’s activity.
You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.
-
kenning.core.measurements.systemstatsmeasurements(measurementname: str, step: float =
0.5
)¶ Decorator for measuring memory usage of the function.
Check SystemStatsCollector.get_measurements for list of delivered measurements.
- kenning.core.measurements.tagmeasurements(tagname: str)¶
Decorator for adding tags for measurements and saving their timestamps.
- kenning.core.measurements.timemeasurements(measurementname: str)¶
Decorator for measuring time of the function.
The duration is given in nanoseconds.
ONNXConversion¶
The ONNXConversion
object contains methods for model conversion in various frameworks to ONNX and vice versa.
It also provides methods for testing the conversion process empirically on a list of deep learning models implemented in the tested frameworks.
- class kenning.core.onnxconversion.ONNXConversion(framework, version)¶
Creates ONNX conversion support matrix for given framework and models.
- add_entry(name, modelgenerator, **kwargs)¶
Adds new model for verification.
- Parameters:¶
- name : str¶
Full name of the model, should match the name of the same models in other framework’s implementations.
- modelgenerator : Callable¶
Function that generates the model for ONNX conversion in a given framework. The callable should accept no arguments.
- **kwargs : Dict[str, Any]
Additional arguments that are passed to ModelEntry object as parameters.
- check_conversions(modelsdir: Path) list[Support] ¶
Runs ONNX conversion for every model entry in the list of models.
- onnx_export(modelentry: ModelEntry, exportpath: Path)¶
Virtual function for exporting the model to ONNX in a given framework.
This method needs to be implemented for a given framework in inheriting class.
- onnx_import(modelentry: ModelEntry, importpath: Path)¶
Virtual function for importing ONNX model to a given framework.
This method needs to be implemented for a given framework in inheriting class.
- prepare()¶
Virtual function for preparing the ONNX conversion test.
This method should add model entries using add_entry methos.
It is later called in the constructor to prepare the list of models to test.
DataProvider¶
The DataProvider
classes are used during deployment to provide data for inference.
They can provide data from such sources as a camera, video files, microphone data or a TCP connection.
The available DataProvider
implementations are included in the kenning.dataproviders
submodule.
Example implementations:
CameraDataProvider for capturing frames from camera.
-
class kenning.core.dataprovider.DataProvider(inputs_sources: dict[str, tuple[int, str]] =
{}
, inputs_specs: dict[str, dict] ={}
, outputs: dict[str, str] ={}
)¶ - abstract detach_from_source()¶
Detaches from the source during shutdown.
- abstract fetch_input() Any ¶
Gets the sample from device.
- prepare()¶
Prepares the source for data gathering depending on the source type.
This will for example initialize the camera and set the self.device to it.
OutputCollector¶
The OutputCollector
classes are used during deployment for inference results receiving and processing.
They can display the results, send them, or store them in a file.
The available output collector implementations are included in the kenning.outputcollectors
submodule.
Example implementations:
DetectionVisualizer for visualizing detection model outputs,
BaseRealTimeVisualizer base class for real time visualizers:
RealTimeDetectionVisualizer for visualizing detection model outputs,
RealTimeSegmentationVisualization for visualizing segmentation model outputs,
RealTimeClassificationVisualization for visualizing classification model outputs.
-
class kenning.core.outputcollector.OutputCollector(inputs_sources: dict[str, tuple[int, str]] =
{}
, inputs_specs: dict[str, dict] ={}
, outputs: dict[str, str] ={}
)¶ - abstract detach_from_output()¶
Detaches from the output during shutdown.
- process_output(input_data: Any, output_data: Any)¶
Returns the inferred data back to the specific place/device/connection.
Eg. it can save a video file with bounding boxes on objects or stream it via a TCP connection, or just show it on screen.
ArgumentsHandler¶
The ArgumentsHandler
class is responsible for concatenating arguments_structure
and creating parsers for command line and JSON config arguments.
In order to make some class being able to be instantiated from command line arguments or JSON config it is required to inherit from this class or its child class and implement from_argparse
or from_json
methods as described in Defining arguments for core classes.
- class kenning.utils.args_manager.ArgumentsHandler¶
Class responsible for creating parsers for arguments from command line or json configs.
The child class should define its own arguments_structure and from_argparse/from_json methods so that it could be instantiated from command line arguments or json config.
- classmethod form_argparse() tuple[ArgumentParser, _ArgumentGroup] ¶
Creates argparse parser based on arguments_structure of class and its all parent classes.