Using Kenning via command line arguments¶
Kenning provides several scripts for training, compilation and benchmarking of deep learning models on various target hardware. The executable scripts are present in the kenning.scenarios module. Sample bash scripts using the scenarios are located in the scripts directory in the repository.
Runnable scripts in scenarios require implemented classes to be provided from the kenning.core
module to perform such actions as in-framework inference, model training, model compilation and model benchmarking on target.
To run below examples it is required to install Kenning with dependencies as follows:
pip install "kenning[tensorflow,tvm] @ git+https://github.com/antmicro/kenning.git"
Command-line arguments for classes¶
Each class (Dataset, ModelWrapper, Optimizer and other) provided to the runnable scripts in scenarios can provide command-line arguments that configure the work of an object of the given class.
Each class in kenning.core
implements form_argparse
and from_argparse
methods.
The former creates an argparse
group for a given class with its parameters.
The latter takes the arguments parsed by argparse
and returns the object of a class.
Autocompletion for command line interface¶
Kenning provides autocompletion for its command line interface.
This feature requires additional configuration to work properly, which can be done using kenning completion
command.
Optionally, it can be configured as described in argcomplete documentation.
Model training¶
kenning.scenarios.model_training
performs model training using Kenning’s ModelWrapper and Dataset objects.
To get the list of training parameters, select the model and training dataset to use (i.e. TensorFlowPetDatasetMobileNetV2
model and PetDataset
dataset) and run:
kenning train \
--modelwrapper-cls kenning.modelwrappers.classification.tensorflow_pet_dataset.TensorFlowPetDatasetMobileNetV2 \
--dataset-cls kenning.datasets.pet_dataset.PetDataset \
-h
This will list the possible parameters that can be used to configure the dataset, the model, and the training parameters. For the above call, the output is as follows:
common arguments:
-h, --help show this help message and exit
--verbosity {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Verbosity level
'train' arguments:
--modelwrapper-cls MODELWRAPPER_CLS
ModelWrapper-based class with inference implementation to import
--dataset-cls DATASET_CLS
Dataset-based class with dataset to import
--batch-size BATCH_SIZE
The batch size for training
--learning-rate LEARNING_RATE
The learning rate for training
--num-epochs NUM_EPOCHS
Number of epochs to train for
--logdir LOGDIR Path to the training logs directory
ModelWrapper arguments:
--model-path MODEL_PATH
Path to the model
PetDataset arguments:
--classify-by {species,breeds}
Determines if classification should be performed by species or by breeds
--image-memory-layout {NHWC,NCHW}
Determines if images should be delivered in NHWC or NCHW format
Dataset arguments:
--dataset-root DATASET_ROOT
Path to the dataset directory
--inference-batch-size INFERENCE_BATCH_SIZE
The batch size for providing the input data
--download-dataset Downloads the dataset before taking any action. If the dataset files are already downloaded and
the checksum is correct then they are not downloaded again. Is enabled by default.
--force-download-dataset
Forces dataset download
--external-calibration-dataset EXTERNAL_CALIBRATION_DATASET
Path to the directory with the external calibration dataset
--split-fraction-test SPLIT_FRACTION_TEST
Default fraction of data to leave for model testing
--split-fraction-val SPLIT_FRACTION_VAL
Default fraction of data to leave for model valdiation
--split-seed SPLIT_SEED
Default seed used for dataset split
Note
The list of options depends on ModelWrapper and Dataset.
At the end, the training can be configured as follows:
kenning train \
--modelwrapper-cls kenning.modelwrappers.classification.tensorflow_pet_dataset.TensorFlowPetDatasetMobileNetV2 \
--dataset-cls kenning.datasets.pet_dataset.PetDataset \
--logdir build/logs \
--dataset-root build/pet-dataset \
--model-path build/trained-model.h5 \
--batch-size 32 \
--learning-rate 0.0001 \
--num-epochs 50
This will train the model with a 0.0001 learning rate and batch size 32 for 50 epochs.
The trained model will be saved as build/trained-model.h5
.
In-framework inference performance measurements¶
The kenning.scenarios.inference_performance
script runs inference on a given model in a framework it was trained on.
It requires you to provide:
a ModelWrapper-based object wrapping the model to be tested,
a Dataset-based object wrapping the dataset applicable to the model,
a path to the output JSON file with performance and quality metrics gathered during inference by the Measurements object.
The example call for the method is as follows:
kenning test \
--modelwrapper-cls kenning.modelwrappers.classification.tensorflow_pet_dataset.TensorFlowPetDatasetMobileNetV2 \
--dataset-cls kenning.datasets.pet_dataset.PetDataset \
--measurements build/tensorflow_pet_dataset_mobilenetv2.json \
--model-path kenning:///models/classification/tensorflow_pet_dataset_mobilenetv2.h5 \
--dataset-root build/pet-dataset/
The script downloads the dataset to the build/pet-dataset
directory, loads the tensorflow_pet_dataset_mobilenetv2.h5
model, runs inference on all images from the dataset and collects performance and quality metrics throughout the run.
The performance data stored in the JSON file can be later rendered using Generating performance reports.
Testing inference on target hardware¶
The kenning.scenarios.inference_tester
and kenning.scenarios.inference_server
are used for inference testing on target hardware.
The inference_tester
loads the dataset and the model, compiles the model and runs inference either locally or remotely using inference_server
.
The inference_server
receives the model and input data, and sends output data and statistics.
Both inference_tester
and inference_server
require Runtime to determine the model execution flow.
Both scripts communicate using the communication protocol implemented in the Protocol.
At the end, the inference_tester
returns the benchmark data in the form of a JSON file extracted from the Measurements object.
The kenning.scenarios.inference_tester
requires:
a ModelWrapper-based class that implements model loading, I/O processing and optionally model conversion to ONNX format,
a Runtime-based class that implements data processing and the inference method for the compiled model on the target hardware,
a Dataset-based class that implements data sample fetching and model evaluation,
a path to the output JSON file with performance and quality metrics.
An Optimizer-based class can be provided to compile the model for a given target if needed.
Optionally, it requires a Protocol-based class when running remotely to communicate with the kenning.scenarios.inference_server
.
To print the list of required arguments, run:
kenning optimize test \
--modelwrapper-cls kenning.modelwrappers.classification.tensorflow_pet_dataset.TensorFlowPetDatasetMobileNetV2 \
--runtime-cls kenning.runtimes.tvm.TVMRuntime \
--dataset-cls kenning.datasets.pet_dataset.PetDataset \
--compiler-cls kenning.optimizers.tvm.TVMCompiler \
--protocol-cls kenning.protocols.network.NetworkProtocol \
-h
With the above classes, the help can look as follows:
common arguments:
-h, --help show this help message and exit
--verbosity {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Verbosity level
--convert-to-onnx CONVERT_TO_ONNX
Before compiling the model, convert it to ONNX and use in compilation (provide a path to save here)
--measurements MEASUREMENTS
The path to the output JSON file with measurements
Inference configuration with JSON:
Configuration with pipeline defined in JSON file. This section is not compatible with 'Inference configuration with flags'. Arguments with '*' are required.
--json-cfg JSON_CFG * The path to the input JSON file with configuration of the inference
Inference configuration with flags:
Configuration with flags. This section is not compatible with 'Inference configuration with JSON'. Arguments with '*' are required.
--modelwrapper-cls MODELWRAPPER_CLS
* ModelWrapper-based class with inference implementation to import
--dataset-cls DATASET_CLS
* Dataset-based class with dataset to import
--compiler-cls COMPILER_CLS
* Optimizer-based class with compiling routines to import
--runtime-cls RUNTIME_CLS
Runtime-based class with the implementation of model runtime
--protocol-cls PROTOCOL_CLS
Protocol-based class with the implementation of communication between inference
tester and inference runner
ModelWrapper arguments:
--model-path MODEL_PATH
Path to the model
PetDataset arguments:
--classify-by {species,breeds}
Determines if classification should be performed by species or by breeds
--image-memory-layout {NHWC,NCHW}
Determines if images should be delivered in NHWC or NCHW format
Dataset arguments:
--dataset-root DATASET_ROOT
Path to the dataset directory
--inference-batch-size INFERENCE_BATCH_SIZE
The batch size for providing the input data
--download-dataset Downloads the dataset before taking any action. If the dataset files are already downloaded and the checksum is correct then they are not downloaded again. Is enabled by default.
--force-download-dataset
Forces dataset download
--external-calibration-dataset EXTERNAL_CALIBRATION_DATASET
Path to the directory with the external calibration dataset
--split-fraction-test SPLIT_FRACTION_TEST
Default fraction of data to leave for model testing
--split-fraction-val SPLIT_FRACTION_VAL
Default fraction of data to leave for model valdiation
--split-seed SPLIT_SEED
Default seed used for dataset split
TVMRuntime arguments:
--save-model-path SAVE_MODEL_PATH
Path where the model will be uploaded
--target-device-context {llvm,stackvm,cpu,c,test,hybrid,composite,cuda,nvptx,cl,opencl,sdaccel,aocl,aocl_sw_emu,vulkan,metal,vpi,rocm,ext_dev,hexagon,webgpu}
What accelerator should be used on target device
--target-device-context-id TARGET_DEVICE_CONTEXT_ID
ID of the device to run the inference on
--runtime-use-vm At runtime use the TVM Relay VirtualMachine
Runtime arguments:
--disable-performance-measurements
Disable collection and processing of performance metrics
TVMCompiler arguments:
--model-framework {keras,onnx,darknet,torch,tflite}
The input type of the model, framework-wise
--target TARGET The kind or tag of the target device
--target-host TARGET_HOST
The kind or tag of the host (CPU) target device
--opt-level OPT_LEVEL
The optimization level of the compilation
--libdarknet-path LIBDARKNET_PATH
Path to the libdarknet.so library, for darknet models
--compile-use-vm At compilation stage use the TVM Relay VirtualMachine
--output-conversion-function {default,dict_to_tuple}
The type of output conversion function used for PyTorch conversion
--conv2d-data-layout CONV2D_DATA_LAYOUT
Configures the I/O layout for the CONV2D operations
--conv2d-kernel-layout CONV2D_KERNEL_LAYOUT
Configures the kernel layout for the CONV2D operations
--use-fp16-precision Applies conversion of FP32 weights to FP16
--use-int8-precision Applies conversion of FP32 weights to INT8
--use-tensorrt For CUDA targets: delegates supported operations to TensorRT
--dataset-percentage DATASET_PERCENTAGE
Tells how much data from the calibration dataset (training or external) will be used for calibration dataset
Optimizer arguments:
--compiled-model-path COMPILED_MODEL_PATH
The path to the compiled model output
NetworkProtocol arguments:
--host HOST The address to the target device
--port PORT The port for the target device
BytesBasedProtocol arguments:
--packet-size PACKET_SIZE
The maximum size of the received packets, in bytes.
--endianness {big,little}
The endianness of data to transfer
The kenning.scenarios.inference_server
requires only:
a Protocol-based class for the implementation of the communication,
a Runtime-based class for the implementation of runtime routines on device.
Both classes may require some additional arguments that can be listed with the -h
flag.
An example script for the inference_tester
is:
kenning optimize test \
--modelwrapper-cls kenning.modelwrappers.classification.tensorflow_pet_dataset.TensorFlowPetDatasetMobileNetV2 \
--runtime-cls kenning.runtimes.tflite.TFLiteRuntime \
--dataset-cls kenning.datasets.pet_dataset.PetDataset \
--measurements ./build/google-coral-devboard-tflite-tensorflow.json \
--compiler-cls kenning.optimizers.tflite.TFLiteCompiler \
--protocol-cls kenning.protocols.network.NetworkProtocol \
--model-path kenning:///models/classification/tensorflow_pet_dataset_mobilenetv2.h5 \
--model-framework keras \
--target "edgetpu" \
--compiled-model-path build/compiled-model.tflite \
--inference-input-type int8 \
--inference-output-type int8 \
--host 192.168.188.35 \
--port 12344 \
--packet-size 32768 \
--save-model-path /home/mendel/compiled-model.tflite \
--dataset-root build/pet-dataset \
--inference-batch-size 1 \
--verbosity INFO
The above runs with the following inference_server
setup:
kenning server \
--protocol-cls kenning.protocols.network.NetworkProtocol \
--runtime-cls kenning.runtimes.tflite.TFLiteRuntime \
--host 0.0.0.0 \
--port 12344 \
--packet-size 32768 \
--save-model-path /home/mendel/compiled-model.tflite \
--delegates-list libedgetpu.so.1 \
--verbosity INFO
Note
This run was tested on a Google Coral Devboard device.
kenning.scenarios.inference_tester
can be also executed locally - in this case, the --protocol-cls
argument can be skipped.
The example call is as follows:
kenning optimize test \
--modelwrapper-cls kenning.modelwrappers.classification.tensorflow_pet_dataset.TensorFlowPetDatasetMobileNetV2 \
--runtime-cls kenning.runtimes.tvm.TVMRuntime \
--dataset-cls kenning.datasets.pet_dataset.PetDataset \
--measurements ./build/local-cpu-tvm-tensorflow-classification.json \
--compiler-cls kenning.optimizers.tvm.TVMCompiler \
--model-path kenning:///models/classification/tensorflow_pet_dataset_mobilenetv2.h5 \
--model-framework keras \
--target "llvm" \
--compiled-model-path ./build/compiled-model.tar \
--opt-level 3 \
--save-model-path ./build/compiled-model.tar \
--target-device-context cpu \
--dataset-root ./build/pet-dataset/ \
--inference-batch-size 1 \
--verbosity INFO
Note
For more examples of running inference_tester
and inference_server
, check the kenning/scripts directory.
Directories with scripts for client and server calls for various target devices, deep learning frameworks and compilation frameworks can be found in the kenning/scripts/edge-runtimes directory.
Running inference¶
kenning.scenarios.inference_runner
is used to run inference locally on a pre-compiled model.
kenning.scenarios.inference_runner
requires:
a ModelWrapper-based class that performs I/O processing specific to the model,
a Runtime-based class that runs inference on target using the compiled model,
a DataProvider-based class that implements fetching of data samples from various sources,
a list of OutputCollector-based classes that implement output processing for the specific use case.
To print the list of required arguments, run:
python3 -m kenning.scenarios.inference_runner \
kenning.modelwrappers.object_detection.darknet_coco.TVMDarknetCOCOYOLOV3 \
kenning.runtimes.tvm.TVMRuntime \
kenning.dataproviders.camera_dataprovider.CameraDataProvider \
--output-collectors kenning.outputcollectors.name_printer.NamePrinter \
-h
With the above classes, the help can look as follows:
positional arguments:
modelwrappercls ModelWrapper-based class with inference implementation to import
runtimecls Runtime-based class with the implementation of model runtime
dataprovidercls DataProvider-based class used for providing data
optional arguments:
-h, --help show this help message and exit
--output-collectors OUTPUT_COLLECTORS [OUTPUT_COLLECTORS ...]
List to the OutputCollector-based classes where the results will be passed
--verbosity {DEBUG,INFO,WARNING,ERROR,CRITICAL}
Verbosity level
Inference model arguments:
--model-path MODEL_PATH
Path to the model
--classes CLASSES File containing Open Images class IDs and class names in CSV format to use (can be generated using
kenning.scenarios.open_images_classes_extractor) or class type
Runtime arguments:
--disable-performance-measurements
Disable collection and processing of performance metrics
--save-model-path SAVE_MODEL_PATH
Path where the model will be uploaded
--target-device-context {llvm,stackvm,cpu,c,cuda,nvptx,cl,opencl,aocl,aocl_sw_emu,sdaccel,vulkan,metal,vpi,rocm,ext_dev,hexagon,webgpu}
What accelerator should be used on target device
--target-device-context-id TARGET_DEVICE_CONTEXT_ID
ID of the device to run the inference on
--input-dtype INPUT_DTYPE
Type of input tensor elements
--runtime-use-vm At runtime use the TVM Relay VirtualMachine
--use-json-at-output Encode outputs of models into a JSON file with base64-encoded arrays
DataProvider arguments:
--video-file-path VIDEO_FILE_PATH
Video file path (for cameras, use /dev/videoX where X is the device ID eg. /dev/video0)
--image-memory-layout {NHWC,NCHW}
Determines if images should be delivered in NHWC or NCHW format
--image-width IMAGE_WIDTH
Determines the width of the image for the model
--image-height IMAGE_HEIGHT
Determines the height of the image for the model
OutputCollector arguments:
--print-type {detector,classificator}
What is the type of model that will input data to the NamePrinter
An example script for inference_runner
:
python3 -m kenning.scenarios.inference_runner \
kenning.modelwrappers.object_detection.darknet_coco.TVMDarknetCOCOYOLOV3 \
kenning.runtimes.tvm.TVMRuntime \
kenning.dataproviders.camera_dataprovider.CameraDataProvider \
--output-collectors kenning.outputcollectors.detection_visualizer.DetectionVisualizer kenning.outputcollectors.name_printer.NamePrinter \
--disable-performance-measurements \
--model-path kenning:///models/object_detection/yolov3.weights \
--save-model-path ../compiled-model.tar \
--target-device-context "cuda" \
--verbosity INFO \
--video-file-path /dev/video0
Generating performance reports¶
kenning.scenarios.inference_performance
and kenning.scenarios.inference_tester
return a JSON file as the result of benchmarks.
They contain both performance metrics data, and quality metrics data.
The data from JSON files can be analyzed, processed and visualized by the kenning.scenarios.render_report
script.
This script parses the information in JSON files and returns an RST file with the report, along with visualizations.
It requires:
a JSON file with benchmark data,
a report name for use in the RST file and for creating Sphinx refs to figures,
an RST output file name,
--root-dir
specifying the root directory of the Sphinx documentation where the RST file will be embedded (it is used to compute relative paths),--img-dir
specifying the path where the figures should be saved,--report-types
, which is a list describing the types the report falls into.
An example call and the resulting RST file can be observed in Sample autogenerated report.
As for now, the available report types are:
performance
- this is the most common report type that renders information about overall inference performance metrics, such as inference time, CPU usage, RAM usage, or GPU utilization,classification
- this report is specific to the classification task, it renders classification-specific quality figures and metrics, such as confusion matrices, accuracy, precision, G-mean,detection
- this report is specific to the detection task, it renders detection-specific quality figures and metrics, such as recall-precision curves or mean average precision.
Displaying information about available classes¶
kenning.scenarios.list_classes
and kenning.scenarios.class_info
provide useful information about classes and can help in creating JSON scenarios.
kenning.scenarios.list_classes
will list all available classes by default, though the output can be limited by providing positional arguments representing groups of modules: optimizers
, runners
, dataproviders
, datasets
, modelwrappers
, onnxconversions
, outputcollectors
, runtimes
.
The amount of information displayed can be controlled using -v
and -vv
flags.
To print available arguments run python -m kenning.scenarios.list_classes -h
.
kenning.scenarios.class_info
provides information about a class given in an argument. More precisely, it will display:
module and class docstrings,
dependencies along with the information whether they are available in the current python environment,
supported input and output formats,
arguments structure used in JSON configurations.
The script uses a module-like path to the file (e.g. kenning.runtimes.tflite
), but optionally a class can be specified by adding it to the path like so: kenning.runtimes.tflite.TFLiteRuntime
.
To get more detailed information, an optional --load-class-with-args
argument can be passed. This needs all required class arguments to be provided, and that all dependencies are available.
For more detail, check python -m kenning.scenarios.class_info -h
.