Commands used¶
This section was generated using:
kenning optimize test \
--json-cfg \
kenning-scenarios/renode-magic-wand-iree-bare-metal-inference.json \
--measurements \
./results.json \
--verbosity \
kenning report \
--report-types \
performance \
classification \
renode_stats \
--report-path \
springbok-magic-wand/report.md \
--report-name \
v-extensions-riscv \
--model-names \
magic_wand_fp32 \
--measurements \
results.json \
--verbosity \
--to-html \
General information for magic_wand_fp32¶
Model framework:
tensorflow ver. 2.14.1
Input JSON:
"dataset": {
"type": "kenning.datasets.magic_wand_dataset.MagicWandDataset",
"parameters": {
"window_size": 128,
"window_shift": 128,
"noise_level": 20,
"dataset_root": "build/MagicWandDataset",
"inference_batch_size": 1,
"download_dataset": true,
"force_download_dataset": false,
"external_calibration_dataset": null,
"split_fraction_test": 0.2,
"split_fraction_val": null,
"split_seed": 1234,
"reduce_dataset": 1.0
"dataconverter": {
"type": "kenning.dataconverters.modelwrapper_dataconverter.ModelWrapperDataConverter",
"parameters": {}
"optimizers": [
"type": "kenning.optimizers.iree.IREECompiler",
"parameters": {
"model_framework": "keras",
"backend": "llvm-cpu",
"compiler_args": [
"compiled_model_path": "./build/tflite-magic-wand.vmfb",
"location": "host"
"platform": {
"type": "kenning.platforms.bare_metal.BareMetalPlatform",
"parameters": {
"uart_port": "/tmp/uart",
"uart_baudrate": 115200,
"uart_log_port": "/tmp/renode_uart_vhb12i1s/uart_log",
"uart_log_baudrate": 115200,
"auto_flash": false,
"openocd_path": "openocd",
"simulated": true,
"runtime_binary_path": "build/build-riscv/iree-runtime/iree_runtime",
"platform_resc_path": "gh://antmicro:kenning-bare-metal-iree-runtime/sim/config/springbok.resc;branch=main",
"resc_dependencies": [
"post_start_commands": [
"sysbus.vec_controlblock WriteDoubleWord 0xc 0"
"disable_opcode_counters": false,
"disable_profiler": false,
"profiler_dump_path": "build/profiler.dump",
"profiler_interval_step": 10.0,
"runtime_init_log_msg": "Inference server started",
"runtime_init_timeout": 30,
"name": "rv32-springbok"
"protocol": {
"type": "kenning.protocols.uart.UARTProtocol",
"parameters": {
"port": "/tmp/uart",
"baudrate": 115200,
"packet_size": 4096,
"endianness": "little",
"timeout": 30
"model_wrapper": {
"type": "kenning.modelwrappers.classification.tflite_magic_wand.MagicWandModelWrapper",
"parameters": {
"window_size": 128,
"model_path": "kenning:///models/classification/magic_wand.h5",
"model_name": null
"runtime": {
"type": "kenning.runtimes.iree.IREERuntime",
"parameters": {
"save_model_path": "./build/tflite-magic-wand.vmfb",
"driver": "local-sync",
"llext_binary_path": null,
"disable_performance_measurements": false
Inference performance metrics for magic_wand_fp32¶
Inference time¶
Figure 1 Inference time¶
First inference duration (usually including allocation time): 0.001866450000000075,
Mean: 0.001864803642857146 s,
Standard deviation: 1.710471485588794e-05 s,
Median: 0.0018686549999999968 s.
Inference quality metrics for magic_wand_fp32¶
Figure 2 Confusion matrix¶
Accuracy: 0.9928571428571429
Mean precision: 0.9947916662942757
Mean sensitivity: 0.9945652170191094
G-mean: 0.9945203413480738
Renode performance measurements for magic_wand_fp32¶
Count of instructions used during inference¶
Figure 3 Histogram of used instructions during inference¶
Figure 4 Histogram of V Vector Extension instructions during inference¶
Executed instructions counters¶
Figure 5 Count of executed instructions per second for cpu during benchmark¶
Figure 6 Cumulative count of executed instructions for cpu during benchmark¶
Peripheral access counters¶
Figure 7 Count of uart0 reads per second during benchmark¶
Figure 8 Cumulative count of uart0 reads during benchmark¶
Figure 9 Count of uart0 writes per second during benchmark¶
Figure 10 Cumulative count of uart0 writes during benchmark¶
Instructions stats¶
Instructions counters per inference pass: 241480
V Vector Extension instructions percentage: 13.558857687523679 %
Top 10 instructions and counters per inference pass:
addi: 44757
sw: 28597
lw: 28077
vle32.v: 13136
bne: 13000
flw: 12278
bltu: 10492
vfmadd.vf: 9552
beq: 9281
lbu: 9062
Memory allocation stats¶
Host bytes peak: 1536
Host bytes allocated: 215040
Host bytes freed: 213504
Device bytes peak: 33536
Device bytes allocated: 2329536
Device bytes freed: 2312752
Compiled model size: 26944
Host memory refers to memory of the CPU controlling the accelerator, while device memory is the memory of the accelerator.