v-extensions-riscv

Commands used

Note

This section was generated using:

kenning optimize test \
    --json-cfg \
        kenning-scenarios/renode-magic-wand-iree-bare-metal-inference.json \
    --measurements \
        ./results.json \
    --verbosity \
        INFO

kenning report \
    --report-types \
        performance \
        classification \
        renode_stats \
    --report-path \
        springbok-magic-wand/report.md \
    --report-name \
        v-extensions-riscv \
    --model-names \
        magic_wand_fp32 \
    --measurements \
        results.json \
    --verbosity \
        INFO \
    --to-html \
        report-html

General information for magic_wand_fp32

Model framework:

  • tensorflow ver. 2.11.1

Input JSON:

{
    "dataset": {
        "type": "kenning.datasets.magic_wand_dataset.MagicWandDataset",
        "parameters": {
            "window_size": 128,
            "window_shift": 128,
            "noise_level": 20,
            "dataset_root": "build/MagicWandDataset",
            "inference_batch_size": 1,
            "download_dataset": true,
            "force_download_dataset": false,
            "external_calibration_dataset": null,
            "split_fraction_test": 0.2,
            "split_fraction_val": null,
            "split_seed": 1234
        }
    },
    "model_wrapper": {
        "type": "kenning.modelwrappers.classification.tflite_magic_wand.MagicWandModelWrapper",
        "parameters": {
            "window_size": 128,
            "model_path": "kenning:///models/classification/magic_wand.h5",
            "model_name": null
        }
    },
    "protocol": {
        "type": "kenning.protocols.uart.UARTProtocol",
        "parameters": {
            "port": "/tmp/uart",
            "baudrate": 115200,
            "packet_size": 4096,
            "endianness": "little"
        }
    },
    "runtime": {
        "type": "kenning.runtimes.renode.RenodeRuntime",
        "parameters": {
            "runtime_binary_path": "build/build-riscv/iree-runtime/iree_runtime",
            "platform_resc_path": "sim/config/springbok.resc",
            "resc_dependencies": [
                "sim/config/platforms/springbok.repl",
                "third-party/iree-rv32-springbok/sim/config/infrastructure/SpringbokRiscV32.cs"
            ],
            "zephyr_build_path": null,
            "post_start_commands": [
                "sysbus.vec_controlblock WriteDoubleWord 0xc 0"
            ],
            "runtime_log_uart": null,
            "runtime_log_init_msg": "Runtime started",
            "disable_profiler": false,
            "profiler_dump_path": "build/profiler.dump",
            "profiler_interval_step": 10.0,
            "sensor": null,
            "batches_count": 10,
            "llext_binary_path": null,
            "disable_performance_measurements": false
        }
    },
    "data_converter": {
        "type": "kenning.dataconverters.modelwrapper_dataconverter.ModelWrapperDataConverter",
        "parameters": {}
    },
    "optimizers": [
        {
            "type": "kenning.optimizers.iree.IREECompiler",
            "parameters": {
                "model_framework": "keras",
                "backend": "llvm-cpu",
                "compiler_args": [
                    "iree-llvm-debug-symbols=false",
                    "iree-vm-bytecode-module-strip-source-map=true",
                    "iree-vm-emit-polyglot-zip=false",
                    "iree-llvm-target-triple=riscv32-pc-linux-elf",
                    "iree-llvm-target-cpu=generic-rv32",
                    "iree-llvm-target-cpu-features=+m,+f,+zvl512b,+zve32x,+zve32f",
                    "iree-llvm-target-abi=ilp32"
                ],
                "compiled_model_path": "./build/tflite-magic-wand.vmfb",
                "location": "host"
            }
        }
    ]
}

Inference performance metrics for magic_wand_fp32

Inference time

Bokeh Application

Figure 1 Inference time

  • First inference duration (usually including allocation time): 0.001839629999999648,

  • Mean: 0.0018446725714285485 s,

  • Standard deviation: 4.016429811897224e-05 s,

  • Median: 0.0018550200000000405 s.

Inference quality metrics for magic_wand_fp32

Bokeh Plot

Figure 2 Confusion matrix

  • Accuracy: 1.0

  • Mean precision: 0.9999999996266236

  • Mean sensitivity: 0.9999999996266236

  • G-mean: 0.9999999996266236

Renode performance measurements for magic_wand_fp32

Count of instructions used during inference

Bokeh Plot

Figure 3 Histogram of used instructions during inference

Bokeh Plot

Figure 4 Histogram of V Vector Extension instructions during inference

Executed instructions counters

Bokeh Application

Figure 5 Count of executed instructions per second for cpu during benchmark

Bokeh Plot

Figure 6 Cumulative count of executed instructions for cpu during benchmark

Memory access counters

Bokeh Application

Figure 7 Count of memory reads per second during benchmark

Bokeh Plot

Figure 8 Cumulative count of memory reads during benchmark

Bokeh Application

Figure 9 Count of memory writes per second during benchmark

Bokeh Plot

Figure 10 Cumulative count of memory writes during benchmark

Peripheral access counters

Bokeh Application

Figure 11 Count of uart0 reads per second during benchmark

Bokeh Plot

Figure 12 Cumulative count of uart0 reads during benchmark

Bokeh Application

Figure 13 Count of uart0 writes per second during benchmark

Bokeh Plot

Figure 14 Cumulative count of uart0 writes during benchmark

Instructions stats

  • Instructions counters per inference pass: 238152

  • V Vector Extension instructions percentage: 13.748338473998853 %

  • Top 10 instructions and counters per inference pass:

    • addi: 44280

    • sw: 28278

    • lw: 27386

    • vle32.v: 13136

    • bne: 12684

    • flw: 12278

    • bltu: 10499

    • vfmadd.vf: 9552

    • beq: 9149

    • lbu: 8945

Memory allocation stats

  • Host bytes peak: 1536

  • Host bytes allocated: 215040

  • Host bytes freed: 213504

  • Device bytes peak: 33536

  • Device bytes allocated: 2329536

  • Device bytes freed: 2312752

  • Compiled model size: 26944

Host memory refers to memory of the CPU controlling the accelerator, while device memory is the memory of the accelerator.


Last update: 2024-11-15