{"id":53495,"date":"2021-10-02T03:17:20","date_gmt":"2021-10-01T19:17:20","guid":{"rendered":"https:\/\/www.seeedstudio.com\/blog\/?p=53495"},"modified":"2021-10-02T03:21:11","modified_gmt":"2021-10-01T19:21:11","slug":"faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques","status":"publish","type":"post","link":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/","title":{"rendered":"Fast(er) Machine Learning Inference on Raspberry Pi SBC &#8211; Four optimization techniques"},"content":{"rendered":"\n<p>A while ago I made a video on semantic segmentation with new Raspberry Pi HD camera module &#8211; I used MobileNet&nbsp;v1&nbsp;backend SegNet-basic for segmentation task and it was running rather slow on Raspberry Pi, which was noticed by people in the comments.<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"Raspberry Pi HQ Camera Module Review and Machine Learning Demo\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/L08PTNVsZOk?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<p>I had to admit that Raspberry Pi 4 is not the best SBC for Machine Learning tasks, since it doesn\u2019t have a hardware accelerator that can be used to speed up&nbsp;the&nbsp;inference and has to rely on CPU. Since then I have been working on applications for&nbsp;Raspberry Pi 4 Compute Module inside of <a href=\"https:\/\/www.seeedstudio.com\/ReTerminal-with-CM4-p-4904.html\" target=\"_blank\" aria-label=\"undefined (opens in a new tab)\" rel=\"noreferrer noopener\">reTerminal<\/a> &#8211; some of which included Machine Learning demos, such as age\/gender recognition, object detection, face anti-spoofing and so on.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img fetchpriority=\"high\" decoding=\"async\" width=\"908\" height=\"567\" src=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/image.png\" alt=\"\" class=\"wp-image-53499\" srcset=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/image.png 908w, https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/image-300x187.png 300w, https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/image-768x480.png 768w\" sizes=\"(max-width: 908px) 100vw, 908px\" \/><\/figure>\n\n\n\n<p>Here is&nbsp;the list of four techniques, that can be helpful when aiming for&nbsp;running real-time model inference on Raspberry Pi 4. Ready? Let\u2019s go!<\/p>\n\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe title=\"Four optimization techniques for Machine Learning Inference on Raspberry Pi SBC\" width=\"640\" height=\"360\" src=\"https:\/\/www.youtube.com\/embed\/BEDEscDQFxk?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\">Designing smaller networks<\/h2>\n\n\n\n<p>If the goal is simple enough (image classification of &lt; 100 classes or object detection of &lt; 10 classes or similar), a smaller network can achieve acceptable accuracy and run very fast. For example, MobileNet v1 alpha 0.25 YOLOv2 network trained to detect only one class of objects (human faces) achieves 62.5 FPS without any further optimization.<\/p>\n\n\n\n<p>Vanilla TensorFlow Lite FP32 inference:<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img decoding=\"async\" src=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/image-1.png\" alt=\"\" class=\"wp-image-53500\" width=\"300\" height=\"300\"\/><\/figure><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Quantization<\/h2>\n\n\n\n<p>Quantization is process of reducing precision for NN network weights, usually from FP32 to INT8. <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large\"><img decoding=\"async\" width=\"516\" height=\"290\" src=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/qat-training-precision.png\" alt=\"\" class=\"wp-image-53501\" srcset=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/qat-training-precision.png 516w, https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/qat-training-precision-300x169.png 300w\" sizes=\"(max-width: 516px) 100vw, 516px\" \/><\/figure><\/div>\n\n\n\n<p>It reduces the size by 4x and latency by ~60-80% using default TensorFlow Lite kernels. There are two kinds of quantization process in TensorFlow: post-training quantization and quantization-aware training or QAT. Post-training quantization usually brings a small accuracy loss, which can be minimized by using QAT &#8211; quantization-aware training, which is the process of fine-tuning network with quantization nodes inserted. <\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"850\" height=\"545\" src=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/Quantization-Aware-Training.png\" alt=\"\" class=\"wp-image-53502\" srcset=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/Quantization-Aware-Training.png 850w, https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/Quantization-Aware-Training-300x192.png 300w, https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/Quantization-Aware-Training-768x492.png 768w\" sizes=\"(max-width: 850px) 100vw, 850px\" \/><\/figure>\n\n\n\n<p>For further reading have a look at <a href=\"https:\/\/www.tensorflow.org\/lite\/performance\/model_optimization\" target=\"_blank\" aria-label=\"undefined (opens in a new tab)\" rel=\"noreferrer noopener\">TensorFlow Lite documentation<\/a> and <a href=\"https:\/\/github.com\/AIWintermuteAI\/aXeleRate\/blob\/master\/axelerate\/networks\/common_utils\/convert.py\" target=\"_blank\" aria-label=\"undefined (opens in a new tab)\" rel=\"noreferrer noopener\">convert.py script<\/a> of aXeleRate, my personal project that simplifies training of common vision networks for inference on embedded devices.<\/p>\n\n\n\n<p>Vanilla TensorFlow Lite INT8 inference:<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/image-2.png\" alt=\"\" class=\"wp-image-53503\" width=\"300\" height=\"300\"\/><\/figure><\/div>\n\n\n\n<h2 class=\"wp-block-heading\">Using optimized kernels<\/h2>\n\n\n\n<p>Inference speed can be improved by utilizing frameworks that have operation kernels optimized for specific CPU instructions set, e.g. NEON SIMD (Single Instruction Multiple Data) instructions for ARM. Examples of such networks include <a href=\"https:\/\/github.com\/ARM-software\/armnn\" target=\"_blank\" aria-label=\"undefined (opens in a new tab)\" rel=\"noreferrer noopener\">ARM NN<\/a> and <a href=\"https:\/\/github.com\/google\/XNNPACK\" target=\"_blank\" aria-label=\"undefined (opens in a new tab)\" rel=\"noreferrer noopener\">XNNPACK<\/a>.<\/p>\n\n\n\n<p>Arm NN SDK is a set of open-source software and tools that enables machine learning workloads on power-efficient devices.<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"468\" src=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/Arm-NN-Frameworks-Diagram.png\" alt=\"\" class=\"wp-image-53505\" srcset=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/Arm-NN-Frameworks-Diagram.png 600w, https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/Arm-NN-Frameworks-Diagram-300x234.png 300w\" sizes=\"(max-width: 600px) 100vw, 600px\" \/><\/figure>\n\n\n\n<p>The description and provided benchmarks look promising, but the installation procedure on latest Raspberry Pi OS is painful at the moment &#8211; the only proper way to install latest version of ARM NN currently is cross-compiling from source. There are binaries available for Debian Bullseye, but Raspberry Pi OS is still at Debian Buster. The inference test results with my benchmark scripts were mixed, for a single model it showed worse performance than even vanilla TensorFlow Lite, but it turned out to be faster in multi-model inference, possibly due to more efficient multi-processing utilization.<\/p>\n\n\n\n<p>ARM NN FP32 inference:<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/image-4.png\" alt=\"\" class=\"wp-image-53507\" width=\"300\" height=\"300\"\/><\/figure><\/div>\n\n\n\n<p>XNNPACK is a library for accelerating neural network inference for ARM, x86, and WebAssembly architectures in Android, iOS, Windows, Linux, macOS environments. <\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/2.bp.blogspot.com\/-Fq-o5JrAw6g\/XxnJDHEagOI\/AAAAAAAADZQ\/CUVEpTdONn4JOw8Ffp9FMK8vMZEuSX9_wCLcBGAsYHQ\/s1600\/mobilephones.png\" alt=\"Accelerating TensorFlow Lite with XNNPACK Integration \u2014 The TensorFlow Blog\"\/><\/figure>\n\n\n\n<p>It is integrated in TensorFlow Lite as a delegate, which is enabled by default for Android build, but for other environments needs to be enabled manually &#8211; thus if you\u2019d like to use XNNPACK on Raspberry Pi 4, you\u2019ll need either to build TensorFlow Lite Interpreter package from source or download one of the third-party binaries. <\/p>\n\n\n\n<p>To build TensorFlow Lite Interpreter pip package with XNNPACK (both FP32 and INT8 optimized kernels) do the following:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>git clone https:\/\/github.com\/tensorflow\/tensorflow.git\ncd tensorflow \nnano tensorflow\/lite\/tools\/pip_package\/build_pip_package_with_bazel.sh<\/code><\/pre>\n\n\n\n<p>Then change Bazel build options from<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Build python interpreter_wrapper.\ncd \"${BUILD_DIR}\"\ncase \"${TENSORFLOW_TARGET}\" in\n  armhf)\n    BAZEL_FLAGS=\"--config=elinux_armhf\n      --copt=-march=armv7-a --copt=-mfpu=neon-vfpv4\n      --copt=-O3 --copt=-fno-tree-pre --copt=-fpermissive\n      --define tensorflow_mkldnn_contraction_kernel=0\n      --define=raspberry_pi_with_neon=true\"\n    ;;\n  aarch64)\n    BAZEL_FLAGS=\"--config=elinux_aarch64\n      --define tensorflow_mkldnn_contraction_kernel=0\n      --copt=-O3\"\n    ;;\n  native)\n    BAZEL_FLAGS=\"--copt=-O3 --copt=-march=native\"\n    ;;\n  *)\n    BAZEL_FLAGS=\"--copt=-O3\"\n    ;;\nesac<\/code><\/pre>\n\n\n\n<p>to<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code># Build python interpreter_wrapper.\ncd \"${BUILD_DIR}\"\ncase \"${TENSORFLOW_TARGET}\" in\n  armhf)\n    BAZEL_FLAGS=\"--config=elinux_armhf\n      --copt=-march=armv7-a --copt=-mfpu=neon-vfpv4\n      --copt=-O3 --copt=-fno-tree-pre --copt=-fpermissive\n      --define tensorflow_mkldnn_contraction_kernel=0\n      --define=raspberry_pi_with_neon=true\n      --define=tflite_pip_with_flex=true\n      --define=tflite_with_xnnpack=true\n      --define=xnn_enable_qs8=true\"\n    ;;\n  aarch64)\n    BAZEL_FLAGS=\"--config=elinux_aarch64\n      --define tensorflow_mkldnn_contraction_kernel=0\n      --define=tflite_pip_with_flex=true\n      --define=tflite_with_xnnpack=true\n      --define=xnn_enable_qs8=true\n      --copt=-O3\"\n    ;;\n  native)\n    BAZEL_FLAGS=\"--copt=-O3 --copt=-march=native\n      --define=tflite_pip_with_flex=true\n      --define=tflite_with_xnnpack=true\n      --define=xnn_enable_qs8=true\"\n    ;;\n  *)\n    BAZEL_FLAGS=\"--copt=-O3\n      --define=tflite_pip_with_flex=true\n      --define=tflite_with_xnnpack=true\n      --define=xnn_enable_qs8=true\"      \n    ;;\nesac<\/code><\/pre>\n\n\n\n<p>And then finally start building process with:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>sudo CI_DOCKER_EXTRA_PARAMS=\"-e CI_BUILD_PYTHON=python3.7 -e CROSSTOOL_PYTHON_INCLUDE_PATH=\/usr\/include\/python3.7\" \\\n  tensorflow\/tools\/ci_build\/ci_build.sh PI-PYTHON37 \\\n  tensorflow\/lite\/tools\/pip_package\/build_pip_package_with_bazel.sh aarch64<\/code><\/pre>\n\n\n\n<p>This is cross-compilation, so the build process needs to be done on your Linux x86 compatible computer and NOT Raspberry Pi! Compilation will take a while &#8211; if you encounter an error, try switching to an earlier commit, since it is not uncommon for builds on master branch of TensorFlow to fail.<\/p>\n\n\n\n<p>Link to the pre-built package is available from <a href=\"https:\/\/wiki.seeedstudio.com\/reTerminal_ML_TFLite\/\" target=\"_blank\" aria-label=\"undefined (opens in a new tab)\" rel=\"noreferrer noopener\">Seeed studio Wiki page<\/a> about using TensorFlow Lite on reTerminal.<\/p>\n\n\n\n<p>XNNPACK delegate Tensorflow Lite FP32 inference:<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/image-5.png\" alt=\"\" class=\"wp-image-53508\" width=\"300\" height=\"300\"\/><\/figure><\/div>\n\n\n\n<p>Main problem with optimized kernels is the uneven support of different architectures\/NN operators\/precision types in different frameworks. For example INT8 optimized kernels are work-in-progress both in ARM NN and XNNPACK. The support for INT8 optimized kernels in XNNPACK was added very recently and seems to bring modest performance improvement, of about ~30%, depending on operators used in the model.<\/p>\n\n\n\n<p>While researching model inference optimization techniques, I stumbled upon another promising lead in <a href=\"https:\/\/github.com\/tensorflow\/tensorflow\/pull\/48751#issuecomment-869111116\" target=\"_blank\" aria-label=\"undefined (opens in a new tab)\" rel=\"noreferrer noopener\">Tensorflow Github Pull Request<\/a>, adding optimized kernels for dynamically quantized models, link to the PR in the video description<\/p>\n\n\n\n<p>The developer claims 3-4x latency improvement, but currently it is only limited to very specific set of models. A PR to allow more convenient usage is in development.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Pruning and sparse inference<\/h2>\n\n\n\n<p>Pruning is a process of fine-tuning trained neural network to find weights, that do not contribute to correct predictions and removing them. Pruning is very helpful for reducing the size of the model for compression by simply setting the \u201cdead\u201d weights to zeros, but using it for reducing latency is trickier, since that requires removing the connections themselves, which can lead to significant accuracy loss. <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter\"><img decoding=\"async\" src=\"https:\/\/miro.medium.com\/max\/1370\/1*p2AMGLzaDaFI4eBLvCTKRg.png\" alt=\"Neural Network Pruning Research Review 2020 | by Prabhu Prakash Kagitha |  Heartbeat\"\/><\/figure><\/div>\n\n\n\n<p>Using Tensorflow Model Optimization toolkit Experimentally it was possible to achieve up to 80% sparsity with negligible impact on accuracy. Check<a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/ai.googleblog.com\/2021\/03\/accelerating-neural-networks-on-mobile.html\" target=\"_blank\" rel=\"noreferrer noopener\"> Google AI blog article here<\/a> and a guide to pruning with Tensorflow model optimization toolkit <a aria-label=\"undefined (opens in a new tab)\" href=\"https:\/\/www.tensorflow.org\/model_optimization\/guide\/pruning\/pruning_for_on_device_inference\" target=\"_blank\" rel=\"noreferrer noopener\">here<\/a>. <\/p>\n\n\n\n<p>To summarize, the fact that a board doesn&#8217;t have a dedicated hardware to accelerate NN inference, doesn&#8217;t mean it is useless for ML though &#8211; there are some optimization options, that allow achieving &#8212;acceptable&#8212; inference speeds. Applying some of these techniques, such as optimized kernels can be a bit tricky though and requires some tweaks to the model architecture and (possibly) compiling inference packages from source, since a lot of optimizations are currently on the bleeding edge of development.<\/p>\n\n\n\n<p>Oh, and while you at at, make sure you avoid thermal throttling when running optimized inference on Raspberry Pi 4 &#8211; get yourself <a href=\"https:\/\/www.seeedstudio.com\/Blink-Blink-ICE-Tower-CPU-Cooling-Fan-for-Raspberry-Pi-Support-Pi-4-p-4215.html\" target=\"_blank\" aria-label=\"undefined (opens in a new tab)\" rel=\"noreferrer noopener\">one of these nice cooling towers<\/a> and don\u2019t set your Pi on fire.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/media-cdn.seeedstudio.com\/media\/catalog\/product\/cache\/9d0ce51a71ce6a79dfa2a98d65a0f0bd\/p\/r\/preview.jpg\" alt=\"\"\/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>A while ago I made a video on semantic segmentation with new Raspberry Pi HD<\/p>\n","protected":false},"author":3505,"featured_media":53514,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_lmt_disableupdate":"","_lmt_disable":"","_price":"","_stock":"","_tribe_ticket_header":"","_tribe_default_ticket_provider":"","_tribe_ticket_capacity":"0","_ticket_start_date":"","_ticket_end_date":"","_tribe_ticket_show_description":"","_tribe_ticket_show_not_going":false,"_tribe_ticket_use_global_stock":"","_tribe_ticket_global_stock_level":"","_global_stock_mode":"","_global_stock_cap":"","_tribe_rsvp_for_event":"","_tribe_ticket_going_count":"","_tribe_ticket_not_going_count":"","_tribe_tickets_list":"[]","_tribe_ticket_has_attendee_info_fields":false,"iawp_total_views":0,"footnotes":""},"categories":[1],"tags":[1355,3744,247,742,1284,1771],"class_list":["post-53495","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news","tag-artificial-intelligence","tag-edge-artificial-intelligence","tag-raspberry-pi","tag-sbc","tag-tensorflow","tag-tensorflow-lite"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v24.0 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Fast(er) Machine Learning Inference on Raspberry Pi SBC - Four optimization techniques - Latest News from Seeed Studio<\/title>\n<meta name=\"description\" content=\"List of four techniques, that can be helpful when aiming for running real-time model inference on Raspberry Pi 4.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Fast(er) Machine Learning Inference on Raspberry Pi SBC - Four optimization techniques - Latest News from Seeed Studio\" \/>\n<meta property=\"og:description\" content=\"List of four techniques, that can be helpful when aiming for running real-time model inference on Raspberry Pi 4.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/\" \/>\n<meta property=\"og:site_name\" content=\"Latest News from Seeed Studio\" \/>\n<meta property=\"article:published_time\" content=\"2021-10-01T19:17:20+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-10-01T19:21:11+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"867\" \/>\n\t<meta property=\"og:image:height\" content=\"720\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Dmitry Maslov\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Dmitry Maslov\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/\",\"url\":\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/\",\"name\":\"Fast(er) Machine Learning Inference on Raspberry Pi SBC - Four optimization techniques - Latest News from Seeed Studio\",\"isPartOf\":{\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg\",\"datePublished\":\"2021-10-01T19:17:20+00:00\",\"dateModified\":\"2021-10-01T19:21:11+00:00\",\"author\":{\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/#\/schema\/person\/be44021cef50367de429a4d5f613ed2f\"},\"description\":\"List of four techniques, that can be helpful when aiming for running real-time model inference on Raspberry Pi 4.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#primaryimage\",\"url\":\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg\",\"contentUrl\":\"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg\",\"width\":867,\"height\":720},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.seeedstudio.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Fast(er) Machine Learning Inference on Raspberry Pi SBC &#8211; Four optimization techniques\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/#website\",\"url\":\"https:\/\/www.seeedstudio.com\/blog\/\",\"name\":\"Latest News from Seeed Studio\",\"description\":\"Emerging IoT, AI and Autonomous Applications on the Edge\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.seeedstudio.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/#\/schema\/person\/be44021cef50367de429a4d5f613ed2f\",\"name\":\"Dmitry Maslov\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.seeedstudio.com\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/b60714970fdc7dfa4a5d9915477bdd24?s=96&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/b60714970fdc7dfa4a5d9915477bdd24?s=96&r=g\",\"caption\":\"Dmitry Maslov\"},\"url\":\"https:\/\/www.seeedstudio.com\/blog\/author\/dmitry-maslov\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Fast(er) Machine Learning Inference on Raspberry Pi SBC - Four optimization techniques - Latest News from Seeed Studio","description":"List of four techniques, that can be helpful when aiming for running real-time model inference on Raspberry Pi 4.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/","og_locale":"en_US","og_type":"article","og_title":"Fast(er) Machine Learning Inference on Raspberry Pi SBC - Four optimization techniques - Latest News from Seeed Studio","og_description":"List of four techniques, that can be helpful when aiming for running real-time model inference on Raspberry Pi 4.","og_url":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/","og_site_name":"Latest News from Seeed Studio","article_published_time":"2021-10-01T19:17:20+00:00","article_modified_time":"2021-10-01T19:21:11+00:00","og_image":[{"width":867,"height":720,"url":"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg","type":"image\/jpeg"}],"author":"Dmitry Maslov","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Dmitry Maslov","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/","url":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/","name":"Fast(er) Machine Learning Inference on Raspberry Pi SBC - Four optimization techniques - Latest News from Seeed Studio","isPartOf":{"@id":"https:\/\/www.seeedstudio.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#primaryimage"},"image":{"@id":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#primaryimage"},"thumbnailUrl":"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg","datePublished":"2021-10-01T19:17:20+00:00","dateModified":"2021-10-01T19:21:11+00:00","author":{"@id":"https:\/\/www.seeedstudio.com\/blog\/#\/schema\/person\/be44021cef50367de429a4d5f613ed2f"},"description":"List of four techniques, that can be helpful when aiming for running real-time model inference on Raspberry Pi 4.","breadcrumb":{"@id":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#primaryimage","url":"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg","contentUrl":"https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg","width":867,"height":720},{"@type":"BreadcrumbList","@id":"https:\/\/www.seeedstudio.com\/blog\/2021\/10\/02\/faster-machine-learning-inference-on-raspberry-pi-sbc-four-optimization-techniques\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.seeedstudio.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Fast(er) Machine Learning Inference on Raspberry Pi SBC &#8211; Four optimization techniques"}]},{"@type":"WebSite","@id":"https:\/\/www.seeedstudio.com\/blog\/#website","url":"https:\/\/www.seeedstudio.com\/blog\/","name":"Latest News from Seeed Studio","description":"Emerging IoT, AI and Autonomous Applications on the Edge","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.seeedstudio.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.seeedstudio.com\/blog\/#\/schema\/person\/be44021cef50367de429a4d5f613ed2f","name":"Dmitry Maslov","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.seeedstudio.com\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/b60714970fdc7dfa4a5d9915477bdd24?s=96&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/b60714970fdc7dfa4a5d9915477bdd24?s=96&r=g","caption":"Dmitry Maslov"},"url":"https:\/\/www.seeedstudio.com\/blog\/author\/dmitry-maslov\/"}]}},"modified_by":"Dmitry Maslov","views":15371,"featured_image_urls":{"full":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg",867,720,false],"thumbnail":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879-80x80.jpg",80,80,true],"medium":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879-300x249.jpg",300,249,true],"medium_large":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879-768x638.jpg",640,532,true],"large":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-1030x579.jpg",640,360,true],"1536x1536":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg",867,720,false],"2048x2048":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg",867,720,false],"visody_icon":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg",32,27,false],"magazine-7-slider-full":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879.jpg",867,720,false],"magazine-7-slider-center":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-936x720.jpg",936,720,true],"magazine-7-featured":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-1024x576.jpg",1024,576,true],"magazine-7-medium":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879-720x380.jpg",720,380,true],"magazine-7-medium-square":["https:\/\/www.seeedstudio.com\/blog\/wp-content\/uploads\/2021\/10\/maxresdefault-e1633115816879-675x450.jpg",675,450,true]},"author_info":{"display_name":"Dmitry Maslov","author_link":"https:\/\/www.seeedstudio.com\/blog\/author\/dmitry-maslov\/"},"category_info":"<a href=\"https:\/\/www.seeedstudio.com\/blog\/category\/news\/\" rel=\"category tag\">News<\/a>","tag_info":"News","comment_count":"0","_links":{"self":[{"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/posts\/53495","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/users\/3505"}],"replies":[{"embeddable":true,"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/comments?post=53495"}],"version-history":[{"count":8,"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/posts\/53495\/revisions"}],"predecessor-version":[{"id":53515,"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/posts\/53495\/revisions\/53515"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/media\/53514"}],"wp:attachment":[{"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/media?parent=53495"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/categories?post=53495"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.seeedstudio.com\/blog\/wp-json\/wp\/v2\/tags?post=53495"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}