How to deploy models to mobile & IoT for offline use

Tutorial for deploying Image Recognition models trained with TensorFlow to your smartphone and edge devices.
Michal Lukáč, Ximilar
Michal Lukac
27. May 2020
Photo by Ravi Kumar on Unsplash

Did you know that the number of IoT devices is crossing a number of 38 billion in 2020?  That is a big number. Roughly half of those are connected to the Internet. That is quite a large load for internet infrastructure, even for 5G networks. And still, some of the countries in the world don’t yet adopt 4G. So internet connectivity can be slow in many cases.

Earlier, in a separate blog post  we mentioned that one day you will be able to download your trained models offline. The time is now. We worked several months with the newest TensorFlow 2+ (KUDOS to the TF team!), rewriting our internal system from scratch so your trained models can finally be deployed offline.

Tadaaa — that makes Ximilar one of the first machine learning platform that allows its users to train a custom image recognition model with just a few clicks and download it for offline usage!

The feature is active only in custom pricing plans. If you would like to download and use your models offline, please let us know at  where we are ready to discuss potential options with you.

Let’s get started!

Let’s have a look at how to use your trained model directly on your server, mobile phone, IoT device, or any other edge device. The downloaded model can be run on iOS devices, Android phones, Coral, NVidia Jetson, Raspberry PI, and many others. This makes sense especially in case your device is offline – if it’s connected to the internet you can query our API to get results from your latest trained model.

Why offline usage?

Privacy, Network connectivity, Security, High latency are common concerns that all customers have. Online use can also become a bottleneck when adopting machine learning on a very large scale or in factories for visual quality control. Here are some scenarios to consider offline models:

  • You don’t want your data to leave your private network.
  • Your device cannot be connected to the Internet or the connectivity is slow.
  • You don’t need to make a request to our API from your mobile for every image you make.
  • You don’t want to be dependent on our infrastructure (but, BTW, we have almost 100% uptime).
  • You need to do a large number of queries (tens of millions) per day and want to run your models on your GPU cards.

Right now, both Recognition models and detection models are ready for offline usage!

Before continuing with this article, you should already know how to create your Recognition models.


After creating and training your Task, go to the Task page. Once you have permission to download the model, scroll down to the list of trained models and you should see the download icons. Choose the version of the model you are satisfied with and use the icon to download a ZIP file.


This ZIP archive contains several files. The actual model file is located in the tflite folder and it is in TFLITE format which can be easily deployed on any edge device. Another essential file is labels.txt which contains names of your task labels. Order of the names is important as it corresponds with the order of model outputs, so don’t mix them up. The default input size of the model has 224×224 resolution (it can be changed within Professional and higher plans). There is another folder with saved_model which is used when deploying on server/pc with GPU.

Deploy on Android


This android code/project contains an example application by the TensorFlow team which shows how to deploy the model on an Android device. We forked it and adjusted it to work with our models. Be aware that the model is already normalizing the input image by itself. So you should not normalize the RGB image from the camera in any way.
Here you can download simple Animal/Cat/Dog tagging model to test. First, copy the model file together with labels.txt to the assets folder of the Android project. Connect your mobile via USB cable to your computer, build the project in Android Studio and run it. Be aware that you should have developer mode with USB debugging enabled on your Android device (you can enable it in Settings). The application should appear on your android device. Select the MobileNet-Float model and you are ready for the magic to happen!

That’s it!

Remember this is just a sample code on how to load the model and use it with your mobile camera. You can adjust/use code in a way you need.

Deploy on iOS

With iOS, you have two options. You can use either Objective-C or Swift language. See an example application for iOS. It is implemented in the Swift language. If you are a developer then I recommend being inspired by this file on Github. It loads and calls the model. The official quick start guide for iOS from the TensorFlow team is on


If you want to deploy the recognition model on your server then start with Ximilar-com/models repository. The folder scripts/recognition contains everything for the successful deployment of the model on your computer. You need to install TensorFlow with version 2.2+. If your workstation has NVidia GPU, you can run the model on it. The GPU needs to have at least 4 GB of memory and CUDA Compute Capability 3.7+. Inferencing on GPU will increase the speed of prediction several times. You can play with the batch size of your samples which we recommend when using GPU.

Deploying to Raspberry Pi is through the python language library. See classification raspberry pi project or guide for tflite.


Edge and Embedded devices

There is also the option to deploy on Coral, NVidia Jetson, or directly to a Web browserPersonally, we have a great experience on small projects with Jetson Nano. The MobilenetV2 architecture converted to TensorFlow LITE models works great. If you need to do object detection, tracking and counting then we recommend to use YOLO architectures converted to TensorRT. YOLO can run on Jetson Nano in real time setting and is fantastic for factories, assembly lines and conveyor belts with small number of product types. You can easily buy and set up a camera on Jetson. Luckily, we are able to develop such models for you and your projects.

Update 2021/2022: We developed object and image recognition system for Nvidia Jetson Nano for conveyor belts and factories. Read more at our blog post how to create visual AI system for Jetson.


Now you have another reason to use the Ximilar platform. Of course, by using offline models, you cannot use the Ximilar Flows which is able to connect your tasks to form a complex computer vision system. Otherwise, you can do with your model whatever you want.

To learn more about TFLITE format, see the tflite guide by the TensorFlow team. Big thanks to them! 

If you would like to download your model for offline usage then contact us at and our sales team will discuss a suitable pricing model for you.

Michal Lukáč, Ximilar

Michal Lukáč ML Engineer & Co-founder

Michal is a co-founder of Ximilar and a machine learning expert focusing mainly on image recognition, visual search and computer vision. He is interested in science, loves reading books and Brazillian Jiu-Jitsu.

Related Articles

Let's take a look at the best online sites and tools for card collectors, including technologies for sports card recognition & grading.
Read moreMay 2024
An in-depth overview of the key AI tools reshaping the fashion industry, with a focus on automated fashion tagging.
Read moreMay 2024
Introducing sports card recognition API for card collector shops, apps, and websites.
Read moreFebruary 2024