Custom Image Recognition

Train your own object detection, image categorization & tagging machine learning models on our platform and integrate them into your business.

I want to:

Image Categorization & Tagging


Train custom Visual AI for image recognition

With the Categorization & Tagging service, anyone can define custom categories & tags, link them to uploaded training images, and train custom image recognition models.

This way, you can automate the quality control of your products, tagging, sorting, filtering, and even recommendations of the items or images from your collection.


AI processes visual data consistently & efficiently in milliseconds

Big databases of pictures can be managed by AI running on Ximilar cloud, which can handle large volumes of data 24/7. You can access it via API and integrate it into your application.

Working with the Ximilar platform doesn’t require a specific skill set. You easily train, chain, and deploy your models or contact us, and we will take care of the setup.


your data with detailed information on images


routine tasks to consistent AI trained on your data


time and resources with AI-powered automation


Assign a category to each image

Image categorization assigns each image a category, such as a maxi dress or midi dress. The categories are visually distinctive, and each image belongs only to one category.

Categorization of Fashion by Ximilar


Tag every image with many tags

Define a set of tags for the features & objects that should be recognized in your images, and train a custom tagging model able to provide tags for each image in your collection.

Tagging of Fashion by Ximilar

Interested in ready-to-use solutions?

We use our platform to deploy detection, categorization & tagging services for fashion apparel, home decor,
furniture, and stock photos, that can be used straight away.


Custom Object Detection


Object Detection finds & marks objects from different categories

At Ximilar’s platform, you can train custom object detection models to identify any object, such as people, cars, particles in the water, imperfections of materials, objects of the same shape, size, or color – or we can train them for you.

An object detection task can either work independently or in a combination with categorization & tagging tasks to tag the detected objects.

how to train object detection model?

Q & A

How do I prepare the training data?

The training of object detection models requires bigger datasets and more training time. It begins with data annotation – the manual marking of objects with bounding boxes. You can use the same dataset as for Categorization & Tagging model training.

Q & A

How do you work with my data?

During the training of custom image recognition models, your annotated images are divided into two groups. Apart from the training set, there is a smaller validation set, which is used to evaluate the accuracy of the model before the deployment. You can also upload another independent test set.

One Platform – Two Interfaces


You can annotate your images in Ximilar App. Just click on the image and draw a bounding box.



Annotate was built for the annotation of large image datasets with complex hierarchical taxonomies in a team.


Flows: Chain & Combine the Models

The key to the management of complex image databases is the interaction of more different models.
Chaining machine learning models with Flows


Divide a complex problem into separate recognition, detection & tagging tasks

With Flows, the machine learning models can be combined and chained in a sequence, that exactly matches your decision-making process while working with visual data.

A Flow is like a tree, that grows bigger with the hierarchy of your models. Each image travels through the tree until it is properly processed and tagged. Based on this hierarchy, you also get suggestions when annotating the images.

Flows in Annotate


Make a network of models and change any part you need anytime

  • Re-train, add or remove any unit anytime
  • Combine custom and ready-to-use services
  • Recognize exclusively the detected objects
  • Call multiple tasks (models) in one API call
  • Call multiple recognition tasks in parallel
  • Call endless nested flows by a primary flow
  • Use one flow in several places
flows – guide

Build rich hierarchy

Define a flow with a few clicks, then use it for both training & automation

Play with the features

Add, remove, or change components, duplicate & modify your flows

Make changes on the fly

Flow structure handles any changes to both dataset and connected models


Conditional image processing

Imagine you are building a real estate website. The first models in your flow can filter out all images that don’t meet certain selection criteria. In this case, it would be the pictures without any real estate, rooms, or furnishing.


Automatic filtering, sorting & tagging

Images can then be gradually sorted with an increasing level of precision. The first task separates apartments and houses. Then, the apartments are sorted by room type, design, and furniture decor, and the houses by features such as architecture, area, garden or swimming pool.

Image Regression


Predict size of objects, age of people, or rating from images

The image regression model is able to predict numerical values from a range that you define in your images.

Image regression is used for building quality control systems, and to estimate values such as age, size, worn out level, or rating. It is a great tool for real estate and insurance companies, MedTech, and Industry 4.0. Contact us to discuss your application!


Be Ahead of the Competition

Unlimited number of images

There are no limits on number of images per model/label

Use one image for many models

You can use the same images for the training of different models

Built-in data augmentation

You don’t have to prepare or multiply the training data in advance

No paying for training time

Unlike the competition, Ximilar doesn’t charge you for the training time

No paying for idle time

No fees for the idle time either

Cashing deployed models

Image processing takes 300 ms, as opposed to 2-3 s at other platforms


We use state of the art neural network models & machine learning techniques

Our AI is improving constantly, so you always have up-to-date technology. Each model has millions of parameters that can be processed by CPU or GPU.

Our intelligent algorithm picks and uses the best performing models. We are using the latest technologies for machine learning as TensorFlow or OpenVINO.


Tips & Tricks

With a new custom image similarity service, we are able to build an image search engine for collectible cards trading.
Read moreOctober 2022
With the AI Explainability in Ximilar App, you can see which parts of your images are the most important to your image recognition models.
Read moreDecember 2021
We developed a computer vision system for object detection, counting, and tracking on Nvidia Jetson Nano.
Read moreOctober 2021

API Documentation

We take care of the complexity behind and wrap it in a few lines of code.