Collectibles
Image Tools
Other industries
Improve your custom recognition & detection models from Ximilar App with advanced training options, including augmentations of the images.
Results of machine learning models are strongly dependent both on the quality and quantity of your training data. Unfortunately, getting more reliable data is often quite expensive. There are however a couple of techniques designed to deal specifically with this problem. Using Ximilar App, anyone can train their own recognition and detection models, and Ximilar has recently introduced new options for this process.
From our end to reduce work for you, we use models pre-trained on huge amounts of data, which can already recognize the basic elements of your images. Sometimes those might be already quite high-level concepts, like an individual. Other times, they could be lines, edges and so forth. The training just refines the model for your specific data.
Then, there are some options which are much more dependent on the given task you are trying to solve. You can artificially change your training images and generate additional data “for free”. Some operations are typically “safe” and we turn them on by default – image quality, mild colour changes, left-right flip or small crop of an image.
Others are more disruptive and can potentially destroy the important pieces of information in your image. Therefore, it is up to you to turn them on. This is something we strongly encourage you to do for as many of them as possible within your task.
Do not be afraid of a growing number of options. The main rule for deciding which one to enable is very simple. Could the particular operation change the image in such a way that you will be no longer able to recognize it? If not, you can allow it. Sometimes, you might ask a second question – could an image modified by this operation be sent to my service? Do I want to recognize it?
In one of our earlier articles, we described how to train a custom image classifier. We will take an example from the same domain (cats vs. dogs) to walk you through the different options.
The most basic operations are rotation and flip. Below, you will find an original image on the left side and then pictures with the following operations: flip vertically, flip horizontally, rotate 90, and rotate max (20).
As you see, the dog is recognizable in all pictures. Horizontal flip should be turned on without any hesitation. Vertical flip and rotate 90 depends on the data you are expecting to recognize. Will those all be professional photos? Then those options will probably not help you. Will users all around the world be using your service to upload pictures from their phones? Well, sometimes the image might be rotated the wrong way.
In addition, a small arbitrary rotation (rotate max) might be useful to introduce some deformations which will make the model even more robust.
Now, we can continue to more advanced image augmentations. Again, we have an original image on the left. And then we apply the following: colour (light, medium and aggressive), quality, crop and erase augmentations.
Changes in the colours are a very natural operation. We provide you with for options:
Other operations are more straightforward. They can be either on or off. Quality simulates various JPEG compressions, noise etc. Crop will cut of a small part of an image on each side. And finally, erase will remove small rectangular patches from the image.
In our task, all operations made the dog recognizable, with the possible exception of the last one – erase. If the object or its defining part on the image is too small, it might be removed by this operation. Therefore, use erase carefully.
Do not be afraid to experiment. You can train multiple models with different settings and compare the results. However, always be careful, how you do the evaluation. The best way is to set your independent test dataset. Read more in this blog post.
For every trained model, your settings are saved, and you can inspect them at your will.
If you have any questions about this or any other functionality, please do not hesitate to contact us. We would be glad to discuss your problem and help you.
Libor Vanek
Libor is an skilled machine learning developer with extensive experience in the fields of artificial intelligence and computer vision. He helped us build many innovative solutions and moved on to more specialized projects in OCR and LLM.
Get your own AI-powered comics and manga image recognition and search tool, accessible through REST API.
Optimize your product listing workflow with automated writing of product titles and descriptions.
Explore new features in our Ximilar App: streamlined Plan overview & Setup, Credit calculator, and API Credit pack pages.