Best practices for the preparation of image training data
Ximilar (vize.ai) offers powerful and easy to use image recognition and classification service using deep neural networks. Working with custom data comes with the responsibility of collecting the right dataset. Good dataset is crucial in achieving highest possible accuracy. Let’s break down some rules for those who are building datasets.
So what are the steps when preparing the dataset?
1. Plan and simplify
In the beginning we must think about how does the computer sees the images. Is is important to understand environment, type of camera or lighting conditions. Want to use the API in a mobile camera? Aim to collect images captured by mobile phone so they match with future images. Analysing medical images? You can get images from the same point of view and the neural network learns nuanced patterns. Do you want to analyse many features (eg. “contains glass” and “is image blurry”)? Setup more models for each of the feature. Don’t mix it up all in one. 😉
If you are not sure ask the support. They can provide educated advise.
For all the tasks try to get the most variable and diverse training dataset. Here are some tips:
- get images from different angles
- change lightning conditions
- take images with good quality and in focus
- change object size and distance.
This is especially true for cases, when you want to recognise real-world objects. They always vary a lot in their background, image quality, lighting etc. Take this in account and try to create as realistic dataset as possible. Realistic in the way of how you are going to use model in future. Training with amazing images and deployment with low res blurry images wont deliver a good performance.
Working with coloured object make sure your dataset consist of different colours.
Higher diversity of the dataset leads to higher accuracy.
With Ximilar (vize.ai) the training minimum is as little as 20 images and you can still achieve great results. However for more complex and nuance categories you should think about 50, 100 or even more images for training. You can test with 20 images to understand the accuracy and then add more.
Sometimes it might be tempting to use stock images or images from Google Search. These will work too. However you might hinder the accuracy.
3. Sort and upload
You have your images ready and it’s time to sort them. When you have only a few categories you can upload all the images into the mixed zone and label them in our app. For big dataset it is best to separate training images into different folders and upload them directly to each of the category in our app. Training API is on the way, stay tuned!
Make the dataset as clean as possible. Skip images that might confuse you. If you are not sure about category of particular image, do not use it.
Think about structure once again. Many times you have more tasks you want to achieve, but you put it all in one and create overlapping categories. For such cases it is good to create more tasks, where each is trained for a feature you want to recognise.
More on processing multilayered task in the coming post.
4. Train and precise
Now comes the exciting part! Training your own neural network and seeing the results. When you send the task to training we split your dataset into training and testing images. This way we can evaluate the accuracy of the your model.
If you’re happy with the accuracy you’re just a few lines of code from implementation into your app.
If you want to achieve higher accuracy, you can clone the task or create a new one and train it on an improved dataset.
To wrap up. You will achieve high accuracy by
- having diverse dataset
- cleaning and properly structuring the data
- using real images, similar to ones you will then send to the API for classification.