Custom Solutions

In an ocean of visual search startups we’ve taken a different — dare we say pioneering — path. We see the future as many narrow, deep AI challenges. We are solving the hardest problems in computer vision.

Well Defined Success Criteria


Initial Call & Exploration of the problem to be solved.

Scheme Exploration


Onsite White boarding workshop with you and our expert team.

Scheme Workshop


First deliverable is a prototyping plan with measurable acceptance criteria.

Scheme Plan


Working Prototype is a matter of 1—3 months, depending on complexity.

Scheme Prototype

5Full Solution

Results from prototype are refined by expert team & input from you.

Scheme Full


Commercial deployment and ongoing maintenance & updates.

Scheme Launch

Smart approach is what makes us different

Given a field — items with categories & features — our experience helps us to choose the best approach to build a robust system with good accuracy. For example, the objective is to recognize 1,000 different items in a new field. Options we would consider —

Flat classifier into 1,000 classes

Straightforward approach, applied quite often by machine learning teams. A lot of training data to crunch. Quite static form, where adding new items requires re-training. Accuracy of this approach significantly decreases with the number of items.

Taxonomy of categories

Categories & subcategories allow to build a hierarchical classification system. Each classification step is simpler & more accurate than in case of flat classifier, but overall recognition is incorrect if any of the consecutive classification steps is incorrect.

Recognize attributes of the items

For example length, diameter, type of head, material, type of drive in case of fasteners. This approach is robust & you can add new items easily. It is very sensitive to correct selection of items included in the training set.

Visual search

Extract certain hidden features from the images and search the collection for images with similar features. This could be pretty rough, but might narrow the view in some use cases.

Given a new area, we choose the best of these approaches or their combination.

Gartner study:
Automation on the Rise

According to Gartner — by 2021, early adopter brands that change their website to support visual & voice search will increase digital commerce revenue by up to 30%. The study — Brand Relevance Under Fire, Automation on the Rise — was published in the end of 2017.

Solid Training Data

We have process, team and tools to acquire & prepare high quality training data for the task. Our team consists of tens of annotation specialists which work can be directly supervised by your knowledge. We know that more data usually helps but proper selection and preparation of the training data is even more important than the volume. During the labeling process we use AI tools to help/assist our team. By using smart AI tools for annotation, we manage to obtain highly accurate models with speeding up entire workflow.


Initially we receive data from you. It might not be in perfect shape.



Ximilar team provides a team of skilled editors to prepare the training data.



We reach the required criteria, and declare succes. Or return to step 2.


More than a Recognition Technology

Our technology is based on latest research results. The training & evaluation process is fully automated. We use NVIDIA GPUs for speeding up training of most advanced deep learning models in TensorFlow and PyTorch.

Nowadays, anyone has access to the latest research results, the architecture of the neural networks are not revolutionary. What makes our technology unique is accurate, fast and fully automated process of training. With support of our engineers and researchers you will be able to have always models.

We use state of the art neural network models & machine learning techniques. Constantly trying to improve the technology so you always have the best quality available. Each model has millions of parameters which can be processed by CPU or GPU. Our intelligent algorithm picks from several models & uses the best performing.

  • Tensorflow-Logo
  • Docker-Logo
  • Nvidia-Logo
  • IbmCloud-Logo
  • Python-Logo

Reliable Deployment by Default

Our system, deployed on a single standard server, can serve over 100 parallel recognition requests per second. Response times are below 100 ms (excluding image upload) for each request. Dozens of big customers & over a thousand of small users on the free plan are a proof. Hundreds of requests per second for a few years.