Ximilar Now Combines Visual and Text-to-Image Search

E-commerce retailers using our search engine now have access to multilingual text search as well.

Zuzana, Ximilar
Zuzana Raidová June 24, 2025
8 minutes of reading
E-commerce retailers using Ximilar visual search engine now have access to multilingual text search as well.

Visual search is one of Ximilar’s core technologies, with numerous applications, including reverse image search, product recommendations, and image matching, across various industries such as fashion, home decor, collectables, and stock photography.

A typical use case is helping e-commerce customers find products through image-based searches. However, as competition in e-commerce grows, the most powerful search engines now seamlessly integrate image and text queries, analysing both while contextualising results across multiple languages.

That’s why we’ve upgraded our visual search solutions to support multilingual text-to-image search alongside image-based search. In this article, I’ll walk you through the upgraded services and how they enhance search engines powered by Ximilar.

Benefits of Multilingual Text Search

Integrating image and text-based search helps create a more intuitive shopping experience, making product discovery faster and more accurate. Customers can describe what they’re looking for in natural language, and the system retrieves matching images.

The benefits of multilingual text-to-image search are, for instance:

  • Understanding queries using AI-powered multilingual processing.
  • Matching text descriptions with visually similar products.
  • Support for filtering by metadata (e.g., brand, category, or supplier).

Solutions Integrating Multilingual Text-to-Image Search

Fashion Search

Fashion e-commerce relies heavily on visual aesthetics, making AI automation essential for managing nearly all customer-facing content, from product listings to newsletters.

Ximilar’s Fashion Search is an all-in-one solution for automating cataloguing, tagging, and searching of fashion apparel. Its basic functions are:

  • Detection of fashion items in uploaded images.
  • Identification of the largest or most prominent item.
  • Returning the exact or similar products from your catalogue based on visual features.

The basic Fashion Search is ideal for shoppers who want to find products based on a reference image. Our customers also combine it with:

  • Fashion Tagging – One of our most advanced e-commerce solutions, providing hundreds of relevant keywords that can be used not only for sorting and filtering on your site, but also in the automated product descriptions.
  • Multilingual Text Search – Allows customers to search with queries in natural language, retrieving matching images.
Multilingual text-to-image search is powered by natural language processing, enabling it to understand phrases and sentences in any common language. When combined with Fashion Search and Tagging, it enhances search accuracy and delivers the most relevant results to users.
Multilingual text-to-image search is powered by natural language processing, so it understands phrases and sentences in any common language. When combined with Fashion Search and Tagging, it enhances search accuracy significantly.

Stock Photo Search

The stock photo industry is highly competitive. Successful platforms require vast databases, often with millions of high-quality visual content. Users expect accurate, intuitive, and fast search, and often migrate to larger providers that keep pace with technology. Therefore, a powerful search engine is critical.

With Ximilar, stock photo sites like StockPhotos.com automate:

Example: Using Natural Language to Search Stock Photos

Let’s explore the text-to-image search feature in the Ximilar Demo using natural language queries. The system processes text inputs, interprets their meaning, and retrieves the most visually relevant images from a stock photo collection, provided for demo purposes by a stock photo database.

First, I enter a simple query: “puppy photo”. As expected, the demo returns a variety of mostly real-life puppy images.

Text to image search on Ximilar Stock Photo Search demo provides images from a stock photo database.
Text-to-image search on the Ximilar Stock Photo Search demo provides images from a stock photo database.

Next, I try “un petit chien” – French for “a small dog.” This time, the results include a mix of AI-generated and digital illustrations alongside real photos. Since I didn’t specify “photo,” the system broadens the results. Additionally, the images feature both adult small dogs and puppies, as I didn’t specify an age.

Testing multilingual text to image search on a text search query in French.
Testing multilingual text-to-image search on a text search query in French.

Now, let’s refine the search further. I want “puppy illustrations” exclusively. Let’s see what the system delivers…

Text to image search is based on natural language processing, allowing the AI search engine understand broader context of words and deliver relevant results.
Text-to-image search is based on natural language processing, allowing the AI search engine to understand the broader context of words and deliver relevant results.

Other Product Search Solutions

All our visual search & product similarity services now include multilingual text search, enabling customers to find products using natural language. This feature works seamlessly across various product categories.

Benefits of Ximilar’s text-to-image search:

  • Trained on product images from major marketplaces like Amazon.
  • Understands queries in multiple languages using AI-powered language processing.
  • Matches text descriptions with visually similar products from your catalog.
  • Supports metadata-based filtering (e.g., brand, category, price).

By combining image-based and text-based search, retailers can enhance product discovery, offering a seamless and efficient shopping experience for a global audience.

How to Access Ximilar’s Text-to-Image Search

I Already Use Ximilar’s Visual Search

Multilingual text-to-image search is now integrated into our existing visual search solutions. If you’re already using them, no changes are required—this feature is automatically available under the Free plan for Stock Photo Search and Product Search, and Business 100K plan for Fashion Search.

If any updates require switching to a different endpoint or adjustments on your side, you’ll receive a personal email notification to ensure a smooth transition.

I Want to Test it First

Ximilar’s visual search solutions for e-commerce are available for free testing through:

For security reasons, daily limits apply to public demos. However, registered Ximilar App users can access extended testing. Creating an account is free, and pricing for paid services is available on our Pricing page and Plan Setup page in the app, both of which include a cost-optimization calculator.

How Multilingual Text-to-Image Search Works

Ximilar’s text-to-image search processes user input, understands its intent, and retrieves the most visually relevant images or products from your collection. The results closely match the described content in terms of visual similarity. This approach is ideal for customers searching without an image, refining visual results using descriptive language, or filtering products through metadata such as brand, category, or price.

Making a Text-to-Image Search Request

To perform a text-based visual search, send a POST request with the text query, optional filters, and metadata fields to return. Ximilar offers following endpoints for different types of product searches:

  • /similarity/text/fashion/v2/text for fashion apparel
  • /similarity/text/photo/v2/text for photos, including stock images
  • /similarity/text/products/v2/text for packshots
  • /similarity/homedecor/text/v2/text for home decor and furniture

More categories are coming soon. If you don’t find your category in our API documentation, please let us know.

Example: Fashion Apparel Search

For fashion-related searches, use the following request. Don’t forget to replace “__APITOKEN__” with your authentication token, which you can find in your Ximilar App account and __YOURCOLLECTIONID__.

curl --request POST \
  --url https://api.ximilar.com/similarity/text/fashion/v2/text \
  --header 'authorization: Token __APITOKEN__' \
  --header 'collection-id: __YOURCOLLECTIONID__' \
  --header 'content-type: application/json' \
  --data '{
    "query_record": {
        "_text_data": "a blue pullover with white stripes pattern"
    },
    "filter": {
        "supplierid": {
            "$lte": 600
        }
    },
    "fields_to_return": [
        "_id",
        "_url",
        "supplierid"
    ],
    "k": 3
}'
Understanding the Response

The API returns the most relevant products along with metadata such as product ID, image URL, and supplier ID.

Here is an example JSON Response for our text-to-image search of fashion request:

{
    "status": {
        "code": 200,
        "text": "OK",
        "proc_id": "d820b7a9-4b93-4a85-a4ed-d96731bc471e"
    },
    "statistics": {
        "OperationTime": 244,
        "processing time": 0.314
    },
    "answer_records": [
        {
            "_id": "65732555",
            "_url": "_URL_1_",
            "supplierid": 450
        },
        {
            "_id": "88182211",
            "_url": "_URL_2_",
            "supplierid": 451
        },
        {
            "_id": "106629458",
            "_url": "_URL_3_",
            "supplierid": 200
        }
    ],
    "answer_distances": [
        1.1622,
        1.1651,
        1.1667
    ],
    "answer_count": 3
}
Key Response Elements
  • answer_records – List of matching products with their IDs, image URLs, and supplier IDs.
  • answer_distances – Similarity scores; lower values indicate better matches.
  • answer_count – Number of results returned.

Example: Stock Photo Search

For stock photo or general image searches, use this request:

curl --request POST \
  --url https://api.ximilar.com/similarity/text/photo/v2/text \
  --header 'authorization: Token __APITOKEN__' \
  --header 'collection-id: __YOURCOLLECTIONID__' \
  --header 'content-type: application/json' \
  --data '{
    "query_record": {
        "_text_data": "a landscape photo of mountains",
        "_dominant_colors": {
            "luv_colors": [
                [0, 0, 0],
                [100, 0, 0]
            ],
            "percentages": [0.6, 0.4]
        }
    },
    "filter": {
        "supplierid": {
            "$lte": 600
        }
    },
    "fields_to_return": [
        "_id",
        "_url",
        "supplierid"
    ],
    "k": 3
}'

This search allows filtering by dominant colors, ensuring results match specific aesthetic requirements.

What’s the Cost?

While the maintenance and features remain unchanged, the costs are now lower. The upgraded visual search system requires less computing power, so processing your images with these solutions is more affordable now. The text-to-image search requests cost just 10 API Credits.

Our pricing page provides a live-updated list of solutions, operations, endpoints, and costs:

Let Us Know What You Think!

We’re excited to share this upgrade with our users, both old and new. We look forward to seeing these new capabilities enhance your search engines and help you stay ahead in the competitive world of e-commerce and online retail.

If you need assistance or have questions about our solutions, don’t hesitate to reach out!

Zuzana, Ximilar

Zuzana Raidová

Head of Marketing

Zuzana is a marketing specialist, biologist, and illustrator addicted to reading and hiking. At Ximilar, she takes care of web content and communication, strives to keep the KPIs high, and the office temperature low. She likes science, kung fu movies, and rain.

Tags & Themes

Related Articles

A guide on how to easily connect to our trading card grading and condition evaluation AI via API.

Automate Card Grading With AI via API – Step by Step

A guide on how to easily connect to our trading card grading and condition evaluation AI via API.

Read more May 2025
A step-by-step guide on how to easily get pricing data for databases of collectibles, such as comic books, manga, trading card games & sports cards.

How to Automate Pricing of Cards & Comics via API

A step-by-step guide on how to easily get pricing data for databases of collectibles, such as comic books, manga, trading card games & sports cards.

Read more April 2025
Ximilar App is a way to access computer vision solutions without coding and to gain your own authentication key to use them via API.

Getting Started with Ximilar App: Plan Setup & API Access

Ximilar App is a way to access computer vision solutions without coding and to gain your own authentication key to use them via API.

Read more March 2025