Ximilar Now Combines Visual and Text-to-Image Search
E-commerce retailers using our search engine now have access to multilingual text search as well.


Visual search is one of Ximilar’s core technologies, with numerous applications, including reverse image search, product recommendations, and image matching, across various industries such as fashion, home decor, collectables, and stock photography.
A typical use case is helping e-commerce customers find products through image-based searches. However, as competition in e-commerce grows, the most powerful search engines now seamlessly integrate image and text queries, analysing both while contextualising results across multiple languages.
That’s why we’ve upgraded our visual search solutions to support multilingual text-to-image search alongside image-based search. In this article, I’ll walk you through the upgraded services and how they enhance search engines powered by Ximilar.
Benefits of Multilingual Text Search
Integrating image and text-based search helps create a more intuitive shopping experience, making product discovery faster and more accurate. Customers can describe what they’re looking for in natural language, and the system retrieves matching images.
The benefits of multilingual text-to-image search are, for instance:
- Understanding queries using AI-powered multilingual processing.
- Matching text descriptions with visually similar products.
- Support for filtering by metadata (e.g., brand, category, or supplier).
Solutions Integrating Multilingual Text-to-Image Search
Fashion Search
Fashion e-commerce relies heavily on visual aesthetics, making AI automation essential for managing nearly all customer-facing content, from product listings to newsletters.
Ximilar’s Fashion Search is an all-in-one solution for automating cataloguing, tagging, and searching of fashion apparel. Its basic functions are:
- Detection of fashion items in uploaded images.
- Identification of the largest or most prominent item.
- Returning the exact or similar products from your catalogue based on visual features.
The basic Fashion Search is ideal for shoppers who want to find products based on a reference image. Our customers also combine it with:
- Fashion Tagging – One of our most advanced e-commerce solutions, providing hundreds of relevant keywords that can be used not only for sorting and filtering on your site, but also in the automated product descriptions.
- Multilingual Text Search – Allows customers to search with queries in natural language, retrieving matching images.

Stock Photo Search
The stock photo industry is highly competitive. Successful platforms require vast databases, often with millions of high-quality visual content. Users expect accurate, intuitive, and fast search, and often migrate to larger providers that keep pace with technology. Therefore, a powerful search engine is critical.
With Ximilar, stock photo sites like StockPhotos.com automate:
- Categorisation and tagging of real-life photos, illustrations, and packshots.
- Visual and multilingual text-to-image search.
- Image recommendations based on topic similarity, colour palette, overall aesthetics, and image matching.
Example: Using Natural Language to Search Stock Photos
Let’s explore the text-to-image search feature in the Ximilar Demo using natural language queries. The system processes text inputs, interprets their meaning, and retrieves the most visually relevant images from a stock photo collection, provided for demo purposes by a stock photo database.
First, I enter a simple query: “puppy photo”. As expected, the demo returns a variety of mostly real-life puppy images.

Next, I try “un petit chien” – French for “a small dog.” This time, the results include a mix of AI-generated and digital illustrations alongside real photos. Since I didn’t specify “photo,” the system broadens the results. Additionally, the images feature both adult small dogs and puppies, as I didn’t specify an age.

Now, let’s refine the search further. I want “puppy illustrations” exclusively. Let’s see what the system delivers…

Other Product Search Solutions
All our visual search & product similarity services now include multilingual text search, enabling customers to find products using natural language. This feature works seamlessly across various product categories.
Benefits of Ximilar’s text-to-image search:
- Trained on product images from major marketplaces like Amazon.
- Understands queries in multiple languages using AI-powered language processing.
- Matches text descriptions with visually similar products from your catalog.
- Supports metadata-based filtering (e.g., brand, category, price).
By combining image-based and text-based search, retailers can enhance product discovery, offering a seamless and efficient shopping experience for a global audience.
How to Access Ximilar’s Text-to-Image Search
I Already Use Ximilar’s Visual Search
Multilingual text-to-image search is now integrated into our existing visual search solutions. If you’re already using them, no changes are required—this feature is automatically available under the Free plan for Stock Photo Search and Product Search, and Business 100K plan for Fashion Search.
If any updates require switching to a different endpoint or adjustments on your side, you’ll receive a personal email notification to ensure a smooth transition.
I Want to Test it First
Ximilar’s visual search solutions for e-commerce are available for free testing through:
- Public demos on service pages and a dedicated demo page.
- Ximilar App with drag-and-drop testing form for each solution.
For security reasons, daily limits apply to public demos. However, registered Ximilar App users can access extended testing. Creating an account is free, and pricing for paid services is available on our Pricing page and Plan Setup page in the app, both of which include a cost-optimization calculator.
How Multilingual Text-to-Image Search Works
Ximilar’s text-to-image search processes user input, understands its intent, and retrieves the most visually relevant images or products from your collection. The results closely match the described content in terms of visual similarity. This approach is ideal for customers searching without an image, refining visual results using descriptive language, or filtering products through metadata such as brand, category, or price.
Making a Text-to-Image Search Request
To perform a text-based visual search, send a POST request with the text query, optional filters, and metadata fields to return. Ximilar offers following endpoints for different types of product searches:
/similarity/text/fashion/v2/text
for fashion apparel/similarity/text/photo/v2/text
for photos, including stock images/similarity/text/products/v2/text
for packshots/similarity/homedecor/text/v2/text
for home decor and furniture
More categories are coming soon. If you don’t find your category in our API documentation, please let us know.
Example: Fashion Apparel Search
For fashion-related searches, use the following request. Don’t forget to replace “__APITOKEN__” with your authentication token, which you can find in your Ximilar App account and __YOURCOLLECTIONID__.
curl --request POST \
--url https://api.ximilar.com/similarity/text/fashion/v2/text \
--header 'authorization: Token __APITOKEN__' \
--header 'collection-id: __YOURCOLLECTIONID__' \
--header 'content-type: application/json' \
--data '{
"query_record": {
"_text_data": "a blue pullover with white stripes pattern"
},
"filter": {
"supplierid": {
"$lte": 600
}
},
"fields_to_return": [
"_id",
"_url",
"supplierid"
],
"k": 3
}'
Understanding the Response
The API returns the most relevant products along with metadata such as product ID, image URL, and supplier ID.
Here is an example JSON Response for our text-to-image search of fashion request:
{
"status": {
"code": 200,
"text": "OK",
"proc_id": "d820b7a9-4b93-4a85-a4ed-d96731bc471e"
},
"statistics": {
"OperationTime": 244,
"processing time": 0.314
},
"answer_records": [
{
"_id": "65732555",
"_url": "_URL_1_",
"supplierid": 450
},
{
"_id": "88182211",
"_url": "_URL_2_",
"supplierid": 451
},
{
"_id": "106629458",
"_url": "_URL_3_",
"supplierid": 200
}
],
"answer_distances": [
1.1622,
1.1651,
1.1667
],
"answer_count": 3
}
Key Response Elements
answer_records
– List of matching products with their IDs, image URLs, and supplier IDs.answer_distances
– Similarity scores; lower values indicate better matches.answer_count
– Number of results returned.
Example: Stock Photo Search
For stock photo or general image searches, use this request:
curl --request POST \
--url https://api.ximilar.com/similarity/text/photo/v2/text \
--header 'authorization: Token __APITOKEN__' \
--header 'collection-id: __YOURCOLLECTIONID__' \
--header 'content-type: application/json' \
--data '{
"query_record": {
"_text_data": "a landscape photo of mountains",
"_dominant_colors": {
"luv_colors": [
[0, 0, 0],
[100, 0, 0]
],
"percentages": [0.6, 0.4]
}
},
"filter": {
"supplierid": {
"$lte": 600
}
},
"fields_to_return": [
"_id",
"_url",
"supplierid"
],
"k": 3
}'
This search allows filtering by dominant colors, ensuring results match specific aesthetic requirements.
What’s the Cost?
While the maintenance and features remain unchanged, the costs are now lower. The upgraded visual search system requires less computing power, so processing your images with these solutions is more affordable now. The text-to-image search requests cost just 10 API Credits.
Our pricing page provides a live-updated list of solutions, operations, endpoints, and costs:
Let Us Know What You Think!
We’re excited to share this upgrade with our users, both old and new. We look forward to seeing these new capabilities enhance your search engines and help you stay ahead in the competitive world of e-commerce and online retail.
If you need assistance or have questions about our solutions, don’t hesitate to reach out!
Tags & Themes
Related Articles

Automate Card Grading With AI via API – Step by Step
A guide on how to easily connect to our trading card grading and condition evaluation AI via API.

How to Automate Pricing of Cards & Comics via API
A step-by-step guide on how to easily get pricing data for databases of collectibles, such as comic books, manga, trading card games & sports cards.

Getting Started with Ximilar App: Plan Setup & API Access
Ximilar App is a way to access computer vision solutions without coding and to gain your own authentication key to use them via API.