Written by: Elastic Tomás Murúa
In this article, we'll cover how to integrate Alibaba Cloud AI features with Elasticsearch to improve relevance in semantic searches.
Alibaba Cloud AI Search is a solution that integrates advanced AI features with Elasticsearch tools, by leveraging the Qwen LLM/DeepSeek-R1 family to contribute with advanced models for inference and classification. In this article, we'll use descriptions of novels and plays written by the same author to test the Alibaba reranking and sparse embedding endpoints.
Steps
1)Configure Alibaba Cloud AI
2)Create Elasticsearch mappings
3)Index data into Elasticsearch
4)Query data
5)Bonus: Answering questions with completion
Configure Alibaba Cloud AI
Alibaba Cloud AI reranking and embeddings
Open inference Alibaba Cloud offers different services. In this example, we'll use the descriptions of popular books and plays by Agatha Christie to test Alibaba Cloud embeddings and reranking endpoints in semantic search.
The Alibaba Cloud AI reranking endpoint is a semantic reranking functionality. This type of reranking uses a machine learning model to reorder search results based on their semantic similarity to a query. This allows you to use out-of-the-box semantic search capabilities on existing full-text search indices.
The sparse embedding endpoint is a type of embedding where most values are zero, making relevant information more prominent.
Get Alibaba Cloud API Key
We need a valid API Key to integrate Alibaba with Elasticsearch. To get it, follow these steps:
1)Access the Alibaba Cloud portal from the Service Plaza section.
2)Go to the left menu API Keys as shown below.
3)Generate a new API Key.
Configure Alibaba Endpoints
We´ll first configure the sparse embedding endpoint to transform the text descriptions into semantic vectors:
Embeddings endpoint:
PUT _inference/sparse_embedding/alibabacloud_ai_search_sparse
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "<api_key>",
"service_id": "ops-text-sparse-embedding-001",
"host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default"
}
}
We´ll then configure the rerank endpoint to reorganize results.
Rerank Endpoint:
PUT _inference/rerank/alibabacloud_ai_search_rerank
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "<api_key>",
"service_id": "ops-bge-reranker-larger",
"host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default"
}
}
Now that the endpoints are configured, we can prepare the Elasticsearch index.
Create Elasticsearch mappings
Let's configure the mappings. For this, we need to organize both the texts with the descriptions as well as the model-generated vectors.
We'll use the following properties:
semantic_description: to store the embeddings generated by the model and run semantic searches.
description: we'll use a "text" type to store the novels and plays’ descriptions and use them for full-text search.
We'll include the copy_to parameter so that both the text and the semantic field are available for hybrid search:
PUT arts
{
"mappings": {
"properties": {
"semantic_description": {
"type": "semantic_text",
"inference_id": "alibabacloud_ai_search_sparse"
},
"description": {
"type": "text",
"copy_to": "semantic_description"
}
}
}
}
With the mappings ready, we can now index the data.
Index data into Elasticsearch
Here's the dataset with the descriptions that we'll use for this example. We'll index it using the Elasticsearch Bulk API.
POST arts/_bulk
{ "index": {} }
{ "description": " Black Coffee is a play by the British crime-fiction author Agatha Christie. In the play, a scientist discovers that someone in his household has stolen the formula for an explosive." }
{ "index": {} }
{ "description": "The Mousetrap is a murder mystery play by Agatha Christie. The play opened in London's West End in 1952 and ran continuously until 16 March 2020." }
{ "index": {} }
{ "description": "The Body in the Murder is a Miss Marple mystery novel published by Agatha Christie in 1942. The case involves the murder of two teenage girls who are similar in appearance." }
{ "index": {} }
{ "description": " Agatha Christie's last published novel before she passed, Curtain: Poirot's Last Case is also her indelible detective's last appearance. Poirot and Hastings return to the very same house from The Mysterious Affairs at Styles over 30 years later." }
{ "index": {} }
{ "description": " Death on the Nile is Agatha Christie's most daring travel mystery novel. The tranquillity of a cruise along the Nile is shattered by the discovery that Linnet Ridgeway has been shot through the head." }
{ "index": {} }
{ "description": " The Murder of Roger Ackroyd was Agatha Christie’s first book to be published by William Collins in the spring of 1926. William Collins became part of HarperCollins and are still Christie’s publishers today." }
Note that the first two documents, “Black Coffee” and “The Mousetraps” are plays while the others are novels.
Query data
To see the different results we can get, we'll run different types of queries, starting with semantic query, then applying reranking, and finally using both. We'll use the same question "Which novel was written by Agatha Christie?" expecting to get the three documents that explicitly say novel, plus the one that says book. The two plays should be the last results.
Semantic search
We'll begin querying the semantic_text field to ask: "Which novel was written by Agatha Christie?" Let's see what happens:
GET /arts/_search
{
"_source": {
"includes": [
"description"
]
},
"query": {
"semantic": {
"field": "semantic_description",
"query": "Which novel was written by Agatha Christie?"
}
}
}
Response:
{
"took": 1246,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 6,
"relation": "eq"
},
"max_score": 0.1759066,
"hits": [
{
"_index": "arts",
"_id": "rdJ4-ZMB36zj9EVTnMgJ",
"_score": 0.1759066,
"_source": {
"description": " Death on the Nile is Agatha Christie's most daring travel mystery novel. The tranquillity of a cruise along the Nile is shattered by the discovery that Linnet Ridgeway has been shot through the head."
}
},
{
"_index": "arts",
"_id": "rNJ4-ZMB36zj9EVTnMgJ",
"_score": 0.17499167,
"_source": {
"description": " Agatha Christie's last published novel before she passed, Curtain: Poirot's Last Case is also her indelible detective's last appearance. Poirot and Hastings return to the very same house from The Mysterious Affairs at Styles over 30 years later."
}
},
{
"_index": "arts",
"_id": "q9J4-ZMB36zj9EVTnMgJ",
"_score": 0.16319725,
"_source": {
"description": "The Body in the Murder is a Miss Marple mystery novel published by Agatha Christie in 1942. The case involves the murder of two teenage girls who are similar in appearance."
}
},
{
"_index": "arts",
"_id": "qtJ4-ZMB36zj9EVTnMgJ",
"_score": 0.15506727,
"_source": {
"description": "The Mousetrap is a murder mystery play by Agatha Christie. The play opened in London's West End in 1952 and ran continuously until 16 March 2020."
}
},
{
"_index": "arts",
"_id": "qdJ4-ZMB36zj9EVTnMgJ",
"_score": 0.14572844,
"_source": {
"description": " Black Coffee is a play by the British crime-fiction author Agatha Christie. In the play, a scientist discovers that someone in his household has stolen the formula for an explosive."
}
},
{
"_index": "arts",
"_id": "rtJ4-ZMB36zj9EVTnMgJ",
"_score": 0.13951442,
"_source": {
"description": " The Murder of Roger Ackroyd was Agatha Christie’s first book to be published by William Collins in the spring of 1926. William Collins became part of HarperCollins and are still Christie’s publishers today."
}
}
]
}
}
In this case, the response prioritized most of the novels, but the document that says “book” appears last. We can still further refine the results with reranking.
Refining results with Reranking
In this case, we'll use a _inference/rerank
request to assess the documents we got in the first query and improve their rank in the results.
POST _inference/rerank/alibabacloud_ai_search_rerank
{
"query": "Which novel was written by Agatha Christie?",
"input": [
"Black Coffee is a play by the British crime-fiction author Agatha Christie. In the play, a scientist discovers that someone in his household has stolen the formula for an explosive.",
"The Mousetrap is a murder mystery play by Agatha Christie. The play opened in London's West End in 1952 and ran continuously until 16 March 2020.",
" The Body in the Murder is a Miss Marple mystery novel published by Agatha Christie in 1942. The case involves the murder of two teenage girls who are similar in appearance.",
" Agatha Christie's last published novel before she passed, Curtain: Poirot's Last Case is also her indelible detective's last appearance. Poirot and Hastings return to the very same house from The Mysterious Affairs at Styles over 30 years later.",
" Death on the Nile is Agatha Christie's most daring travel mystery novel. The tranquillity of a cruise along the Nile is shattered by the discovery that Linnet Ridgeway has been shot through the head.",
" The Murder of Roger Ackroyd was Agatha Christie’s first book to be published by William Collins in the spring of 1926. William Collins became part of HarperCollins and are still Christie’s publishers today."
]
}
Response:
{
"rerank": [
{
"index": 3,
"relevance_score": 0.91086304
},
{
"index": 4,
"relevance_score": 0.8409133
},
{
"index": 2,
"relevance_score": 0.76838577
},
{
"index": 5,
"relevance_score": 0.2295352
},
{
"index": 0,
"relevance_score": 0.13846178
},
{
"index": 1,
"relevance_score": 0.06620602
}
]
}
The response here shows that both plays are now at the bottom of the results.
Semantic search and reranking endpoint combined
Using a retriever, we'll combine the semantic query and reranking in just one step:
POST /arts/_search
{
"_source": {
"includes": ["description"]
},
"retriever": {
"text_similarity_reranker": {
"retriever": {
"standard": {
"query": {
"semantic": {
"field": "semantic_description",
"query": "Which novel was written by Agatha Christie?"
}
}
}
},
"field": "description",
"rank_window_size": 10,
"inference_id": "alibabacloud_ai_search_rerank",
"inference_text": "Which novel was written by Agatha Christie?"
}
}
}
Response:
"took": 1568,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 6,
"relation": "eq"
},
"max_score": 0.91086304,
"hits": [
{
"_index": "arts",
"_id": "rNJ4-ZMB36zj9EVTnMgJ",
"_score": 0.91086304,
"_source": {
"description": " Agatha Christie's last published novel before she passed, Curtain: Poirot's Last Case is also her indelible detective's last appearance. Poirot and Hastings return to the very same house from The Mysterious Affairs at Styles over 30 years later."
}
},
{
"_index": "arts",
"_id": "rdJ4-ZMB36zj9EVTnMgJ",
"_score": 0.8409133,
"_source": {
"description": " Death on the Nile is Agatha Christie's most daring travel mystery novel. The tranquillity of a cruise along the Nile is shattered by the discovery that Linnet Ridgeway has been shot through the head."
}
},
{
"_index": "arts",
"_id": "q9J4-ZMB36zj9EVTnMgJ",
"_score": 0.76838577,
"_source": {
"description": "The Body in the Murder is a Miss Marple mystery novel published by Agatha Christie in 1942. The case involves the murder of two teenage girls who are similar in appearance."
}
},
{
"_index": "arts",
"_id": "rtJ4-ZMB36zj9EVTnMgJ",
"_score": 0.2295352,
"_source": {
"description": " The Murder of Roger Ackroyd was Agatha Christie’s first book to be published by William Collins in the spring of 1926. William Collins became part of HarperCollins and are still Christie’s publishers today."
}
},
{
"_index": "arts",
"_id": "qdJ4-ZMB36zj9EVTnMgJ",
"_score": 0.13846178,
"_source": {
"description": " Black Coffee is a play by the British crime-fiction author Agatha Christie. In the play, a scientist discovers that someone in his household has stolen the formula for an explosive."
}
},
{
"_index": "arts",
"_id": "qtJ4-ZMB36zj9EVTnMgJ",
"_score": 0.06620602,
"_source": {
"description": "The Mousetrap is a murder mystery play by Agatha Christie. The play opened in London's West End in 1952 and ran continuously until 16 March 2020."
}
}
]
}
}
The results here differ from the semantic query. We can see that the document with no exact match for "novel" but that says “book” (The Murder of Roger Ackroyd) appears higher than in the first semantic search. Both plays are still the last results, just like with reranking.
Bonus: Answering questions with completion
With embeddings and reranking we can satisfy a search query, but still, the user will see all the search results and not the actual answer.
With the examples provided, we are one step away from a RAG implementation, where we can provide the top results + the question to an LLM to get the right answer.
Fortunately, Alibaba Cloud AI Service also provides an endpoint service we can use to achieve this purpose.
Let’s create the endpoint
Completion Endpoint:
1)Create Completion Endpoints with Alibaba Cloud Qwen LLM
PUT _inference/completion/alibabacloud_ai_search_completion
{
"service": "alibabacloud-ai-search",
"service_settings": {
"host" : "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"api_key": "<api_key>",
"service_id": "ops-qwen-turbo",
"workspace" : "default"
}
}
2)We can also create it with DeepSeek-R1
PUT _inference/completion/alibabacloud_ai_search_completion_deepseek_r1
{
"service": "alibabacloud-ai-search",
"service_settings": {
"host" : "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"api_key": "{{API_KEY}}",
"service_id": "deepseek-r1",
"workspace" : "default"
}
}
And now, send the results and question from the previous query:
Query with Alibaba Cloud Qwen LLM
POST _inference/completion/alibabacloud_ai_search_completion
{
"input": """
Answer the following question using the context provided:
QUESTION: Which novel was written by Agatha Christie?
CONTEXT:
DOCUMENT1
Black Coffee is a play by the British crime-fiction author Agatha Christie. In the play, a scientist discovers that someone in his household has stolen the formula for an explosive.
DOCUMENT2
The Mousetrap is a murder mystery play by Agatha Christie. The play opened in London's West End in 1952 and ran continuously until 16 March 2020.
DOCUMENT3
The Body in the Murder is a Miss Marple mystery novel published by Agatha Christie in 1942. The case involves the murder of two teenage girls who are similar in appearance.
DOCUMENT4
Agatha Christie's last published novel before she passed, Curtain: Poirot's Last Case is also her indelible detective's last appearance. Poirot and Hastings return to the very same house from The Mysterious Affairs at Styles over 30 years later.
DOCUMENT5
Death on the Nile is Agatha Christie's most daring travel mystery novel. The tranquillity of a cruise along the Nile is shattered by the discovery that Linnet Ridgeway has been shot through the head."
DOCUMENT6
The Murder of Roger Ackroyd was Agatha Christie’s first book to be published by William Collins in the spring of 1926. William Collins became part of HarperCollins and are still Christie’s publishers today.
ANSWER:
"""
}
Response:
{
"completion": [
{
"result": "Agatha Christie wrote several novels, including \"The Body in the Murder,\" \"Curtain: Poirot's Last Case,\" \"Death on the Nile,\" and \"The Murder of Roger Ackroyd.\""
}
]
}
Query with Alibaba Cloud DeepSeek-R1
POST _inference/completion/alibabacloud_ai_search_completion_deepseek_r1?timeout=180s
{
"input": "<|system|>
You are an AI assistant.</s>
<|user|>
CONTEXT:
Black Coffee is a play by the British crime-fiction author Agatha Christie. In the play, a scientist discovers that someone in his household has stolen the formula for an explosive;
The Mousetrap is a murder mystery play by Agatha Christie. The play opened in London's West End in 1952 and ran continuously until 16 March 2020;
The Body in the Murder is a Miss Marple mystery novel published by Agatha Christie in 1942. The case involves the murder of two teenage girls who are similar in appearance;
Agatha Christie's last published novel before she passed, Curtain: Poirot's Last Case is also her indelible detective's last appearance. Poirot and Hastings return to the very same house from The Mysterious Affairs at Styles over 30 years later;
Death on the Nile is Agatha Christie's most daring travel mystery novel. The tranquillity of a cruise along the Nile is shattered by the discovery that Linnet Ridgeway has been shot through the head;
The Murder of Roger Ackroyd was Agatha Christie’s first book to be published by William Collins in the spring of 1926. William Collins became part of HarperCollins and are still Christie’s publishers today;
QUESTION:
Which novela were written by Agatha Christie?</s>
<|assistant|>"
}
Result:
{
"completion": [
{
"result": """<think>
Okay, let's see. The user is asking which novels were written by Agatha Christie based on the given context. First, I need to go through each item in the context and determine if it's a novel. The user mentioned "novela," which I think is Spanish for "novel," so they're asking about novels, not plays or other works.
Looking at the context entries one by one:
1. **Black Coffee** is described as a play by Christie. So that's a play, not a novel. Exclude.
2. **The Mousetrap** is a murder mystery play, opened in London's West End. Definitely a play, not a novel. Exclude.
3. **The Body in the Murder** is listed as a Miss Marple mystery novel published in 1942. Wait, the title here might be a bit off. Agatha Christie wrote a novel called "The Body in the Library," which is a Miss Marple story from 1942. Maybe the user made a typo. Assuming it's "The Body in the Library," then yes, that's a novel. But the title given is "The Body in the Murder," which I don't recall. Need to check if that's a real title or a mistake. However, since the context says it's a Miss Marple novel published in 1942, I'll proceed with that, even if the title is slightly wrong. So include as a novel.
4. **Curtain: Poirot's Last Case** is mentioned as her last published novel before she passed. So that's a novel. Include.
5. **Death on the Nile** is described as a travel mystery novel. That's a novel. Include.
6. **The Murder of Roger Ackroyd** was her first book published by William Collins. That's a novel. Include.
So the novels listed here are: The Body in the Murder (assuming typo), Curtain, Death on the Nile, and The Murder of Roger Ackroyd. However, "The Body in the Murder" might actually be "The Body in the Library," which is the correct title. But since the user provided that exact title, I should list it as given, even if there's an error. Alternatively, note the possible typo.
Also, check if there are other works mentioned. The other entries are plays. So the answer should list the four novels mentioned in the context, being careful with the title accuracy.
</think>
The novels written by Agatha Christie mentioned in the context are:
1. **The Body in the Murder** (likely a typo for *The Body in the Library*, a Miss Marple novel published in 1942).
2. **Curtain: Poirot's Last Case** (her final published novel featuring Hercule Poirot).
3. **Death on the Nile** (a travel mystery novel set on a Nile cruise).
4. **The Murder of Roger Ackroyd** (her breakthrough novel published in 1926).
*Note*:
- *Black Coffee* and *The Mousetrap* are plays, not novels.
- If "The Body in the Murder" is intended to refer to *The Body in the Library*, the latter is the correct title of Christie's 1942 Miss Marple novel."""
}
]
}
Conclusion
Integrating Alibaba Cloud AI Search with Elasticsearch allows us to easily access completion, embedding, and reranking models to incorporate them into our search pipeline.
We can use the reranking and embedding endpoints, either separately or together, with the help of a retriever.
We can also introduce the completion endpoint to finish up a RAG end-to-end implementation.
Ready to start your journey with elasticsearch on Alibaba Cloud? Explore our tailored Cloud solutions and services to take the first step towards transforming your data into a visual masterpiece.
Please Click here, Embark on Your 30-Day Free Trial
Original text:Embeddings and reranking with Alibaba Cloud AI Service - Elasticsearch Labs