Posted June 25, 2024Jun 25 [HEADING=2]Search Product Catalog Images Using Azure Search and OpenAI with Langchain[/HEADING] In the ever-evolving landscape of retail, businesses are continually seeking innovative solutions to streamline their operations and enhance customer experiences. One such breakthrough is the implementation of artificial intelligence (AI) to search product catalog images efficiently. This transformative technology not only simplifies the search process but also empowers businesses to provide personalized and seamless shopping experiences for their customers. The Need for AI in Product Catalog Image Search: Traditional methods of searching through product catalogs involve manual tagging and categorization, which can be time-consuming and prone to human error. As the volume of products in a catalog grows, managing and searching for specific items becomes a daunting task. AI, particularly computer vision, addresses these challenges by automating the recognition and categorization of products in images. Key Features of AI-Powered Product Catalog Image Search: Object Recognition and Tagging: AI algorithms can identify and tag objects within images, providing accurate and consistent categorization of products. This reduces the reliance on manual tagging, ensuring that products are correctly labeled in the catalog. Visual Similarity Search: AI enables visual similarity search, allowing users to find products based on visual attributes rather than relying solely on text-based queries. This feature is especially valuable for customers who may struggle to describe a product in words but can easily recognize it visually. Enhanced Product Discovery: By understanding the visual characteristics of products, AI facilitates a more sophisticated recommendation system. Customers can discover related or complementary items, leading to increased cross-selling opportunities and a more engaging shopping experience. Improved Accuracy and Efficiency: AI-powered image recognition is highly accurate and can process large volumes of images in a fraction of the time it would take a human. This efficiency not only reduces operational costs but also enhances the speed at which customers can find and purchase products. Integration with E-Commerce Platforms: AI-driven image search can seamlessly integrate with existing e-commerce platforms, making it easy for businesses to adopt this technology without major disruptions. This integration allows for a smoother transition and ensures that the AI-enhanced search becomes an integral part of the overall shopping experience. Now lets try to implement this with Azure OpenAI. Firs you need to import some libraries import azure.cognitiveservices.speech as speechsdk import datetime import io import json import math import matplotlib.pyplot as plt import numpy as np import openai import os import random import requests import sys import time from azure.core.credentials import AzureKeyCredential from azure.search.documents import SearchClient from azure.search.documents.indexes import SearchIndexClient from azure.search.documents.indexes import SearchIndexerClient from azure.search.documents.indexes.models import ( SearchIndexerDataContainer, SearchIndexerDataSourceConnection, ) from azure.storage.blob import BlobServiceClient, generate_blob_sas, BlobSasPermissions from azure.cognitiveservices.speech import ( AudioDataStream, SpeechConfig, SpeechSynthesizer, SpeechSynthesisOutputFormat, ) from azure.cognitiveservices.speech.audio import AudioOutputConfig from azure.search.documents.models import VectorizedQuery,VectorizableTextQuery from dotenv import load_dotenv from io import BytesIO from IPython.display import Audio from PIL import Image import os import base64 import re from datetime import datetime, timedelta import requests import os from tenacity import ( Retrying, retry_if_exception_type, wait_random_exponential, stop_after_attempt ) import json import mimetypes Initiate some environmental variable for your Azure OpenAI Endpoint Azure Cognitive Service End point Azure Search End point load_dotenv("azure.env") # Azure Open AI openai_api_type = os.getenv("azure") openai_api_base = os.getenv("AZURE_OPENAI_ENDPOINT") openai_api_version = os.getenv("AZURE_API_VERSION") openai_api_key = os.getenv("AZURE_OPENAI_KEY") # Azure Cognitive Search acs_endpoint = os.getenv("ACS_ENDPOINT") acs_key = os.getenv("ACS_KEY") # Azure Computer Vision 4 acv_key = os.getenv("ACV_KEY") acv_endpoint = os.getenv("ACV_ENDPOINT") blob_connection_string = os.getenv("BLOB_CONNECTION_STRING") container_name = os.getenv("CONTAINER_NAME") # Azure Cognitive Search index name to create index_name = "azure-fashion-demo" # Azure Cognitive Search api version api_version = "2023-02-01-preview" Now lets create a function to create text embedding using vision API def text_embedding(prompt): """ Text embedding using Azure Computer Vision 4.0 """ version = "?api-version=" + api_version + "&modelVersion=latest" vec_txt_url = f"{acv_endpoint}/computervision/retrieval:vectorizeText{version}" headers = {"Content-type": "application/json", "Ocp-Apim-Subscription-Key": acv_key} payload = {"text": prompt} response = requests.post(vec_txt_url, json=payload, headers=headers) if response.status_code == 200: text_emb = response.json().get("vector") return text_emb else: print(f"Error: {response.status_code} - {response.text}") return None Lets Now lets create a function to create Image embedding using vision API def image_embedding(image_path): url = f"{acv_endpoint}/computervision/retrieval:vectorizeImage" mime_type, _ = mimetypes.guess_type(image_path) headers = { "Content-Type": mime_type, "Ocp-Apim-Subscription-Key": acv_key } for attempt in Retrying( retry=retry_if_exception_type(requests.HTTPError), wait=wait_random_exponential(min=15, max=60), stop=stop_after_attempt(15) ): with attempt: with open(image_path, 'rb') as image_data: response = requests.post(url, params=params, headers=headers, data=image_data) if response.status_code != 200: response.raise_for_status() vector = response.json()["vector"] return vector Next thing we require is to create a function which takes a text prompt as input and search Azure Search for most relevant images. Here Buy Now Link is a dummy link which can be replaced with actual product URL def prompt_search(prompt, topn=5, disp=False): """ Azure Cognitive visual search using a prompt """ results_list = [] # Initialize the Azure Cognitive Search client search_client = SearchClient(acs_endpoint, index_name, AzureKeyCredential(acs_key)) blob_service_client = BlobServiceClient.from_connection_string(blob_connection_string) container_client = blob_service_client.get_container_client(container_name) # Perform vector search vector_query = VectorizedQuery(vector=text_embedding(prompt), k_nearest_neighbors=topn, fields="image_vector") response = search_client.search( search_text=prompt, vector_queries= [vector_query], select=["description"], top = 2 ) for nb, result in enumerate(response, 1): blob_name = result["description"] + ".jpg" blob_client = container_client.get_blob_client(blob_name) image_url = blob_client.url sas_token = generate_blob_sas( blob_service_client.account_name, container_name, blob_name, account_key=blob_client.credential.account_key, permission=BlobSasPermissions(read=True), expiry=datetime.utcnow() + timedelta(hours=1) ) sas_url = blob_client.url + "?" + sas_token results_list.append({"buy_now_link" : sas_url,"price_of_the_product": result["description"], "product_image_url": sas_url}) return results_list Lets ingest some Product Images to the Azure Search. Here we are basically the idea is we have folder called images having all the product images stored. We are basically creating a container and uploading all the images from the folder to the specific container. EMBEDDINGS_DIR = "embeddings" os.makedirs(EMBEDDINGS_DIR, exist_ok=True) image_directory = os.path.join('images') embedding_directory = os.path.join('embeddings') output_json_file = os.path.join(embedding_directory, 'output.jsonl') for root, dirs, files in os.walk(image_directory): for file in files: local_file_path = os.path.join(root, file) blob_name = os.path.relpath(local_file_path, image_directory) with open(local_file_path, "rb") as data: blob_client.upload_blob(data, overwrite=True) Next we will create the embedding of the product images and store the same locally in the embedding directory. Point to note is that we have used only 2 metadata id and description. You can basically extend to many more metadata like price, buy now link etc. with open(output_json_file, 'w') as outfile: for idx, image_path in enumerate(os.listdir(image_directory)): if image_path: try: vector = image_embedding(os.path.join(image_directory, image_path)) except Exception as e: print(f"Error processing image at index {idx}: {e}") vector = None filename, _ = os.path.splitext(os.path.basename(image_path)) result = { "id": f'{idx}', "image_vector": vector, "description": filename } outfile.write(json.dumps(result)) outfile.write('\n') outfile.flush() print(f"Results are saved to {output_json_file}") Now since have created the local embedding file , we can upload the same into a Azure Search. Before that lets create an index . from azure.search.documents.indexes import SearchIndexClient from azure.search.documents.indexes.models import ( SimpleField, SearchField, SearchFieldDataType, VectorSearch, HnswAlgorithmConfiguration, VectorSearchProfile, SearchIndex ) credential = AzureKeyCredential(acs_key) # Create a search index index_client = SearchIndexClient(endpoint=acs_endpoint, credential=credential) fields = [ SimpleField(name="id", type=SearchFieldDataType.String, key=True), SearchField(name="description", type=SearchFieldDataType.String, sortable=True, filterable=True, facetable=True), SearchField( name="image_vector", hidden=True, type=SearchFieldDataType.Collection(SearchFieldDataType.Single), searchable=True, vector_search_dimensions=1024, vector_search_profile_name="myHnswProfile" ), ] # Configure the vector search configuration vector_search = VectorSearch( algorithms=[ HnswAlgorithmConfiguration( name="myHnsw" ) ], profiles=[ VectorSearchProfile( name="myHnswProfile", algorithm_configuration_name="myHnsw", ) ], ) # Create the search index with the vector search configuration index = SearchIndex(name=index_name, fields=fields, vector_search=vector_search) result = index_client.create_or_update_index(index) print(f"{result.name} created") Once you have created the index , you can upload the locally stored index file. from azure.search.documents import SearchClient import json data = [] with open(output_json_file, 'r') as file: for line in file: # Remove leading/trailing whitespace and parse JSON json_data = json.loads(line.strip()) data.append(json_data) search_client = SearchClient(endpoint=acs_endpoint, index_name=index_name, credential=credential) results = search_client.upload_documents(data) for result in results: print(f'Indexed {result.key} with status code {result.status_code}') Congratulations you have finally ready to implement your Agent using OpenAI Lets create tool called image search which will be used by the Agent from typing import Optional from langchain_core.callbacks import CallbackManagerForToolRun from langchain_core.tools import BaseTool from util import prompt_search class ImageSearchResults(BaseTool): """Tool that queries the Fashion Image Search API and gets back json.""" name: str = "image_search_results_json" description: str = ( "A wrapper around Image Search. " "Useful for when you need search fashion images related to cloth , shoe etc" "Input should be a search query. Output is a JSON array of the query results" ) num_results: int = 4 def _run( self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None, ) -> str: """Use the tool.""" return str(prompt_search(prompt = query, topn=self.num_results)) Here we will be using Langchain to implement our Fashion Agent called Luca from langchain_core.prompts.chat import ( BaseMessagePromptTemplate, ChatPromptTemplate, HumanMessagePromptTemplate, MessagesPlaceholder, SystemMessagePromptTemplate, PromptTemplate, ) from langchain_core.messages import AIMessage, HumanMessage, SystemMessage from langchain_core.runnables import Runnable, RunnablePassthrough from langchain_community.tools.convert_to_openai import format_tool_to_openai_function from langchain_core.utils.function_calling import convert_to_openai_function from langchain.agents.output_parsers.openai_functions import ( OpenAIFunctionsAgentOutputParser, ) from langchain.agents.format_scratchpad.openai_functions import ( format_to_openai_function_messages, ) from langchain.agents import AgentExecutor from langchain_openai import AzureChatOpenAI from langchain_core.runnables import RunnableConfig from custom_tool import ImageSearchResults import openai Lets initialize our LLM from langchain_openai import AzureChatOpenAI llm = AzureChatOpenAI( api_key=os.environ["AZURE_OPENAI_KEY"], api_version="2023-12-01-preview", azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"], model="gpt-4-turbo", ) llm(messages=[HumanMessage(content = "Hi")]) prefix="""You are Luca a helpful Fashion Agent who help people navigating and buying products online Note: \\ Show Prices always in INR \\ Always try user to buy from the buy now link provided""" suffix = "" Lets attach tool we created, here we are using LCEL to implement out agent tools = [imageSearchResults(num_results=5)] llm_with_tools = llm.bind( functions=[convert_to_openai_function(t) for t in tools] ) messages = [ SystemMessage(content=prefix), HumanMessagePromptTemplate.from_template("{input}"), AIMessage(content=suffix), MessagesPlaceholder(variable_name="agent_scratchpad"), ] input_variables = ["input", "agent_scratchpad"] prompt = ChatPromptTemplate(input_variables=input_variables, messages=messages) agent = ( RunnablePassthrough.assign( agent_scratchpad=lambda x: format_to_openai_function_messages( x["intermediate_steps"] ) ) | prompt | llm_with_tools | OpenAIFunctionsAgentOutputParser() ) Congratulation !! You are ready to test your Agent response = agent_executor.invoke( { "input": "I am looking for some summer dress as I am travelling to new Delhi", "chat_history": [ HumanMessage(content="hi! my name is bob"), AIMessage(content="Hello Bob! How can I assist you today?"), ], } ) Hurray !! You are now ready to deploy this Agent to a Enterprise App with some good looking UI. Here is the reference github repo with all the code artifact. AOAI_Samples/content_product_tagging at main · monuminu/AOAI_Samples Favor : Please clap if you like this and Follow me for more such content. References: Fundamentals of Knowledge Mining and Azure AI Search - Training Explore Azure AI Search to discover how to create an index, import data, and query the index for better search results.learn.microsoft.com LangChain Expression Language (LCEL) | 🦜️🔗 Langchain LangChain Expression Language, or LCEL, is a declarative way to easily compose chains together.python.langchain.com What is Azure AI Vision? - Azure AI services The Azure AI Vision service provides you with access to advanced algorithms for processing images and returning…learn.microsoft.com Continue reading...
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.