Posted August 31, 20231 yr Introduction Are you tired of manually writing SQL queries? Do you wish there was an easier way to interact with your data? Look no further! In this blog post, we will show you how to build a cutting-edge speech-enabled SQL query system using Azure Open AI, Semantic Kernel, and Azure Speech Services. With this system, you can simply speak your query in natural language and receive the results as spoken text. We will use the power of Azure Open AI and Semantic Kernel to translate your natural language queries into SQL statements that can be executed against an SQL Server database. This will allow you to interact with your data in a more intuitive and user-friendly way. No more struggling with complex SQL syntax – just speak your query and let the system do the rest! And with Azure Speech Services, we will convert your speech into text and synthesize the results as speech. This means that you can hear the results of your query spoken back to you, making it easier to understand and digest the information. In this post, we will guide you through the process of setting up an Azure Open AI account, deploying a model, and integrating it with Semantic Kernel to generate SQL queries from natural language input. We will also show you how to use Azure Speech Services to convert speech to text and synthesize speech from text. By the end of this post, you will have a better understanding of how to leverage these powerful Azure services to build a speech-enabled SQL query system. You will be able to interact with your data in a whole new way, making it easier and more efficient to get the information you need. So why wait? Let’s get started and revolutionize the way we interact with our data! Prerequisite The next part of the sections will explain the flow and the implementation through Python coding through Semantic Kernel Orchestrations. If you are new to these technologies, don’t worry please go through these prerequisite links to get started: Azure Open AI: Get started with Azure OpenAI Service - Training | Microsoft Learn We are going to use Azure OpenAI completion models and not chat models. So please use Azure Open AI GPT 3.5 Turbo (0301) model for this. [*]Azure Cognitive Services: Speech To Text: Speech to text overview - Speech service - Azure Cognitive Services | Microsoft Learn [*]Python coding: You will find multiple courses on the internet, You can refer Learn Python - Free Interactive Python Tutorial [*]Semantic Kernel Orchestrations: Orchestrate your AI with Semantic Kernel | Microsoft Learn [*]Azure SQL: Azure SQL fundamentals - Training | Microsoft Learn Architecture Figure 1: Architecture The above architecture is explained step by step below: A user speaks a natural language query into a microphone. The natural language query is captured as text. The text is sent to the Azure Speech to Text Service, which converts the speech into text. The text is then processed by an NLP system, which converts the natural language query into an SQL query. The SQL query is used to fetch data from an SQL server. The data from the SQL server is returned as a result. The result is sent to the Azure Text to Speech Service, which converts the text into speech. The output is then played back to the user as speech. This architecture diagram shows how a user can interact with a SQL server using natural language and receive the output as speech. It uses Azure Speech Services and an NLP system to convert the user’s speech into an SQL query and then fetches data from the SQL server based on that query. Finally, the result is converted back into speech using Azure Text to Speech Service and played back to the user. Creating the Folder Structure Now I will suggest a folder structure for this application. It is not mandatory to follow and completely optional. I have created 4 folders for this application. bicep_template: Folder to store my bicep templates to create the azure resources fake_data: Folder to store my code to create some fake data for this application. myPlugins: Folder to keep all my plugins for this application. sql_scripts: Folder to keep all my sql scripts. Creating the Resources in Azure Now to automate the deployment we will use a few Azure AZ ClI commands and a bicep template. First, we need to create a rsourcegroup in Azure. Please execute the below code in a console to create one . az group create --name <Resource Group Name> --location <Location> Now, create a new file named resources.bicep and copy and paste the below code. The file will be under bicep_template folder. // Define input parameters param sqlServerName string param sqlDatabaseName string param openAIResourceName string param location string param speechServiceName string // Create SQL Server resource resource sqlServer 'Microsoft.Sql/servers@2022-11-01-preview' = { name: sqlServerName location: location properties: { administratorLogin: '<username>' // Set the administrator login for the SQL Server administratorLoginPassword: '<password>' // Set the administrator login password for the SQL Server } } // Create SQL Database resource resource sqlDatabase 'Microsoft.Sql/servers/databases@2022-11-01-preview' = { parent: sqlServer // Set the parent resource to the SQL Server name: sqlDatabaseName // Set the name of the SQL Database location: location // Set the location of the SQL Database } // Create Speech Service resource resource speechService 'Microsoft.CognitiveServices/accounts@2022-03-01' = { name: speechServiceName // Set the name of the Speech Service location: location // Set the location of the Speech Service kind: 'SpeechServices' // Set the kind of the Speech Service to SpeechServices sku: { name: 'S0' // Set the SKU name to S0 tier: 'Standard' // Set the SKU tier to Standard } properties: { customSubDomainName: speechServiceName // Set the custom subdomain name for the Speech Service } identity: { type: 'SystemAssigned' // Enable managed identity for the Speech Service } } // Create OpenAI resource resource open_ai 'Microsoft.CognitiveServices/accounts@2022-03-01' = { name: openAIResourceName // Set the name of the OpenAI resource location: location // Set the location of the OpenAI resource kind: 'OpenAI' // Set the kind of the OpenAI resource to OpenAI sku: { name: 'S0' // Set the SKU name to S0 } properties: { apiProperties: { endpoint: 'https://${openAIResourceName}.openai.azure.com/' // Set the endpoint for the OpenAI resource } } } Script explanation : The code above is written in Bicep, a domain-specific language for deploying Azure resources. It defines a set of input parameters and creates three Azure resources: a SQL Server, a Speech Service, and an OpenAI resource. The param statements at the beginning of the file define the input parameters for the deployment, including the names of the SQL Server, SQL Database, Speech Service, and OpenAI resources, as well as the location where the resources will be deployed. The resource statements define the Azure resources that will be created. The first two resources create an SQL Server and an SQL Database, respectively. The parent property of the SQL Database resource is set to the SQL Server resource, which makes the SQL Database a child resource of the SQL Server. The third resource creates a Speech Service, which is a Cognitive Services resource that provides speech-to-text and text-to-speech capabilities. The kind property of the speech Service resource is set to Speech Services, which specifies the type of Cognitive Services resource to create. The SKU property specifies the pricing tier for the resource, and the properties property sets a custom subdomain name for the Speech Service. The identity property enables a managed identity for the resource. The fourth resource creates an OpenAI resource, which is a Cognitive Services resource that provides natural language processing capabilities. The kind property of the OpenAI resource is set to OpenAI, which specifies the type of Cognitive Services resource to create. The SKU property specifies the pricing tier for the resource, and the properties property sets the endpoint for the resource. Overall, this code creates a set of Azure resources that can be used to support natural language processing and speech-to-text capabilities in an application. Command to run the bicep command in a console: az deployment group create --resource-group <resource-group> --template-file ./bicep_template/resources.bicep --parameters sqlServerName=<sql server name> sqlDatabaseName=<sql database name> speechServiceName=<speech service name> openAIResourceName=<azure open ai resource name> location=<location> After the script completes the deployment we will find 4 resources along with the resourcegroup provisioned into the Azure subscription. Creating some fake date in Azure SQL Server (Optional) Now let us create some fake data for this application to use. Before that please add your client IP4 address in the Azure SQL server Firewall rules. Please follow the link to complete this. ( Adding your Client IP to the Azure SQL server firewall - Microsoft Playground (msftplayground.com) ) Log in to the Azure SQL Server Query Editor or through SQL Server Management Studio and create a fake table. Example: CREATE TABLE ExplorationProduction ( WellID INT PRIMARY KEY, WellName VARCHAR(50), Location VARCHAR(100), ProductionDate DATE, ProductionVolume DECIMAL(10, 2), Operator VARCHAR(50), FieldName VARCHAR(50), Reservoir VARCHAR(50), Depth DECIMAL(10, 2), APIGravity DECIMAL(5, 2), WaterCut DECIMAL(5, 2), GasOilRatio DECIMAL(10, 2) ); We can put this script as part of our sql_scripts for future reference. Now create a file named config.ini within the fake_data folder. [database] server_name = <servername>.database.windows.net database_name = <databasename> username = <username> password = <password> Create another file named create_fake_data.py and paste the below content. import pyodbc from faker import Faker import configparser # Read the config.ini file config = configparser.ConfigParser() config.read('fake_data/config.ini') # Connect to the SQL Server database server_name = config.get('database', 'server_name') database_name = config.get('database', 'database_name') username = config.get('database', 'username') password = config.get('database', 'password') # Connect to the SQL Server database conn = pyodbc.connect('DRIVER={driver};SERVER={server_name};DATABASE={database_name};UID={username};PWD={password}'.format(driver="ODBC Driver 18 for SQL Server",server_name=server_name, database_name=database_name, username=username, password=password)) # Create a cursor object to execute SQL queries cursor = conn.cursor() # Create Faker object fake = Faker() # Generate and insert 10,000 fake records for id in range(10000): well_id = id + 1 well_name = fake.word() + ' Well' location = fake.city() + ', ' + fake.country() production_date = fake.date_between(start_date='-1y', end_date='today') production_volume = fake.pydecimal(left_digits=6, right_digits=2, positive=True) operator = fake.company() field_name = fake.word() + ' Field' reservoir = fake.word() + ' Reservoir' depth = fake.pydecimal(left_digits=5, right_digits=2, positive=True) api_gravity = fake.pydecimal(left_digits=2, right_digits=2, positive=True) water_cut = fake.pydecimal(left_digits=2, right_digits=2) gas_oil_ratio = fake.pydecimal(left_digits=4, right_digits=2) print(well_name + " added to the database.") # Insert record into the ExplorationProduction table cursor.execute("INSERT INTO ExplorationProduction (WellID, WellName, Location, ProductionDate, ProductionVolume, Operator, FieldName, Reservoir, Depth, APIGravity, WaterCut, GasOilRatio) VALUES (?,?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)", well_id,well_name, location, production_date, production_volume, operator, field_name, reservoir, depth, api_gravity, water_cut, gas_oil_ratio) # Commit the changes and close the connection conn.commit() conn.close() Code Explanation: The code above is a Python script that generates fake data and inserts it into a SQL Server database using the pyodbc module. The script first reads a configuration file named config.ini that contains the connection details for the SQL Server database. The script then connects to the SQL Server database using the connection details from the configuration file. It creates a cursor object that is used to execute SQL queries on the database. The script uses the faker module to generate fake data for each record. It generates 10,000 records, each with a unique well ID, well name, location, production date, production volume, operator, field name, reservoir, depth, API gravity, water cut, and gas-oil ratio. The print() function is used to print a message to the console for each record that is added to the database. The script then uses the cursor.execute() method to insert each record into the ExplorationProduction table in the SQL Server database. The SQL query uses parameterized values to insert the data into the table. Finally, the script uses the conn.commit() method to commit the changes to the database, and the conn.close() method to close the database connection. This script can be used to generate test data for an SQL Server database or to populate a database with sample data for development or demonstration purposes. The faker module can be customized to generate data that is specific to the requirements of the database schema. Deploying the Azure Open AI Model Now please log in to the Azure Open ai studio Azure OpenAI Studio - Microsoft Azure and under Models select gpt-35-turbo version 0301 model. Click the Deploy option and deploy the model for further usage within this blog. Creating the Environment File Now we will create our environment file, you can name it .env. Please fill in the details in the file in the below format: AZURE_OPENAI_DEPLOYMENT_NAME="<model name deployed>" AZURE_OPENAI_ENDPOINT="<url>" AZURE_OPENAI_API_KEY="<key>" SPEECH_KEY="<speech key>" SPEECH_REGION="<location>" server_name = <server name> database_name = <database name> SQLADMIN_USER = <username> SQL_PASSWORD = <password> Explanation of the variables: The code above is a configuration file in the .env format. It contains environment variables that are used to configure the deployment of an OpenAI resource and the connection to an SQL Server database. The AZURE_OPENAI_DEPLOYMENT_NAME variable specifies the name of the OpenAI resource deployment. The AZURE_OPENAI_ENDPOINT variable specifies the endpoint for the OpenAI resource, which is used to access the resource's API. The AZURE_OPENAI_API_KEY variable specifies the API key that is used to authenticate requests to the OpenAI resource. The SPEECH_KEY variable specifies the API key for a Speech Service resource, which is used to provide speech-to-text and text-to-speech capabilities. The SPEECH_REGION variable specifies the region where the Speech Service resource is deployed. The server_name and database_name variables specify the name of the SQL Server and SQL Database, respectively. The SQLADMIN_USER variable specifies the username for the SQL Server administrator, and the SQL_PASSWORD variable specifies the password for the SQL Server administrator. Overall, this configuration file contains the necessary environment variables to configure the deployment of an OpenAI resource and the connection to an SQL Server database. These variables can be customized to match the specific requirements of the deployment and the database connection. Creating the Plugins Now let us create the Semantic Kernel Plugins for our application. · STT Plugin Create a folder named sttPlugin within the myPlugins folder. Create another file named sttPlugin.py and paste the below content. import azure.cognitiveservices.speech as speechsdk from semantic_kernel.skill_definition import ( sk_function, sk_function_context_parameter, ) from semantic_kernel.orchestration.sk_context import SKContext class STTPlugin: _function( description="generate text from speech", name="recognize_from_microphone", input_description="The content from microphone to be converted to text", ) _function_context_parameter( name="speech_key", description="speech_key", ) _function_context_parameter( name="speech_region", description="speech_region", ) def recognize_from_microphone(self, context: SKContext): speech_config = speechsdk.SpeechConfig(subscription=context["speech_key"], region=context["speech_region"]) speech_config.speech_recognition_language="en-US" audio_config = speechsdk.audio.AudioConfig(use_default_microphone=True) speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config) speech_recognition_result = speech_recognizer.recognize_once_async().get() if speech_recognition_result.reason == speechsdk.ResultReason.RecognizedSpeech: #print("Recognized: {}".format(speech_recognition_result.text)) context["result"] = speech_recognition_result.text elif speech_recognition_result.reason == speechsdk.ResultReason.NoMatch: print("No speech could be recognized: {}".format(speech_recognition_result.no_match_details)) elif speech_recognition_result.reason == speechsdk.ResultReason.Canceled: cancellation_details = speech_recognition_result.cancellation_details #print("Speech Recognition canceled: {}".format(cancellation_details.reason)) if cancellation_details.reason == speechsdk.CancellationReason.Error: print("Error details: {}".format(cancellation_details.error_details)) print("Did you set the speech resource key and region values?") return context Code Explanation : The code above is a Python class named STTPlugin that provides a function for converting speech to text using the Azure Cognitive Services Speech SDK. The class is designed to be used as a plugin in the Semantic Kernel, a framework for building conversational agents. The recognize_from_microphone function is decorated with the sk_function decorator, which registers the function as a skill in the Semantic Kernel. The function takes a single argument, context, which is an instance of the SKContext class. The function is designed to be called from a conversational agent and takes the content from a microphone as input. The function first creates a SpeechConfig object using the speechsdk module, which is part of the Azure Cognitive Services Speech SDK. The subscription and region properties of the SpeechConfig object are set using the values of the speech_key and speech_region parameters, respectively. The speech_recognition_language property is set to "en-US", which specifies the language of the speech to be recognized. The function then creates an AudioConfig object using the audio module of the speechsdk module. The use_default_microphone property is set to True, which specifies that the default microphone should be used for audio input. The function then creates a SpeechRecognizer object using the SpeechConfig and AudioConfig objects. The recognize_once_async method of the SpeechRecognizer object is called to recognize speech from the microphone. The get method is called on the result of the recognize_once_async method to wait for the recognition to complete and retrieve the result. The function then checks the reason property of the SpeechRecognitionResult object to determine if the speech was recognized. If the speech was recognized, the text property of the SpeechRecognitionResult object is assigned to the result property of the context object. If the speech is not recognized, an error message is printed to the console. Finally, the function returns the context object, which contains the recognized text in the result property. Overall, this code provides a simple way to convert speech to text using the Azure Cognitive Services Speech SDK and can be used as a plugin in the Semantic Kernel to build conversational agents that can recognize speech input. · TTS Plugin Create a folder named ttsPlugin within the myPlugins folder. Create another file named ttsPlugin.py and paste the below content. import azure.cognitiveservices.speech as speechsdk from semantic_kernel.skill_definition import ( sk_function, sk_function_context_parameter, ) from semantic_kernel.orchestration.sk_context import SKContext class TTSPlugin: _function( description="generate speech from text", name="speak_out_response", input_description="The content to be converted to speech", ) _function_context_parameter( name="content", description="The content to be converted to speech", ) _function_context_parameter( name="speech_key", description="speech_key", ) _function_context_parameter( name="speech_region", description="speech_region", ) def speak_out_response(self, context: SKContext): speech_config = speechsdk.SpeechConfig(subscription=context["speech_key"], region=context["speech_region"]) content = context["content"] audio_config = speechsdk.audio.AudioOutputConfig(use_default_speaker=True) # The language of the voice that speaks. speech_config.speech_synthesis_voice_name='en-US-JennyNeural' speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_config) speech_synthesis_result = speech_synthesizer.speak_text_async(content).get() if speech_synthesis_result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted: print("Speech synthesized to speaker for text [{}]".format(content)) elif speech_synthesis_result.reason == speechsdk.ResultReason.Canceled: cancellation_details = speech_synthesis_result.cancellation_details print("Speech synthesis canceled: {}".format(cancellation_details.reason)) if cancellation_details.reason == speechsdk.CancellationReason.Error: if cancellation_details.error_details: print("Error details: {}".format(cancellation_details.error_details)) print("Did you set the speech resource key and region values?") return context Code Explanation: The code above is a Python class named TTSPlugin that provides a function for converting text to speech using the Azure Cognitive Services Speech SDK. The class is designed to be used as a plugin in the Semantic Kernel, a framework for building conversational agents. The speak_out_response function is decorated with the sk_function decorator, which registers the function as a skill in the Semantic Kernel. The function takes a single argument, context, which is an instance of the SKContext class. The function is designed to be called from a conversational agent and takes text content as input. The function first creates a SpeechConfig object using the speechsdk module, which is part of the Azure Cognitive Services Speech SDK. The subscription and region properties of the SpeechConfig object are set using the values of the speech_key and speech_region parameters, respectively. The speech_synthesis_voice_name property is set to "en-US-JennyNeural", which specifies the voice to be used for speech synthesis. The function then creates an AudioOutputConfig object using the audio module of the speechsdk module. The use_default_speaker property is set to True, which specifies that the default speaker should be used for audio output. The function then creates a SpeechSynthesizer object using the SpeechConfig and AudioOutputConfig objects. The speak_text_async method of the SpeechSynthesizer object is called to synthesize speech from the text content. The get method is called on the result of the speak_text_async method to wait for the speech synthesis to complete and retrieve the result. The function then checks the reason property of the SpeechSynthesisResult object to determine if the speech was synthesized successfully. If the speech was synthesized successfully, a message is printed to the console indicating that the speech was synthesized. If the speech synthesis is canceled, an error message is printed to the console. Finally, the function returns the context object, which contains the synthesized speech in the result property. Overall, this code provides a simple way to convert text to speech using the Azure Cognitive Services Speech SDK and can be used as a plugin in the Semantic Kernel to build conversational agents that can synthesize speech output. · nlpToSqlPlugin For the nlpToSql Plugin, create a new folder named nlpToSqlPlugin within the myPlugins folder. Create two more files there: config.json skprompt.txt In config.json add the following content: { "schema": 1, "description": "NLP to SQL", "type": "completion", "completion": { "max_tokens": 200, "temperature": 0, "top_p": 0, "presence_penalty": 0.0, "frequency_penalty": 0.0 }, "input": { "parameters": [ { "name": "input", "description": "NLP to SQL", "defaultValue": "" } ] } } In skprompt.txt: ### SQL SERVER SQL tables, with their properties: # # ExplorationProduction (WellID, WellName, Location, ProductionDate, ProductionVolume, Operator, FieldName, Reservoir, Depth, APIGravity, WaterCut, GasOilRatio) # ### A SQL query to find {{$input}} Final Orchestrator Create a file named nlp_sql.py to orchestrate the application flow. Paste the below code to create the orchestration. import os import semantic_kernel as sk from semantic_kernel.connectors.ai.open_ai import AzureTextCompletion from myPlugins.ttsPlugin.ttsPlugin import TTSPlugin from myPlugins.sttPlugin.sttPlugin import STTPlugin from dotenv import load_dotenv import pyodbc import time # Native functions are used to call the native skills # 1. Create speech from the text # 2. Create text from user's voice through microphone def nativeFunctions(kernel, context, plugin_class,skill_name, function_name): native_plugin = kernel.import_skill(plugin_class, skill_name) function = native_plugin[function_name] result = function.invoke(context=context) return result["result"] # Create speech from the text def speak_out_response(kernel, context, content): context["content"] = content context["speech_key"] = os.getenv("speech_key") context["speech_region"] = os.getenv("speech_region") nativeFunctions(kernel, context, TTSPlugin(),"ttsPlugin","speak_out_response") # Create text from user's voice through microphone def recognize_from_microphone(kernel, context): context["speech_key"] = os.getenv("speech_key") context["speech_region"] = os.getenv("speech_region") return nativeFunctions(kernel, context, STTPlugin(),"sttPlugin","recognize_from_microphone") # Semantic functions are used to call the semantic skills # 1. nlp_sql: Create SQL query from the user's query def semanticFunctions(kernel, skills_directory, skill_name,input): functions = kernel.import_semantic_skill_from_directory(skills_directory, "myPlugins") summarizeFunction = functions[skill_name] return summarizeFunction(input) # Function to get the result from the database def get_result_from_database(sql_query): server_name = os.getenv("server_name") database_name = os.getenv("database_name") username = os.environ.get("SQLADMIN_USER") password = os.getenv("SQL_PASSWORD") conn = pyodbc.connect('DRIVER={driver};SERVER={server_name};DATABASE={database_name};UID={username};PWD={password}'.format(driver="ODBC Driver 18 for SQL Server",server_name=server_name, database_name=database_name, username=username, password=password)) cursor = conn.cursor() try: cursor.execute(sql_query) result = cursor.fetchone() except: return "No Result Found" cursor.close() conn.close() return result[0] def main(): #Load environment variables from .env file load_dotenv() # Create a new kernel kernel = sk.Kernel() context = kernel.create_new_context() context['result'] = "" # Configure AI service used by the kernel deployment, api_key, endpoint = sk.azure_openai_settings_from_dot_env() # Add the AI service to the kernel kernel.add_text_completion_service("dv", AzureTextCompletion(deployment, endpoint, api_key)) # Starting the Conversation speak_out_response(kernel,context,"....Welcome to the Kiosk Bot!! I am here to help you with your queries. I am still learning. So, please bear with me.") repeat = True while(repeat): speak_out_response(kernel,context,"Please ask your query through the Microphone:") print("Listening:") # Taking Input from the user through the Microphone query = recognize_from_microphone(kernel, context) print("Processing........") print("The query is: {}".format(query)) # Processing the query # Generating summary skills_directory = "." sql_query = semanticFunctions(kernel, skills_directory,"nlpToSqlPlugin",query).result.split(';')[0] print("The SQL query is: {}".format(sql_query)) # Use the query to call the database and get the output result = get_result_from_database(sql_query) # Speak out the result to the user speak_out_response(kernel,context,"The result of your query is: {}".format(result)) speak_out_response(kernel,context,"Do you have any other query? Say Yes to Continue") # Taking Input from the user print("Listening:") user_input = recognize_from_microphone(kernel, context) if user_input == 'Yes.': repeat = True else: repeat = False speak_out_response(kernel,context,"Thank you for using the Kiosk Bot. Have a nice day.") if __name__ == "__main__": start = time.time() main() print("Time taken Overall(mins): ", (time.time() - start)/60) Code Explanation: The code above is a Python script named nlp_sql.py that uses the Semantic Kernel to build a conversational agent that can process natural language queries and generate SQL queries to retrieve data from a database. The script uses the Azure Cognitive Services Speech SDK to recognize speech input from the user and synthesize speech output to the user. The script defines several functions that are used to call the native and semantic skills of the Semantic Kernel. The nativeFunctions function is used to call the native skills of the Semantic Kernel, which are implemented as plugins. The speak_out_response function is used to synthesize speech output from text content using the TTSPlugin plugin. The recognize_from_microphone function is used to recognize speech input from the user using the STTPlugin plugin. The semanticFunctions function is used to call the semantic skills of the Semantic Kernel, which are implemented as Python modules. The nlp_sql module is used to generate SQL queries from natural language queries using the nlpToSqlPlugin module. The get_result_from_database function is used to retrieve data from an SQL Server database using the pyodbc module. The function takes a SQL query as input, executes the query using a database connection, and returns the result of the query. The main function is the main entry point of the script. The function first loads environment variables from a .env file using the load_dotenv function of the dotenv module. The function then creates a new instance of the Semantic Kernel and configures it to use the Azure Cognitive Services Text Completion API. The function then starts a conversation with the user by synthesizing speech output to welcome the user and ask for a query. The function then listens for speech input from the user using the recognize_from_microphone function, processes the query using the nlp_sql module, generates a SQL query, retrieves data from the database using the get_result_from_database function, and synthesizes speech output to speak the result to the user. The function then asks the user if they have any other queries, listens for speech input from the user, and repeats the process if the user says "Yes". If the user says "No", the function synthesizes speech output to thank the user and end the conversation. Overall, this code provides a simple way to build a conversational agent that can process natural language queries and retrieve data from an SQL Server database using the Semantic Kernel and the Azure Cognitive Services Speech SDK. Sample Demo Here is a screenshot of a sample demo of the application. The speech is printed to show the conversation. Speech synthesized to speaker for text [....Welcome to the Kiosk Bot!! I am here to help you with your queries. I am still learning. So, please bear with me.] Speech synthesized to speaker for text [Please ask your query through the Microphone:] Listening: Processing........ The query is: How many locations are there? The SQL query is: SELECT COUNT(DISTINCT Location) AS 'Number of Locations' FROM ExplorationProduction Speech synthesized to speaker for text [The result of your query is: 9985] Speech synthesized to speaker for text [Do you have any other query? Say Yes to Continue] Listening: Speech synthesized to speaker for text [Please ask your query through the Microphone:] Listening: Processing........ The query is: How many wells are there were water cut is more than 95? The SQL query is: SELECT COUNT(*) FROM ExplorationProduction WHERE WaterCut > 95 Speech synthesized to speaker for text [The result of your query is: 245] Speech synthesized to speaker for text [Do you have any other query? Say Yes to Continue] Listening: Speech synthesized to speaker for text [Thank you for using the Kiosk Bot. Have a nice day.] Time taken Overall(mins): 1.3298223217328389 Conclusion In a world where data-driven insights fuel progress, the way we interact with our data can make all the difference. The journey we embarked on in this blog post has led us to the cusp of a transformative breakthrough. By harnessing the capabilities of Azure Open AI, Semantic Kernel, and Azure Speech Services, we've shattered the barriers that once confined data querying to the realms of complex syntax and manual effort. Gone are the days of painstakingly crafting SQL queries; our vision of effortless data interaction has materialized. With the fusion of cutting-edge technologies, we've bestowed upon ourselves the ability to express queries in the most natural way possible – through speech. This evolution isn't just about convenience; it's about empowerment. Empowerment to articulate queries in our own words, unburdened by the intricacies of technical syntax. Empowerment to hear the responses in a format that resonates with us – spoken language. Through the harmony of Azure Open AI and Semantic Kernel, we've seen language transformed into logic. Complex translation from natural speech to structured queries has become an automated symphony, orchestrating SQL statements with a mere whisper. With Azure Speech Services, the symphony's crescendo reaches our ears, delivering results in spoken form, effortlessly consumable and intuitively absorbed. The path we've treaded together encompassed Azure account setup, model deployment, and seamless integration – all stepping stones that brought us to this revolutionary crossroads. Armed with this knowledge, we are now equipped to redefine the way we seek insights, unravel patterns, and make informed decisions. The boundaries of data interaction have expanded, and the horizon of possibilities beckons us forward. As we conclude this journey, envision a future where the conversation with data is no longer limited by technical jargon. A future where insights are just a spoken query away, and the responses resonate as familiar words. The potential is boundless, and the power to reshape our relationship with data lies firmly within our grasp. So, let us embrace this transformation, integrating the wisdom imparted by Azure's innovations. Let us stride confidently into a realm where data is not just queried, but conversed with. The dawn of a new era in data interaction is here – one spoken word at a time. Continue reading...
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.