D
Drac_Zhang
As per the previous blog (Integrate Azure Open AI in Teams Channel via Logic App - Microsoft Community Hub) which only supports for plain text conversation.
For now, Azure Open AI released GPT-4o model which integrates text and images in a single model, enabling it to handle multiple data types simultaneously.
This post is to introduce how to upgrade to GPT-4o with capability for image processing.
Prerequisite
Most of the prerequisites are the same, just instead of GPT-3.5, we need to have a GPT-4o deployment in Azure Open AI environment.
Since GPT-4o is not available for all the regions, so you need to refer to the document (Azure OpenAI Service models - Azure OpenAI | Microsoft Learn) to see whether your current region supports for GPT-4o or not. If not, then you will have to deploy a new Azure Open AI.
Mechanism
In Open AI API, we can see that when send image to API, we need to provide image as base64 encoded content.
So the only thing for us is to get image binary data from Teams channel which we can achieve that via using Teams Graph API: Get chatMessageHostedContent - Microsoft Graph v1.0 | Microsoft Learn
Template and parameters
You can find the template in Drac-Zhang/LogicApp_Teams_OpenAI_Integration_WithGPT4o (github.com)
Parameters of the template:
Known Issues
Except the same known issues from previous blog, we have a new one:
In "Send a Microsoft Graph Http Request" action in "Microsoft Teams" connector, we have an issue that cannot correctly response hosted content as binary data, so we have to use "Invoke an HTTP request" which results to extra API connection (webcontents) and it requests to manual authentication after the deployment.
Sample Chat
Continue reading...
For now, Azure Open AI released GPT-4o model which integrates text and images in a single model, enabling it to handle multiple data types simultaneously.
This post is to introduce how to upgrade to GPT-4o with capability for image processing.
Prerequisite
Most of the prerequisites are the same, just instead of GPT-3.5, we need to have a GPT-4o deployment in Azure Open AI environment.
Since GPT-4o is not available for all the regions, so you need to refer to the document (Azure OpenAI Service models - Azure OpenAI | Microsoft Learn) to see whether your current region supports for GPT-4o or not. If not, then you will have to deploy a new Azure Open AI.
Mechanism
In Open AI API, we can see that when send image to API, we need to provide image as base64 encoded content.
Code:
[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": ".....U5ErkJggg=="
}
},
{
"type": "text",
"text": "what is this logo"
}
]
}
]
So the only thing for us is to get image binary data from Teams channel which we can achieve that via using Teams Graph API: Get chatMessageHostedContent - Microsoft Graph v1.0 | Microsoft Learn
Template and parameters
You can find the template in Drac-Zhang/LogicApp_Teams_OpenAI_Integration_WithGPT4o (github.com)
Parameters of the template:
Parameter Name | Comments |
openai_apikey | The api key of Open AI resource, it can be found in Open AI -> Keys and Endpoints |
openai_endpoint | Open AI api endpoint, the format is https://[Open AI resource Name].openai.azure.com/openai/deployments/[Deployment name]/chat/completions?api-version=2024-04-01-preview The version has been upgraded to 2024-04-01-preview compare to previous blog |
teams_channel_keyword | The keywords you would like to trigger the Logic App, not case sensitive |
teams_channel_id | See previous blog Integrate Azure Open AI in Teams Channel via Logic App - Microsoft Community Hub |
teams_group_id | See previous blog Integrate Azure Open AI in Teams Channel via Logic App - Microsoft Community Hub |
storage_account_name | The storage account name for saving conversation history |
storage_account_accesskey | The access key for the storage account |
Known Issues
Except the same known issues from previous blog, we have a new one:
In "Send a Microsoft Graph Http Request" action in "Microsoft Teams" connector, we have an issue that cannot correctly response hosted content as binary data, so we have to use "Invoke an HTTP request" which results to extra API connection (webcontents) and it requests to manual authentication after the deployment.
Sample Chat
Continue reading...