T
theringe
Sometimes, on the Azure portal, you might see an error message on the Function App like "we were not able to load some functions in the list due to errors."
There are many reasons for this symptom, such as connection errors with the storage account, runtime being down, indexing failure, and the synctriggers failure that we'll discuss.
To confirm whether your issue is indeed due to synctriggers, you could press F12 in your browser to activate developer mode and search for the keyword "batch" under the "Network" tab. This endpoint is used by the Azure portal to call various internal services of the Function App (e.g., retrieving app settings, site information, getting host status, etc.), including synctriggers.
Under the "Network" tab's "Payload" section, you can find these invocation activities. Look for the "WebsitesExtension.sync" activity (i.e., synctriggers) and note its GUID name.
Then, in the "Preview" section under the "Network" tab, use the GUID name to find the corresponding service invocation results. In this example, you might find that the return status code of the synctriggers invocation is not 200, meaning the invocation failed for some reason, which explains the related error messages in the Azure portal.
It means we could not see the trigger in the Azure portal is due to an internal "synctriggers" invocation failure. The causes of synctriggers failure are numerous, with the majority being network-related. Hence, we have compiled this simple SOP to help you quickly perform self-troubleshooting.
TOC
What is it
The synctriggers is an internal endpoint of Azure Function App in synchronizing the triggers defined in your application with the platform’s data.
Purpose of synctriggers Endpoint:
When is synctriggers Called:
Architecture
We need to understand that the caller of synctriggers is the Kudu container in this scenario within the Function App, and the callee is the application itself. Under normal circumstances, this invocation will pass through different network components before reaching the destination. Therefore, if any part of this process encounters an issue, it will cause the entire flow to fail.
In the following sections, we will discuss the potential issues causing synctriggers failures based on different network architectures (i.e., different numbered arrow processes). Specifically, we will cover:
Besides, the reason we cannot see the deployed triggers on the Function App Overview page in the Azure Portal is usually due to the failure in the invocation of this step. However, there are other possible reasons for synctriggers failures as described above, and the caller to synctriggers might not be Kudu container such that there might ba a different network architecture.
Troubleshooting Cases
[Condition 1]
The internal endpoint "/admin/host/synctriggers" is called by the Kudu container. Under normal scenario, the Kudu container makes direct requests to the application.
Solution 1:
Sometimes, the issue may arise when we initially sets up the Function App with only one of the two settings: "WEBSITE_CONTENTOVERVNET" or "WEBSITE_CONTENTAZUREFILECONNECTIONSTRING" According to App setting, we can either retain or remove both settings simultaneously.
We could simply add/remove them from here:
And restart the app after applying those change.
[Condition 2]
In more complex network configurations, the Function App is setup with VNet integration, and its subnet is configured with NSG (i.e., Network Security Group) rules that restrict inbound and outbound traffic on specific ports from that subnet.
We could simply get the NSG rules regards to that subnet if available.
Here is an example in grid view.
Since synctriggers are invoked via HTTPS, they will use port 443. We need to check whether there is any deny rule for the specific combination of "tcp" + "port 443" + "source/destination IP". In this example, all remaining traffic, including traffic on port 443, will be blocked. This results in the interruption of the process in condition 2, indirectly causing this error.
Solution 2:
The solution is to identify and remove the problematic rule and then try again.
If possible, we could also temporarily detach the NSG from the subnet. This way, we could quickly determine if the issue originates from there.
[Condition 3]
Many network engineers need to use an NVA (typically a firewall) to centrally log all traffic from different VNets/subnets. Therefore, it is common to setup a route table in the subnet with custom rules, directing any requests originating within the subnet to the NVA for forwarding before they actually reach the target.
Still, we could simply get the RT rules from ASC regards to that subnet if available.
Here is an example in grid view.
There is only 1 rule in the route table is to send all traffic to an NVA for transmission before sending it out. The issue arises from this configuration.
Solution 3A:
Same as condition 2, if possible, we could also temporarily detach the RT from the subnet. This way, we could quickly determine if the issue originates from there.
Since the NVA is usually managed by our own rather than Azure, the solution is to setup an allow rule in our internal firewall settings to permit TCP 443 requests originating from Kudu.
Solution 3B:
[Condition 3.1]
When both NSG and RT are present in the subnet, the rules for both configurations need to be reviewed together. If we have a working Function App to use as a comparison, it would be quicker to identify the differences between the two settings.
[Condition 4: Custom DNS]
The previous solutions address issues that occur during the connection process after the request target has been identified. However, there is also a scenario where the request fails because the target cannot be identified (i.e., we could not get the resolve ip using nameresolver or nslookup).
Solution 4A:
If our subnet is using the Azure default DNS (i.e., 168.63.129.16), there might be some anomalies in the Function App's resource provider’s DNS registration behavior during the time when the issue occurred. Please contact to the Azure support engineer since we could not directly get the related log.
Solution 4B:
If we are using a custom DNS server, we could get the server's access logs and check for any anomalies in the requests during that time period.
Summary
This article focuses on exploring the causes from different network scenarios and attempts to provide solutions. However, synctriggers is merely a symptom of the actual problem. There are still many other potential causes beyond networking that could lead to this type of error. Therefore, understanding the mechanism, timing, and process of synctriggers is crucial for DevOps personnel. This knowledge can help us quickly identify the root cause of issues.
References
Azure Functions のトリガーの同期とは - Japan PaaS Support Team Blog
Continue reading...
There are many reasons for this symptom, such as connection errors with the storage account, runtime being down, indexing failure, and the synctriggers failure that we'll discuss.
To confirm whether your issue is indeed due to synctriggers, you could press F12 in your browser to activate developer mode and search for the keyword "batch" under the "Network" tab. This endpoint is used by the Azure portal to call various internal services of the Function App (e.g., retrieving app settings, site information, getting host status, etc.), including synctriggers.
Under the "Network" tab's "Payload" section, you can find these invocation activities. Look for the "WebsitesExtension.sync" activity (i.e., synctriggers) and note its GUID name.
Then, in the "Preview" section under the "Network" tab, use the GUID name to find the corresponding service invocation results. In this example, you might find that the return status code of the synctriggers invocation is not 200, meaning the invocation failed for some reason, which explains the related error messages in the Azure portal.
It means we could not see the trigger in the Azure portal is due to an internal "synctriggers" invocation failure. The causes of synctriggers failure are numerous, with the majority being network-related. Hence, we have compiled this simple SOP to help you quickly perform self-troubleshooting.
TOC
- What is it
- Architecture
- Troubleshooting Cases
- Summary
- References
What is it
The synctriggers is an internal endpoint of Azure Function App in synchronizing the triggers defined in your application with the platform’s data.
Purpose of synctriggers Endpoint:
- Trigger Synchronization:
- To ensure that the triggers defined in the function app (e.g., HTTP triggers, timer triggers, and etc.) are registered and synchronized with the underlying Azure Functions runtime and the Azure platform.
- Updating Configuration:
- When changes are made to the function app (e.g., adding, updating, or removing triggers), the synctriggers endpoint helps propagate these changes.
- Deployment and Scaling:
- During the deployment/scaling process, the synctriggers endpoint is called to update the function definitions and inform the runtime of any new or modified triggers.
- Trigger Management:
- It used to managing and maintaining the lifecycle of triggers, ensuring that they are up-to-date.
When is synctriggers Called:
- Updating Configuration:
- Whenever there are changes to the function app settings or triggers, this endpoint is called to resynchronize the changes.
- Scaling Operations:
- When the function app scales out or scales in, the endpoint ensures that new instances understand the triggers they need to work with.
- Deployment:
- During the deployment of the function app, the synctriggers endpoint is invoked to register the triggers with the platform.
Architecture
We need to understand that the caller of synctriggers is the Kudu container in this scenario within the Function App, and the callee is the application itself. Under normal circumstances, this invocation will pass through different network components before reaching the destination. Therefore, if any part of this process encounters an issue, it will cause the entire flow to fail.
In the following sections, we will discuss the potential issues causing synctriggers failures based on different network architectures (i.e., different numbered arrow processes). Specifically, we will cover:
- Possible reasons for issues occurring without a detailed network architecture.
- Possible reasons when using VNet and NSG.
- Possible reasons when using VNet and route table. (and 3.1 combined state of 2 and 3.)
- DNS issues.
Besides, the reason we cannot see the deployed triggers on the Function App Overview page in the Azure Portal is usually due to the failure in the invocation of this step. However, there are other possible reasons for synctriggers failures as described above, and the caller to synctriggers might not be Kudu container such that there might ba a different network architecture.
Troubleshooting Cases
[Condition 1]
The internal endpoint "/admin/host/synctriggers" is called by the Kudu container. Under normal scenario, the Kudu container makes direct requests to the application.
Solution 1:
Sometimes, the issue may arise when we initially sets up the Function App with only one of the two settings: "WEBSITE_CONTENTOVERVNET" or "WEBSITE_CONTENTAZUREFILECONNECTIONSTRING" According to App setting, we can either retain or remove both settings simultaneously.
We could simply add/remove them from here:
And restart the app after applying those change.
[Condition 2]
In more complex network configurations, the Function App is setup with VNet integration, and its subnet is configured with NSG (i.e., Network Security Group) rules that restrict inbound and outbound traffic on specific ports from that subnet.
We could simply get the NSG rules regards to that subnet if available.
Here is an example in grid view.
Since synctriggers are invoked via HTTPS, they will use port 443. We need to check whether there is any deny rule for the specific combination of "tcp" + "port 443" + "source/destination IP". In this example, all remaining traffic, including traffic on port 443, will be blocked. This results in the interruption of the process in condition 2, indirectly causing this error.
Solution 2:
The solution is to identify and remove the problematic rule and then try again.
If possible, we could also temporarily detach the NSG from the subnet. This way, we could quickly determine if the issue originates from there.
[Condition 3]
Many network engineers need to use an NVA (typically a firewall) to centrally log all traffic from different VNets/subnets. Therefore, it is common to setup a route table in the subnet with custom rules, directing any requests originating within the subnet to the NVA for forwarding before they actually reach the target.
Still, we could simply get the RT rules from ASC regards to that subnet if available.
Here is an example in grid view.
There is only 1 rule in the route table is to send all traffic to an NVA for transmission before sending it out. The issue arises from this configuration.
Solution 3A:
Same as condition 2, if possible, we could also temporarily detach the RT from the subnet. This way, we could quickly determine if the issue originates from there.
Since the NVA is usually managed by our own rather than Azure, the solution is to setup an allow rule in our internal firewall settings to permit TCP 443 requests originating from Kudu.
Solution 3B:
The synctriggers endpoint is invoked via the HTTPS protocol, so the SSL root certificates of the HTTP server within the application need to be recognized by the firewall. If the firewall does not have these root certificates installed, certificate errors will occur during the Kudu request process, leading to request failures.
We could simply check it using the following command form a kudu site:
openssl s_client my-function-app.azurewebsites.net:443
Here is the example result.
The solution is to setup the SSL root certificates of the firewall.
[Condition 3.1]
When both NSG and RT are present in the subnet, the rules for both configurations need to be reviewed together. If we have a working Function App to use as a comparison, it would be quicker to identify the differences between the two settings.
[Condition 4: Custom DNS]
The previous solutions address issues that occur during the connection process after the request target has been identified. However, there is also a scenario where the request fails because the target cannot be identified (i.e., we could not get the resolve ip using nameresolver or nslookup).
Solution 4A:
If our subnet is using the Azure default DNS (i.e., 168.63.129.16), there might be some anomalies in the Function App's resource provider’s DNS registration behavior during the time when the issue occurred. Please contact to the Azure support engineer since we could not directly get the related log.
Solution 4B:
If we are using a custom DNS server, we could get the server's access logs and check for any anomalies in the requests during that time period.
Summary
This article focuses on exploring the causes from different network scenarios and attempts to provide solutions. However, synctriggers is merely a symptom of the actual problem. There are still many other potential causes beyond networking that could lead to this type of error. Therefore, understanding the mechanism, timing, and process of synctriggers is crucial for DevOps personnel. This knowledge can help us quickly identify the root cause of issues.
References
Azure Functions のトリガーの同期とは - Japan PaaS Support Team Blog
Continue reading...