J
Jose_Manuel_Jurado
Last week, while working on a support case, our customer was facing performance issues in their Python application.
After some investigation, I decided to suggest CProfiler to identify which function call was taking the most time and use that as a starting point for troubleshooting.
So profiling the Python code became essential to pinpoint the bottleneck. I suggested using CProfiler, a built-in Python module, which helps you profile your code and identify performance issues in real time.
But first, I ran some examples with my own test code to see the results.
In this code snippet:
I was able to track the exact function calls and time spent on each one, for example,
Analyzing the profiling results for the Python code execution, sorted by cumulative time, I was able to find the following results:
Enjoy!
Continue reading...
After some investigation, I decided to suggest CProfiler to identify which function call was taking the most time and use that as a starting point for troubleshooting.
So profiling the Python code became essential to pinpoint the bottleneck. I suggested using CProfiler, a built-in Python module, which helps you profile your code and identify performance issues in real time.
But first, I ran some examples with my own test code to see the results.
In this code snippet:
- RunCommandTimeout is a function that performs command timeouts against a SQL database.
- cProfile.Profile() starts profiling the code within the context.
- After execution, pstats.Stats(Profile) helps us visualize the most time-consuming calls, sorting them by cumulative time.
Code:
import cProfile
import pstats
with cProfile.Profile() as Profile:
RunCommandTimeout(initial_timeout=1, loop_count=5, retry=True, retry_count=3, retry_increment=4)
results = pstats.Stats(Profile)
results.strip_dirs().sort_stats('cumulative').print_stats(10)
I was able to track the exact function calls and time spent on each one, for example,
- Function calls and their frequency
- Time spent in each function (including sub-calls)
- Cumulative time taken by functions, which is useful for spotting bottlenecks
Code:
Ordered by: cumulative time
List reduced from 181 to 10 due to restriction <10>
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.357 0.357 84.706 84.706 GiveNotesPerformance.py:240(RunCommandTimeout)
7 54.108 7.730 54.108 7.730 {method 'execute' of 'pyodbc.Cursor' objects}
3 27.001 9.000 27.001 9.000 {built-in method time.sleep}
4 0.001 0.000 3.025 0.756 GiveNotesPerformance.py:63(ConnectToTheDB)
4 2.945 0.736 2.951 0.738 {built-in method pyodbc.connect}
1 0.209 0.209 0.209 0.209 {method 'close' of 'pyodbc.Connection' objects}
29 0.035 0.001 0.035 0.001 {method 'flush' of '_io.TextIOWrapper' objects}
12 0.000 0.000 0.032 0.003 GiveNotesPerformance.py:526(set_text_color)
4 0.001 0.000 0.030 0.007 GiveNotesPerformance.py:39(get_credentials_from_file)
4 0.027 0.007 0.027 0.007 {built-in method io.open}
Analyzing the profiling results for the Python code execution, sorted by cumulative time, I was able to find the following results:
- I found very interesting points, such as the execute method being called 7 times with a total time of 54 seconds. This indicates that the execute method is responsible for the majority of the time spent in this execution. Each call takes an average of 7.73 seconds, suggesting that the database queries are the primary bottleneck.
- Additionally, my time.sleep function shows that it was called to introduce delays, likely as part of the retry mechanism built into the RunCommandTimeout function. Each sleep call lasted for an average of 9 seconds.
Enjoy!
Continue reading...