The last step above produces an analysis file which is in human readable form. This file contains a couple of tables flat profile and call graph in addition to some other information. While flat profile gives an overview of the timing information of the functions like time consumption for the execution of a particular function, how many times it was called etc.
On the other hand, call graph focuses on each function like the functions through which a particular function was called, what all functions were called from within this particular function etc So this way one can get idea of the execution time spent in the sub-routines too. Lets try and understand the three steps listed above through a practical example.
Following test code will be used throughout the article :. In this first step, we need to make sure that the profiling is enabled when the compilation of the code is done. You must use this option when compiling the source files you want data about, and you must also use it when linking.
In the second step, the binary file produced as a result of step-1 above is executed so that profiling information can be generated.
Note that while execution if the program changes the current working directory using chdir then gmon. Also, your program needs to have sufficient permissions for gmon. This produces an analysis file which contains all the desired profiling information.
Note that one can explicitly specify the output file like in example above or the information is produced on stdout. On a related note, you should also understand how to debug your C program using gdb.
Lets have a look at this text file :. The individual columns for the flat profile as well as call graph are very well explained in the output itself.
There are various flags available to customize the output of the gprof tool. Some of them are discussed below:. If there are some static functions whose profiling information you do not require then this can be achieved using -a option :.
As you would have already seen that gprof produces output with lot of verbose information so in case this information is not required then this can be achieved using the -b flag. Note that I have used and will be using -b option so as to avoid extra information in analysis output. So we see that a flat profile containing information related to only function func1 is displayed.
Also, if there is a requirement to print flat profile but excluding a particular function then this is also possible using -P flag by passing the function name to exclude along with it. Now lets see the analysis output:. Also, if it is desired to suppress a specific function from call graph then this can be achieved by passing the desired function name along with the -Q option to the gprof tool.
Tagged as: Gprof Call Graph. Thank you very much, keep up the good work! Some more reasons: here , and here. I think it is better to make bigger time discrete in functions. In flat profile we expect that func1 works longer than func2. But because difference in loops is small, every time program executes we can get different sort results in flat profile. Sorry for bad English and thank you very much for this article.
I have been using gprof to isolate a performance issue in a large scale business application, but recent attempts to do this have stalled. GPROF is not very good for what you need. In a large application like yours, The flat profile is mainly about self time, which in a large program is usually irrelevant because the real problems are mid-stack. The only reason I came here is I was trying to remember why gmon.
As usual in life, it is a matter of using the tool in the right way in addition to using the right tool and that involves having enough experience in order to interpret it you know, sort of like you don't use a hammer to put a screw in a screw hole? The same goes the other way around too. Bottom line: suggesting not to use a profiling tool as it isn't designed for finding bottlenecks which incidentally, only you wrote that word.. The man page, by the way, suggests this:. If you simply want to know which functions burn most of the cycles, it is stated concisely here.
The call graph shows, for each function, which functions called it, which other functions it called, and how many times. There is also an estimate of how much time was spent in the subroutines of each function. This can suggest places where you might try to eliminate function calls that use a lot of time. In other words, in the hands of an experienced programmer, it is a very valuable tool for exactly what you claim it isn't good for.
Depending on the chosen tool, the UCode is instrumented appropriately to record the data of interest. For performance profiling, we are interested in the tool callgrind : a profiling tool that records the function call history as a call-graph. For analyzing the collected profiling data, there is is the amazing visualization tool KCachegrind 5.
It represents the collected data in a very nice way what tremendously helps to get an overview about whats going on. We can execute any executable as it is with valgrind.
Of course the executed program should contain debugging information to get an expressive call graph with human readable symbol names. The profiling result itself is not influenced by the measurement. Gperftools from Google provides a set of tools aimed for analyzing and improving performance of multi-threaded applications. They offer a CPU profiler, a fast thread aware malloc implementation, a memory leak detector and a heap profiler. We focus on their sampling based CPU profiler.
Creating a CPU profile of selected parts of your application with gperftools requires the following steps:. But you can find more about this in the docs. This table describes the call tree of the program, and was sorted by the total amount of time spent in each function and its children.
Each entry in this table consists of several lines. The line with the index number at the left hand margin lists the current function.
The lines above it list the functions that called this function, and the lines below it list the functions this one called. All the explanation mentioned above is also present in the output file generated by gprof. I have mentioned it here for the ease of readers.
Note: All this explanation is produced in the output file every-time gprof is run. So, once you've understood the details, you can use the gprof command-line option -b.
0コメント