Introduction
The Profiling tools let developers measure, evaluate, and target performance-related issues in their code. It also helps the developers to compare the reports. Profiling tools can also be used from the command-line. This allows users the flexibility of running these tools from the command-line or using them to automate tasks using script.
VS2010 Ultimate comes with MS profiler tools. These tools are fully integrated into the VS IDE to provide a seamless and approachable user experience.
Profiling an application is straightforward. Following are four steps
- Create performance session and configure it by specifying the collection method and the data that you intent to collect.
- Run the application in the performance session to collect the profiling data
- Review the profiler reports and analyze the data to identify the performance issue(s).
- Modify the code to increase the application performance of the code
- Collect profiling data on the changed code, and compare the profiling data of the original and changed data
- Generate a report that documents the increase in performance.
Let's first elaborate on the different profiling methods and terminology used in profiler report.
Types of Profiling
Following are four types of profiling
CPU Sampling : Collects application statistics which are useful for initial analysis and for analyzing CPU utilization issues.
Instrumentation Profiling : Collects detailed timing data which is useful for focused analysis and for analyzing input/output performance issues.
.NET Memory Allocation : Collects the .NET memory allocation and object lifetime data, which helps to detect memory-related issues in the application
Concurrency Profiling : Collects numeric resource contention data, process and thread execution data that is useful in analyzing multi-threaded and multi-process applications.
Let's go in some details of each of these profiling methods
CPU SAMPLING
In this method, the profiler periodically interrupts an execution of application in order to take a sample of it. The regularity of interrupts is achieved through usage of one of the CPU Performance Counters (default is 10,000,000 non halted CPU cycles).
You can change the sampling interval or the counter and switch to a different event as a trigger of your samples. Non halted CPU cycle event causes your samples being collected when your application is actively using CPU. Last Level Cache Misses event causes samples being generated on cache misses. (So, if there are no cache misses – no data is collected). VS Profiler supports sampling on only one counter at a time. Therefore, you can use only one counter when you configure your sampling session.
When you look through results of sampling, you will not see separate data columns with the counter name. Usage of specific counter in this scenario means that your samples were triggered by the counter event. So interpret your samples (inclusive, exclusive, function samples, module samples, call trees etc) appropriately.
To select a specific counter for your Sampling session in Visual Studio, go to Properties of your Performance Session (right click –> Properties), and select the Sampling tab. In Sample event drop down list switch default Clock cycles to Performance counter.
The following four types of sample events are available:
- Clock Cycles - for CPU bound problems
- Page Faults - for memory related problems
- System Calls - for I/O related problems
- Performance Counters - for low-level performance problems
- Additional sample events can be specified based on available performance counters
Understanding Sampling Data Values
- Inclusive samples
- The total number of samples that are collected during the execution of the target function
- This includes samples that are collected during the direct execution of the function code and samples that are collected during the execution of child functions that are called by the target function.
- Exclusive samples
- The number of samples that are collected during the direct execution of the instructions of the target function.
- Exclusive samples do not include samples that are collected during the execution of functions that are called by the target function.
- Inclusive percent
- The percentage of the total number of inclusive samples in the profiling run that are inclusive samples of the function or data range.
- Exclusive percent
- The percentage of the total number of exclusive samples in the profiling run that are exclusive samples of the function or data range.
INSTRUMENTATION PROFILING
In this sampling, VS Profiler records timestamps at function entries and exits. If you add CPU Performance Counters to the session, their values will be collected at the same time, and you will have separate columns of data for each counter.
In instrumentation profiling session you can add as many CPU counters as you wish. However the number of counters you can use simultaneously is limited by a number of registers that are implemented in your CPU for that purpose. Sometimes it could be as few as two registers only, and MS Profiler always tries to acquire one of them for internal usage.
To add some Performance Counters to your Performance Session, go to the session Properties and select CPU Counters tab. There you can enable and select counters you want to add to your instrumentation data collection to take a closer look at the assembly(s) that contains the hot-spots.
Understanding Instrumentation Data Values
- Elapsed Inclusive values
- The total time that was spent executing a function and its child functions.
- Elapsed Inclusive values include the intervals that were spent directly executing the function code and the intervals that were spent executing the child functions of the target function. Intervals of the function or its child functions that include waiting for the operating system are also included in Elapsed Inclusive values.
- Elapsed Exclusive values
- The time that was spent executing a function, excluding time that was spent in child functions.
- Elapsed Exclusive values include the intervals that were spent directly executing the function code, regardless of whether an operating system event occurred in the interval. All intervals spent in child functions that were called by the target function are not included in Elapsed Exclusive values.
- Application Inclusive values
- The time that was spent executing a function and its child functions, excluding time that was spent in operating system events.
- Application Inclusive values do not include intervals that contain operating system events. Application Inclusive values include all other intervals that were spent executing a function, regardless of whether the interval was spent directly executing the function code or was spent in child functions of the target function.
- Application Exclusive values
- The time that was spent executing a function, excluding the time that was spent in child functions and the time that was spent in operating system events.
- Application Exclusive values do not include intervals that contain operating system events or intervals that were spent executing functions that were called by the function. Application Exclusive values include only those intervals that were spent directly executing the function code and that did not contain an operating system event.
- Elapsed Inclusive percent
- The percentage of the total Elapsed Inclusive values of the profiling session that were Elapsed Inclusive values of the function, module, thread, or process.
- 100 * Function Elapsed Inclusive / Session Elapsed Inclusive
- Elapsed Exclusive percent
- The percentage of the total Elapsed Inclusive values of the profiling session that were Elapsed Exclusive values of the function, module, thread, or process.
- 100 * Function Elapsed Exclusive / Session Elapsed Inclusive
- Application Inclusive percent
- The percentage of the total Application Inclusive values of the profiling session that were Application Inclusive values of the function, module, thread, or process.
- 100 * Function Application Inclusive / Session Application Inclusive
- Application Exclusive percent
- The percentage of the total Application Inclusive values of the profiling session that were Application Exclusive intervals of the function, module, thread, or process.
- 100 * Function Application Exclusive / Session Application Inclusive
.NET MEMORY ALLOCATION
Collects the .NET memory allocation and object lifetime data, which helps you to detect memory-related performance issues in your application. Data about .NET memory allocation includes the size and number of .NET Framework memory objects that were allocated.
Object lifetime data includes the size and number of .NET Framework memory objects that were reclaimed in the three garbage collection generations.
Understanding Memory Allocation and Object Lifetime Data Values
- Allocation data
- When a .memory event occurs, the total counts and sizes of the allocated or destroyed memory objects are incremented.
- When a .memory allocation event occurs, the profiler increments the sample counts for each function on the call stack. When the data is collected, only one function on the call stack is currently executing the code in its function body. The other functions on the stack are parents in the hierarchy of function calls that are waiting for the functions that they called to return
- For the allocation event, the profiler increments the exclusive sample counts of the function that is currently executing its instructions. Because an exclusive sample is also part of the total (inclusive) samples of the function, the inclusive sample count of the currently active function is also incremented.
- The profiler increments the inclusive samples count of all other functions on the call stack
- Lifetime data
- The garbage collector of the .NET Framework manages the allocation and release of memory for your application. To optimize the performance of the garbage collector, the managed heap is divided into three generations: 0, 1, and 2. The run-time's garbage collector stores new objects in generation 0. Objects that survive collections are promoted and stored in generations 1 and 2.
- The garbage collector reclaims memory by de-allocating a whole generation of objects. For objects that the profiled application created, the Object Lifetime view displays the number and size of the objects and the generation when they are reclaimed.
CONCURRENCY PROFILING
Collects numeric resource contention data, process and thread execution data that is useful in analyzing multi-threaded and multi-process applications. It enables you to profile multi-threaded applications to see how the threads are behaving inside the application. There are two sub-modes for Concurrency profiling which adds their own reports to profiler results. This feature works for both native and managed applications.
- Collect resource contention data collects numeric data for contention events.
- Visualize the behavior of multithreaded applications collects thread and process execution data.
Understanding Memory Allocation and Object Lifetime Data Values
Resource contention profiling collects detailed call stack information each time competing threads (in an application) are forced to wait for access to a shared resource. Resource contention reports display the total number of contentions and the total time that was spent waiting for a resource for the modules, functions, source code lines, and instructions in which the waiting occurred.
- Inclusive values display the total number of contentions that forced a function to wait by resource contentions and the total time that the function waited. Contentions that were caused by child functions that were called by the function are included in inclusive values.
- Exclusive values display only the number of contentions that forced a function to wait and that were caused by code in the body of the function. Contentions caused by child functions are not included. The exclusive time for the function also includes only the wait times that were caused by statements in the function body.
Resource contention report views also include timeline graphs that show the individual contention events over time and show the call stacks that created the particular event.
Why Profiling?
Profiling gathers information about an executing application, allowing you to find out the performance bottlenecks in your application. It can have a massive benefit to the performance and scalability of these applications if those performance blocks are redesigned, revisited. Following are the reasons to support profiling
Why should you profile the application
- Focus on portions of code that really require attention
Profiling allows you to focus on critical sections. When you have a very large application it is very difficult to identify areas that require improvement. Profiling your application will pinpoint the code sections that really require improvement or tuning. - Identify code blocks with performance issues
Performance is a very hard thing to measure and trace. The task of identifying code blocks or methods that have performance issues is tedious. Profiling the code helps you to identify those code lines, blocks or methods. - Compare alternative approaches
During development, you might come across alternate ways of achieving a task. For any given task you might have two different implementations that you wish to compare, in order to find out which implementation is better in terms of performance, scalability, and resource usage. By comparing these implementations with a profiler, you can select the most efficient code block. - Get accurate code execution response times
By profiling your .NET code you can get accurate execution times of a line of code, or a block of code, or a method. - Visualize performance and memory usage
A visual depiction of execution times and memory usage helps you to make informative decisions very quickly. Using profiler reports, different graphs of the execution times or memory usage, it is much easier and quicker to understand issues and fix them. - Track the lifecycle of your .NET objects
Tracking the lifecycle of objects will allow you to make optimizations in your code. e.g. you might be creating the resource intensive object too early in your application. Profiling will uncover these issues. - Avoid unnecessary loading or initialization of your program
During development you might have had some tests that are loaded and initialized. Prior to deployment of application you will want to ensure that any unnecessary loading or initialization is removed. - Optimize your looping constructs
Looping constructs are a common source of performance issues. Profiling your code allows you to understand and eliminate unnecessary loops within your looping constructs. Improving your looping constructs in turn will improve the overall performance of your application. - Identify memory leaks in your application
Memory leaks in your application can be very difficult to identify. Profiling your code allows you to identify any unnecessary memory usage and therefore optimize the memory usage in your .NET application. - Compare profiler reports
Comparing profiler reports help you to find out the exact performance improvement in the applications as well as the efficient method to us
Glossary
- VS2010 – Visual Studio 2010
- BFS – Banking and Finance services
- VS IDE – Visual Studio Integrated development Editor
References
- http://www.simple-talk.com/dotnet/performance/the-why-and-how-of-.net-profiling/
- http://msdn.microsoft.com/en-us/library/dd264872.aspx
BeleTPL.com
No comments:
Post a Comment