Does your organisation have a true performance management methodology? For the majority of organisations the answer is simply “no” – performance management amounts to a variety of disparate, ad hoc and predominantly reactive processes. Examples of such approaches may include:
Server utilisation monitoring – perfmon statistics – CPU/memory/disk utilisation etc.
User-feedback – “It seems slow”
Transaction response time monitoring (“stopwatch testing”)
Common limitations of these legacy approaches include:
Reactive – bottlenecks are typically only identified after they cause performance problems
Don’t take into account shared systems – virtualisation/SAN/network
Obtaining more useful data requires significantly greater operational investment
Tools tend to be focus on individual infrastructure layers making it difficult to build processes that are useful across the entire enterprise infrastructure
Baseline performance benchmarking only useful for before/after analysis – cannot be used for accurate “what if” scenario planning
The largest challenge faced by infrastructure administrators in responding to performance problems is a lack of data. When users complain that “it’s slow” administrators lack the critical information needed to effectively respond – how did the application perform before the issue arose; what utilisation metrics correlated to the previous, acceptable, performance level; to what degree is performance now degraded; what has changed between then and now?
As a result, administrators tend to fall back on intuitive approaches to performance troubleshooting – looking for errant performance metrics, reviewing code release schedules and recent infrastructure changes, and frequently fall back on crude techniques such as increasing available computational resources in a hope to resolve performance bottlenecks. Such approaches are inefficient and are not cost effective, and do not scale to large, complex environments.
A new wave of tools is emerging that seek to resolve these problems. These tools are generally “cross domain”, referring to their ability to collect and analyse data from multiple infrastructure layers. Typically, the include the ability to determine whether performance variations are due to increased load, code changes, infrastructure changes, or impacted by performance of shared system components (ie, where an application’s performance is degraded due to increased load on a shared component such as a virtualisation farm).
An additional benefit of a data-driven performance management approach is the ability to “right-sizing” infrastructure – particularly in virtualised environments, resources are often over-allocated to individual servers and are therefore wasted. Once administrators fully understand the true performance and resource requirements of applications, these wasted resources can be reclaimed and reallocated.
In addition, they are capable of complex scenario modelling – this allows administrators to forecast the impacts associated with, for example, a successful web marketing campaign, the hiring of 100 new office staff, or the opening of a new branch office. As a result administrators can proactively identify future performance bottlenecks and IT infrastructure spending can be targeted to where they will deliver most benefit. Further, by understanding application utilisation trends and knowing where bottlenecks reside in their infrastructure, administrators are able to reesolve performance issues before their users even notice them.