Tuning Strategies for Impatient Users
In this digital world, everyone expects immediate response from the web-sites and mobile apps they use. We all want instant gratification. Even those of us who manage the IT systems that are supposed to provide the best user experience have these expectations, knowing full well the challenges faced in delivering them.
There is no less pressure to meet similar performance targets of our back-office and business applications as well. Patience is limited. So, what strategies do others use to meet the demands of their impatient users?
First, let’s define what performance and tuning mean in context of this discussion.
Performance is how applications, compute services, and hardware behave given specific workloads; and it is measured by their response times, throughput, availability, and workload densities. The more productive all this computing infrastructure is, as measured by these metrics, the higher their rated performance.
Tuning refers to the methods used to deliver various levels of computing infrastructure performance. These methods include database and application code optimization, load-balancing, data storage and caching techniques, and distributed computing. The objective is, in general, to improve the software and hardware infrastructure’s cost-to-benefit ratio; its efficiency. Specific performance tuning goals include:
- Achieving faster application response times (reduced latency), higher throughput and increased workload densities without increasing infrastructure costs.
- Consolidating software and hardware infrastructure to reduce costs without degrading services.
- Processing larger and/or more demanding workloads without acquiring additional infrastructure.
Benefits come in the form of:
- Greater user satisfaction because of faster response times.
- The ability to add more workloads and users to existing IT resources due to its more efficient use.
- Increased capacity to respond to business demands by mitigating troubleshooting tasks.
Based on the above benefits, tuning software and hardware infrastructure can have positive impacts on customers, prospects, and top-line revenues, as well as, contribute to cost savings. Accordingly, it’s safe to say infrastructure tuning is important to both strategic and tactical goals and objectives of the business.
Tuning is an iterative process that takes good planning and a fair amount of time to do well. Sadly, as we all know, many users have little tolerance to waiting. Therefore, as others have found, it can be helpful to buy some time in the short-run when dealing with urgent and immediate performance problems. This approach, however, is meant only to satisfy your impatient users’ demands while you develop and deploy more long-term performance solutions.
Buying time in this context means throwing hardware at the problem despite traditionally being the very last option used after all other tuning methods have been implemented. For those working in the cloud, buying time with hardware can be very useful and cost-effective. In fact, public clouds, like Microsoft Azure and Amazon Web Services (AWS), are being tapped for just this purpose.
Public clouds provide flexible hardware and software infrastructure that can be provisioned as needed. They eliminate the time and expense to acquire and deploy on-premise IT resources, which is very useful when dealing with impatient users. For AWS Elastic Compute Cloud (EC2) users, as an example, cost-effective resources for quickly resolving problem performance issues can be found by searching the AWS-Marketplace for “EC2 accelerators.”
However, this discussion is about both public-cloud and on-premise IT workloads. And when it comes to impatient users, beginning with the end-in-mind, by improving the performance of variables that significantly impact application response times, is the place to start.
Variables that impact response times (latency) reside at all levels of the infrastructure stack; from computing services, application code, and databases; to operating systems, networks, CPU, and various storage tiers. But what is the first point of focus since all of these levels can individually involve their own, exhaustive set of performance tuning strategies and practices?
Start by finding out what are your impatient users’ most pressing concerns. Ask them about the details of what performance issues they are experiencing. Besides helping them feel involved in the performance problem-solving process, it also helps you peel back the layers of software and hardware performance symptoms to uncover the issues that need attention. Examples of some questions to ask are:
- What applications are being affected by performance issues?
- What are the symptoms of such application performance problems?
- Slow response times or continuously spinning widgets…
- Incorrect or inconsistent query results…
- Tasks not completing or timing-out…
- When do these problems occur, i.e. time of day, day of week, etc.?
- What other applications, if any, are being used at the same time?
- Where do these problems occur, i.e. geo-locations of users, workloads, data, systems, etc.?
Notice, these questions concentrate on the software infrastructure layer. Why? Because studies have found that more than 75% of computing performance bottlenecks can be traced to the application, software services, and database tiers of the infrastructure stack. In fact, hardware and network infrastructure often hide the real performance issues caused by these culprits. Powerful CPU’s, high-capacity fast networks, and well-balanced storage architectures can make up for poor application design and code; and less than optimal database, operating system, and virtual machine configurations. Therefore, investigating these tiers often leads to the worst offenders, and when tuned, provide the greatest performance benefits.
Measuring the following application and database performance metrics helps isolate problems:
- Latency – the amount of processing time taken to complete an end-to-end operation. In essence, it is the time required to finish a unit of work, which is usually expressed as an average or as a percentage. For example, 98% of operations are finished within 0.4 seconds.
- Throughput – the amount of work a system is doing per some unit of time. Throughput is generally recorded as an absolute number. For a database server, it is often represented by the number of queries processed per second. For a web server, throughput is measured by the number of client requests processed per second.
It would be ideal to minimize latency while maximizing throughput, but there are tradeoffs in tuning for each that correspondingly impacts the other. Consequently, balancing both takes time and is best suited for on-going, long-term performance practices.
The most often used short and long-term performance tuning strategy in dealing with impatient users is to leverage the offending application’s and database’s caching mechanisms as much as possible. There is no faster and more cost-effective way of improving response times – assuming your physical infrastructure has sufficient CPU, memory and storage space required of their workloads.
Additionally, utilizing in-memory computing also provides the means to quickly deliver significantly better application performance as well.
Of course, regular monitoring and routine tuning are the best, long-term means to ensure your systems are delivering their most optimal performance. But every now and then, urgent application performance issues require an approach that buys some time in response to impatient users.