CMG’12 Call for Papers and Workshops – The Best Independent Performance and Capacity Conference

May 18th, 2012 No comments
Share

The Computer Measurement Group (CMG) calls for papers and presentations for CMG’s 38th International Conference to be held in Las Vegas, Nevada, December 3rd through 7th, 2012.

The 2012 CMG conference will cover all areas of systems management, including but not limited to: capacity planning, IT service management, application performance management, performance engineering and testing, as well as the latest developments in the overall field of computer performance evaluation. See the Call for Papers and Call for Workshops for details.

CMG is the source of unbiased and objective expert information and practical, real life experiences across all computing platforms in the computer industry for over 30 years. Share your knowledge and experiences: write a paper and submit it for presentation at CMG’12.

Paper are categorized as Introductory, Tutorial, Advanced, or User Experience. I want to especially encourage all of you to consider writing a User Experience paper. Every year, the conference evaluations show a common theme: “More User Experience Papers, please!” You don’t need to be one of the field’s superstars to write one — in fact, they seem to work better from people who are just working in the field, in non-IT companies and government bodies. Just tell us what problem you faced, how you went about figuring out what the cause was, and how you dealt with it. Mentors are available for writing assistance, and may be requested at any point in the writing process, including before the paper is started. Just write mentor@cmg.org and ask.

Please take the time to participate in the CMG’12 program. It will be rewarding for both authors and attendees, and as we all share our knowledge we all become more complete professionals.
Paper submission through the CMG website is now available. For more information go to paper submission and workshop submission.

The deadline for paper submissions is June 8, 2012.

Please send questions to CMG’12 Program Chair, Bill Jouris at cmgpc@cmg.org.

Load Testing: What Tool to Choose?

May 10th, 2012 3 comments
Share

Classifying and evaluating load testing tools is not easy as they include different sets of functionality often crossing borders of whatever criteria are used. In most cases, any classification is either an oversimplification (which in some cases still may be useful) or a marketing trick to highlight advantages of specific tools. There are many criteria allowing to differentiate load testing tools and it is probably better to evaluate tools on each criterion separately.

First, there are three main approaches to workload generation and every tool may be evaluated on which of them it supports and how exactly.

Protocol-level recording and the list of supported protocols. Does the tool support protocol-level recording and, if it does, what protocols it supports. With quick Internet growth and popularity of browser-based clients, most products support HTTP only or a few Web-related protocols. According to my knowledge, only HP LoadRunner and Microfocus SilkPerformer try to keep up with support of all popular protocols. So if you need recording of a special protocol, you probably end up into looking at these two tools (unless you find a special niche tool supporting your specific protocol). That somewhat explains the popularity of LoadRunner at large corporations where you probably have almost all possible protocols used. The level of support of specific protocols differs significantly too. Some HTTP-based protocols are extremely difficult to correlate if there is no built-in support, so you may look for that kind of specific support. For example, Oracle Application Testing Suite may have better support of Oracle technologies.

UI-level recording. The option was available for a long time, but it is much more viable now. For example, there was a possibility to use Mercury/HP WinRunner or QuickTest Professional (QTP) scripts in load tests, but you needed a separate machine for each virtual user (or at least a separate terminal session). That limited the level of load you may achieve drastically. Other known options were, for example, Citrix and RDP (Remote Desktop Protocol) protocols in LoadRunner – which always were the last resort when nothing else was working, but were notoriously tricky to playback. New UI-level tools for browsers, such as Selenium, extended possibilities of the UI-level approach allowing to run multiple browser per machine (so scalability is limited by resources available to run browsers). Moreover, we got UI-less browsers, such as HtmlUnit, which require significantly less resources than real browsers. There are multiple tools supporting this approach now – such as PushToTest directly harnessing Selenium and HtmlUnit for load testing or LoadRunner TruClient protocol and SOASTA CloudTest using more proprietary solutions to achieve low-overhead playback. Still questions of supported technologies, scalability, and timing accuracy remain largely undocumented, so the approach requires evaluation in every specific non-trivial case.

Programming. There are cases when you can’t (or can, but it is more difficult) use recording at all. In such cases using API calls from the script may be an option. Other variations of this approach are web services scripting and using of unit testing scripts for load testing. And, of course, you may need to add some logic to your recorded script. You program the script using whatever way you have and use the tool to execute scripts, coordinate their executions, report and analyze results. To do this, the tool should have ability to add code to (or invoke code from) your script. And, of course, if tool’s language is different from the language of your API, you would need to figure out a way to plumb them. Tools, using standard languages such as C (e.g. LoadRunner) or Java (e.g. Oracle Application Testing Suite) may have an advantage here. However you should know all details of the communication between client and server that is often very challenging.

Other important criteria are related to the environment:

Deployment Model. There were a lot of discussions about different deployment models: lab vs. cloud vs. service. There are some advantages and disadvantage of each model. Depending on your goals and systems to test you may prefer one deployment model over another. But I still believe that for comprehensive performance testing you really need both lab testing (with reproducible results for performance optimization) and realistic outside testing from around the globe (to check real-life issues that you can’t simulate in the lab). Doing both would be expensive and makes sense when you really care about performance and have a global system – but it not rare and if you are not there yet, you can get there eventually. If there are such chances, it would be better to have a tool which supports different deployment models.

If it is lab or cloud, an important sub-question would be what kind of software / hardware / cloud the tool requires. Many tools use low-level system functionality, so is may be unpleasant surprises when the platform of your choice or your corporate browser standard is not supported.

Scaling. When you have a few users to simulate, it usually is not a problem. The more users you need to simulate, the more important it becomes. The tools differ drastically on how many resources they need per simulated user and how well they may handle large volumes of information. It may differ significantly even for specific tool depending on protocol used and specifics of your script. As soon as you get to thousands of users, it may become a major problem. For a very large number of users some automation, like automatic creation of a specified number of load generators across several clouds in SOASTA CloudTest, may be very handy.

Two other important sets of functionality are monitoring of the environment and result analysis. While theoretically it is possible to do it using other tools, it significantly degrades productivity and may require building some plumbing infrastructure. So while these two areas may look optional, integrated and powerful monitoring and result analysis are very important. And the more complex system and tests, the more important they are.

Of course, non-technical criteria are important too:

Cost. There are commercial tools (and license costs differ drastically) and free tools. And there are some choices in between: for example SOASTA has the CouldTest Light edition free up to 100 users. There are many free tools (some, as JMeter, are mature enough and well-known) and many inexpensive tools, but most of them are very limited in functionality.

Skills. Considering a large number of tools and a relatively small number of people working in the area, there is a kind of labor market only for the most popular tools. Even for the second-tier tools there are few people around and few positions available. So if you don’t choose the market leaders, you can’t count that you find people with this tool experience. Of course, an experienced performance engineer will learn any tool – but it may take some time until productivity will get to the expected level.

Support. Recording and load generation has a lot of sophistication in the background and issues may happen in every area. Availability of good support may significantly improve productivity.

This is, of course, not a comprehensive list of criteria – rather a few starting points. Unfortunately, in most cases you can’t just rank tools on the better – worse scale. It may be that a simple tool will work quite well in your case. If your business is built around a single web site, it doesn’t use sophisticated technologies, and load is not extremely high – almost every tool will work for you. The further you are from this state, the more challenging it would be to pick up the right tool. And it even may be that you need several tools.

And while you may evaluate tools with above mentioned criteria, it is not guaranteed that a specific tool will work with your specific product (unless it uses a well-known and straightforward technology). That actually means that if you have a few system to test, you need to evaluate the tools you consider using your systems and see if the tools can handle them. If you have many, choosing a tool supporting multiple load generation options is probably a good idea (and, of course, check it with at least the most important systems).

Load Testing: Its Present and Future

April 26th, 2012 No comments
Share

Recent trends of agile development, DevOps, Web and Social Media sites somewhat question importance of load testing. Some (not many) openly saying that they don’t need load testing, some still paying lip service to it – but just never get to it. In more traditional corporate world we still see performance testing groups and important systems usually get load tested before deployment.

Let’s first define load testing as far as terminology is rather vague here. I use it here as anything that requires applying multi-user synthetic load – in contrast with single-user performance (which is a subset of performance engineering and may include, for example, profiling or Web Performance Optimization as it is defined now). And I use it here as an umbrella term including all other variations of multi-user testing, such as performance, concurrency, stress, endurance, longevity, scalability, etc. – but you may replace it with any other term if you prefer.

Yes, it looks like some Web and Social Media sites managed to survive without load testing. However, it looks like many such companies match the following profile:
-Business is built around a single Web site, so everybody in the company follows what is going on in production.
-Overall architecture is still clear and relatively simple. Changes (however frequent) are rather minor and evolutional.
-There is decent instrumentation providing performance information.
-There is a possibility to remove changes relatively easy.
-Site downtime/a period of slow performance (until the problem would be noticed and fixed) is not extremely painful or dangerous to the business.

Load testing is a way to mitigate load- and performance-related risks. There are other approaches and techniques that also alleviate some performance risks:
-Good single-user performance engineering practices (single-user requests performance are constantly tracked).
-Good instrumentation/Application Performance Management providing insights in what is going on inside the system.
-[Auto] scalable architecture.
-Continuous integration allowing quickly deploy and remove changes.

Still all of these don’t completely replace load testing, but rather complement it. They definitely decrease performance risk comparing with situation when nothing was done about performance at all until the last moment before rolling out the system in production without any instrumentation at all, but it still leaves risks of crashing and performance degradation under multi-user load. And if the cost of it is high, you should do load testing (what exactly and how is another large topic – there is much more here than the stereotypical waterfall-like last-moment record-and-replay approach).

There is always a risk of crashing or performance issues under heavy load – and the only way to mitigate it is actually test it. Even stellar performance in production and highly scalable architecture don’t guarantee that it won’t crash with a slightly higher load. Truly speaking, even load testing doesn’t completely guarantee it (real-life workload may be different from what you have tested), but it drastically decreases the risk.

Another important value of load testing is making sure that changes don’t degrade multi-user performance. Unfortunately, better single-user performance doesn’t guarantee better multi-user performance. In many cases it improves multi-user performance too, but definitely not always. And the more complex system, the more probable exotic multi-user performance issues no one even thought of. And a way to ensure that you don’t have such issues is load testing.

When you do performance optimization, you need a reproducible way to evaluate the impact of changes on multi-user performance. The impact on multi-user performance probably won’t be proportional to what you see with single-user performance (even if it still would be somewhat correlated). Without multi-user testing the actual effect is difficult to quantify. The same with the issues happening only in specific cases that are difficult to troubleshoot and verify in production – using load testing can significantly simplify the process.

Summarizing, I don’t see that the need in load testing is going away. Even in case of Web and Social Media sites we would probably see load testing coming back as soon as systems become more complex and performance issues start to hurt business. Maybe it would be less need for “performance testers” as it was at the heyday due to better instrumenting, APM tools, continuous integration, etc. – but I’d expect more need for performance experts that would be able to see the whole picture using all available tools and techniques (although I don’t see it yet).

Performance Dimension of Information Technology

April 16th, 2012 1 comment
Share

There are no standards on titles and skill sets related to performance dimension of IT. I decided to put together how I understand them (most terms are vague, so it is quite possible that other people understand them differently). Of course, it is a simplification – but the topic is probably too heavy influenced by organization history and politics in every particular organization to be clear cut anyway.

I still think that we can break the whole area into three major categories: design (and development), testing, and production (maybe somewhat matching ITIL terms of Service Design, Service Transition, and Service Operation). The term Performance Engineering may be related to the whole area (or maybe related to the design category – in this case sometimes referred as Software Performance Engineering, SPE).

Performance Design. Talking about the design category (I used the ‘Performance Design’ term to group all performance-related activities during design and development , although it isn’t used this way – probably reflecting that the whole area is not quite existing as a separate discipline), we have specific areas of performance engineering knowledge for each specific technology. Such as Java performance, .Net performance, etc. One relatively new, but large and popular area is Web Performance Optimization, covering end-user Web performance. And, of course, we have Software Performance Engineering (SPE) trying to establish generic approaches – although SPE progress wasn’t too impressive since Dr. Connie Smith published ‘Performance Engineering of Software Systems’ in 1990.

It is definitely supposed to be an important part of the skill set of software architects (on a higher level, SPE, etc.) and software developers (maybe on a lower level, how efficiently design specific component using the chosen technology – but good understanding of high-level performance engineering won’t hurt either).

And while many architects and developers have some understanding of performance, often the main stress is on functionality and deadlines, so performance is left to the very end – where it sometimes may be indeed tuned in (usually when technologies are mature and the team is quite experienced), and sometimes require major changes (and late changes are very expensive).

It looks like the idea to have an explicit person responsible for performance from the beginning (starting from requirements) and working with other architects and developers to build it in makes sense. The title may be ‘performance architect’ or ‘performance champion’. Although such people are rare – rather we could see a proactive person from performance engineering or performance testing groups trying to ask performance questions early.

Performance Testing. Including, of course, all other variations and names, such as load, stress, endurance, etc. testing. ITIL matching term would probably Service Validation and Testing. All ways to apply synthetic load to the system and analyze system’s behavior. In the narrow sense, ‘performance tester’ is responsible for creating and applying such load (test scripting and execution). In a wider sense, it also includes workload characterization (workload modeling), performance analysis and performance troubleshooting – and often such person is referred as ‘performance engineer’. In some cases they are different people: performance tester is responsible for applying the load and performance engineer (maybe performance analyst in this case) is responsible for system analysis and optimization.

I definitely put performance testing in a separate category due to specific set of skills required: workload generation. And, perhaps, techniques to find and fix issues in the system applying an appropriate workload. But definitely not because “testing should go after development before production” as it use to be in the waterfall approach – testing should start as early as possible mostly overlapping with development and may continue in production. Monitoring the system using synthetic workload, for example, I’d rather also put in this testing category – it is actually testing the production system in parallel to production workload.

Performance Management, perhaps, may be a good name for the collection of performance-related activities and skills in production (and around).

It is interesting that ITIL places Capacity Management and Service Level Management processes into Service Design. I see a point here – you definitely need to allocate capacity before deploying the system, and Service Levels should probably come directly from the performance requirements. Still real people working in these areas are usually part of operations. Capacity Planners are responsible for allocating resources, although fewer and fewer people have such title and these responsibilities get spread between other groups (which, unfortunately, often don’t have appropriate skills).

Service Level Management would probably handled by Performance Monitoring (Analysis). ITIL matching term would probably Service Measurement. Title ‘Performance Analysts’ used often in the past – but not very popular anymore. Probably title ‘Performance Engineer’ is more popular now. And, of course, it may be specialized, like Database Monitoring, System Monitoring, Application Server Monitoring. These may be done by respective administrators (DBA, system administrator, etc.).

Application Monitoring – relatively new staff. Usually referred as Application Performance Monitoring. The idea is to measure application-specific metrics (including business-related metric, end-user metrics, etc.) in addition to those system-level metrics that used to be measured earlier. Importance of application monitoring is definitely growing. From one side, system-level metrics becomes less relevant in today’s infrastructure with virtualization, multi-tenancy, cloud, etc. From another side, the system becomes so complicated that trying to figure out what is going on using low-level metrics becomes nightmare. Form the third side, full monitoring from the business point of view becomes a business requirement – and it is where IT can provide unique business advantage.

Probably Application Performance Management (APM) would the right category encompassing most production-related categories such as Performance Monitoring, Capacity Management, Diagnostics (troubleshooting) and Tuning (and Optimization – although this may somewhat get into re-design category). We probably not there yet and Application Performance Management is rather a vague vision than reality. Gartner, for example, stresses that APM is Application Performance Monitoring, not Management. And I am not sure what would be a title of the person doing this. Management is a favorite word for an area of expertise (as in Performance Management or Capacity Management), but Manager (at least in the US) still means a person who manages other people. So the title, I guess, would be the same ubiquitous ‘performance engineer’.

Performance Troubleshooting or Diagnostics is definitely important part of Performance Management and is an application of performance engineering to existing performance issues. While it is probably the most typical performance-related activity at many corporations, very few have anything formal around it and usually all other performance-related groups get involved. And we need performance engineering kind of skills to investigate and fix performance problems in production.

It looks like that in the new generation of Web companies monitoring and capacity planning often included into ‘Site Reliability’, adding, I guess, some confusion to the already existing mess of terms and notions.

P.S. By the way, the only conference covering almost all topics mentioned above is CMG. Call for papers and workshops is opened now.

Performance Testing and Optimization for the Cloud

March 29th, 2012 2 comments
Share

While many companies promote performance testing in the cloud (or from the cloud), it makes sense only for certain types of performance testing. For example, it should work fine if we want to test how many users the system supports, would it crash under load of X users, how many servers we need to support Y users, etc., but are not too concerned with exact numbers or variability of results (or even want to see some real-life variability).

Even in this case it assumes that we don’t introduce any bottleneck using the cloud (for example, saturating network bandwidth between load generators and the system under test) and leave the cloud provider to care that our test doesn’t impact other cloud tenants (that may be not too trivial in the case of PaaS or SaaS).

However it doesn’t work for performance optimization, when we make a change in the system and want to see how it impacts performance. Testing in a cloud with other tenants intrinsically has some results variability as far as we don’t control other activities in the cloud and in most cases don’t know exact hardware configuration. For example, if the system scales out by automatic creation of an additional application instance, the new instance may be outside of the network segment where other servers are. The effects may be even more sophisticated in case of PaaS and SaaS.

So when we talk about performance optimization, we still need an isolated lab. And, if the target environment for the system is a cloud, it should be an isolated private cloud with all hardware and software infrastructure of the target cloud. And we need monitoring access to underlying hardware to see how the system maps to the hardware resources and if it works as expected (for example, testing scaling out or evaluating impacts to/from other tenants – which probably should be one more kind of performance testing to do). Real-world network emulators should be used to make sure that performance testing is representative of how the system would be used in production – otherwise we don’t taking into account such factors as network latency, bandwidth, jitter, etc. This means that we need a way to plug in the network emulation appliance properly.

So if we need optimization for cloud software, we still need a lab – but the lab should be more sophisticated to emulate the cloud environment and real-world network conditions. An ultimate example of such lab probably is the lab Microsoft created for testing IE.

So factoring in the cloud into performance testing, we have two alternatives: coarse performance testing in/from the cloud with inherent variability (and perhaps some savings on hardware and configuration costs) or granular performance testing and optimization in a sophisticated isolated lab emulating the cloud (thus avoiding variability with probably higher hardware and configuration costs).

The Main Performance Problem

March 22nd, 2012 3 comments
Share

Dennis Drogseth’s post The Many Dimensions of User Experience Management (UEM) is very indicative of the main problem we have in performance: people thinking about many small specific performances, but we have just one PERFORMANCE. It depends on many different components and manifests itself in many different ways, but any attempt to decompose it results in silos and losing some important parts of the whole.

From the post: When we asked “What is your primary driver?” Better application performance and triage came in fifth, with only 13% of the votes. Employee productivity topped the list at 23%, followed by business competitiveness and/or revenue at 20%. Better support for services delivered over the network came in third, and brand protection and customer satisfaction came in fourth.

Well. Ask , for example, business users about JVM performance and it probably won’t get into the first hundred of issues they care about. Does it prove anything? No. They care a lot about it if they use J2EE systems, but just don’t know about it (except maybe a few most curious).

“Employee productivity” heavily depends on application performance. ” Business competitiveness and/or revenue” is related to application performance. “Better support for services delivered over the network” – not sure what it means, but performance also comes to mind. “Customer satisfaction” – performance is a pretty major component. And even with “brand” quite may be impacted by bad performance. Probably business users (and not only business) don’t care much about performance when it is good, but as soon as performance degrades, it immediately jumps on the top of everybody’s priority list.

I, of course, don’t want to say that performance is the main thing in business – if you don’t have any business, you may not be concerned with performance. But as soon as you do, application performance would impact all parts of your business. But you notice it only when it is bad (and usually it will happen soon if you don’t take care).

Then the post says: Similarly, when we wanted to understand which organizations or groups within IT and the business were behind UEM or QoE, the Help Desk/End User Support came in first, Customer Experience Management came in second, and Applications Management and Network Operations were tied at third and fourth place.

And when asked which organization is likely to DRIVE the overall QOE/UEM initiative, the first five groups were: Line of Business, Customer Experience Management, Process Management and Compliance Professional, Help Desk, and Service Management.
Applications Management came in seventh, one percentage point after Infrastructure Management!

Yeah, exactly proves the point: there is no organization/group responsible for performance today. Not sure what “Application Management” is (I don’t recall seeing such group – app admins?). And it is not surprising that people don’t put such group to drive such initiative – I guess perception is that such groups are groups of IT geeks doing something with computers, not caring about business, and starting to do something only when would be told by CIO to fix it (that, unfortunately, often is close to the truth).

How it relates to concept of Application Performance Management (which is rather concept for the moment)? It just proves that it doesn’t exist in practice (at least in its ideal form). Usually there is no organization responsible for it (as holistic concept, in conjunction with business).

What are end-user response times (what EUM monitors)? They are external symptoms of application performance. The only part of application performance end users care about. The tip of the iceberg. If we are saying that we want to manage application performance, would end-user response times part of it? I have no doubt it would. Otherwise the whole concept doesn’t make sense.

The post states: User Experience Management also has strong business impact, governance, service level and user productivity implications that transcend performance management. Yes, performance has “business impact, governance, service level and user productivity implications”.

So the data provided in the post, by my opinion, proves two things: business cares about performance a lot, but there is no any reliable structure in place to care about end-to-end performance.

Actually I am rather confused by the term User Experience Management. I understand what it is User Experience Monitoring or End User Experience (which usually used in the context of measuring response times). But how would you manage it? You may manage your application/systems which would improve response times. Unless you just saying that you want to use the name User Experience Management as an umbrella name covering all related to performance (including APM, Capacity Planning, etc., etc.) – which maybe an option, but it doesn’t look like it is used this way. Or maybe User Experience Management is used as a wider term including usability, UX (User eXperience), etc., which usually relate to UI design? If yes, then it indeed includes important factors not related to performance and only partially overlaps with APM – but then I am not sure why we compare EUM with APM.

Ian Molyneaux’s post The Case for the CPO brought the topic of a person responsible for performance to its extreme. Great idea, but… How far are we from there? Forget CPO, but just having a person (or persons) responsible for end-to-end performance and building up the process assuring such performance? See job posting – have somebody seen any position saying that we need a person to drive performance in our organization (and meant it)? I haven’t. All positions are for a specific silo team or for consulting. So it looks like it would be awhile until we see a more holistic approach to performance (whatever name would be used for it).

Response Times: Digesting the Latest Information

March 13th, 2012 No comments
Share

Returning to the discussion around my post How Response Times Impact Business? and recent publications about the topic, like For Impatient Web Users, an Eye Blink Is Just Too Long to Wait.

From one side, we see more and more statement the response times should be shorter and shorter. For example, both Scott Barber and Harry Shum (“a computer scientist and speed specialist at Microsoft” according to the New York Times article) state 250 milliseconds as the magic number for response times (although I am not sure where these 250 milliseconds came from).

From another side, the three psychological thresholds and other considerations I referred to in my post were based on multiple researches and definitely make sense.

Well, I definitely prefer to see a recent research about the subject. It is strange that there were a lot of research since 1968 – but none recent. And it is when really big money gets involved. Or there are some, but they just don’t get released?

Meanwhile one explanation may be that perception about [at least] simple web navigation is changing. Web response times were defined by the second threshold: users feel they are interacting freely with the information (1-5 seconds). They notice the delay, but feel that the computer is “working” on the command. Well, maybe users don’t feel anymore that the computer should “work” [at least] for simple web navigation. Maybe they now perceive it as described by the first threshold: instantaneous (0.1-0.2 second). Users feel that they directly manipulate objects in the user interface. So while these psychological thresholds are still correct, perception of [at least] simple web navigation is changing and it gets defined by another threshold. Just a speculation, of course – it would be interesting to see any research to prove or disprove it.

In a similar classification, Steven Seow in Designing and Engineering Time: The Psychology of Time Perception in Software defines four classes of responsiveness bases on user expectancy:

  • Instantaneous: 0.1 to 0.2 seconds
  • Immediate: 0.5 to one second
  • Continuous: two to five seconds
  • Captive: seven to ten seconds

So in a way he breaks the middle threshold into two classes: immediate and continues. If accept this division, we perhaps may say that user expectations for [at least] for simple web navigation are moving from continuous class to immediate class. That maybe makes more sense for me: I am still rather skeptical that we indeed need 250 ms end-user response time (of course, if we talk about server response time, it would be another story).

Increasing Transaction Rate: a Valid Performance Testing Technique

March 12th, 2012 3 comments
Share

Just read a very good post Concurrency of Users Vs Increasing Transaction Rate by Jason Buksh about a pretty old question in load testing: if you could use less virtual users by increasing transaction rate.

Jason’s post discusses the subject in great details. One more thing worth mentioning is that the comparison between these two, say, approaches makes sense when the system handles the load well (low response times, etc.). As soon as the system starts to slow down, load becomes quite different (as far as you have a limited number of requests sent to the system with a small number of users).

Still I rather prefer to think about it not as about two equal approaches. If you use increasing transaction rate alone, it would be definitely cutting corners. Much better than nothing, but you still risk to miss issues related to high concurrency (which are quite probable).

I’d rather prefer to think about increasing transaction rate as an important performance testing technique to use in addition to using a realistic number users. This view is explained well in the Rapid Bottleneck Identification – A Better Way to do Load Testing white paper. Actually it was an Empirix white paper originally and there was a discussion about it back in 2005.

So increasing transaction rate leaves high concurrency risks and it is always better to test the full concurrency in the end. But increasing transaction rate may be a good technique to speedup performance testing and catch some issues earlier.

Book Review: Solving Enterprise Applications Performance Puzzles : Queuing Models to the Rescue

March 12th, 2012 No comments
Share

Solving Enterprise Applications Performance Puzzles : Queuing Models to the Rescue by Leonid Grinshpan is a pretty interesting book about application of queuing models to solving enterprise performance and I believe the book fills a few gaps in practical application of queuing theory. Another good name for this book could be “Building queuing models by example”.

I spent a lot of time trying to use queuing models to solve practical performance issues and would testify that it is pretty challenging. There are a few areas where it was developed a little further (for example, around capacity planning of existing systems), but if you trying to do something else – you won’t find much help. You have a lot of books about systems performance, you have a lot of books about queuing theory with simple examples, but not much in between to solve practical tasks. And here Leonid’s book may help, especially if you are new in this area.

Chapter 1, Queuing Networks as Applications Models, is an introduction into the topic. It discusses how queuing theory may be used to model enterprise applications. A lot of analogues are used to introduce the subject.

Chapter 2, Building and Solving Application Models, is an overview of the whole process, including short discussions about essentials of queuing theory and using of tools to solve models.

Chapter 3, Workload Characterization and Transaction Profiling, discusses what input data for models are and how to gather them.

Chapter 4, Servers, CPUs, and Other Building Blocks of Application Scalability, discusses scalability, bottlenecks, how to identify bottlenecks and ways to fix them (mostly on CPU and I/O examples).

Chapter 5, Operating System Overheads, discusses main components of operating systems, where overheads come from, how to measure them, and their impact on transaction time.

Chapter 6, Software Bottlenecks, is devoted to software bottlenecks, which are rarely discussed in application to queuing models – while in practice software bottlenecks happen all the time. Memory bottlenecks and thread optimizations and their modeling are discussed in details. Multiple other software bottlenecks are also reviewed.

Chapter 7, Performance and Capacity of Virtual Systems, is an overview of performance issues related to virtualization , their explanation with queuing theory, and a methodology of virtual machine sizing.

Chapter 8, Model-Based Application Sizing: Say Good-Bye to guessing, explains why to use model-based sizing and discusses it step-by-step from gathering input data to model deliverables and what-if scenarios.

Chapter 9, Modeling Different Application Configurations, discusses several specials cases including geographical distribution of users, cross-platform modeling, remote terminal services, load balancing, and parallelization of transactions.

The book covers a lot of topics. However, to avoid disappointments, I’d like to point out what this book is not:

- It is not a textbook about queuing theory. The section 2.2 Essentials of Queuing Networks Theory has 5 pages in it.

- It is not a book about tools to solve queuing models. Available tools are listed and there are references, but they are just mentioned as a way to solve models (with one tool used as an illustration of the process). You don’t need to know any tool to read the book (but you will need one when you try to solve your own models).

- It is not a comprehensive book about enterprise application performance. There is plenty of important information and practical recommendations about enterprise application performance in the book, but it is shared as needed to build models and analyze their results.

So the book is exactly what the title says: a practical book about building queuing models to investigate enterprise applications performance issues.

Multiple Dimensions of Response Time

February 27th, 2012 No comments
Share

It looks like everything related to performance has multiple dimensions. Reading recently excellent posts A non-geeky guide to understanding performance measurement terms by Joshua Bixby and Building a High Performance Website by Phil Stanhope, I realized how many dimensions even a relatively simple term “response time” has. And, moreover, it looks like we don’t have a reliable way to measure the response time that would matter to end user (I guess something between “time to display” and “time to interactivity” depending on the site design, if follow the posts terminology). Both authors look at this rather from the front end / Web Performance Optimization (WPO) point of view.

Spending most of my time in performance testing, I’d guess that “response time” comes from load testing / active monitoring tools that are the main source of performance information (the “waterfall” approach of the WPO community quickly becomes popular – but I am not sure how many monitoring services use it). And in this case, “response time” is what the tool reports. What “response time” means in such case is heavily depends on the tool and its settings – and in many cases, I guess, it won’t be any of the metrics provided in the aforementioned posts (which, I guess, are standard in the WPO community – but they may be not easy to measure by load testing and enterprise monitoring tools). For protocol-based tools it would be probably the time of receiving all requests without any client-side activities (with many additional details of browser emulation- like caching, threading, keep-alive, compressing, etc.). For GUI-based tools it probably depends on what underlying mechanism the tool uses and how the script is designed. Quite often if you don’t set any specific checks it may report a success without full downloading and rendering (and when somebody say that a modern sophisticated site will load for 0.169 sec over the Internet it would be my first guess). Although, if scripted properly, it perhaps may measure the performance metric that matters (when the page would “be almost fully interactive”) by checking that the parts that matter are downloaded and rendered (that probably can’t be done without manual scripting / analysis).

That brings an interesting question about Application Performance Management (APM): what End-User Experience Monitoring (EUM) a.k.a. Real-User Monitoring (RUM) measures? EUM/RUM is considered as an integral part of APM (and definitely should be), but may measure pretty different things depending on the approach to measure it. And as I mentioned above, it probably won’t be the actual end-user experience – but only its approximation by another metric (different for different tools).

Only thing that often saves us from all this complexness is , as often happens in performance, that in many cases it doesn’t matter. All of the metrics are just close enough from the practical point of view. In old good times of plain html the main part was getting response from the server, the client-side part was fixed and usually small. So it wasn’t said much about different kinds of response times in the past. The situation is changing now: the front-end time becomes significant (see the Performance Golden Rule by Steve Souders, keeping in mind that it is based on front pages mainly) and now it looks like we can’t ignore the differences between response times anymore.