Archive

Archive for the ‘Software Performance Engineering’ Category

Two Main Challenges of Performance Modeling and System Sizing

September 10th, 2012 No comments
Share

Reading about performance modeling / simulation and system sizing, you often see two completely opposite views of the subject. Either authors describe in detail how you can model performance using some math and you may feel that as soon as you comprehend that math, you won’t have any problem with modeling. Or authors say that it is a black magic and you’d better stay away from it or do it in a minimal way with simple trending (while you probably won’t see that view in serious books, it is often can be seen in Internet discussions).

The truth, as usual, is in the middle. Modeling is a very helpful and works well if you use it properly and understand its limitations. And there are two main challenges here that rarely get highlighted – while everybody who wants to approach the topic should understand them clearly.

The first challenge is that modeling works well for known resource limitations. You should know these limitations in advance (and how your system uses that limited resource – which is also a challenge, but more technical one). For example, if your system is processor-bound and you know how much cpu it takes per transaction, you may build a rather simple and pretty accurate model (using queuing theory or even something pretty simple – for example, if you stay away from heavy cpu utilization, linear model may work well with multi-processor systems).

But that model would never tell you when you run out of another resource and run into another kind of bottleneck until you build it into the model. And beyond a few common resources (processor, memory, disk, network) and explicitly introduced throttling, you usually don’t know about bottlenecks until you run into them. This is the primary reason that results of your model (which may be perfect from the mathematical point of view) are not reliable if you model significantly higher load than you tested / validated – as far as there is a high probability that you run into another bottleneck you are not taking in consideration now. However, the model would provide the best possible case (which turns true when you fix all other bottlenecks you didn’t take in consideration at the moment of modeling), which is important information by itself. A model would also be very useful to see if the system behaves up to expectations – or there are internal issues degrading performance and preventing scalability (that may be not so trivial to catch in complex systems).

Another challenge is a lack of performance-related metrics of hardware to use in modeling. You can find detailed hardware specifications, but they won’t tell you how fast your systems would work on this hardware. As far as I understand, the only relatively objective approach (without testing the real system on the real hardware – which is, of course, the best) is to use existing benchmark results to compare performance (keeping in mind that they represent results of this specific benchmark, not your systems). Most serious commercial modeling tools come with a library of hardware configurations and their performance metrics, allowing what-if performance analysis. It looks like keeping such libraries is a pretty time-consuming task and their quality may differ. Such a library is usually a major advantage of commercial modeling tools in comparison with free or inexpensive modeling tools (which may be quite good from the mathematical point of view, but you need to provide all numbers yourself).

IDC made an interesting move here introducing QPI (Qualified Performance Indicator) as a part of IDC’s Server Decision Suite Metrics (free 30-days trial available). A kind of independent performance library that may be used for proper performance modeling / sizing (and, as far as I understand, going well beyond performance, integrating this information with other IT-related metrics such as price, power, and size – it should be a very interesting optimization task to find the best hardware configuration based on all these metrics).

How Do We Measure Computer Resources?

December 26th, 2011 No comments
Share

Posted How Do We Measure Computer Resources? on Application Performance Engineering Hub. It looks like an important issue for the high-tech industry for me – it is a pity that it continues to be unnoticed.

Front End vs. Back End

June 2nd, 2011 5 comments
Share

Following the topic in my previous post and Steve Thair ‘s comment as well as Steve Thair ‘s related post, I want to reiterate the front end vs. back end discussion.

Steve Souders in his recent interview said: For years when developers started focusing on the performance of their websites, they would start on the back end, optimizing C++ code or database queries. Then we discovered that about 10% or 20% of the overall page load time was spent on the back end. So if you cut that in half, you only improve things 5%, maybe 10%. In many cases, you can reduce the back end time to zero and most users won’t notice. So really, improvement comes from the time spent on the front end, on the network transferring resources and in the browser pulling in those resources.

Well, if we see that “about 10% or 20% of the overall page load time was spent on the back end” under the maximal load, this statement is a great example of applying performance engineering to the problem analysis. It is definitely the first thing to do investigating any performance issue – find where time is spent. And, considering popularity of WPO, it is probably the case for most modern websites with rich web interface and no need of transactional processing behind the scene. But it is usually not the case for sophisticated business applications working with transactional data (although I even doubt that it is exactly the case for the moment you click on “confirm order” button when you buy something on the Internet).

But one important thing I’d like to stress is that back end should handle multiple users while on the client side we always have single user (it perhaps may somewhat change in the future with multiple browser tabs and parallel JavaScripts doing something all the time, but we probably may ignore this for the moment). And for a single-user performance issue you, generally speaking, run a profiler, find where the time is spend, and then need to figure out why and how it should be changed.

For back end you have multi-user load and back-end performance problems observable with a single user are somewhat trivial (see above about profiler, etc.). But many performance issues may be observed only under [heavy] load. So you get one more level of sophistication on the top: you need to simulate load and you need to find a way to debug / profile under load (and most tools bring too much overhead to be used in this situation, plus issues may be related to timing and attempts to look inside may change the behavior of the system). Plus you get system resources limitations on the top of multi-user software problems (such as synchronization issues, running out software objects, etc.) introducing non-linear effects (it is where you get to capacity management, queuing theory, etc.). All these introduce a high probability that back end performance will degrade drastically with load (if not properly tested and configured), while the time spent on the client side for rendering and client side processing would remain the same in most cases (although affected by server response timing). It was the primary reason why almost all attention was used to be paid to the back end.

WPO: A New Wave of Performance Engineering?

May 31st, 2011 2 comments
Share

Reading Who should be on your WPO team? reminded me so many things I read before like, for example, creating a SPE team or creating Performance Center of Excellence.

Looks like we got a completely new area of performance engineering – Web Performance Optimization (WPO), with its own terminology, approaches, experts , Web Performance meeting groups, Velocity conference, and, perhaps, even new load testing tools like CloudTest (according to my impression, it is more beneficial for WPO projects). WPO actually was around for a while (looks like the first Velocity conference was in 2008), but only recently, after attending a couple of New York Web Perf events, I realized that it became a separate discipline. I guess the appearance of this new movement concentrated on the web performance means that we get a pretty mature industry of very scalable web sites delivering sophisticated content.

Well , the history of performance engineering looks like a series of waves (for me, although my knowledge of its history is limited, especially for the period before I got involved). Computer Measurement Group (CMG) was organized in 1975 as an organization of performance analysts and capacity planners. Dr. Connie Smith book “Performance Engineering of Software Systems” book was published in 1990 created the Software Performance Engineering movement.

Distributed systems brought new wave of performance engineering based around load testing. Perhaps because there was not much instrumentation available and only way to make sure that the system performs was to apply load. It looks like the first version of LoadRunner was shipped in 1989. But when I first time got involved into load testing in 1997 with SQL Bench (SilkPerformer’s ancestor), it was still far from what we expect from load testing tools now. The latest wave was probably Application Performance Management with a large array of tools promising application instrumentation (visibility in what is going on inside applications).
It is interesting that all these overlapping areas never completely merged. This is probably the reason why we have such discrepancy in performance terminology because every group often started terminology from a scratch (while others still used old terminology).

And now we get Web Performance Optimization (looks like the term was coined by Steve Souders). While WPO looks like a separate discipline, I’d rather placed it as a part of overall performance engineering. You still have a back end in most cases – and while the back end is mentioned in the WPO presentations, it sometimes looks like authors mention something trivial. Well, it is not, even for most web sites, not to mention large banks and insurance companies with many tiers of sophisticated systems in the back – and for the end-user performance you need to consider all together. Downplaying “back end” is probably as wrong as downplaying “front end” (which, working mostly with business applications, I am definitely guilty – well, historically load testing concentrated on the server performance). The importance of each component depends on the system. In my opinion, performance principles are much more generic that the details of specific technologies. Most of performance engineering experience may be applied to any technology (you, of course, still need to learn something about this new technology too).

So, while it is very promising and exciting that we get a new wave of people dedicated to performance, it is a little sad that it looks like it often gets started from a scratch inventing new terminology and ignoring what existed before. For me it would be better if we get all these waves together to enrich each other with the area of performance engineering they specialize in. Of course, there are some interaction – well, you need to work together in a way to ensure systems’ performance – but it still looks like every wave tend to stay somewhat separate, cultivating their own terminology, approaches, and events.

Recalling Hyperformix

January 26th, 2011 1 comment
Share

I worked with Hyperformix products for rather a short period of time, but followed the company for a long time. Hyperformix tried to implement a lot of very interesting things in performance engineering.

The original Hyperformix product, Hyperformix Optimizer, has a modeling language and you may build very sophisticated models with it, including models of new systems (after, of course, you learn the language). Optimizer uses simulation – so while you may build as complex models as you want, it may take a while to run such models. Although Optimizer was rather weak on data collection. It had interfaces to many monitoring tools, but it was a challenge to collect information from a large zoo of servers.

Many other modeling tools, like Teamquest, came from the enterprise capacity planning point of view. You set agents on all servers, collect and report information, and build analytical models based on the collected information. It is not easy to build more sophisticated models or models for new systems. But if the goal is to monitor a zoo of servers, report results, and build some predictions for the existing systems, such tools are a good choice.

Teamquest, for example, didn’t have any meaningful methodology for modeling non-existing systems until my colleague Leonid Grinshpan came up with Multi-tiered Applications Sizing Methodology Based on Load Testing and Queuing Network Models (a CMG’08 paper).

I guess that Hyperformix did the best job to integrate modeling and load testing. For example, see Moving Beyond Test and Guess – Using Modeling with Load Testing to Improve Web Application Readiness by Richard Gimarc, Amy Spellmann, and Jim Reynolds (as well as other Richard Gimarc’s papers – just do a search through CMG proceedings, papers up to 2007 are opened to public, free registration required). For some time, Mercury even re-sold Hyperformix Optimizer as Mercury Capacity Planning.

A group of Hyperformix authors published a very good practical book on performance engineering Fundamentals of Performance Engineering; You can’t spell firefighter without IT. For those without deep math background it may be a good book to start learning modeling / performance engineering from (it has some Hyperformix-inclined stuff, but not much).
And at one moment Hyperformix provided a performance engineering certification – nothing is available in this area yet until now. But they did it too Hyperformix-inclined to interest somebody outside the rather narrow circle of Hyperformix clients. I guess such certification should be vendor-independent to succeed (at least until consolidation starts in this area).

Later Hyperformix created another product, Capacity Manager, which looks similar to other capacity management products (like Teamquest) in goals and methods (somewhat admitting that Optimizer, as it is, is not the best tool for enterprise-level monitoring and capacity planning). And looks like Capacity Manager became the main Hyperformix product even before CA acquisition (so many performance engineering initiatives were forgotten) and Capacity Manager was the main goal of the CA acquisition. Although, of course, it is just as I see it from outside.

Hyperformix was acquired by CA

November 4th, 2010 2 comments
Share

Hyperformix was acquired by CA. Hyperformix was one of the leading companies providing capacity planning/management solutions based on mathematical modeling. They probably had the best product for Software Performance Engineering. They probably did the best job integrating modeling/capacity planning with load testing (and LoadRunner in particular). Once upon a time Hyperformix Performance Optimizer was also marketed as Mercury Capacity Planning.

I believe that this is the first acquisition in this area since BGS was acquired by BMC in 1998 with then the leading BEST/1 product (now, I guess, BMC Capacity Management).

It is interesting that modeling (capacity management based on the true modeling) remained an area with small player for a long time. In addition to Hyperformix and BMC, it is possible to mention TeamQuest, Metron, OPNET, and BEZ Systems. It is especially interesting because large players (like IBM, Oracle, HP, and Microsoft) don’t have any real modeling solutions in their portfolio (some products may have trending/forecasting features, but I haven’t heard about anything more sophisticated). But with high-tech buying spree around, this area surprisingly remained untouched.

I was also surprised that it was CA who acquired Hyperformix. As far as I understand, CA is not known for investing too much in the acquired companies. Looks like they put more stress on capacity management. So I am afraid that most interesting features of Hyperformix (in my opinion, of course) like using for new system modeling (Software Performance Engineering) and integration with load testing may be lost.