Archive

Posts Tagged ‘WPO’

Performance Dimension of Information Technology

April 16th, 2012 1 comment
Share

There are no standards on titles and skill sets related to performance dimension of IT. I decided to put together how I understand them (most terms are vague, so it is quite possible that other people understand them differently). Of course, it is a simplification – but the topic is probably too heavy influenced by organization history and politics in every particular organization to be clear cut anyway.

I still think that we can break the whole area into three major categories: design (and development), testing, and production (maybe somewhat matching ITIL terms of Service Design, Service Transition, and Service Operation). The term Performance Engineering may be related to the whole area (or maybe related to the design category – in this case sometimes referred as Software Performance Engineering, SPE).

Performance Design. Talking about the design category (I used the ‘Performance Design’ term to group all performance-related activities during design and development , although it isn’t used this way – probably reflecting that the whole area is not quite existing as a separate discipline), we have specific areas of performance engineering knowledge for each specific technology. Such as Java performance, .Net performance, etc. One relatively new, but large and popular area is Web Performance Optimization, covering end-user Web performance. And, of course, we have Software Performance Engineering (SPE) trying to establish generic approaches – although SPE progress wasn’t too impressive since Dr. Connie Smith published ‘Performance Engineering of Software Systems’ in 1990.

It is definitely supposed to be an important part of the skill set of software architects (on a higher level, SPE, etc.) and software developers (maybe on a lower level, how efficiently design specific component using the chosen technology – but good understanding of high-level performance engineering won’t hurt either).

And while many architects and developers have some understanding of performance, often the main stress is on functionality and deadlines, so performance is left to the very end – where it sometimes may be indeed tuned in (usually when technologies are mature and the team is quite experienced), and sometimes require major changes (and late changes are very expensive).

It looks like the idea to have an explicit person responsible for performance from the beginning (starting from requirements) and working with other architects and developers to build it in makes sense. The title may be ‘performance architect’ or ‘performance champion’. Although such people are rare – rather we could see a proactive person from performance engineering or performance testing groups trying to ask performance questions early.

Performance Testing. Including, of course, all other variations and names, such as load, stress, endurance, etc. testing. ITIL matching term would probably Service Validation and Testing. All ways to apply synthetic load to the system and analyze system’s behavior. In the narrow sense, ‘performance tester’ is responsible for creating and applying such load (test scripting and execution). In a wider sense, it also includes workload characterization (workload modeling), performance analysis and performance troubleshooting – and often such person is referred as ‘performance engineer’. In some cases they are different people: performance tester is responsible for applying the load and performance engineer (maybe performance analyst in this case) is responsible for system analysis and optimization.

I definitely put performance testing in a separate category due to specific set of skills required: workload generation. And, perhaps, techniques to find and fix issues in the system applying an appropriate workload. But definitely not because “testing should go after development before production” as it use to be in the waterfall approach – testing should start as early as possible mostly overlapping with development and may continue in production. Monitoring the system using synthetic workload, for example, I’d rather also put in this testing category – it is actually testing the production system in parallel to production workload.

Performance Management, perhaps, may be a good name for the collection of performance-related activities and skills in production (and around).

It is interesting that ITIL places Capacity Management and Service Level Management processes into Service Design. I see a point here – you definitely need to allocate capacity before deploying the system, and Service Levels should probably come directly from the performance requirements. Still real people working in these areas are usually part of operations. Capacity Planners are responsible for allocating resources, although fewer and fewer people have such title and these responsibilities get spread between other groups (which, unfortunately, often don’t have appropriate skills).

Service Level Management would probably handled by Performance Monitoring (Analysis). ITIL matching term would probably Service Measurement. Title ‘Performance Analysts’ used often in the past – but not very popular anymore. Probably title ‘Performance Engineer’ is more popular now. And, of course, it may be specialized, like Database Monitoring, System Monitoring, Application Server Monitoring. These may be done by respective administrators (DBA, system administrator, etc.).

Application Monitoring – relatively new staff. Usually referred as Application Performance Monitoring. The idea is to measure application-specific metrics (including business-related metric, end-user metrics, etc.) in addition to those system-level metrics that used to be measured earlier. Importance of application monitoring is definitely growing. From one side, system-level metrics becomes less relevant in today’s infrastructure with virtualization, multi-tenancy, cloud, etc. From another side, the system becomes so complicated that trying to figure out what is going on using low-level metrics becomes nightmare. Form the third side, full monitoring from the business point of view becomes a business requirement – and it is where IT can provide unique business advantage.

Probably Application Performance Management (APM) would the right category encompassing most production-related categories such as Performance Monitoring, Capacity Management, Diagnostics (troubleshooting) and Tuning (and Optimization – although this may somewhat get into re-design category). We probably not there yet and Application Performance Management is rather a vague vision than reality. Gartner, for example, stresses that APM is Application Performance Monitoring, not Management. And I am not sure what would be a title of the person doing this. Management is a favorite word for an area of expertise (as in Performance Management or Capacity Management), but Manager (at least in the US) still means a person who manages other people. So the title, I guess, would be the same ubiquitous ‘performance engineer’.

Performance Troubleshooting or Diagnostics is definitely important part of Performance Management and is an application of performance engineering to existing performance issues. While it is probably the most typical performance-related activity at many corporations, very few have anything formal around it and usually all other performance-related groups get involved. And we need performance engineering kind of skills to investigate and fix performance problems in production.

It looks like that in the new generation of Web companies monitoring and capacity planning often included into ‘Site Reliability’, adding, I guess, some confusion to the already existing mess of terms and notions.

P.S. By the way, the only conference covering almost all topics mentioned above is CMG. Call for papers and workshops is opened now.

Multiple Dimensions of Response Time

February 27th, 2012 No comments
Share

It looks like everything related to performance has multiple dimensions. Reading recently excellent posts A non-geeky guide to understanding performance measurement terms by Joshua Bixby and Building a High Performance Website by Phil Stanhope, I realized how many dimensions even a relatively simple term “response time” has. And, moreover, it looks like we don’t have a reliable way to measure the response time that would matter to end user (I guess something between “time to display” and “time to interactivity” depending on the site design, if follow the posts terminology). Both authors look at this rather from the front end / Web Performance Optimization (WPO) point of view.

Spending most of my time in performance testing, I’d guess that “response time” comes from load testing / active monitoring tools that are the main source of performance information (the “waterfall” approach of the WPO community quickly becomes popular – but I am not sure how many monitoring services use it). And in this case, “response time” is what the tool reports. What “response time” means in such case is heavily depends on the tool and its settings – and in many cases, I guess, it won’t be any of the metrics provided in the aforementioned posts (which, I guess, are standard in the WPO community – but they may be not easy to measure by load testing and enterprise monitoring tools). For protocol-based tools it would be probably the time of receiving all requests without any client-side activities (with many additional details of browser emulation- like caching, threading, keep-alive, compressing, etc.). For GUI-based tools it probably depends on what underlying mechanism the tool uses and how the script is designed. Quite often if you don’t set any specific checks it may report a success without full downloading and rendering (and when somebody say that a modern sophisticated site will load for 0.169 sec over the Internet it would be my first guess). Although, if scripted properly, it perhaps may measure the performance metric that matters (when the page would “be almost fully interactive”) by checking that the parts that matter are downloaded and rendered (that probably can’t be done without manual scripting / analysis).

That brings an interesting question about Application Performance Management (APM): what End-User Experience Monitoring (EUM) a.k.a. Real-User Monitoring (RUM) measures? EUM/RUM is considered as an integral part of APM (and definitely should be), but may measure pretty different things depending on the approach to measure it. And as I mentioned above, it probably won’t be the actual end-user experience – but only its approximation by another metric (different for different tools).

Only thing that often saves us from all this complexness is , as often happens in performance, that in many cases it doesn’t matter. All of the metrics are just close enough from the practical point of view. In old good times of plain html the main part was getting response from the server, the client-side part was fixed and usually small. So it wasn’t said much about different kinds of response times in the past. The situation is changing now: the front-end time becomes significant (see the Performance Golden Rule by Steve Souders, keeping in mind that it is based on front pages mainly) and now it looks like we can’t ignore the differences between response times anymore.

Front End vs. Back End

June 2nd, 2011 5 comments
Share

Following the topic in my previous post and Steve Thair ‘s comment as well as Steve Thair ‘s related post, I want to reiterate the front end vs. back end discussion.

Steve Souders in his recent interview said: For years when developers started focusing on the performance of their websites, they would start on the back end, optimizing C++ code or database queries. Then we discovered that about 10% or 20% of the overall page load time was spent on the back end. So if you cut that in half, you only improve things 5%, maybe 10%. In many cases, you can reduce the back end time to zero and most users won’t notice. So really, improvement comes from the time spent on the front end, on the network transferring resources and in the browser pulling in those resources.

Well, if we see that “about 10% or 20% of the overall page load time was spent on the back end” under the maximal load, this statement is a great example of applying performance engineering to the problem analysis. It is definitely the first thing to do investigating any performance issue – find where time is spent. And, considering popularity of WPO, it is probably the case for most modern websites with rich web interface and no need of transactional processing behind the scene. But it is usually not the case for sophisticated business applications working with transactional data (although I even doubt that it is exactly the case for the moment you click on “confirm order” button when you buy something on the Internet).

But one important thing I’d like to stress is that back end should handle multiple users while on the client side we always have single user (it perhaps may somewhat change in the future with multiple browser tabs and parallel JavaScripts doing something all the time, but we probably may ignore this for the moment). And for a single-user performance issue you, generally speaking, run a profiler, find where the time is spend, and then need to figure out why and how it should be changed.

For back end you have multi-user load and back-end performance problems observable with a single user are somewhat trivial (see above about profiler, etc.). But many performance issues may be observed only under [heavy] load. So you get one more level of sophistication on the top: you need to simulate load and you need to find a way to debug / profile under load (and most tools bring too much overhead to be used in this situation, plus issues may be related to timing and attempts to look inside may change the behavior of the system). Plus you get system resources limitations on the top of multi-user software problems (such as synchronization issues, running out software objects, etc.) introducing non-linear effects (it is where you get to capacity management, queuing theory, etc.). All these introduce a high probability that back end performance will degrade drastically with load (if not properly tested and configured), while the time spent on the client side for rendering and client side processing would remain the same in most cases (although affected by server response timing). It was the primary reason why almost all attention was used to be paid to the back end.

WPO: A New Wave of Performance Engineering?

May 31st, 2011 2 comments
Share

Reading Who should be on your WPO team? reminded me so many things I read before like, for example, creating a SPE team or creating Performance Center of Excellence.

Looks like we got a completely new area of performance engineering – Web Performance Optimization (WPO), with its own terminology, approaches, experts , Web Performance meeting groups, Velocity conference, and, perhaps, even new load testing tools like CloudTest (according to my impression, it is more beneficial for WPO projects). WPO actually was around for a while (looks like the first Velocity conference was in 2008), but only recently, after attending a couple of New York Web Perf events, I realized that it became a separate discipline. I guess the appearance of this new movement concentrated on the web performance means that we get a pretty mature industry of very scalable web sites delivering sophisticated content.

Well , the history of performance engineering looks like a series of waves (for me, although my knowledge of its history is limited, especially for the period before I got involved). Computer Measurement Group (CMG) was organized in 1975 as an organization of performance analysts and capacity planners. Dr. Connie Smith book “Performance Engineering of Software Systems” book was published in 1990 created the Software Performance Engineering movement.

Distributed systems brought new wave of performance engineering based around load testing. Perhaps because there was not much instrumentation available and only way to make sure that the system performs was to apply load. It looks like the first version of LoadRunner was shipped in 1989. But when I first time got involved into load testing in 1997 with SQL Bench (SilkPerformer’s ancestor), it was still far from what we expect from load testing tools now. The latest wave was probably Application Performance Management with a large array of tools promising application instrumentation (visibility in what is going on inside applications).
It is interesting that all these overlapping areas never completely merged. This is probably the reason why we have such discrepancy in performance terminology because every group often started terminology from a scratch (while others still used old terminology).

And now we get Web Performance Optimization (looks like the term was coined by Steve Souders). While WPO looks like a separate discipline, I’d rather placed it as a part of overall performance engineering. You still have a back end in most cases – and while the back end is mentioned in the WPO presentations, it sometimes looks like authors mention something trivial. Well, it is not, even for most web sites, not to mention large banks and insurance companies with many tiers of sophisticated systems in the back – and for the end-user performance you need to consider all together. Downplaying “back end” is probably as wrong as downplaying “front end” (which, working mostly with business applications, I am definitely guilty – well, historically load testing concentrated on the server performance). The importance of each component depends on the system. In my opinion, performance principles are much more generic that the details of specific technologies. Most of performance engineering experience may be applied to any technology (you, of course, still need to learn something about this new technology too).

So, while it is very promising and exciting that we get a new wave of people dedicated to performance, it is a little sad that it looks like it often gets started from a scratch inventing new terminology and ignoring what existed before. For me it would be better if we get all these waves together to enrich each other with the area of performance engineering they specialize in. Of course, there are some interaction – well, you need to work together in a way to ensure systems’ performance – but it still looks like every wave tend to stay somewhat separate, cultivating their own terminology, approaches, and events.