Archive

Archive for the ‘Performance’ Category

Performance vs. Scalability

April 5th, 2013 No comments
Share

After attending Sergey Chernyshev’s (@sergeyche) Scalability vs. Performance presentation at NY Web Performance Meetup and reading Scalability: it’s the question that drives us by Robert David Graham (@ErrataRob) and Scalability vs. Performance: it isn’t a battle by Theo Schlossnagle (@postwait) I would like to share my understanding. While I agree in general with everything said, I would rather word it differently. The topic became loaded, so accents are important. Robert, for example, states that “performance” and “scalability” are orthogonal problems. Well, no, they are not. They are different, but correlated notions. Even leaving aside that performance and scalability are somewhat vague terms.

If we speak about web systems now, it looks like we can roughly separate two main components in response time (which is the main performance metric): backend (server-side) time and frontend (network and client-side time). There are subtleties and grey areas, but I’d ignore them here. The frontend time, the subject of Web Performance Optimization (WPO), doesn’t relate to scalability as far as it is not involving server processing (again ignoring subtleties).

The proportion of frontend time vs. backend time may be any at all. According to Steve Souders (@souders), the founder of WPO, 80-90% of the end-user response time is spent on the frontend. But even for major web sites the backend time for requests involving database processing (such as submitting orders or querying order status) may be more noticeable. And there are plenty of corporate web applications where the share of frontend time is rather small. Of course, the starting point of any performance troubleshooting is to find where time is spent. And there is absolutely no sense to optimize parts where time is not spent.

However, there is one important “but”. While front-end time supposed to be constant (another simplification, but again ignoring subtleties), the backend (server) time depends on load. The heavier is load, the larger may be server time. And at some point it may skyrocket making the system practically unusable. So thinking about what you need to optimize, you need to check where time is spend under maximal load. Unless you don’t care about downtime and user experience, the way to do it is load testing.

And here we get to scalability. The frontend performance indeed doesn’t matter here and is independent of scalability. But the backend performance is directly related to scalability. The relationship, of course, may be non-linear and quite sophisticated – but it does exist.

To illustrate it, let’s consider one simple (but still typical) example. The backend processing takes X ms, the time is mainly spent in CPU and we don’t have any other bottlenecks. In this case the server response time would be mainly CPU processing time – and every request would take X ms of CPU time (if we don’t have parallelism here). As soon as we take most of available CPU time, server response time would skyrocket (that situation may be modeled using queuing theory). So there is a load when the system becomes practically unusable – and the question is just when we get to this load. We, of course, may get to problems sooner if we have any scalability problem inside the system or run out of another kind of resources.

Generally speaking, you can increase scalability by either optimizing server processing (using less resources) or providing more resources. Of course, if your architecture allows using these additional resources – so mainly scalability boils down to ability to parallelize your processing (and often limited by what you can’t fully parallelize – like a centralized database).

We do have two parts of response time – frontend and backend – which behaves differently and may need different approaches and tools to optimize. But the end user experience is the sum of these two parts – where the backend time is a function of load. You can’t say much about your end-to-end performance and its backend part until you check it under load – and load testing is the safe way to do so.

Historically performance engineering concentrated on backend – where main performance and scalability issues were – and practically ignored frontend (which indeed was usually pretty straightforward then). Several sub-disciplines were formed including performance analysis, capacity planning, and load testing. Later, when sophistication of frontend skyrocketed, a whole new discipline was established by Steve Souders and quickly grew around Velocity and Web Performance meetups. Unfortunately, it practically dismissed performance engineering developments of the last 40 years (maybe even more – the Computer Measurement Group (CMG) was founded in 1975). While frontend WPO definitely has its own specific, I’d still expect to see a holistic approach to performance engineering, taking in account all aspects of performance and scalability end-to-end.

APM, Analytics, and Industry Trends

February 11th, 2013 No comments
Share

Just read Gartner Q&A: Analytics vs. APM with Will Cappelli, Gartner Research VP in Enterprise Management, part one, part two, and part three about his latest report: Will IT Operations Analytics Platforms Replace APM Suites?

The discussion is very interesting and informative, touches areas that interested me for a long time, but the title doesn’t make sense for me. Maybe I just got behind in terminology – but here is as I understand it.

I joined Hyperion in 1999 and, with one break, did, in a way, performance management of performance management applications. I was responsible for application performance of business performance management and business intelligence applications. At this point you probably can guess some questions that came to my mind. Can we use anything from application performance management in business performance management? Or, vise versa, can we use business analytic software for application performance analysis? Well, I haven’t still figured out a meaningful way to do it.

I am also involved in CMG (worldwide organization of IT professionals specializing in performance, capacity and service management), where there was an idea that skills of application performance and capacity management may be used for business performance and capacity. I waited with interest if the best experts in application performance and capacity management (and CMG is the place where you find them) would be able to come with interesting ideas of using their skills for business (beyond the trivial fact that performance impacts business). Still waiting.

While these two areas, application performance and business performance, look very similar and, of course, often use the same underlying approaches and math, the devil is in details. They are just two different areas and not much may be re-used between them (as soon as we get to details).

Reading the discussion, I came to one explanation of what is going on. It also explains one more phenomenon surprising me for a while. The phenomenon was that there are many new companies on the market providing only end-user monitoring (EUM, or real-user monitoring, RUM) and apparently doing pretty well. While EUM is definitely a very important APM area, my understanding was that it is just one part of APM and, without other parts, provides very limited APM value. My understanding was that we should rather move toward integrated APM solutions. Still the impression is that some such EUM companies doing even better than companies providing integrated APM solutions.

Well, the explanation is that web analytics (even with EUM) is just a completely different area from APM (although EUM is an important part of APM too). Web analytics is a business application (and yes, end-user performance is one of components of business analysis now) and APM is an “IT” application. If we agree on this, it explains all these strange titles about death of APM or replacing APM by analytics. End-user performance with web analytics became a business intelligence application – and the business intelligence market is many times larger than IT intelligence market (where we still have all true APM solutions – with all their problems because the APM is, in essence, much more sophisticated area technically than business analytics). It was always this way – recall market size of business intelligence (Hyperion, Cognos, Business Objects, etc.) vs. market size of companies providing monitoring and deep diagnostics solutions.

Business intelligence/analytics moved to a new level: from sales analysis we are getting to user clicks analysis. And there we have another level of data volume (==big data) and we are getting to the point when end-user performance is just one more dimension of business data. Performance impacts business – and business wants to analyze it. Well, in application performance management we worked with “big data” for ages – and have you heard about it as a world problem until it got into the business realm?

But nothing changed in APM – if we find a way to separate these business-related applications, I guess we find that the APM market is slowly growing and that APM applications are slowly becoming better and better. To manage your applications, you need an APM solution, not web analytics and not EUM only (although EUM is important for APM too).

As I highlighted in my performance requirements article (here is the last version just presented at CMG’12), we have two completely different views: business view and “IT” view. Business is interested in one (and only one) performance metric – end-user response time. That’s it, nothing else at all. All other stuff – throughput, resource utilization, bandwidth, latency, etc. – is for IT internal use only (even if “IT” is the core of the company as with many web services). Yes, business people know these words too – they hear them all the time as explanation why we can’t get requested performance or why we need to spend another pile of money on something IT needs – but if “IT” delivered the requested performance they were quite happy not even knowing these words. And that is exactly what we see here: we have solutions for business (EUM – which is much simpler of integrated APM solutions and provides everything business needs) and solutions for “IT” (with all these stuff you needed to manage applications performance – but much more sophisticated and difficult to implement).

Categories: APM, Performance Tags: ,

Performance Engineering: Historical View

January 8th, 2013 No comments
Share

It is interesting to look how handling performance changed with time. Probably performance went beyond single-user profiling when mainframes started to support multiprogramming. It was mainly batch loads with sophisticated ways to schedule and ration consumed resources as well as pretty powerful OS-level instrumentation allowing to track down performance issues. The cost of mainframe resources was high, so there were capacity planners and performance analysts to optimize mainframe usage.

Then the paradigm changed to client-server and distributed systems. Available operating systems didn’t have almost any instrumentation and workload management capabilities, so load testing became almost only remedy in addition to system-level monitoring to handle multi-user performance. Deploying across multiple machines was more difficult and the cost of rollback was significant, especially for Commercial Of-The-Shelf (COTS) software which may be deployed by thousands of customers. Load testing became probably the main way to ensure performance of distributed systems and performance testing groups became the centers of performance-related activities in many organizations.

While cloud looks quite different from mainframes, there are many similarities between them, especially from the performance point of view. Such as availability of computer resources to be allocated, an easy way to evaluate the cost associated with these resources and implement chargeback, isolation of systems inside a larger pool of resources, easier ways to deploy a system and pull it back if needed without impacting other systems.

However there are notable differences and they make managing performance in cloud more challenging. First of all, there is no instrumentation on the OS level and even resource monitoring becomes less reliable. So all instrumentation should be on the application level. Second, systems are not completely isolated from the performance point of view and they could impact each other. And, of course, we mostly have multi-user interactive workloads which are difficult to predict and manage. That means that such performance risk mitigation approaches as APM, load testing, and capacity management are very important in cloud.

It is interesting that while performance is the result of all design and implementation details, performance engineering area remains very siloed. Those who do capacity planning are usually not involved much in performance testing or software performance engineering. The new and fastest growing group, web performance specialists, remains mainly isolated from other performance-related groups. People and organizations trying to span all performance-related activities together are few and far apart.

I don’t see that the need that need for specific performance-related expertise, such as load testing or capacity planning, is going away. Even in case of web operations, we would probably see load testing coming back as soon as systems become more complex and performance issues start to hurt business. There perhaps would be less need for “performance testers” as it was at the heyday due to better instrumenting, APM tools, continuous integration, resource availability, etc. – but I’d expect more need for performance experts who would be able to see the whole picture using all available tools and techniques.

Why do I believe that everybody interested in performance should come to CMG’12?

November 7th, 2012 No comments
Share

CMG’12 is an annual conference organized by Computer Management Group – a volunteer organization of professionals specialized in performance, capacity, and IT service management. This year it is held in Las Vegas, December 2-7, 2012.

Why I love CMG, spend a lot of my time organizing and promoting it, and coming there every year (sometimes on my own)? Well, because I believe that it is the best (and actually the only) conference on performance and capacity, the main topic of my interest for the last fifteen years. There are many conferences on specific topics. For example, the Velocity conference, devoted to web performance, is significantly larger and more popular – but it is still devoted mainly to single-use web performance, leaving all other performance and capacity questions to CMG. Let me share some of my excitement – of course, from my personal point of view (there is plenty of other highlights, but I am mentioning only the ones that are close to my heart).

This year the conference covers all aspects of performance (well, almost all – performance is so sophisticated subject that there is always much more to learn) from Web Performance Optimization (the conference opens by the keynote by Patrick Meenan, a web performance Google guru and the creator of WebPagetest) to mainframe performance (and everything in between).

The conference starts with a half-day workshops – see here the description. In addition to workshops, there are CMG-T sessions during the whole conference. Each CMG-T class spans 2 or 3 session spots, so it could easily be considered as a workshop or a training class. All led by renown experts with tons of experience, you hardly would get anybody even remotely close if you engage in a typical vendor class (not to mention a unique vendor-neutral or vendor-agnostic perspective you hardly find anywhere else). You have the CMG-T track through the whole conference and every one of them is a gem:

  • Capacity Planning by Ray Wicks
  • z/OS Basics by Glenn Anderson
  • Java Performance Analysis and Tuning by Peter Johnson
  • Model and Forecasting Basics by Dr. Michael Salsburg
  • Network Performance Management by Manoj Nambiar
  • Windows System Performance Management and Analysis by Jeffry Schwartz
  • Using SAS to Communicate Your Message by MP Welch

CMG’12 has 4 keynote/plenary session and almost a hundred regular track sessions going on from mid-Monday to mid-Friday. The conference is 5 tracks wide. One track, as I already mentioned, is CMG-T 101– type classes (with 301-depth). Others four tracks shared between five subject areas: Performance Engineering and Testing, Capacity Planning, Application Performance Management, IT Service Management, and Hot Topics. It is difficult to list all highlights – too many. While I know many great presenters and am fascinated by many topics, commenting every single one would take too much time and space. Probably you just need to look at agenda – there are three different views: preliminary agenda (overview, a day on a page), a list of abstracts in a single pdf document and search/scheduler (click on the abstract number to see the abstract).

One track on Wednesday is a Michelson award track. CMG is presenting Michelson award since 1974 (if you wonder, Albert Abraham Michelson was known for his technical accomplishments in measuring the speed of light and for his role as teacher and inspirer of others – and measuring is the key to performance). This year we will see many Michelson winners presenting: Dr. Connie Smith, the founder of Software Performance Engineering, Dr. Daniel Menasce, the author of many great books about performance and capacity planning, Adam Grummit, the author of the great Capacity Management book (ITSM Library) and the CMG president, Dr. Pat Artis, Bruce McNutt, and Dr. Michael Salsburg.

I believe that the main advantage of attending CMG is networking with best world experts in almost all areas of performance and capacity. Nowadays you can find all technical information on the Internet, but there is no substitution to face-to-face conferences to learn how to use it and what were people experiences, and, of course, to see the whole picture. Especially in performance: performance is the result of every design and implementation detail and you need to be learning all the time to keep up with coming challenges.

I am presenting there too: Load Testing: See a Bigger Picture on Thursday and
Performance Requirements: the Backbone of the Performance Engineering Process on Friday. Nothing comparing to other CMG’12 highlights, but I hope to trigger discussions around these two very important topics.

And, of course, it is Las Vegas – and Rio’s rate is $55 per night until November 14th. See you there!

Two Main Challenges of Performance Modeling and System Sizing

September 10th, 2012 No comments
Share

Reading about performance modeling / simulation and system sizing, you often see two completely opposite views of the subject. Either authors describe in detail how you can model performance using some math and you may feel that as soon as you comprehend that math, you won’t have any problem with modeling. Or authors say that it is a black magic and you’d better stay away from it or do it in a minimal way with simple trending (while you probably won’t see that view in serious books, it is often can be seen in Internet discussions).

The truth, as usual, is in the middle. Modeling is a very helpful and works well if you use it properly and understand its limitations. And there are two main challenges here that rarely get highlighted – while everybody who wants to approach the topic should understand them clearly.

The first challenge is that modeling works well for known resource limitations. You should know these limitations in advance (and how your system uses that limited resource – which is also a challenge, but more technical one). For example, if your system is processor-bound and you know how much cpu it takes per transaction, you may build a rather simple and pretty accurate model (using queuing theory or even something pretty simple – for example, if you stay away from heavy cpu utilization, linear model may work well with multi-processor systems).

But that model would never tell you when you run out of another resource and run into another kind of bottleneck until you build it into the model. And beyond a few common resources (processor, memory, disk, network) and explicitly introduced throttling, you usually don’t know about bottlenecks until you run into them. This is the primary reason that results of your model (which may be perfect from the mathematical point of view) are not reliable if you model significantly higher load than you tested / validated – as far as there is a high probability that you run into another bottleneck you are not taking in consideration now. However, the model would provide the best possible case (which turns true when you fix all other bottlenecks you didn’t take in consideration at the moment of modeling), which is important information by itself. A model would also be very useful to see if the system behaves up to expectations – or there are internal issues degrading performance and preventing scalability (that may be not so trivial to catch in complex systems).

Another challenge is a lack of performance-related metrics of hardware to use in modeling. You can find detailed hardware specifications, but they won’t tell you how fast your systems would work on this hardware. As far as I understand, the only relatively objective approach (without testing the real system on the real hardware – which is, of course, the best) is to use existing benchmark results to compare performance (keeping in mind that they represent results of this specific benchmark, not your systems). Most serious commercial modeling tools come with a library of hardware configurations and their performance metrics, allowing what-if performance analysis. It looks like keeping such libraries is a pretty time-consuming task and their quality may differ. Such a library is usually a major advantage of commercial modeling tools in comparison with free or inexpensive modeling tools (which may be quite good from the mathematical point of view, but you need to provide all numbers yourself).

IDC made an interesting move here introducing QPI (Qualified Performance Indicator) as a part of IDC’s Server Decision Suite Metrics (free 30-days trial available). A kind of independent performance library that may be used for proper performance modeling / sizing (and, as far as I understand, going well beyond performance, integrating this information with other IT-related metrics such as price, power, and size – it should be a very interesting optimization task to find the best hardware configuration based on all these metrics).

CMG’12 Call for Papers and Workshops – The Best Independent Performance and Capacity Conference

May 18th, 2012 No comments
Share

The Computer Measurement Group (CMG) calls for papers and presentations for CMG’s 38th International Conference to be held in Las Vegas, Nevada, December 3rd through 7th, 2012.

The 2012 CMG conference will cover all areas of systems management, including but not limited to: capacity planning, IT service management, application performance management, performance engineering and testing, as well as the latest developments in the overall field of computer performance evaluation. See the Call for Papers and Call for Workshops for details.

CMG is the source of unbiased and objective expert information and practical, real life experiences across all computing platforms in the computer industry for over 30 years. Share your knowledge and experiences: write a paper and submit it for presentation at CMG’12.

Paper are categorized as Introductory, Tutorial, Advanced, or User Experience. I want to especially encourage all of you to consider writing a User Experience paper. Every year, the conference evaluations show a common theme: “More User Experience Papers, please!” You don’t need to be one of the field’s superstars to write one — in fact, they seem to work better from people who are just working in the field, in non-IT companies and government bodies. Just tell us what problem you faced, how you went about figuring out what the cause was, and how you dealt with it. Mentors are available for writing assistance, and may be requested at any point in the writing process, including before the paper is started. Just write mentor@cmg.org and ask.

Please take the time to participate in the CMG’12 program. It will be rewarding for both authors and attendees, and as we all share our knowledge we all become more complete professionals.
Paper submission through the CMG website is now available. For more information go to paper submission and workshop submission.

The deadline for paper submissions is June 8, 2012.

Please send questions to CMG’12 Program Chair, Bill Jouris at cmgpc@cmg.org.

Performance Dimension of Information Technology

April 16th, 2012 1 comment
Share

There are no standards on titles and skill sets related to performance dimension of IT. I decided to put together how I understand them (most terms are vague, so it is quite possible that other people understand them differently). Of course, it is a simplification – but the topic is probably too heavy influenced by organization history and politics in every particular organization to be clear cut anyway.

I still think that we can break the whole area into three major categories: design (and development), testing, and production (maybe somewhat matching ITIL terms of Service Design, Service Transition, and Service Operation). The term Performance Engineering may be related to the whole area (or maybe related to the design category – in this case sometimes referred as Software Performance Engineering, SPE).

Performance Design. Talking about the design category (I used the ‘Performance Design’ term to group all performance-related activities during design and development , although it isn’t used this way – probably reflecting that the whole area is not quite existing as a separate discipline), we have specific areas of performance engineering knowledge for each specific technology. Such as Java performance, .Net performance, etc. One relatively new, but large and popular area is Web Performance Optimization, covering end-user Web performance. And, of course, we have Software Performance Engineering (SPE) trying to establish generic approaches – although SPE progress wasn’t too impressive since Dr. Connie Smith published ‘Performance Engineering of Software Systems’ in 1990.

It is definitely supposed to be an important part of the skill set of software architects (on a higher level, SPE, etc.) and software developers (maybe on a lower level, how efficiently design specific component using the chosen technology – but good understanding of high-level performance engineering won’t hurt either).

And while many architects and developers have some understanding of performance, often the main stress is on functionality and deadlines, so performance is left to the very end – where it sometimes may be indeed tuned in (usually when technologies are mature and the team is quite experienced), and sometimes require major changes (and late changes are very expensive).

It looks like the idea to have an explicit person responsible for performance from the beginning (starting from requirements) and working with other architects and developers to build it in makes sense. The title may be ‘performance architect’ or ‘performance champion’. Although such people are rare – rather we could see a proactive person from performance engineering or performance testing groups trying to ask performance questions early.

Performance Testing. Including, of course, all other variations and names, such as load, stress, endurance, etc. testing. ITIL matching term would probably Service Validation and Testing. All ways to apply synthetic load to the system and analyze system’s behavior. In the narrow sense, ‘performance tester’ is responsible for creating and applying such load (test scripting and execution). In a wider sense, it also includes workload characterization (workload modeling), performance analysis and performance troubleshooting – and often such person is referred as ‘performance engineer’. In some cases they are different people: performance tester is responsible for applying the load and performance engineer (maybe performance analyst in this case) is responsible for system analysis and optimization.

I definitely put performance testing in a separate category due to specific set of skills required: workload generation. And, perhaps, techniques to find and fix issues in the system applying an appropriate workload. But definitely not because “testing should go after development before production” as it use to be in the waterfall approach – testing should start as early as possible mostly overlapping with development and may continue in production. Monitoring the system using synthetic workload, for example, I’d rather also put in this testing category – it is actually testing the production system in parallel to production workload.

Performance Management, perhaps, may be a good name for the collection of performance-related activities and skills in production (and around).

It is interesting that ITIL places Capacity Management and Service Level Management processes into Service Design. I see a point here – you definitely need to allocate capacity before deploying the system, and Service Levels should probably come directly from the performance requirements. Still real people working in these areas are usually part of operations. Capacity Planners are responsible for allocating resources, although fewer and fewer people have such title and these responsibilities get spread between other groups (which, unfortunately, often don’t have appropriate skills).

Service Level Management would probably handled by Performance Monitoring (Analysis). ITIL matching term would probably Service Measurement. Title ‘Performance Analysts’ used often in the past – but not very popular anymore. Probably title ‘Performance Engineer’ is more popular now. And, of course, it may be specialized, like Database Monitoring, System Monitoring, Application Server Monitoring. These may be done by respective administrators (DBA, system administrator, etc.).

Application Monitoring – relatively new staff. Usually referred as Application Performance Monitoring. The idea is to measure application-specific metrics (including business-related metric, end-user metrics, etc.) in addition to those system-level metrics that used to be measured earlier. Importance of application monitoring is definitely growing. From one side, system-level metrics becomes less relevant in today’s infrastructure with virtualization, multi-tenancy, cloud, etc. From another side, the system becomes so complicated that trying to figure out what is going on using low-level metrics becomes nightmare. Form the third side, full monitoring from the business point of view becomes a business requirement – and it is where IT can provide unique business advantage.

Probably Application Performance Management (APM) would the right category encompassing most production-related categories such as Performance Monitoring, Capacity Management, Diagnostics (troubleshooting) and Tuning (and Optimization – although this may somewhat get into re-design category). We probably not there yet and Application Performance Management is rather a vague vision than reality. Gartner, for example, stresses that APM is Application Performance Monitoring, not Management. And I am not sure what would be a title of the person doing this. Management is a favorite word for an area of expertise (as in Performance Management or Capacity Management), but Manager (at least in the US) still means a person who manages other people. So the title, I guess, would be the same ubiquitous ‘performance engineer’.

Performance Troubleshooting or Diagnostics is definitely important part of Performance Management and is an application of performance engineering to existing performance issues. While it is probably the most typical performance-related activity at many corporations, very few have anything formal around it and usually all other performance-related groups get involved. And we need performance engineering kind of skills to investigate and fix performance problems in production.

It looks like that in the new generation of Web companies monitoring and capacity planning often included into ‘Site Reliability’, adding, I guess, some confusion to the already existing mess of terms and notions.

P.S. By the way, the only conference covering almost all topics mentioned above is CMG. Call for papers and workshops is opened now.

The Main Performance Problem

March 22nd, 2012 3 comments
Share

Dennis Drogseth’s post The Many Dimensions of User Experience Management (UEM) is very indicative of the main problem we have in performance: people thinking about many small specific performances, but we have just one PERFORMANCE. It depends on many different components and manifests itself in many different ways, but any attempt to decompose it results in silos and losing some important parts of the whole.

From the post: When we asked “What is your primary driver?” Better application performance and triage came in fifth, with only 13% of the votes. Employee productivity topped the list at 23%, followed by business competitiveness and/or revenue at 20%. Better support for services delivered over the network came in third, and brand protection and customer satisfaction came in fourth.

Well. Ask , for example, business users about JVM performance and it probably won’t get into the first hundred of issues they care about. Does it prove anything? No. They care a lot about it if they use J2EE systems, but just don’t know about it (except maybe a few most curious).

“Employee productivity” heavily depends on application performance. ” Business competitiveness and/or revenue” is related to application performance. “Better support for services delivered over the network” – not sure what it means, but performance also comes to mind. “Customer satisfaction” – performance is a pretty major component. And even with “brand” quite may be impacted by bad performance. Probably business users (and not only business) don’t care much about performance when it is good, but as soon as performance degrades, it immediately jumps on the top of everybody’s priority list.

I, of course, don’t want to say that performance is the main thing in business – if you don’t have any business, you may not be concerned with performance. But as soon as you do, application performance would impact all parts of your business. But you notice it only when it is bad (and usually it will happen soon if you don’t take care).

Then the post says: Similarly, when we wanted to understand which organizations or groups within IT and the business were behind UEM or QoE, the Help Desk/End User Support came in first, Customer Experience Management came in second, and Applications Management and Network Operations were tied at third and fourth place.

And when asked which organization is likely to DRIVE the overall QOE/UEM initiative, the first five groups were: Line of Business, Customer Experience Management, Process Management and Compliance Professional, Help Desk, and Service Management.
Applications Management came in seventh, one percentage point after Infrastructure Management!

Yeah, exactly proves the point: there is no organization/group responsible for performance today. Not sure what “Application Management” is (I don’t recall seeing such group – app admins?). And it is not surprising that people don’t put such group to drive such initiative – I guess perception is that such groups are groups of IT geeks doing something with computers, not caring about business, and starting to do something only when would be told by CIO to fix it (that, unfortunately, often is close to the truth).

How it relates to concept of Application Performance Management (which is rather concept for the moment)? It just proves that it doesn’t exist in practice (at least in its ideal form). Usually there is no organization responsible for it (as holistic concept, in conjunction with business).

What are end-user response times (what EUM monitors)? They are external symptoms of application performance. The only part of application performance end users care about. The tip of the iceberg. If we are saying that we want to manage application performance, would end-user response times part of it? I have no doubt it would. Otherwise the whole concept doesn’t make sense.

The post states: User Experience Management also has strong business impact, governance, service level and user productivity implications that transcend performance management. Yes, performance has “business impact, governance, service level and user productivity implications”.

So the data provided in the post, by my opinion, proves two things: business cares about performance a lot, but there is no any reliable structure in place to care about end-to-end performance.

Actually I am rather confused by the term User Experience Management. I understand what it is User Experience Monitoring or End User Experience (which usually used in the context of measuring response times). But how would you manage it? You may manage your application/systems which would improve response times. Unless you just saying that you want to use the name User Experience Management as an umbrella name covering all related to performance (including APM, Capacity Planning, etc., etc.) – which maybe an option, but it doesn’t look like it is used this way. Or maybe User Experience Management is used as a wider term including usability, UX (User eXperience), etc., which usually relate to UI design? If yes, then it indeed includes important factors not related to performance and only partially overlaps with APM – but then I am not sure why we compare EUM with APM.

Ian Molyneaux’s post The Case for the CPO brought the topic of a person responsible for performance to its extreme. Great idea, but… How far are we from there? Forget CPO, but just having a person (or persons) responsible for end-to-end performance and building up the process assuring such performance? See job posting – have somebody seen any position saying that we need a person to drive performance in our organization (and meant it)? I haven’t. All positions are for a specific silo team or for consulting. So it looks like it would be awhile until we see a more holistic approach to performance (whatever name would be used for it).

Response Times: Digesting the Latest Information

March 13th, 2012 No comments
Share

Returning to the discussion around my post How Response Times Impact Business? and recent publications about the topic, like For Impatient Web Users, an Eye Blink Is Just Too Long to Wait.

From one side, we see more and more statement the response times should be shorter and shorter. For example, both Scott Barber and Harry Shum (“a computer scientist and speed specialist at Microsoft” according to the New York Times article) state 250 milliseconds as the magic number for response times (although I am not sure where these 250 milliseconds came from).

From another side, the three psychological thresholds and other considerations I referred to in my post were based on multiple researches and definitely make sense.

Well, I definitely prefer to see a recent research about the subject. It is strange that there were a lot of research since 1968 – but none recent. And it is when really big money gets involved. Or there are some, but they just don’t get released?

Meanwhile one explanation may be that perception about [at least] simple web navigation is changing. Web response times were defined by the second threshold: users feel they are interacting freely with the information (1-5 seconds). They notice the delay, but feel that the computer is “working” on the command. Well, maybe users don’t feel anymore that the computer should “work” [at least] for simple web navigation. Maybe they now perceive it as described by the first threshold: instantaneous (0.1-0.2 second). Users feel that they directly manipulate objects in the user interface. So while these psychological thresholds are still correct, perception of [at least] simple web navigation is changing and it gets defined by another threshold. Just a speculation, of course – it would be interesting to see any research to prove or disprove it.

In a similar classification, Steven Seow in Designing and Engineering Time: The Psychology of Time Perception in Software defines four classes of responsiveness bases on user expectancy:

  • Instantaneous: 0.1 to 0.2 seconds
  • Immediate: 0.5 to one second
  • Continuous: two to five seconds
  • Captive: seven to ten seconds

So in a way he breaks the middle threshold into two classes: immediate and continues. If accept this division, we perhaps may say that user expectations for [at least] for simple web navigation are moving from continuous class to immediate class. That maybe makes more sense for me: I am still rather skeptical that we indeed need 250 ms end-user response time (of course, if we talk about server response time, it would be another story).

Increasing Transaction Rate: a Valid Performance Testing Technique

March 12th, 2012 3 comments
Share

Just read a very good post Concurrency of Users Vs Increasing Transaction Rate by Jason Buksh about a pretty old question in load testing: if you could use less virtual users by increasing transaction rate.

Jason’s post discusses the subject in great details. One more thing worth mentioning is that the comparison between these two, say, approaches makes sense when the system handles the load well (low response times, etc.). As soon as the system starts to slow down, load becomes quite different (as far as you have a limited number of requests sent to the system with a small number of users).

Still I rather prefer to think about it not as about two equal approaches. If you use increasing transaction rate alone, it would be definitely cutting corners. Much better than nothing, but you still risk to miss issues related to high concurrency (which are quite probable).

I’d rather prefer to think about increasing transaction rate as an important performance testing technique to use in addition to using a realistic number users. This view is explained well in the Rapid Bottleneck Identification – A Better Way to do Load Testing white paper. Actually it was an Empirix white paper originally and there was a discussion about it back in 2005.

So increasing transaction rate leaves high concurrency risks and it is always better to test the full concurrency in the end. But increasing transaction rate may be a good technique to speedup performance testing and catch some issues earlier.