This is the second in a series of entries about Performance Testing in SharePoint.  You can read Part I here.

To recap, we want to test the performance of a SharePoint intranet so, we created a set of user profiles and determined how often each profile will be executed.  Then we ran a set of tests using those profiles and made some performance improvements on the system.  However, we have yet to answer the question of is our system powerful enough to actually handle our user load.  Worse yet some of our test results might have been reviewed by the business users and they saw that page load time when we had say 300 users was over 30 seconds and panic has set in because they think that the system won’t handle more then 300 users and we have 10,000!

Avoiding the First Pitfall of Testing – Concurrent versus Total Users

We have been trapped by the first pitfall that performance tests usually encounter.  The pitfall of confusing virtual user load (or vUsers) with the number of users who can access the system.  If we run a test with 100 vUsers, that does not equate to 100 users on the system.  Since the test rigs are going to he configured to make as many hits as possible on the system, it more closely resembles 100 concurrent users hitting the system. 

So, if we have 100 concurrent users hitting the system, how many users does that map to?  That is a darn good question.  To answer that, we need to do one of two things.  We could figure out our current user base, and then determine how often they click into the site being tested in a given period of time (say an hour).  That would give us the Requests Per Second (RPS) and a target for your performance testing.

So, let’s say that we have 10,000 users that are going to hit a system.  If we look at their current usage and determine that between 8AM and 9AM (the heaviest usage period, the system has there are about 75,000 hits to the system, we can work backwards to a RPS of about 20.8.  Then we want to ensure that our system will handle approximately 21+ RPS at an “acceptable” page load time.

But what do we do when you don’t have empirical data that can be uses to measure hits to a system?  The simple answer is that we make an estimate of average user activity and active user load on a system and use that to back into a RPS number.  In this case, take your total number of users (10,000) and make an estimate based on their roles how many of them will be actively using SharePoint at any given time.  In the first client example where SharePoint is the primary application that they will be using to get their work done, then that percentage will be fairly high (say 50%), but for most implementations it will be much lower (say 10%). 

Company 1 10,000 total users 50% active users 5,000 user count = 10,000 * .50
Company 2 10,000 total users 10% active users 1,000 user count = 10,000 * .10

That means that at any given moment for out two companies, the first will have 5,000 active users, and the second will have 1,000 active users.

Once we have an estimate of active users, we next need to determine how often does a user actually click on a link and expect a result. 

For Company 1, we can look at usage patterns and determine that the average time spent looking at a document that is retrieved before the next search is about 2 minutes.  This means that of the 5,000 active users, each one is clicking and searching every 120 seconds.  That translates into 5,000 / 120 seconds = 41.67 requests per second. 

For Company 2 our usage scenario is a bit different.  They are really browsing for information and so while fewer requests are made they are made a bit faster.  We estimate that users will spend 60 seconds per request so our 1,000 users / 60 seconds = 16.67 requests per second.

Finally we have a goal that is definite and measureable for our tests.  We know what our RPS needs to be, and that will enable us to perform the next set of tests that will attempt to determine just how much traffic our servers can handle.  It also lets us set a goal for the number of concurrent or vUsers that we want to use for our testing.  For Company 1 we want to have about 40-50 vUsers and ensure that the system can maintain 40-50 RPS inside of our page load SLA.  For Company 2, that drops to 15-20 vUsers and a 15-20 RPS for our page load SLA.

Next, Part III, comparing Apples to Apples

References

Microsoft Performance and Capacity Estimation Guide – http://technet.microsoft.com/en-us/library/cc261795.aspx