I have found in discussing performance issues that terminology gets in the way. We get into "definitional arguments", or semantic debates that waste a lot of time. It helps to arrive at mutual agreement on the basic terms. They essentially boil down to three terms (and their variants) that are related by an important law (Little's Law).
If you find yourself in definitional argument while discussing performance, the following should help.The three important terms are: Arrival rate, visit time, and concurrency level. Related concepts to arrival rate are departure rate and throughput - but they're essentially the same quantity. Think time is closely related to inter-arrival time (inverse of arrival rate) - but from the client point of view. The following half a dozen terms define the different meanings of the three fundamental terms related in Little's Law.
- Arrival Rate: The rate at which users arrive in a system. This essentially self-defining term causes a great deal of confusion when discussing testing because of the ambiguity of who it is that is arriving. In a physical system, arrival rate is simply determined by observing customer arrivals at the physical site with a stop watch. You can sit outside the bank for an hour and observe people entering and count them. For a load test, confusion may arise because some people want to count actual physical threads. In a test harness that equates a user with one thread (like Grinder), the rate of thread creation could be the arrival rate, or not. It depends on what the thread is actually doing. Most test harnesses use thread pooling. RPT (Rational Performance Tester) uses about 15 threads to model up to several thousand users - so counting thread creation rate means nothing. Trying to be too smart at deciphering the meaning of arrival rate will be counter productive. The best interpretation of arrival rate, in the context of a simulated user behavior, is the rate at which sessions are created.
- Departure rate: The rate at which users depart a system. Similar confusion arises when we go from physical systems to simulated systems. At a bank you can observe customers exiting the bank and count them in a given time interval. In a simulated system, a good definition is the rate at which sessions are completed. In a stable system (at the steady state) arrival rate must be equal to departure rate, since users cannot be created or destroyed (unless people in the bank are killed by tellers or a lot of babies are delivered while mom's are in line cashing checks!). Arrival rate is not equal to departure rate during ramp up and ramp down times in a simulated test.
- Concurrency level: Number of users in the system at the same time. Again for a physical system, this is clear. Number of people in the bank. Some are queued waiting for a teller, some are siting waiting for a manager, some are at the manager's desk, some are the teller's window. Some are in the vault checking their box. Some are just lounging around reading bank literature. In a simulated system that implements the notion of a session, concurrency is the number of sessions in progress at the same time. In a system that does not implement the session concept (requests only), it is the number of requests in progress at the same time. Attempting more precise definitions of concurrency can lead to confusion. For example, defining concurrency as the number of users (threads, or processes) using the CPU at the same time, would lead average concurrency less than 1.0, in a single CPU system. The average concurrency rate for a single CPU system is the CPU utilization. In most systems a CPU utilization of 0.70 means the CPU is quite busy. Another way to read this is that: in a single CPU system, a concurrency rate of 0.70 is quite high. In a 4-CPU system, a concurrency rate of 2.80 is quite good. This is interpreting "active" to mean, "using the CPU". This is not a very useful interpretation of concurrency, although quite precise. Since most concurrency is simulated concurrency, the number of concurrent sessions in progress is a more meaningful measure. At some level, it may be useful to define concurrency to mean, "number of active threads", but that may not map well to the user experience, since it is masked by thread pooling, as discussed previously.
- Active Users vs. Total Users: concurrency discussions can get ambiguous when different people have different definitions for what "active" means. For a web site discussion, the least ambiguous definition of "active users" is the number of sessions in progress at the same time, i.e. the same as the concurrency definition. The notion of "active users" arises if we want to be more precise at stating if users are actually using system resources while they're in the system. This notion is potentially ambiguous and can lead to confusion. From the point of view of a load testing tool, like RPT or LoadRunner, active users, concurrent users, or concurrent sessions all mean the same thing: number of sessions that have a request being processed. For RPT specifically, a distinction is made between active users and total users. Total users is the total number of sessions that we request from the tool (100, 200, 500, etc.). At any point in time the number of concurrent sessions is referred to as "active users". As a function of time, active users starts out at 0, then you see it ramp up to a maximum, then it falls off back down to 0. The actual peak value may not reach the total simulated users. So you can simulate 500 users, but observe maximum concurrency of about 20. That just means that users depart, or sessions complete, faster than users are accumulating. I.e. it depends on average session duration. So a distinction is made between number of users in the tool, and number of users in the SUT (system under test). Once that is clarified, it becomes clear, in the context of RPT, to talk about "active vs. total users".
- Session duration: On average, the length of time a session is active. In a stable system, the concurrency (number of concurrent sessions) is equal to arrival rate times the average session duration. So if we have an arrival rate of 50 sessions per minute, a session duration of 10 minutes, will cause a build-up of 500 concurrent sessions.
- Think Time: Time elapsed between the receipt of a reply from a server and the generation of a new request. Several things to note about this definition: 1) it is a client-side time interval. It may, or may not affect the server greatly. It is not accurate to speak of think times when discussing the flow from the server's point of view. "Think" is something that the client does between requests. 2) The server sees gaps in the inter-arrival times, so "think time" is taken care of by the definition of arrival rate (inter-arrival time). A great deal of unnecessary confusion can be avoided by understanding this simple point: think time is a client side thing. 3) In a slow network link, the total network turnaround time (request travel to server and reply travel to client) may dominate inter-arrival time and mask even long think times. In very fast network links, think time may dominate inter-arrival time. But the main thing to keep in mind is that "think time" is meaningless on the server side.
- Throughput: The most confusing term of all (followed by Think Time). The most general interpretation is units of work completed per unit of time. This is also the reason behind the confusion - no standard definition for what the unit of work used. You will find OLTP folks talking about TPS (transactions per second), without bothering to define what a transaction is. Or typically, a transaction is well defined within a given application, and people in the community of that application know, or have a tacit agreement, or a sense, of what a transaction is. Transactions also come in all sorts of sizes, and "weights". There are many light transactions and many heavy transactions. Some financial institutions talk about "revenue throughput" - dollars per hour. You also find people talking about "hits per second", or "page views per second". Requests per second and visits per second are not unusual. Telecom people like to talk about connections per second. TPS, or transactions per second is the least meaningful, unless you define what a transaction is. I find the least confusing use of throughput is the one that is in alignment with your definition of arrival rate. If you talk about arrival rate in terms of sessions per second, and departure rate in session completions per second, and you equate that with visits per second, then that is your throughput. If, on average, you visit 40 pages per session, then page views per second is 40 times visits per second. Knowing that a page on average may contain 50 elements, then hits per second is 50 times page views per second. If you are consistent in using an arrival rate in sessions per unit time, and session duration in the same units of time, then you can easily apply Little's Law and stay clear in what you're talking about in terms of the three important metrics (arrival rate, session duration, and concurrency).
Recent Comments