Improving mod_perl Sites' Performance: Part 1
by Stas BekmanMay 29, 2002
In the next series of articles, we are going to talk about mod_perl performance issues. We will try to look at as many aspects of the mod_perl driven service as possible: hardware, software, Perl coding and finally the mod_perl specific aspects.
The Big Picture
To make the user's Web browsing experience as painless as possible, every effort must be made to wring the last drop of performance from the server. There are many factors that affect Web site usability, but speed is one of the most important. This applies to any Web server, not just Apache, so it is important that you understand it.
How do we measure the speed of a server? Since the user (and not the computer) is the one that interacts with the Web site, one good speed measurement is the time elapsed between the moment when one clicks on a link or presses a Submit button to the moment when the resulting page is fully rendered.
The requests and replies are broken into packets. A request may be made up of several packets; a reply may be many thousands. Each packet has to make its way from one machine to another, perhaps passing through many interconnection nodes. We must measure the time starting from when the first packet of the request leaves our user's machine to when the last packet of the reply arrives back there.
A Web server is only one of the entities the packets see along their way. If we follow them from browser to server and back again, then they may travel by different routes through many different entities. Before they are processed by your server, the packets might have to go through proxy (accelerator) servers and, if the request contains more than one packet, packets might arrive to the server by different routes with different arrival times. Therefore, it's possible that some packets that arrive earlier will have to wait for other packets before they could be reassembled into a chunk of the request message that will be then read by the server. Then the whole process is repeated in reverse.
Related Reading ![]() Practical mod_perl |
You could work hard to fine-tune your Web server's performance, but a slow Network Interface Card (NIC) or a slow network connection from your server might defeat it all. That's why it's important to think about the big picture and to be aware of possible bottlenecks between the server and the Web.
Of course, there is little that you can do if the user has a slow connection. You might tune your scripts and Web server to process incoming requests quickly, so you will need only a small number of working servers, but you might find that the server processes are all busy waiting for slow clients to accept their responses.
But there are techniques to cope with this. For example, you can deliver the response compressed. If you are delivering a pure text respond, then gzip compression will sometimes reduce the size of the respond by 10 times.
You should analyze all the involved components when you try to create the best service for your users, and not the Web server or the code that the Web server executes. A Web service is like a car: If one of the parts or mechanisms is broken, then the car may not operate smoothly and it can even stop dead if pushed too far without fixing it.
Let me stress it again: If you want to be successful in the Web service business, then you should start worrying about the client's browsing experience and not only how good your code benchmarks are.
Operating System and Hardware Analysis
Before you start to optimize server configuration and learn to write more-efficient code, you need to consider the demands that will be placed on the hardware and the operating System. There is no point in investing a lot of time and money in configuration tuning and code optimizing, only to find that your server's performance is poor because you did not choose a suitable platform in the first place.
Because hardware platforms and operating systems are developing rapidly (even while you are reading this article), the following advisory discussion must be in general terms, without mentioning specific vendors names.
Choosing the Right Operating System
I will try to talk about what characteristics and features you should be looking for to support a mod_perl enabled Apache server, then when you know what you want from your OS, you can go out and find it. Visit the Web sites of the operating systems you are interested in. You can gauge user's opinions by searching the relevant discussions in newsgroup and mailing list archives. Deja - https://deja.com and eGroups - https://egroups.com are good examples. I will leave this fan research to you. But probably the best shot will be to ask mod_perl users, as they know the best.
Stability and Robustness Requirements
Probably the most important features in an OS are stability and robustness. You are in the Internet business. You do not keep normal 9 a.m. to 5 p.m. working hours like conventional businesses. You are open 24 hours a day. You cannot afford to be off-line, because your customers will shop at another service (unless you have a monopoly ...). If the OS of your choice crashes every day, then first conduct a little investigation. There might be a simple reason that you can fix. There are OSs that won't work unless you reboot them twice a day. You don't want to use this type of OS, no matter how good the OS' vendor sales department is. Do not follow flushy advertisements; follow developers' advice instead.
Generally, people who have used the OS for some time can tell you a lot about its stability. Ask them. Try to find people who are doing similar things to what you are planning to do, they may even be using the same software. There are often compatibility issues to resolve. You may need to become familiar with patching and compiling your OS.
Good Memory-Management Importance
You want an OS with a good memory-management implementation. Some OSs are well-known as memory hogs. The same code can use twice as much memory on one OS compared to another. If the size of the mod_perl process is 10Mb and you have tens of these running, then it definitely adds up!
Say No to Memory Leaks
Some OSs and/or their libraries (e.g. C runtime libraries) suffer from memory leaks. A leak is when some process requests a chunk of memory for temporary storage, but then does not subsequently release it. The chunk of memory is not then available for any purpose until the process that requested it dies. You cannot afford such leaks. A single mod_perl process sometimes serves thousands of requests before it terminates. So if a leak occurs on each request, then the memory demands could become huge. Of course, your code can be the cause of the memory leaks as well, but it's easy to detect and solve. Certainly, we can reduce the number of requests to be served during the process' life, but that can degrade performance.
Memory-Sharing Capabilities Is a Must
You want an OS with good memory-sharing capabilities. If you preload the Perl modules and scripts at server startup, then they are shared between the spawned children (at least for a part of a process' life - memory pages can become ``dirty'' and cease to be shared). This feature can reduce memory consumption a lot!
And, of course, you don't want an OS that doesn't have memory-sharing capabilities.
The Real Cost of Support
If you are in a big business, then you probably do not mind paying another $1,000 for some fancy OS with bundled support. But if your resources are low, then you will look for cheaper and free OSs. Free does not mean bad, it can be quite the opposite. Free OSs can have the best support you can find. Some do.
It is easy to understand - most of the people are not rich and will try to use a cheaper or free OS first if it does the work for them. Since it really fits their needs, many people keep using it and eventually know it well enough to be able to provide support for others in trouble. Why would they do this for free? One reason is for the spirit of the first days of the Internet, when there was no commercial Internet and people helped each other, because someone helped them in first place. I was there, I was touched by that spirit and I'm keen to keep that spirit alive.
But, let's get back to the real world. We are living in material world, and our bosses pay us to keep the systems running. So if you feel that you cannot provide the support yourself and you do not trust the available free resources, then you must pay for an OS backed by a company, and blame them for any problem. Your boss wants to be able to sue someone if the project has a problem caused by the external product that is being used in the project. If you buy a product and the company selling it claims support, then you have someone to sue or at least to put the blame on.
If we go with open source and it fails we do not have someone to sue ... wrong -- in the past several years, many companies have realized how good the open-source products are and started to provide an official support for these products. So your boss cannot just dismiss your suggestion of using an open-source operating system. You can get a paid support just like with any other commercial OS vendor.
Also remember that the less money you spend on OS and software, the more you will be able to spend on faster and stronger hardware. Of course, for some companies money is a nonissue, but there are many companies for which it is a big issue.
