

Making Your Server System Scale
Databases are CPU- and disk-intensive, so focus your resources on those two components. The data devices of a database tend toward random I/O, whereas the log devices are sequential, so you should organize your disk subsystem to match. For maximum performance and scalability, RAID 10 (mirrored pairs of striped disks) offers the best performance but at the highest cost. RAID 5 has become popular and provides good read performance, but its write performance is slower. We don't recommend OS-based RAID, since it uses the host system's CPU to do the parity calculations and can easily consume much of the CPU cycles on your server. Hardware-based RAID systems include a fast processor and high-speed memory.
NT can addre
ss up to 4 GB of memory, 2 GB of which is used solely by NT; the other 2 GB are available to applications; 2 GB is just dandy for 99 percent of the applications, but in the world of very large databases, 2 GB is not that uncommon. Unix allows applications to address larger amounts of RAM.
How many drives should you have? The rule of thumb is the more spindles the better. For I/O-intensive tasks, you'll see increased benefit up to about 12 to 14 spindles. After that, you start reaching a point of diminishing returns. To get even more performance, you can use multiple SCSI hosts adapters. And, of course, faster drives are better, to an extent. Once again, understanding the I/O patterns of your system is critical. For random I/O, drives with fast seek time are best. For sequential I/O, drives with faster rotational speeds work well--drives like the Seagate Technology Cheetah that has a rotational speed of 10,000 rpm. Finally, SCSI-3 or Ultra SCSI adapters offer increased throughput over SCSI II adapters.
We've found that Exchange Server is primarily CPU-intensive. Once the CPU resource has been exhausted, the bottleneck will be the disk subsystem or main memory.
With Exchange Server, you'll do best to run the Exchange Optimizer program, which interro
gates your system and sets the various parameters of Exchange for you. One school of thought for scaling Exchange is to limit the number of users per server--first, because there's a 16-GB limit for user mailboxes, which most sites won't hit any time soon, and second, because you need to take fault tolerance into account. Finally, instead of placing the public folders and user mailboxes on the same servers, we've found that if you try placing all the mailboxes on one server and all the public folders on a second server, each server can be tuned for the specific tasks, and you'll be able to have more total users.
Web servers are typically I/O intensive. For the most part, serving up Web pages is not a CPU-intensive activity, but if you're running CGI appl
ications, they can tax your CPU. If you're doing online commerce or running other Web applications that require encryption and decryption, the system's CPU is at the heart of that server's performance ability. For large-scale industrial online systems, many companies have turned to RISC systems such as Sun UltraSPARC servers. Your connection to the Internet or intranet will most likely be the bottleneck in most Web servers.
Net Effect of the Network
It's easy to concentrate on the server, but often the network is the bottleneck. We've seen many occasions where there would be a huge server that supported several hundred users, but there was only one 10-Mbps network card going into the server. Have multiple network adapters in each server and load-balance the users across those network segments. Bus Mastering direct memory access (DMA) cards will unload some processing from the CPU. Also, get to know the characteristics of the adapter you're using. In our labs, we found that the Intel PCI 10/100 E
therExpress adapter performed better using a later revision of the drivers, and we saw better performance when we set the card to 100 Mbps instead of autodetect.
Also, be sure to load-balance your adapters across the PCI buses. Don't place all of your high-
bandwidth cards on the primary PCI bus and the low-bandwidth cards on the secondary bus. Separate the high-bandwidth cards on each bus. This balances the amount of I/O on each bus.
Tuning With Tools
A system that is not running optimally most often is detected by your users. But there are a number of monitoring tools available from both the operating system vendors and from third parties. NT comes with Performance Monitor, a program that is extremely powerful but not easy to use. We recommend starting with a copy of the NT 3.51 Resource Kit and a cup of coffee. There's an entire book dedicated to NT performance tuning, and it has Performance Monitor counters specifically for the Pentium and Pentium Pro.
Another tool we use in our San
Mateo labs is Net-IQ's AppManager. It's the only performance monitor tool we know of that's designed specifically for NT and Microsoft BackOffice applications. It uses intelligent scripts, called Knowledge Scripts (KS), that run on the target server. At predefined intervals, the KS scripts retrieve information about the specific BackOffice application (Internet Information Server, SQL Server and SMS, but not SNA Server) and record it at a central location. We found the beta version easy to use, but on several occasions the scripts caused our ProLiant 5000 CPU to spike to 65 percent--not a good thing for a performance monitoring application. We were able to eliminate this problem by going to the next version of the software. Make sure that your monitoring tools don't affect server performance.
While these software tools let you see what's happening within your operating system and applications, and to some extent your hardware, many server vendors have instrumented their hardware. This instrumented hardwar
e interacts with a software agent that collects performance statistics that would otherwise be unavailable. For instance, our corporate lab partners have standardized on Compaq ProLiant servers and use the Compaq Insight Manager utility to gather statistics about the
server, such as PCI and EISA bus utilization, SCSI bus errors and various drive statistics. In rare instances, there have been performance problems with the servers that eventually were determined to be hardware faults (such as the disk drive going bad). Without a tool like Insight Manager, the system administrator would not have been able to "see" those hardware errors occurring. Products from Tricord Systems and NetFRAME also have this feature.
Jay Milne can be reached at jmilne@nwc.com.
A special thanks to Neal Nelson and Associates for providing benchmarks for this testing. For more information on its benchmark testing suites, the company can be reached at (312) 755-1000.
|