Clearing Database Network Clogs

Databases are more likely than ever to cause network congestion. We examine how to fine-tune the ways your client application queries the database so you can break up those bottlenecks.

April 28, 2003

8 Min Read
NetworkComputing logo in a gray background | NetworkComputing

The best way to remedy database traffic overload is to fine-tune the way your client application queries the database. There's a huge difference between configuring your client to query the database with "give me all the data about all our customers and I'll sort through them" and "give me information on customers such as X, but only their names, addresses and phone numbers."

If the application's SQL implementation limits the result set, the query will carry less data. The more users, however, the greater the impact queries have on the applications and the network. One or two users can get away with a query that selects, for instance, all 25,000 100-byte long records. But when 300 users attempt that, you have a problem. Then you need to optimize your queries or your database and revisit your network bandwidth and design.

If you have a single active connection to the database, meanwhile, only a minimal amount of traffic can traverse it--bad news for database performance but good news for the network. If you have multiple active connections that sit idle for a long time, however, you may be placing an unnecessary load on your network because most database technologies use messages to keep a connection open. Depending on the type of idle connection, you may want to reduce your number of active connections.

A Measure of Success

To gauge your database traffic's effect on the network, you must measure it with a network monitor. We used the open-source packet sniffer Ethereal, but any packet sniffer will do. After pointing Ethereal at the database server, we ran our application code and walked through the results to see what was happening.First, determine the size of the packets the database returns with the data. Some experts recommend fine-tuning the packet size with which your database communicates, but this is risky. Such fine-tuning can degrade performance, and a database application rarely requests chunks of data in a single size, so don't change this setting without understanding the potential impact. You can get more information on how to fine-tune packet size from your database vendor (see "Sites To See," below).

The sniffer, meanwhile, lets you determine how many times database connections are made. In our sniffer tests, the databases we examined exchanged multiple packets when establishing a connection. This can increase traffic if connections are being reopened consistently.

And finally, measure the size of the query result sets the database returns to the client. We parsed through the result sets and determined the quality of the application vendors' code. If you find that a particular application is pushing many large data sets across your network, you can work with the application vendor to improve the software. If you're having problems sniffing traffic from a particular database, meanwhile, you can set up a port mirror on your switch. That ensures you receive traffic for that machine, and not traffic destined for others.

Once you've gathered all this information about the application's behavior, you can simulate a single client and predict the impact of multiple clients. If the database performance degrades in other ways when you add clients--producing more I/O than your disk can handle--it will appear to lessen the load on the network. But don't be fooled. The minute you fix the bottleneck on your server, you'll encounter traffic problems again.Now that you have data on how your database traffic affects your network, you can reduce that traffic. Like any attempt to tune a database, the quality of the SQL query hitting your database plays a big role. Cleaning up SQL usually solves database performance issues.

Then look at connection pooling. The concept may seem simple--keep connections open between the database and application or application server so you can reuse them right away--but it has several facets.First and foremost is security. Database connections are cached at the application-instance level or thread level. With some database technologies that cache connections by application, a user can gain unauthorized access to data. However, thread-based technologies don't pose this problem because they don't share connections across the application; each thread has its own security.

The technology you choose for connection pooling is also important. If you control the source code to your programs and they use ODBC or JDBC (Java Database Connectivity), you can enable connection pooling in your code (at the application-instance level). If your applications are running on an application server, that's OK--most servers support connection pooling in one form or another.

Finally, most database technologies let you set the number of pooled connections and the length of time to keep unused connections open. If you have too few pooled connections, network performance gains are minimal. And each open connection uses resources on the database server, so too many connections, especially with a long time-out value, hurts overall database performance.

Tracking connection problems can be difficult. Network performance for the database improves with reduced bandwidth. And as the database slows down, there's less traffic. That can fake you into thinking you achieved your goal of improving database behavior. Still, you can find out how many connections are open by checking your database-management interface.

Change Is GoodConnection pooling is the best short-term solution. By making a few simple changes to your application server's configuration files or a slight modification to your source code, you can get a huge reduction in bandwidth usage. Much of the benefit of this approach depends on your users' access patterns. If many users hit the database repeatedly, the performance improvement will be noticeably large because you won't be rebuilding connections repeatedly. But if you have just a few users performing long-running queries, the improvement will be minimal.

A long-term solution is to review the queries your applications use against your database. Queries that return unnecessary columns or too many rows, and queries that return data and then use the result to go back to the database for more data, are problems.

If you still can't reduce network traffic using these methods, go Gigabit Ethernet on the connection from the switch to your database and application server. Gigabit networking technology costs have dropped, so increasing the pipe size is no big deal. This is a good solution if your application uses an application server or if you're having problems because of congestion between the database and the application server. If your application has a fat client to the database, you need to upgrade your backbone so the switches are gigabit-connected as well.

Watch out for Web services. If they aren't configured properly, they can hit your database too much. Writing a Web service that selects and returns a single value from a database row is a no-brainer, but it poses a health hazard to your database and your network. If your organization has 15 Web services modules that each return a single field from the same table, for instance, that places 15 expensive call-response sequences on your network, and your database executes the query 15 times.

Even when implemented correctly, Web services create more network traffic than a traditional query because of two factors: the SOAP (Simple Object Access Protocol) envelope and the translation of result sets to XML/text. When used incorrectly, the overhead can be large--about 250 bytes per call, plus the difference in size between your database's representation of the data and a text translation of that data.From the network side, there aren't many options for optimizing your database other than throwing bandwidth at the problem. As database and hardware technologies advance, though, more developers will tackle solutions for reducing database traffic. Until then, it's up to you and your packet sniffers.

Don MacVittie is an applied technologist at WPS Resources. Write to him at [email protected].

Post a comment or question on this story.

How to reduce Database Traffic on Your Network

1. Use a network packet sniffer to track down problem applications. The sniffer shows you how many times a database query creates connections, and how much data is being transferred.

2. Check your SQL. Look for queries that return more rows than the program needs or more columns than the application uses. Combine multiple queries into a single query, if possible. If a SQL statement selects one or two fields from a table and another SQL statement uses this information to get data from another table, for example, then merge the two into a single statement that returns data from both tables.3. Make sure problem applications are using connection-sharing or pooling, if appropriate. Many applications, particularly real-time or data-acquisition apps, open a connection to the database, use it and then close it. Opt for connection sharing instead.

4. If all else fails and your network utilization is high, consider upgrading your network infrastructure. If you have an upcoming project, such as an ERP buildout that will use a lot of network bandwidth, you may want to complete an upgrade before implementing the new app. And if you don't budget annually for network upgrades, have your application project team budget for one to support its project.Sites for Database Tuning Information:

SUBSCRIBE TO OUR NEWSLETTER
Stay informed! Sign up to get expert advice and insight delivered direct to your inbox

You May Also Like


More Insights