
Database Performance
You may be pointing the finger in the wrong direction when you accuse your data access middleware of causing performance problems.
The design of your database governs performance to a much greater degree than your choice of database access middleware. Most databases can benefit from changes in table design, choice of index columns and query design. If you focus first on these areas, you'll reap the most rewards from your efforts to make your database perform faster.
Normalizing the design of a database entails the elimination of redundant data, the avoidance of repeating data groups and the creation of separate tables to hold different categories of data. A properly normalized database generally has many tables with relatively few columns in each. Normalization can improve database performance for the following reasons:
Fewer columns in a table imply faster sorting and faster index creation
Indexes can be clustered because there are more tables
Indexes can be more compact
Fewer indexes per table suggest INSERT, UPDATE, and DELETE statements will process faster
Choosing indexes and keys that help normalize the data can help performance. Simplifying long-running queries through the use of appropriate keys and indexes can reduce response times, and you might consider running complex queries that look at entire tables during times of the day when transaction volume is low.
Normalizing the database, along with using SQL that's as simple in construction as possible, helps the SQL compiler portion of the database server process faster. The SQL compiler has to recognize and understand natural language (SQL), then turn the SQL statements into instructions that it gives to the database engine's retrieval and update processes. The SQL compiler's job is compounded by the fact that it has to operate in real time as quickly as possible.
The SQL compiler processes each SQL statement in five basic steps. The first step parses the SQL, examines the SQL for syntax errors, then converts the SQL parse tree into an internal representation IBM, which invented SQL, calls the internal format the QGM (Query Graph Model). The second step examines the reformatted SQL to ensure that executing the statement won't violate referential integrity. The second step also notes whether the database engine should process a constraint or trigger for the SQL.
Next, the SQL compiler rewrites the SQL statement, replacing view references with actual column names and transforming the SQL for processing by the optimizer. The transformation eliminates redundant joins, adds implied predicates and converts INTERSECT clauses to EXISTS subqueries. The optimizer itself, the fourth step, uses cost-based algorithms to determine the most efficient execution method for the SQL. The optimizer finds the best join order, for example, and it decides whether the execution of the SQL statement will be CPU- or I/O-bound. The optimizer chooses an execution path for the SQL statement that will result in the quickest response from the database engine. The fifth step "remembers" the essence of the SQL for later comparison with other SQL statements-the SQL compiler keeps a history of how well it optimizes statements so it can "learn" the fastest ways to access the database. Finally, the SQL compiler's fifth step delivers the compiled, optimized SQL statement to the retrieval and update processes in the database engine.
An understanding of normalization and the SQL compiler's functions is key to making your database access as fast as possible. Look first at these considerations, as well as the tuning parameters for your database, before you blame your database access middleware for slow performance.
|