John Goodson

Subscribe to John Goodson: eMailAlertsEmail Alerts
Get John Goodson: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Java Developer Magazine

Java Developer : Article

Making Your WebSphere Apps Run Faster and Jump Higher

Making Your WebSphere Apps Run Faster and Jump Higher

What's the best way to develop and fine-tune your WebSphere applications to run faster, jump higher…and make fewer trips to the database?

When creating optimized WebSphere applications that access data on your DB2 database, what kinds of challenges do you face? Creating DB2-enabled WebSphere apps involves the Java Database Connectivity (JDBC) API, which can be a challenge in itself. This article will look at various DB2-enabled applications and offer some guidelines to help your WebSphere applications run more efficiently when they connect to a DB2 database.

Designing WebSphere Applications
The tips in this section will help you optimize system performance when designing your applications.

Start by Planning Your Connections
Connection management is important to application performance, so optimize your application by connecting once and using multiple statement objects, instead of performing multiple connections. Avoid connecting to a data source after establishing an initial connection.

Although gathering driver information at connect time is a good practice, it’s often more efficient to gather it in one step rather than two. For example, some apps establish a connection, then call a method in a separate component that reattaches and gathers information about the driver. Applications that are designed as separate entities should pass the established connection object to the data collection routine instead of establishing a second connection.

Another bad practice is to connect and disconnect several times throughout your application to perform SQL statements. Connection objects can have multiple statement objects associated with them. Statement objects, which are defined to be memory storage for information about SQL statements, can manage multiple SQL statements.

You can improve performance significantly with connection pooling, especially for applications that connect over a network or through the Web. Connection pooling lets you reuse connections. Closing connections doesn’t close the physical connection to the database. When an application requests a connection, an active connection is reused, thus avoiding the network I/O needed to create a new connection.

Plan connection and statement handling before you implement the application. The time you spend thoughtfully handling connection management will lead to improved application performance and maintainability.

Be Careful with Commits
Committing transactions is extremely disk I/O intensive and slow. Always turn autocommit off by setting:

WSConnection.setAutoCommit
(false)

What does a commit actually involve? The DB2 server must flush back to disk every data page that contains updated or new data. This isn’t a sequential write, it’s a searched write to replace existing data in the table. By default, Autocommit is on when connecting to a data source, and Autocommit mode usually impairs performance because of the significant amount of disk I/O needed to commit every operation.

Although using transactions can help application performance, don’t take this too far. Leaving transactions active can reduce throughput by holding locks on rows for long times, preventing other users from accessing the rows. Commit transactions in intervals that allow maximum concurrency.

Avoid Distributed Transactions
DB2 supports distributed transactions – that is, transactions that span multiple connections. Distributed transactions are at least four times slower than normal transactions due to the logging and network I/O necessary to communicate between all the components involved. Unless distributed transactions are required, avoid using them. Instead, use local transactions whenever possible.

For the best system performance on DB2, design the application to run under a single Connection object.

Retrieving Data
To retrieve data efficiently, you should return only the data you need, using the most efficient method possible. The following guidelines will help you to optimize system performance when retrieving data with JDBC applications.

Avoid Retrieving Long Data
Unless it’s necessary, applications should avoid requesting long data because retrieving long data across a network is slow and resource-intensive. Most users don’t want to see long data. If the user does want to see these results, then the application can query the database again, specifying only the long columns in the select list. This method allows the average user to retrieve the result set without having to pay a high performance penalty for network traffic.

Although the best method is to exclude long data from the select list, some applications don’t formulate the select list before sending the query to the JDBC driver (that is, some applications select * from <table name> ...). If the select list contains long data, then some drivers must retrieve that data at fetch time even if the application doesn’t bind the long data in the result set. Whenever possible, try to implement a method that doesn’t retrieve all columns of the table.

Additionally, although the getClob and getBlob methods allow the application to control how long data is retrieved in the application, you should realize that in many cases, the JDBC driver emulates these methods. The driver must retrieve all of the long data across the network before exposing the getClob and getBlob methods.

Sometimes you must retrieve long data. In this case, remember that most users don’t want to see 100 KB, or more, of text on the screen.

Reduce the Size of Data Retrieved
To reduce network traffic and improve performance, you can reduce the size of any data being retrieved to a manageable limit by calling setMaxRows, setMaxFieldSize, and the driver-specific SetFetchSize. Another method of reducing the size of the data being retrieved is to decrease the column size. If the driver allows you to define the packet size, use the smallest packet size that will still meet your needs.

In addition, be careful to return only the rows you need. If you return five columns when you only need two, performance is decreased, especially if the unnecessary rows include long data.

Choose the Right Data Type
Retrieving and sending certain data types can be expensive. When you design a schema, select the data type that can be processed most efficiently. For example, integer data is processed faster than floating-point data. Floating-point data is defined according to internal database-specific formats, usually in a compressed format. The data must be decompressed and converted into a different format so it can be processed by the wire protocol.

Different DB2 data types take different amounts of time to process (see Table 1). Processing time is shortest for character strings, followed by integers, which usually require some conversion or byte ordering. Processing of floating-point data and timestamps is at least twice as slow as processing integers.

Updating Data in the DB2 Database
This section provides general guidelines to help you optimize system performance when updating data in databases.

Use updateXXX Methods
Although programmatic updates don’t apply to all types of applications, developers should try to use programmatic updates and deletes. Using the updateXXX methods of the ResultSet object allows you to update data without building a complex SQL statement. Instead, you simply supply the column in the result set that’s to be updated and the data that’s to be changed. Then, before moving the cursor from the row in the result set, call the updateRow method to update the database as well.

In the following code fragment, the value of the Age column of the Resultset object rs is retrieved using the method getInt, and the method updateInt is used to update the column with an int value of 25. The method updateRow is called to update the row in the database that contains the modified value:

int n = rs.getInt("Age");

// n contains value of Age column in the resultset rs

. . .

rs.updateInt("Age", 25);

rs.updateRow();

In addition to making the application more easily maintainable, programmatic updates usually improve performance. You don’t need performance-expensive operations to locate the row to be changed, because the DB2 server is already positioned on the row for the Select statement in process.

Maximizing Metadata Methods
Because database metadata methods that generate Resultset objects are slow compared to other JDBC methods, using them too often can impair system performance. The guidelines in this section will help you to optimize DB2 system performance when selecting and using database metadata.

Minimize the Use of Database Metadata Methods
Compared to other JDBC methods, database metadata methods that generate Resultset objects are relatively slow. So that you don’t need multiple executions, applications should cache information returned from result sets that generate database metadata methods.

While it’s almost impossible to write a JDBC application without using database metadata methods at all, you can improve system performance by using them as little as possible. To return all result column information mandated by the JDBC specification, a JDBC driver may have to perform complex queries or multiple queries to return the necessary result set for a single call to a database metadata method. These particular elements of the SQL language are performance-expensive.

Applications should cache information from database metadata methods. For example, call getTypeInfo once in the application and cache away the elements of the result set that your application depends on. It’s unlikely that any application uses all elements of the result set generated by a database metadata method, so it shouldn’t be difficult to maintain the cache of information.

Avoid Search Patterns
Using null arguments or search patterns in database metadata methods generates time-consuming queries. In addition, network traffic potentially increases due to unwanted results. So always supply as many non-null arguments as possible to result sets that generate database metadata methods.

Because database metadata methods are slow, applications should invoke them as efficiently as possible. Many applications pass the fewest non-null arguments necessary for the function to return success. For example:

ResultSet WSrs = WSc.getTables (null, null, "WSTable", null);

should be :

ResultSet WSrs = WSc.getTables ("null", "johng", "WSTable", "TABLE");

Sometimes little information is known about the object for which you’re requesting information. Any information that the application can send to the driver when calling database metadata methods can result in improved performance and reliability.

Determine Table Characteristics with a Dummy Query
Avoid using getColumns to determine characteristics about a table. Use a dummy query with getMetadata instead.

Let’s consider an application that allows the user to choose the columns that will be selected (see Listings 1 and 2). Should the application use getColumns to return information about the columns to the user or instead prepare a dummy query and call getMetadata?

In both cases, a query is sent to the DB2 server, but in Listing 1 the query must be evaluated and form a result set that must be sent to the client. Because the Where clause of the query always evaluates to FALSE, the query generates no result rows and should execute without accessing table data. Clearly, the Listing 2 model will perform better.

Selecting JDBC Objects and Methods
The following guidelines will help you optimize system performance when selecting and using JDBC objects and methods.

Use Parameter Markers with Stored Procedures
When calling stored procedures, instead of using literal arguments, you should always use parameter markers for the argument markers. JDBC drivers can call stored procedures on the DB2 server either by executing the procedure as any other SQL query, or by optimizing the execution by invoking a Remote Procedure Call (RPC) directly into the DB2 server. Executing the stored procedure as a SQL query results in the DB2 server parsing the statement, validating the argument types, and converting the arguments into the correct data types. Remember that SQL is always sent to the DB2 server as a character string; for example:

"{call getCustName (12345)}"

In this case, even though the application programmer might assume that the only argument to getCust-Name is an integer, the argument is actually passed inside a character string to the server. The DB2 server would parse the SQL query, isolate the single argument value 12345, then convert the string “12345” into an integer value.

By invoking an RPC inside the DB2 server, the overhead of using a SQL character string is avoided. The procedure is instead called by name only, with the argument values already encoded into their native data types.

Case 1
Stored Procedure can’t be optimized to use a server-side RPC. The DB2 server must parse the statement, validate the argument types, and convert the arguments into the correct data types:

CallableStatement cstmt = conn.prepareCall ("call getCustName (12345)");

ResultSet rs = cstmt.executeQuery ();

Case 2
Stored Procedure can be optimized to use a server-side RPC. Because the application calls the procedure by name and the argument values are already encoded, the load on the DB2 server is less:

CallableStatement cstmt – conn.prepareCall ("Call getCustName (?)");

cstmt.setLong (1,12345);

ResultSet rs = cstmt.executeQuery();

Use PreparedStatement Objects for Repeated SQL Statements
JDBC drivers are optimized based on the perceived use of the functions being executed. Choose between the PreparedStatement object and the Statement object depending on the planned use. The Statement object is optimized for a single execution of a SQL statement. In contrast, the PreparedStatement object is optimized for SQL statements that will be executed two or more times.

The overhead for the initial execution of a PreparedStatement object is high, but the benefit comes with subsequent executions of the SQL statement.

Choose the Right Cursor
Choosing the appropriate type of cursor allows maximum application flexibility A forward-only cursor provides excellent performance for sequential reads of all of the rows in a table but it can’t be used when the rows to be returned aren’t sequential.

Insensitive cursors used by JDBC drivers are ideal for applications that require high levels of concurrency on the DB2 server and require the ability to scroll forwards and backwards through result sets. The first request to an insensitive cursor fetches all of the rows and stores them on the client. Thus, the first request is very slow, especially when long data is retrieved, but subsequent requests require no network traffic and are processed quickly. Because the first request is processed slowly, insensitive cursors shouldn’t be used for a single request of one row. As a designer you should also avoid using insensitive cursors when long data is returned, because memory can be exhausted.

Conclusion
With thoughtful design and implementation, the performance of JDBC applications on DB2 can be improved. By using DatabaseMetaData methods appropriately, retrieving only required data, selecting functions that optimize performance, and managing connections and updates, your applications can run more efficiently and generate less network traffic on your WebSphere application server.

By following the tried-and-true approaches for JDBC in this article, you can develop and fine-tune your WebSphere applications to run faster, jump higher…and make fewer trips to the database.

More Stories By John Goodson

As vice-president of product operations, John Goodson leads the product strategy, direction, and development efforts at DataDirect Technologies. For more than 10 years, he has worked closely with Sun and Microsoft on the development and evolution of database connectivity standards including J2EE, JDBC, .NET, ODBC, and ADO. His active memberships in various standards committees, including the JDBC Expert Group, have helped Goodson's team develop the most
technically advanced data connectivity technologies. He holds a BS in computer science from Virginia Tech.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.