1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Wednesday, February 20, 2008

Preventing Database Abuse By J2EE Application Servers

Some time back in around 2002, Mircrosoft took the Sun Petstore sample application and re-wrote in .Net. The driver behind this was to demonstrate that .Net was supperior to J2EE technology in terms of performance and this "they did". This created a real furore in the J2EE architect community at the time. Further investigation into what Microsoft had actually done revealed that the .Net layer of the rewritten application was only a very thin wrapper for calls to transact SQL stored procedures. Here is a references to this benchmark:-
The morals of this story are:-

1. You shouldn't believe all benchmarks on face value.
2. The fastest way to manipulate data in a database is to do as close to the data as possible, i.e. use transact SQL or PL/SQL stored procedures.
3. All software vendors will put spin on their benchmarks in order to make their architectures
and software appear to be the fastest and most scalable on the market.

It is also worth noting that bench marks not using your data loose some meaning when you try to apply them to your specific circumstances.
Tom Kytes once said (perhaps on his blog somewhere) that he believed all the best applications in world used stored procedures or BEGIN ENDs (anonymous PL/SQL blocks). But what if you have an application that does not do this, which tries to do as much in the application server as possible and you couldn't leverage PL/SQL because those skills do not exist in the development team. Even if you did have PL/SQL expertise at hand you would have to effectively re-write your application to leverage this, which you do not have the time and money for. There is still hope, I will cover some of the things you can do in this article.

The main objective is to eliminate latency when accessing data in the database as much as possible. The fastest way to do something is to not do it at all, when using Oracle this means:-
  1. Keep hard parsing to a minimum.
  2. Keep physical reads to a minimum.
  3. Keep latching to a minimum.
  4. Keeping logical reads to a minimum.
  5. Accessing SQL from any language even from PL/SQL carries certain overheads, minimise database access from an application server in order to reduce data access latency and network round trips, assuming that the application server and database server are two seperate machines.

This posting assumes the use of bean managed persistance, i.e. the JDBC API is used explicitly in the Java code and that an Oracle JDBC Driver compatible with the JDBC 2.0 standard is being used. Here are my tips for optimising a J2EE application which uses Oracle, assuming that it uses widget by widget logic (row by row processing) and you do not want to re-write the application, my points should be reasonably comprehensive, but please forgive me if I have missed anything out.

Good Coding Practises

  • Make sure you code is bind valariable friendly, use prepared statements for the same statements that are executed multiple times. Refer to this tutorial from Sun for more information on prepared statements:-

    http://java.sun.com/docs/books/tutorial/jdbc/basics/prepared.html
  • Avoid using the database if you can by caching standing (lookup) data in the middle tier.
  • If there is any application logic that purely uses standing data, after you have cached this perform the processing entirely in the middle tier so as to avoid round trips on the network.
  • If your application is based on processing numbers of dicrete items, lets say 'widgets'. If you have a batch process that processes 10 widgets for example, ensure that statements with execution counts that should relate to the numbers of widgets being processed are not performed more times that the number of widgets.
  • Look for SQL the application is generating that returns no rows or affects no rows, is this redundant SQL ?, can the application logic be changed so that this does not need to be executed.
  • JDBC resources, namely result set, statement and connection handles should always be released when they are no longer required, use v$open_cursor in order to track down cursor 'leaks'.

Intelligent Use Of SQL, PL/SQL and JDBC

  • Use the JDBC batching API to execute multiple statements in one batch. This effectively puts a 'BEGIN' and 'END' around multiple statements. The Oracle JDBC driver supports two flavours of this, the standard JDBC 2.0 interface and an Oracle proprietry interface, refer to:-

http://download-west.oracle.com/docs/cd/B19306_01/java.102/b14355/oraperf.htm

  • Look at using things such as in line functions, subquery factoring and merge statement to achieve in one piece of SQL what you might otherwise require multiple statements for.
  • If you frequently create the same result set and perform application logic against it multiple times, look at aggregating this into a working table.
  • From 8i onwards Oracle has the facility to support Java stored procedures, can some of the data intensive porocessing be moved closer to the database using these, i.e. leverage the existing Java skills in the team.
  • If your application is designed around widget by widget processing logic, it's probably performing a fair amount of index range scanning. Look at making these more efficient by rebuilding larger more frequently accessed indexes with a larger database block sizes, this significantly reduces latching around the buffer cache, i.e. larger blocks = less logical reads + larger leaf blocks = less branch blocks in the index B-tree. Also consider index compression.
  • Leverage the JDBC features for traversing result sets, if for example you use the queries:-

    SELECT empno FROM emp WHERE deptno = 10

    and

    SELECT MAX(empno) FROM emp WHERE deptno = 10

    Within close proximity in your application you might be able to do away with the second query by using the JDBC API to navigate around the result set returned by the first query.
  • Use set the fetch size array in JDBC in order to minimise network round trips when retrieving multiple multiple rows from a query.
  • The chances are that any batch processes in you widget by widget processing software achieves it's throughput via threading, if this is the case you may experience significant log file sync waits (waits on the contents of the log buffer being written to disk). This particular bottleneck can be eliminated by using the 10g asynchronous write to redo log files feature via the COMMIT_WRITE initialisation parameter. However, be advised that there are certain resiliency remaifications to using asynchronous writes to the redo logs.
  • Use connection pooling, all J2EE applications servers support JDBC connection pooling, reusing existing connections is an order or magnitude faster than having to create a connection every time a statement needs to be executed. Also set the minimum and maximum number of connections to the same value in order to prevent connection storms, i.e. bursts of requests from the application server to the database for the creation of connections.

Move The Data Closer To The Application Server => Reduce Data Access Latency

  • Leverage the Oracle 11g client side results cache in order to perform SELECTS without incurring the cost of jumping out of the application server into the database, however be aware that in 11g release 1, ANY changes to the data in the base tables upon which a result set is dependant will invalidate the results in the results cache, will this restriction be lifted in 11g release 2 ?.
  • A better option than using the results cache in 11g may be to build you own cache off the back of the database change notification mechansim available via the 11g JDBC driver for both 11g and 10g databases.
  • If you have more exacting caching requirements look at a caching software specifically for this purpose such as Oracle Coherance, IBM WebSphere DataGrid, Gigaspaces etc . . .

Further reading:-

JDBC 2.0 Fundamentals from Sun http://java.sun.com/developer/onlineTraining/Database/JDBC20Intro/

Oracle 10.2 JDBC Developers Guide And Reference
http://download.oracle.com/docs/cd/B19306_01/java.102/b14355/toc.htm

1 comment:

Anonymous said...

Well written article.

 
1. 2.