1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Thursday, February 21, 2008

Good Rules Of Thumb For Benchmarking And Tuning

Benchmarking Goals

  • Start out with clearly defined performance goals. Wooly, ill conceived or even none exisitant targets defeat the purposes of any bench marking exercise.

  • Performance, throughput and scalability goals should be linked to the business aims of the software.

Testing Environment and Load Simulation
  • Always use data which is representative of what the application will look like in production.

  • Scale you data in order to understand how performance will be affected as the database grows, if database growth is applicable.

  • Use a realistic time distribution for when SQL statements will hit the database, i.e. ask youself at what what sort of frequency are the SQL statements going to hit the database and how this frequency is distributed over the time window when the application will be in use.

  • Use a realistic distribution of the SQL statements that used against the database.

  • Ensure that when benchmarking and tuning your application, that your application is the only using you server, network, I/O infrastructure. Activity outside of your application will skew your results. SANs are brilliant for skewing your results through caching, some SANs queue all I/O until the write part of the cache is written to disk once it becomes full, read aheads can kick in etc . . .

  • When using load simulation tools be aware of 'think' time, if this is to low you can in fact be simulating activity which is a order or magnitude greater than what you are trying to achieve.

  • Leverage Oracle features that allows your environment to easily reset and for tests to be easily reproduced, such as flashback database.


Metrics Capture

  • Use meaningful metrics for capturing performance, the ultimate goal of your exercise will be the fulfilling processing under certain conditions in order to meet certain business goals, therefore you should be able to express some of your findings in terms of figures that are meaningful to the business users of the application.

  • Hit ratios are flawed indicators of performance, there is some PL/SQL on Connor MacDonalds http://www.oracledba.co.uk/ web site which illustrates how to set the buffer cache hit ratio to support this point. Jonathan Lewis has made a presentation on the subject of hit ratios, in which he explains the flaws of traditional hit ratios and put's forward the "Fan Hit Ratio" as a superior performance metric, refer to http://www.jlcomp.demon.co.uk/hit_ratios.html. The crux of hit ratios being flawed is that they loose meaning, latency to name but one of the things that hit ratios do not capture.

  • Techniques for capturing metrics should not incur overheads that they skew your results.


Tuning

  • Tune What You Know, this gets reiterated at every UKOUG conference during various performance and tuning related presentations and justly so. Only tune after you have hard statistics showing where your time is going, do not blindly apply techniques for optimisation first without knowing where the time is being expended.

  • Make one change to your environment at a time when tuning.

  • When tuning, make changes which correspond to the scope of the problem, e.g. when tuning a query look for changes which are within the scope of the query, object statistics, indexes etc rather than initialisation parameter changes. There are obviously parameters that affect optimisation, however, when trouble shooting your first port of call should always be to the places which are local to the problem.

  • When gathering performance metrics and using Oracle 10g, ensure that database time is one of the metrics you collect.

  • If testing turnaround times for parts of your application are considerable, consider (if possible) breaking out certain parts of the application in order to test the affect of certain changes idependantly before performing the full blown test using the real software.

  • Try to understand what parts of the architecture / infrastrcture will throttle the performance of your software.

  • Optimise "single stream" performance as much as possible before scaling out.

  • Once performance goals have been met, document baselines of key performance indicators (document a flight envelope) to provide a tool for production performance trouble shooting.

Wednesday, February 20, 2008

Preventing Database Abuse By J2EE Application Servers

Some time back in around 2002, Mircrosoft took the Sun Petstore sample application and re-wrote in .Net. The driver behind this was to demonstrate that .Net was supperior to J2EE technology in terms of performance and this "they did". This created a real furore in the J2EE architect community at the time. Further investigation into what Microsoft had actually done revealed that the .Net layer of the rewritten application was only a very thin wrapper for calls to transact SQL stored procedures. Here is a references to this benchmark:-
The morals of this story are:-

1. You shouldn't believe all benchmarks on face value.
2. The fastest way to manipulate data in a database is to do as close to the data as possible, i.e. use transact SQL or PL/SQL stored procedures.
3. All software vendors will put spin on their benchmarks in order to make their architectures
and software appear to be the fastest and most scalable on the market.

It is also worth noting that bench marks not using your data loose some meaning when you try to apply them to your specific circumstances.
Tom Kytes once said (perhaps on his blog somewhere) that he believed all the best applications in world used stored procedures or BEGIN ENDs (anonymous PL/SQL blocks). But what if you have an application that does not do this, which tries to do as much in the application server as possible and you couldn't leverage PL/SQL because those skills do not exist in the development team. Even if you did have PL/SQL expertise at hand you would have to effectively re-write your application to leverage this, which you do not have the time and money for. There is still hope, I will cover some of the things you can do in this article.

The main objective is to eliminate latency when accessing data in the database as much as possible. The fastest way to do something is to not do it at all, when using Oracle this means:-
  1. Keep hard parsing to a minimum.
  2. Keep physical reads to a minimum.
  3. Keep latching to a minimum.
  4. Keeping logical reads to a minimum.
  5. Accessing SQL from any language even from PL/SQL carries certain overheads, minimise database access from an application server in order to reduce data access latency and network round trips, assuming that the application server and database server are two seperate machines.

This posting assumes the use of bean managed persistance, i.e. the JDBC API is used explicitly in the Java code and that an Oracle JDBC Driver compatible with the JDBC 2.0 standard is being used. Here are my tips for optimising a J2EE application which uses Oracle, assuming that it uses widget by widget logic (row by row processing) and you do not want to re-write the application, my points should be reasonably comprehensive, but please forgive me if I have missed anything out.

Good Coding Practises

  • Make sure you code is bind valariable friendly, use prepared statements for the same statements that are executed multiple times. Refer to this tutorial from Sun for more information on prepared statements:-

    http://java.sun.com/docs/books/tutorial/jdbc/basics/prepared.html
  • Avoid using the database if you can by caching standing (lookup) data in the middle tier.
  • If there is any application logic that purely uses standing data, after you have cached this perform the processing entirely in the middle tier so as to avoid round trips on the network.
  • If your application is based on processing numbers of dicrete items, lets say 'widgets'. If you have a batch process that processes 10 widgets for example, ensure that statements with execution counts that should relate to the numbers of widgets being processed are not performed more times that the number of widgets.
  • Look for SQL the application is generating that returns no rows or affects no rows, is this redundant SQL ?, can the application logic be changed so that this does not need to be executed.
  • JDBC resources, namely result set, statement and connection handles should always be released when they are no longer required, use v$open_cursor in order to track down cursor 'leaks'.

Intelligent Use Of SQL, PL/SQL and JDBC

  • Use the JDBC batching API to execute multiple statements in one batch. This effectively puts a 'BEGIN' and 'END' around multiple statements. The Oracle JDBC driver supports two flavours of this, the standard JDBC 2.0 interface and an Oracle proprietry interface, refer to:-

http://download-west.oracle.com/docs/cd/B19306_01/java.102/b14355/oraperf.htm

  • Look at using things such as in line functions, subquery factoring and merge statement to achieve in one piece of SQL what you might otherwise require multiple statements for.
  • If you frequently create the same result set and perform application logic against it multiple times, look at aggregating this into a working table.
  • From 8i onwards Oracle has the facility to support Java stored procedures, can some of the data intensive porocessing be moved closer to the database using these, i.e. leverage the existing Java skills in the team.
  • If your application is designed around widget by widget processing logic, it's probably performing a fair amount of index range scanning. Look at making these more efficient by rebuilding larger more frequently accessed indexes with a larger database block sizes, this significantly reduces latching around the buffer cache, i.e. larger blocks = less logical reads + larger leaf blocks = less branch blocks in the index B-tree. Also consider index compression.
  • Leverage the JDBC features for traversing result sets, if for example you use the queries:-

    SELECT empno FROM emp WHERE deptno = 10

    and

    SELECT MAX(empno) FROM emp WHERE deptno = 10

    Within close proximity in your application you might be able to do away with the second query by using the JDBC API to navigate around the result set returned by the first query.
  • Use set the fetch size array in JDBC in order to minimise network round trips when retrieving multiple multiple rows from a query.
  • The chances are that any batch processes in you widget by widget processing software achieves it's throughput via threading, if this is the case you may experience significant log file sync waits (waits on the contents of the log buffer being written to disk). This particular bottleneck can be eliminated by using the 10g asynchronous write to redo log files feature via the COMMIT_WRITE initialisation parameter. However, be advised that there are certain resiliency remaifications to using asynchronous writes to the redo logs.
  • Use connection pooling, all J2EE applications servers support JDBC connection pooling, reusing existing connections is an order or magnitude faster than having to create a connection every time a statement needs to be executed. Also set the minimum and maximum number of connections to the same value in order to prevent connection storms, i.e. bursts of requests from the application server to the database for the creation of connections.

Move The Data Closer To The Application Server => Reduce Data Access Latency

  • Leverage the Oracle 11g client side results cache in order to perform SELECTS without incurring the cost of jumping out of the application server into the database, however be aware that in 11g release 1, ANY changes to the data in the base tables upon which a result set is dependant will invalidate the results in the results cache, will this restriction be lifted in 11g release 2 ?.
  • A better option than using the results cache in 11g may be to build you own cache off the back of the database change notification mechansim available via the 11g JDBC driver for both 11g and 10g databases.
  • If you have more exacting caching requirements look at a caching software specifically for this purpose such as Oracle Coherance, IBM WebSphere DataGrid, Gigaspaces etc . . .

Further reading:-

JDBC 2.0 Fundamentals from Sun http://java.sun.com/developer/onlineTraining/Database/JDBC20Intro/

Oracle 10.2 JDBC Developers Guide And Reference
http://download.oracle.com/docs/cd/B19306_01/java.102/b14355/toc.htm

Tuesday, February 19, 2008

Why Does The Optimizer Get It Wrong ?

    Here are some of factors that I know of that can lead to the optimizer developing sub-optimal execution plans.

  • Stale Statistics
    When gathering statistics knowing the rate at which the base data changes and when it changes is paramount. Some applications for example load temporary 'scratch' tables with data and use these tables in a subsequent SQL statements and then blow the data away. Dynamic sampling can help here, however the cost in terms of time added to the parsing of a statement due to dynamic sampling has to be balanced with the amount of time it takes to execute the query. For example, dynamic sampling may be more suitable for data warehouse and reporting applications where parse time are a small fraction of the total SQL execution time as opposed to OLTP type applications. Indeed, when, how often and how to gather statistics has provoked a lot of debate within the DBA community, here is the take of one expert on this very subject.

    Tables with columns generated from sequences whose statistics become stale are particularly prone to throwing execution plans out. There is a graph in his "Cost-Based Oracle Fundamentals" book by Jonathan Lewis illustrating how the accuracy of predicted cardinalities degrades as values used in equality predicates get increasingly out of range of the column's high low values.

    Dave Ensor (stalwart of the UKOUG DBMS SIG presentation circuit) once made a comment dubbed the statistics gathering paradox, which stated that the only time it is safe to gather statistics was when this made no changes to execution plans (assuming they are optimal in the first place).

  • Even Column Data Distribution Assumtpion
    By default the optimizer assumes that column data is distributed evenly and uniformly, this can be corrected by the creation of histograms, however you should be aware that if method_opt=>'. . . SIZE AUTO' is used, you may find cases where histograms are created when they are not required, histograms not being created when they are needed and histograms created with the wrong number of buckets. I found this to be the case with a function based index used in conjuction with a LIKE predicate.


  • Getting Statistics Wrong
    Using estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE in 10g can be less than 100% accurate when determining the numbers of distinct values for highly skewed columns. Oracle 11g uses a new algorithm to gather more accurate statistics when auto sampling is used, refer to Greg Rahn's blog entry on 11g auto sampling.


  • Maximum Number Of 254 Buckets Per Histogram
    This issue bites when you require skew information on more than 254 distinct values. Some material refers to "Popular values" in a histogram, these are values which are bucket end points. In his book "Oracle Cost Based Fundamentals", Jonathan Lewis provides an example of creating a histogram in favour of supplied popular values. A more extreme solution is to partition tables with more than 254 distinct values.


  • Oracle Bugs
    All software contains bugs, even software produced by world class software producers who adhere to exacting software engineering practices and standards such as level 5 of the Carnegie Melon Capability Maturity Model produce software with bugs in, this is not a slight on Oracle.


    • Bugs in the optimizer cardinality engine


    • Bugs in DBMS_STATS.


    • Costing bugs, plans incorrectly selected.


    • Transformations not being costed, in Oracle 10g some transformations that were never costed in 9i are now costed in 10g. With every new release of Oracle, new parameters prefixed by underscores appear (hidden parameters), these usually relate to new optimizer features which are not always costed until the next release appears. All the parameters relating to the CBO can be found by running a 10053 trace. I'm sure that as I write this, the likes of Jonathan Lewis are already on the case with working what new and hidden parameters affect the CBO in 11g and what they do.



  • Selectivity On Columns Used In Expressions
    Until you start using extended statistics with Oracle 11g, the optimizer has no way of determining the correct selectivity for expressions, e.g. TRUNC(col1) = . . . , this can sometimes lead to sub optimal plans, 11g allows this to be rectified using:-

    DBMS_STATS.CREATE_EXTENDED_STATS(ownname => user,
    tabname => 'MYTAB',
    extension => 'TRUNC(col1)')



  • Incorrect Configuration Of The Cost Based Optimizer Environment
    The Oracle Real World Performance Group recommends that certain parameters affecting the cost based defaults are left at their defaults, refer to Greg Rah's blog:-

    "I think it’s important to understand what variables influence the Optimizer in order to focus the debugging effort. There are quite a number of variables, but frequently the cause of the problem ones are: (1) non-default optimizer parameters and (2) non-representative object/system statistics. Based on my observations I would say that the most abused Optimizer parameters are:

    OPTIMIZER_INDEX_CACHING

    OPTIMIZER_INDEX_COST_ADJ

    DB_FILE_MULTIBLOCK_READ_COUNT

    Many see setting these as a solution to get the Optimizer to choose an index plan over a table scan plan, but this is problematic in several ways:

    1. This is a global change to a local problem
    2. Although it appears to solve one problem, it is unknown how many bad execution
    plans resulted from this change
    3. The root cause of why the index plan was not chosen is unknown, just that tweaking
    parameters gave the desired result
    4. Using non-default parameters makes it almost impossible to correctly and effectively
    troubleshoot the root cause"


  • Data Correlation / The Predicate Dependence Assumption
    Oracle always assumes that predicates are always independent when calculating selectivity and cardinalities. If you have a table with two columns e.g. salary and tax bracket, if you write a query with predicates that refer to these columns there is a dependence between these columns which the optimizer will always assume does not exist, therefore it will work out their cardinalities independantly when patently they are not independant. At a very high level the selectivity of the predicates that is ANDed together is calculated as:-

    selectivity(col1) * selectivity(col2)

    and for columns ORed together:-

    selectivity(col1) + selectivity(col2) - selectivity(col1 AND col2)

    • In an example in "SQL Tuning By Statistics" Wolfgang Breitling fixes this by setting the column statistic density by hand.


    • Dynamic sampling first introduced in Oracle 9i can help with this.


    • Extended statistics in Oracle 11g, Greg Rahn provides an example of this:-
      SELECT DBMS_STATS.CREATE_EXTENDED_STATS(USER, 'CALENDAR', '(MONTH, ZODIAC)') FROM DUAL;

  • Violation Of The Join Uniformity Assumption
    The CBO assumes that there is a one to one correspondance between rows joined from one row source to rows in other row sources. I came across this on Wolfgang Breitlings web site. "Facalcies Of The Cost Based Optimizer" refers to this.




Excellant Sources For Information On Tuning And The Optimizer

Histograms On Columns Containing Unique Data ?

I recently came across an oddity in 10g (10.2.0.3.0) whereby a problem with an execution plan was resolved by creating a histogram on a column containing unique data. This was the query:-





To trouble shoot the issue I ran the query with the gather_plan_statistics hint as per the SQL trouble shooting guide mentioned by Greg Rahn on his blog (www.structureddata.org), this is a technique for finding the root problem of an execution plan based on the predicated cardinality of an operation in the plan being different from the actual cardinality once the SQL has run. Usually the difference between the predicted and actual cardinality will be out be an order of magnitude. I'm not sure who first came across this but the first reference I have found to such a technique was on Wolfgang Breitling's web site (www.centrexcc.com). Here Wolfgang refers to this technique as "Tuning by cardinality feedback".


Once you have obtained your 'enhanced' plan you look for the inner most operation where predicted and actual cardinality is different by a significant margin. In my plan this is line 12:-





The index AD_SERVICE_POINT_IX30 is a function based index and the column SYS_NC00055$ is the pseudo column associated with it. Note that the predicate used with line 12 is a LIKE and that the statistics were gathered on this table by the built in GATHER_STATS_JOB, i.e. Oracle was left to it's own devices to determine what histograms to create and the number of buckets to create the histograms with. After working through the issue with Oracle Support it transpired that the resolution to this was to create a histogram "by hand" on the function based index with the maximum number of buckets:-

exec dbms_stats.gather_table_stats(null,
'AD_SERVICE_POINT',
cascade=>false,
method_opt=>'for columns SYS_NC00055$ size 254');
This is the plan now that the histogram has been manually created:-




In fact Oracle didn't create a histogram at all, there was a histogram on SYS_NC00055$, but it only two rows in user_tab_histograms view for the pseudo column, i.e. all the values were in one bucket = no data distribution information. The number of end points in a histogram is one per bucket + 1. The apology Jonathan Lewis uses to describe this in his Cost Based Fundamentals book is that of the number of fence posts you have per fence panels. After a bit more digging by Oracle support the explanation for method_opt=> FOR . . . COLUMNS SIZE AUTO getting this wrong was because the LIKE predicate is irregular as such there was insufficient information in col_usage$ from which DBMS_STATS could determine what sort of histogram to create. col_usage$ is a dictionary table the RDBMS engine uses to record the usage of a column every time a query is parsed. I assumed that "LIKE being irregular" referred to to the predicate being a special case in terms of it accepting wild card characters as part of it's input strings.

I have come across a couple of references in material alluding to the GATHER_STATS_JOB not being infallible, Greg Rahn refers to the defaults as " . . . a good start but not perfect in specific situations . . ." in "When To Use The Appropriate Database Technology" presentation made at OOW 2007. Richard Foote (http://richardfoote.wordpress.com/) is also not a big fan of method_opt=>' . . . SIZE AUTO' as he claims that this can lead to histograms being created when they shouldn't be and histogram creation omittion when histograms are genuinely required, this Richard goes on to say can lead to excessive latching around the shared pool, there is also the inference that this can lead to sub optimal plans.

Futher reading:-

 
1. 2.