1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Monday, December 01, 2008

J2EE Layered Application Architectures and Performance

I have spent most of my career as a DBA focusing on performance. With most applications there is an intimate relationship between the design and architecture of the application and that of its performance and scalability. Generally speaking, if the design and architecture is wrong, getting an application to perform and scale can be akin to applying elastoplasts over the biggest holes.

As an industry, IT has a patchy record for delivering projects on time and to budget, as such there are many projects out there that barely manage to cross the finishing line having met all of their functional requirements let alone performance requirements. Due to this, tuning can be reactive and end up becoming a battle against the architecture and design.

There was a time when most applications consisted of a client and business logic written directly in the database using stored procedures. Indeed the database software formed the largest part of the application stack. However, more recent architectural trends have been heavily influenced by:-

  • The internet, Web 2.0 has firmly established itself and Web 3.0 is around the corner

  • Service Oriented Architectures (SOA)

  • Software as a service (SAAS)

  • Event driver architectures
  • Extreme transaction processing
  • Grid architectures
  • Mash ups

Indeed, there was a time when the database was a focal point of integration efforts, now the industry is moving towards integration via XML based middleware. Here is one such article on this very subject from design and architecture luminary Martin Fowler. In the current climate "integration architects" are highly sort after people.

Cloud computing is on its way and just as during the old web 1.0 days, to quote a marketing tag line from Sun. "We were into intranets when most people thought this was a typo", there will also be internal clouds as well. Therefore if you have an interest in performance and tuning, it is highly likely that you will have some involvement in the areas outside the database and find that the database is being under utilised because of what is happening in the layers above it. Cary Millsap made a presentation at Oracle Open World 2007 highlighting the importance of establishing application profiles when tuning.

One of the most prevalent software architectures today uses a combination of layering and tiering, a tier being a layer that is physically deployed on a separate server as opposed to a layer. What is paramount about the tier-ing / layering design approach is that it provides a "clean separation of concerns". Layering can also be vertical, i.e. swathes of development for particular functional areas which cut through the horizontal layers. Classically, the 'Horizontal' layers are:-

  • Presentation

  • Business logic (or domain)

  • Integration

Architects in the finance world prefer tiers, in that each tier can reside in its own DMZ (de-militarised zone) and hence provide maximum security. My interest in application architecture has developed from the bottom up, i.e. from the database upwards. However, the integration layer is not always an RDBMS, it can be:-

1. A web service

2. A message queue

3. Distributed object cache (which may be transactional)

4. An enterprise service bus adapter or message broker

5. Java Connection Architecture connection

6. etc . . .

In fact with XA distributed transactions, a transaction can span multiple end points below the integration layer.

Some patterns are almost de facto in certain layers and tiers, such as model view controller in the presentation tier, session facade in the business logic or domain tier, to name but two. Also, there are frameworks and third party libraries that take a lot of the effort out of having to crank out boiler plate code and the code behind the application infrastructure, e.g.:-


1. Apache struts and Java Server Faces for the presentation layer

2. Spring for the whole application or specific layers

3. Log 4j for logging

4. Apache commons, which is a common utilities library

However, whatever you do you should never; code business logic into the presentation layer or explicitly code details of the integration layer into the business logic or domain layer.

When looking at the performance of such architectures one needs to establish a profile of where particular activities spend their time within the various layers or tiers and the networks involved in connecting any tiers. If a J2EE application server is involved in the application stack, something known as the "carrot model" should be understood. All J2EE application servers use the following set of components that form a pipe line from the 'front' of the application server to the 'back', these are:-

1. Object request broker used for J2EE clients.

2. Web container

3. Enterprise Java Bean Container

4. JDBC Pool or JCA connectors.

The WebSphere application server has a http server built into the web container, however, this is only really recommended for testing purposes so you will probably have an http server outside of the J2EE application server. What the "carrot model" represents is that component utilisation is usually quite high at the 'front' of the application server and gradually dwindles off as you get towards the JDBC connection pool or JCA connectors, hence the analogy of a carrot, thick at one end and thin at the other. Note that ORB is only used for J2EE clients, web clients will utilise the web container http server or external http server directly.

A lot of texts refer to the effort and careful consideration that needs to go into designing the layers of the applications architecture. I would add that the same level of consideration needs to be given to where processing should be carried out. In the simplest of architectures with a J2EE application server on one machine and a database on another, there is still scope for the network to become the bottleneck by excessive calls being made to the database to carry out the processing of business logic and also standing data lookup. In one example, a stateless bean used to access fixed domain data was the most utilised bean in the application until a standing data cache in the application server was implemented. In another example, a stored procedure was called to validate individual items within individual records, using fixed domain data cached in the application server. By carrying the validation entirely within the application server, the process concerned ran faster, required fewer threads than before with lower CPU consumption on both of the servers hosting the application server and the database.

There is equal scope for poor performance and scalability to be present by design with vertical layering also. One particular design trend is to develop specific functional areas with an API implemented using a session facade. Tools such as Apache Axis and also the J2EE application server itself make exposing stateless session beans as a web service easy and straight forward. Consider two separate beans used to implement the APIs for two different functional areas, if these areas are tightly coupled and they are to be used in a service oriented architecture, there is going to be a significant amount of XML to Java serialisation that takes place involving the two respective web services, that might be avoided, if the API was provided by a single bean.

I strongly advise people to read the IBM red book "Designing and Coding Applications for Performance and Scalability in WebSphere Application Server". This is one of the best texts I have come across in the area of designing and writing scalable and well performing J2EE applications. If I had my way, this would be mandatory reading material for all developers, designers and architects on J2EE development projects.
In the not too distant future I will post an article on my experiences of the specific things that can stifle utilisation of the database server when a J2EE application is used.

No comments:

 
1. 2.