Providing fast access to data is often next to the need to provide high availability as key reasons for enterprise architects to consider a cluster-based deployment, or even a cluster-centric architecture. The general concern is that databases are considered "slow," or at least the limiting factor, in an application's throughput.
Jags Ramnarayan, chief architect of Gemstone Systems, summarized this concern in an interview with Artima:
A typical scenario is that you have events coming in at a very high rate from multiple data sources, but for the application to act on those events, it almost always needs contextual data... What you [often] see is a complex workflow that involves events coming in..., and these events have to be correlated with data that typically resides in databases before you can take an action.
A classic problem people face is that the events are coming in at a very high rate... every time an even comes in, it lands [for example] in a JMS topic, an application listener gets invoked, the application turns around, and accesses one or more databases to look at reference data corresponding to the event before [the system] can act on the event.
The limiting thing here is the point at which I can go to the database. Maybe my messaging bus can deliver something like 5,000 events a second. But if I can't go to the database at the same speed, I [am limited to] operating at the speed of the database, which is typically less than a 1,000 [transactions] a second.
Jags Ramnarayan, chief architect of Gemstone Gemstone Systems, talks about database scalability. (3 minutes 5 seconds) |
It would be tempting to accept the premise that databases are relatively slow, if only because it's an often heard statement at developer conferences and in marketing literature. But just how slow are databases, really?
A variety of standard database transaction processing benchmarks are maintained by the vendor-independent Transaction Processing Council (TPC). Perhaps the best-known such benchmark is the TPC-C transaction benchmark that centers around measuring the performance of a typical credit-debit transaction.
At the time of this writing, the TPC-C record-holder is an Itanium-based HP Superdome server, running Oracle's 10g R2 Enterprise database, and BEA's Tuxedo transaction-processing monitor. The system was benchmarked at being able to perform 4,092,799 transactions per minute, which comes out to about 68,000 transactions per second. The TCP-C benchmark also tracks the cost of performing transactions on a system, which comes to $2.93 for 1,000 transactions on the HP Superdome.
Even granting a wide margin for benchmark-related fine-tuning by the vendors, this HP- and Oracle-based transaction system could still handle many thousands of event processing or Web request processing queries a second (note that the benchmark represents not read-only, but update transactions).
Based on your experience with the databases you've worked with, to what degree would you say typical databases really scale in terms of transaction (query and update) throughput? At what level of transaction volume would you think of clustering as a way to increase the system's throughput?
Post your opinion in the discussion forum.Have an opinion? Readers have already posted 5 comments about this article. Why not add yours?
Frank Sommers is Editor-in-Chief of Artima Developer. He also serves as chief editor of the IEEE Technical Committee on Scalable Computing's newsletter, and is an elected member of the Jini Community's Technical Advisory Committee. Prior to joining Artima, Frank wrote the Jiniology and Web services columns for JavaWorld.
Bill Venners is president of Artima, Inc. He is author of the book, Inside the Java Virtual Machine, a programmer-oriented survey of the Java platform's architecture and internals. His popular columns in JavaWorld magazine covered Java internals, object-oriented design, and Jini. Bill has been active in the Jini Community since its inception. He led the Jini Community's ServiceUI project, whose ServiceUI API became the de facto standard way to associate user interfaces to Jini services. Bill also serves as an elected member of the Jini Community's initial Technical Oversight Committee (TOC), and in this role helped to define the governance process for the community.
Artima provides consulting and training services to help you make the most of Scala, reactive
and functional programming, enterprise systems, big data, and testing.