Summary
Tangosol recently released a set of products aimed at building enterprise data grids. Artima asked Tangosol founder and CEO Cameron Purdy to explain the difference between a second-level cache and a data grid, and about the kinds of enterprise problems data grids solve.
Advertisement
Tangosol, a company best-known for its Java enterprise data caching products, released [Note: PDF download] a new offering aimed at building data grids. While most developers are familiar with second-level caches, a data grid may not be an equally household concept.
Artima asked Tangosol founder and CEO Cameron Purdy to explain the difference between data caching and a data grid:
Database caching is the ability to access database information transparently through a cache, loading database information as it is needed, and using various technologies to keep it up to date, in memory, so that the application can truly rely on the cached information. Caching, [therefore], is ... significantly more powerful than just keeping local copies of database data. However, even that—keeping local copies of database data—can prove immensely helpful to a significant class of applications.
A data grid is a service available across any number of applications that exposes data to those applications, and that acts as a real-time, transactional, reliable, scalable and highly performant system-of-record. [While] data grids are typically backed by databases, data grids [also] coalesce information from any number of data sources, and manage that data in a manner natural to modern platforms such as Java, .NET, C++, and Ruby. [In addition], data grids host not only the data but the business logic related to that data as well.
Purdy pointed out that in addition to providing the benefits of a data cache—managing in-memory copies of data near where computation is performed on that data—enterprises started to use data grids as a system integration tool:
Data grids are specialized for in-memory data access and management, and serve as excellent integration and consolidation points for federating existing enterprise systems, application silos and legacy services... Data grids scale data services like computational grids scale computational services, [and] provide transparent failover, and reliability through transactions.
One reason a data grid supports system integration is that it allows data to be accessed from a diverse array of sources in the form most suitable for a client. While a regular database can be accessed from a diversity of client technologies—JDBC or ODBC, for instance—a grid client can access data from many sources without having to know where that data originates.
Purdy noted that he expects similarly transparent application server support for data grids to what already exists for data caches:
The types of applications that best utilize data grids are those that benefit from the data being managed in a form natural to the application—such as an object form—and with in-memory latency... The EJB3 programming model should be identical whether working with data grids or with other enterprise data sources. Not all application servers provide that level of transparency today [for data grids], but they will get there.
Tangosol's new Coherence Data Grid Solution Set offers the following features:
Coherence Data Client: Enterprise-wide access to services provided by the data grid.
Coherence Real Time Client: Real-time access to data feeds, including near-caching of data on the client as well as continuous query caching.
Coherence Compute Client: Optimized for data-intensive compute grid nodes as well as transaction intensive application servers.
How often does your enterprise application need to access data from multiple data sources? What techniques do you use to manage such data access, and how do you scale that technique to handle a large number of requests?
Maybe this is unrelated to the topic, but something Cameron wrote in his blog a while ago left me curious. Regarding the famous "Ten Fallacies of Distributed Programming" he said "(...) the "fallacies" paper itself does not reflect a lot of the knowledge and best practices that actual developers building these applications have accumulated." (http://www.jroller.com/page/cpurdy?entry=the_seven_habits_of_highly3)
When I went back and read that link, I though of this movie that my kids watch called "The Incredibles". One of the funny lines in it is "Everyone's special." The kid replies "That's just a way of saying that no one is special." Arbitrary blanket use of Exceptions can suffer from the same syndrome.