Summary
Corey Klaasmeyer's recent JavaWorld article provides a detailed introduction to one of the more popular open-source grid toolkits, Globus. In the process, the article defines the differences between grids and clusters, and shows what classes of problems grids are best suited to solve.
Advertisement
The term grid computing has been used, and abused, in many ways in recent years. Grids are often confused with clusters, and even with parallel computing techniques, such as the master-worker pattern popularized by JavaSpaces, for instance. That confusion threatens to overshadow the true benefits of compute grids that have recently become available for general commercial use, such as Amazon's EC2 and Sun's SunGrid.
Grids solve the more complicated problem of providing computing resources to groups that span organizational boundaries, where resources are spread among the groups... A grid may marshal numerous clusters from different organizations into a logical set of computational resources available to a group of authorized users... Grid services must live in more a complex environment where resources must be shared and secured according to policies that may differ from organization to organization.
In emphasizing the resource-sharing aspects of grids, Klaasmeyer re-iterates grid researcher Ian Foster's three-point definition of a grid:
Coordinates resources that are not subject to centralized control
Uses standard, open, general-purpose protocols and interfaces
Delivers nontrivial qualities of service
The most interesting parts of Klaasmeyer's article describe the open-source Globus Toolkit, and provide an example of using Globus's Java implementation to expose a grid service.
As the examples accompanying the article show, implementing a Globus service is made more complex by Globus relying on XML-based Web service technologies to describe, locate, and invoke grid services. Having defined the Java interface representing the service, an implementation class must also specify that it implements Globus' Resource interface. The implementation is then registered with the Globus grid runtime with a deployment descriptor and a WSDL file. Clients can locate and invoke the grid service using information in the WSDL.
In the article, Klaasmeyer notes that most grids exist in a highly distributed environment, and that large-scale distribution require some level of complexity.
Globus is but one of many grid computing toolkits. For example, Amazon's EC2 compute cloud provides a compute resource for rent, with its own Web service-based interface for registering and managing grid nodes.
Today much of the focus in scalable enterprise computing centers around clusters and other forms of parallel architectures that operate within the confines of an organization's firewalls. What sorts of enterprise computing tasks do you think grids might be a better fit for?