Cloning is an advanced technique for improving the performance and availability of an IBM WebSphere Application Server (WAS). Cloning allows the workload management system to transparently balance the application server workload among the clones in the server group and automatically switches users from failed application instances to active clones with no interruption in service. The transparency of cloning to the user makes cloning an invaluable technique for maintaining a highly efficient and reliable production environment, but how can system administrators and developers tell their WebSphere clones apart from each other when it becomes necessary to identify a specific server? This article discusses techniques for uniquely identifying clones within WebSphere Application Server Advanced Edition 3.5.x and appropriate strategies for their application. For a full description of the usage and configuration of clones, consult the IBM Redbook "WebSphere Scalability: WLM and Clustering." This article explains why good programming practice dictates that cloned Web applications running in a production environment should not depend on knowing the identity of the clone they are running on. In a development environment, however, the specific identity of a clone can be useful in a variety of ways, and this article provides developers with information, strategies, and tools for distinguishing between individual WebSphere clones operating in a variety of configurations.
What are clones? Clones are WebSphere Application Servers that appear identical to the user. When the user selects a Web application, the WebSphere workload management system determines which clones are available to run the application and dispatches the request to the most available clone. A developer selectable algorithm determines the most available clone. Cloning with the WebSphere Administration System is a simple process. The system administrator creates a model of the desired application server. This model contains all of the information that an application server would, but it exists only as a data set in the administration repository. It is not associated with any specific node or physical system, nor is it running as a process. Depending on the architecture that the system administrator implements, one or many machines will instantiate copies of the model. The two major architectural design strategies used in relation to cloning are vertical and horizontal. Vertical cloning instantiates multiple copies of the application server model on the same node. What is the benefit of creating multiple clones of the same application server on the same node? WAS is based in the Sun Java specification and executes Java code. In fact, each application server runs its own Java virtual machine. Due to limitations inherent in the current version of the virtual machine specification, each Java virtual machine can make use of only a single processor. Therefore, in a multiprocessor environment, running multiple Java virtual machines increases efficiency, because a single Java virtual machine cannot make use of all the available processing capabilities. Even in a single processor environment, a single Java virtual machine may not use all of a processor's resources. Vertical cloning solves this problem. In a vertically cloned environment, multiple copies of the same application server (running as multiple Java virtual machines) execute on the same machine making maximum use of available system resources, whether a single processor or multiple ones. Because the vertically cloned application servers are identical, the workload management system can pick the most available server and direct the next request to idle or under-utilized system processors. Horizontal cloning instantiates multiple copies of an application server model on different physical machines. This is similar to the vertical cloning scenario, but the processors, instead of being in a single, multiprocessor machine, are distributed to multiple machines. The WebSphere workload management system still can route the next request to the most available system. Unlike vertical cloning, this configuration provides the added benefit of application failover. If one machine shuts down, for whatever reason, the application continues to be available on the other application server clones with no apparent change in application availability to the user. It is this redundancy that makes horizontal cloning a valuable technique for achieving a high-availability configuration and providing uninterrupted service for critical business environments. Horizontal and vertical cloning can be combined to gain both the benefit of failover and maximum utilization of system resources. An environment with multiple machines, some of which have multiprocessor boards, is a prime example. Combining these styles of cloning does not, however, change the critical need for transparency for productionapplications or the techniques by which the clones can be distinguished.
Transparency and distinguishing clones Users who request a Web application cannot distinguish between clones, because operation of the clones has been deliberately made transparent to users to provide a seamless high-performance, high-availability environment. Maintaining this transparency between clones in a production environment is absolutely vital. The WebSphere workload management system assumes that clones from the same model are identical in every detail and relies on this fact when dispatching user requests to cloned servers. If there are any differences between the cloned servers, such as dependencies on the application server identity, the workload management software-blindly trusting in the benevolence of the application programmer-may send the request to the wrong server. When this happens, the assumptions necessary to maintain transparency have been violated, and WAS is likely to respond with unexpected and difficult-to-trace error messages. Many inexperienced developers design applications that recognize which clone they are executing on. This invites conflict with the WebSphere workload management system, because applications that do this perform differently as they are dispatched to different clones. The developer has unintentionally introduced a difference in program behavior between cloned servers. This may not cause a problem, or it may result in the intermittent and hard-to-trace application problem that behaves normally in a test environment but fails in a busy production environment when multiple clones are active. In many instances, this can lead to difficulties that resist simple fixes and generally require a restructuring of the Web application's design. Thus, a production-level application should not depend on the ability to uniquely identify the clone on which it is executing. A development environment, on the other hand, encounters a different set of issues from those of a production environment. Certain types of performance analysis, such as testing workload distribution or application server failover, routinely require the ability to identify specific clones on which the code is executing. WAS's workload management system supports multiple options for the algorithm for selecting the most appropriate application server to process a request. For developers to test these different options and determine which suit their own application best, developers must differentiate between clones to determine how and where requests are being dispatched and processed. Testing failover capabilities to ensure user transparency is another area where the ability to distinguish between clones is a necessity.
Distinguishing horizontal clones For a developer to distinguish clones programmatically, the clones must have some unique identifier that can be accessed by the developer's code. As will be shown, the more advanced techniques used to distinguish vertical clones also can be used in a horizontally cloned environment. It turns out, however, that developers can accomplish this task much more easily in a horizontally cloned architecture. This is because each machine uses a unique IP address for network communication. The IP address, therefore, uniquely identifies the application server clone that runs on that machine. A Java program, such as a servlet or JavaServer Page (JSP), can access the address through a class called InetAddress that is contained in the java.net package. The following code fragment provides the essential structures:
InetAddress does not have a public constructor; instead, special static methods exist that return a reference to an existing IP address. Line 1 uses the static method getLocalHost() to return the IP address of the local host in an InetAddress object. While a developer can store an InetAddress object and compare it to other objects using the equals(Object obj) method, lines 2 through 4 present this object in more usable forms. The getHostAddress() method on line 2 returns a String object containing the familiar four-byte dotted IP address (for example, 127.0.0.1). This address also can be returned as a four-element byte array by using the similarly named getAddress() method. Line 3 uses the getHostName() method to return a String containing the host name associated with the address. The hashCode() method in line 4 returns an int type that represents the decimal conversion of the 32-bit hexadecimal form of the address. Though any of the results (hostAddr, hostName, hostHash, or even inet) provide a unique identification for the machine on which the code executes, this method provides a single integer that can be stored and compared with nominally greater efficiency than that of the other object types. Note that the toString() method of InetAddress, which executes automatically when an InetAddress object is included as part of a String, returns a concatenation of the getHostName() and getHostAddress() methods.
Distinguishing vertical clones The developer has to work a little harder to distinguish between clone instances in a vertically cloned environment. There is no documented, programmatic method to accomplish this goal. Developers will realize, however, that there must be some way to identify clones, because WebSphere's workload management system itself must identify and distinguish between clones to effectively perform workload balancing and dispatch requests to specific clone instances. Also, WAS comes with sample programs, the BeenThere bean and showCfg, that identify the clone that the servlet happens to be executing on. Many developers find that these system utilities supply enough information for their performance analysis needs, without having to change or recode anything. However, some developers may find that that they need more functionality to accomplish their goals. To design Java-based code that will differentiate between clones, first understand how the WebSphere workload management system accomplishes clone identification. The WebSphere workload management system distinguishes between any group of clones, vertical or horizontal, by using a combination of the queue name and clone index. The queue name represents the group of application servers to which a clone belongs. A stand-alone application server (not cloned) is the only entry in its server group, while a clone exists as one of many within its server group. All entries within a server group must be identical. No server group should contain clones from different models or a clone and a stand-alone server. The clone index represents the individual clone within a particular server group. Thus the combination of queue name and clone index values provides unique identification for any clone (horizontal or vertical) or stand-alone server. However, if the developer must deal with both horizontal and vertical cloning, the clone index on the horizontal clones must be adjusted manually for the clone index to remain unique. The clone index defaults to 1 for the first instance of a clone on a given machine and numerically increases for each additional clone. In a horizontally cloned environment (multiple machines) this default index may not be unique. Thus, the system administrator should stop the clones and assign them a unique index, preferably in numeric order. For purely vertical cloning, this manual adjustment is not necessary. How can a developer retrieve the same information programmatically? The following code fragment describes the process: 1: ServletEngine eng = ServletEngine.getEngine(); 2: ServletEngineInfo engInfo = eng.getInfo(); 3: String xptName = engInfo.getActiveTransportName(); 4: TransportInfo xptInfo = engInfo.getTransportInfo(xptName); 5: Properties properties = xptInfo.getArgs(); 6: String cloneindex = properties.getProperty("cloneIndex"); 7: String queuename = properties.getProperty("queueName"); 8: String uniqueID = queuename+"/"+cloneindex; The servlet engine is what executes the servlets and JSPs that make up the application. This includes the Java virtual machine that the code is running on. In the WebSphere framework, this is represented by the Servlet Engine class, found in the com.ibm.servlet.engine package. Like InetAddress, the ServletEngine class should not be directly instantiated. A better way to return the current instance of the ServletEngine class uses the static method getEngine() as shown in line 1. Line 2 uses the getInfo() method of this instance to retrieve a ServletEngineInfo object. The ServletEngineInfo class, found in the com.ibm.servlet.engine. config package, contains certain information about the ServletEngine object, including a Hashtable representing all of the transports available to the server group. A transport provides the connection between a clone and the Web server that is sending it information. In line 3, the code then calls the getActiveTransportName() method of the ServletEngineInfo object to return the name (as a String) of the current active transport. Line 4 uses the transport name and the getTransportInfo(String name) method of the ServletEngineInfo object to get an instance of TransportInfo class. The TransportInfo class represents the information available about a particular transport and can also be found in the com.ibm.servlet.engine.config class. To allow the developer access to this information, line 5 uses the getArgs() method toconvert the information stored by this instance into a java.util.Properties object. The keys() method can provide an Enumeration object containing all of the keys in the Properties object, but in this case that proves unnecessary. Lines 6 and 7 use getProperties(String key) to retrieve the necessary information. In line 6, the clone index is returned as a String object by using the key "cloneIndex." In line 7, the unique identifier of the server group is returned as a String by using the key "queueName." Line 8 uses concatenation to combine these String objects into a single String that represents a unique code for the current application server clone. This will uniquely represent any server in a vertically cloned environment. As noted above, it will also uniquely represent horizontal clones, if they have been properly configured to have unique clone indices. Thus, for an application server that is vertically cloned or both horizontally and vertically cloned, this technique works best. Note that the classes in the com.ibm.* package are undocumented and are not formally supported.
Conclusion Cloning is a useful technique for expanding the capacity and improving the efficiency and availability of a WAS. This usefulness is based on the effectiveness of the WebSphere workload management system in distributing application requests to maximize processor efficiency, its capability to immediately reassign a failed application instance, and the ability of the cloned servers to maintain all of these activities transparently to the user. To keep this transparency uncompromised, system administrators and developers should not base production application functionality on the ability to distinguish clones. Development or test environments have different goals than a production environment and sometimes have justifiable needs to cross the transparency rule to distinguish between clones. In a horizontally cloned environment, programmers can use the IP address or its derivatives to uniquely identify clones. In a vertically cloned environment, they can use a set of undocumented classes starting with ServletEngine to retrieve unique identification data. If the environment is both horizontally and vertically cloned, the developer can use a combination of techniques for identification, or for properly configured servers with unique clone indices, developers can use the same technique as is used for vertically cloned servers.