Replication through a Meta-Object Protocol Bert Robben(*), Wouter Joosen(+), Frank Matthijs, Johan Van Oeyen, Stijn Bijnens and Pierre Verbaeten Dept. computer science K.U.Leuven Celestijnenlaan 200A 3001 LEUVEN Belgium email: bert.robben@cs.kuleuven.ac.be (*) Researcher of the National Fund for Scientific Research (+) Researcher of the Flemish IWT Introduction Object-orientation has become a main-stream approach for software development in general because it enables software houses to develop applications in a man power effective way. Concurrent object-oriented programming is a very promising area in computer science, because of its expressive power and its natural approach to model the real world. It is a widely accepted view that the more a model matches the real world, the easier it becomes to understand, modify and maintain the software system that implements this model. An emerging new field in the area of distributed systems is mobile computing. We view a mobile system as a system in which certain parts can become unreachable during a non-negligible period of time. When that period has passed, the part is reconnected. Two additional properties complicate matters further: on the one hand, the part may have changed its state and on the other hand it can be reconnected at a different location than the one it was disconnected from. A traditional solution is to install a replication protocol that transparently presents a copy of the disconnected part to the rest of the system and at reconnection time, executes a merge policy to restore the part's integrity. This position paper describes how meta-object protocols can be used to implement such replication protocols in an environment for mobile users. After a short description of the target environment, it is shown how meta-object protocols help to achieve a desirable separation of concerns between application and replication protocol. The benefit of this approach is reusability. The domain where this can be applied is much broader however. The paper further outlines how this approach can be scaled to achieve fault-tolerance. Finally, we give a short comparison with other work in this area and conclude. Mobile Objects in CORRELATE CORRELATE is a class-based concurrent object-oriented language. An important concern in such a language is the synchronisation of objects. Depending on the state of the object, a specific operation may or may not be executed. In other words, in a certain state, an object can accept only a subset of its entire set of operations in order to maintain its internal integrity. In CORRELATE, this is expressed with synchronisation preconditions: an operation on an active object will be blocked by the language runtime until its associated precondition becomes true. In CORRELATE, an object is a unit of concurrency. Only one operation can be active on a single object at a given moment in time. Object interaction takes place by sending messages (invoking operations). CORRELATE supports both synchronous and asynchronous message passing. The application programmer must explicitly specify what kind of message passing (s)he prefers. The underlying execution environment is built for use in a distributed environment. It delivers two important basic mechanisms: location independent object invocation allows CORRELATE object to interact transparently to the address space they are living in. The second mechanism is migration: active objects are mobile and can be migrated to other address spaces. We have built a prototype of this language framework running on Dec Alpha, Sun Solaris and SGI. Replication Protocols in a Mobile Environment When the distributed objects as described above become mobile, additional functionality is required. Some objects will not be reachable for a certain period of time and when they reappear, they might have changed. Several policies are possible to deal with this situation. An example policy can be that a copy of the disconnecting object is made and that this copy is used as a read-only object. Write-requests are suspended until the original object is back online. But this is only an example. Two fundamental requirements of a replication system are versatility and reusability: the system should allow for a variety of replication protocols and a given replication policy should be reusable for many applications. The meta-object protocol of CORRELATE supports such a system. A default meta-object views its base-level object as an abstract process that is created, that sends and receives messages and that eventually is destroyed. Through this protocol, a meta-object has sufficient control over the application object to allow a wide diversity of replication policies. Moreover, the strict separation of concerns between base and meta guarantees a good reusability. The following code fragment shows the interface of the default meta-object. A more in depth discussion of our mop can be found in [Joosen]. active MetaObject { autonomous: void Activate(); // process one message from the incoming message queue interface: void Construct(ConstructorMessage msg); // create object void Destroy(DestructorMessage msg); // destroy object void MessageIn(InvocationMessage msg); // put in incoming message queue void AMessageOut(InvocationMessage msg); // forward to receiver void SMessageOut(InvocationMessage msg); // forward and become BLOCKED void End(InvocationMessage msg); // become READY void ReplyMessage(InvocationMessage msg); // accept and become RUNNING behaviour: bool IsUNINITIALIZED(); bool IsREADY(); bool IsRUNNING(); bool IsBLOCKED(); for Activate() precondition IsREADY(); }; An example replication policy The next example makes the previous paragraph more concrete. The idea is as follows: when the user wants an object to go off-line, a Disconnect call is made on the meta-object. The meta-object will write the data of its application object into a bitstream (e.g. a file) and reach the state DISCONNECTED. In that state, not all incoming messages will be processed. Only the messages that are read-only are eligible for selection. When after a certain period of time, the object is reconnected again, the meta-object replcaes the old copy of its application object with the new version and resumes processing in the standard way, i.e. allowing writer operations as well. This is just a simple example of a replication policy[Mullender]. More complex policies that allow write-operations during the time the object is off-line are possible as well. The Reconnect operation will in such a case, execute a merge operation restoring consistency to the object. active SimpleMetaObject : public MetaObject { autonomous: void Activate() { // process a message from the incoming msg queue if (IsCONNECTED() ) TakeMessageFromQueueAndExecuteIt(); else TakeReadOnlyMessageFromQueueAndExecuteIt(); } interface: ... void Disconnect( Bitstream* ); void Reconnect ( Bitstream* ); behaviour: bool IsCONNECTED(); ... }; Notice that throughout the example no explicit reference is made to the exact type of the application object. This will ensure reusability accross different applications. An important property of the meta-objects in CORRELATE is the fact that they are active objects and as such enjoy the benefits of the language framework. One example of this is method synchronisation. E.g. in the previous example, a precondition could be attached to the Disconnect operation. That way, the meta-object can ensure that the application object is only disconnected in a stable state. Another example is when an application object needs to be reconnected at another location. Since the meta-object is an active object, it can simply be migrated to the destination location just before reconnection. A More General View In a sense, disconnecting an object can be considered a special case of the more general idea of failure of an object. The major difference and additional difficulty with fault tolerance is that failures unlike disconnections are not announced and thus a much more profound approach is required. We are exploiting the power of our protocol in this wider area of fault tolerance as well. Thereby focussing no only on replication but also on checkpointing and transactions. Using our mop, we are now building a set of protocols that enable an application object to tolerate a failure. Depending on the kind of failures that should be tolerated on the one hand, and the performance penalties of introducing this extra protection on the other hand, the application programmer chooses which policy is the most appropriate choice for his situation. The definition of our mop and the strict separation of concerns it enables, is the corner stone of this strategy. Related work Some existing work in this area [Stroud, Fabre] show that this approach is feasible. The above mentioned references both use the sequential object model op OpenC++ [Chiba] as mop. Properties of active objects are thus not available at the meta-level and have to be created by the application programmer himself. Our focus on the other hand, is on applications in a concurrent and distributed environment and as such we inherently start from a concurrent object model and a concurrent mop. We believe that the task of building powerful meta-objects is made easier by the higher level language support of CORRELATE. Summary In this position paper, our view on the applicability of replication policies, and fault-tolerance algorithms more in general using a meta-object protocol is discussed. We believe that objects in an environment for mobile users can profit from this technique as well. The mobility and replication workshop at Oopsla provides an ideal opportunity for us to validate the generality of our approach and get a better view on the new work in the area of mobile computing. References [Joosen] Joosen, Robben, Van Oeyen, Matthijs, Bijnens and Verbaeten. Developing Distributed Applications using the CORRELATE MOP. Dept. of Comp. Science, KULeuven Belgium, technical report. [VanOeyen] Johan Van Oeyen, Stijn Bijnens, Wouter Joosen, Bert Robben Frank Matthijs and Pierre Verbaeten. A Flexible Object Support System as Runtime for Concurrent Object-Oriented Languages. In Chris Zimmermann, editor, "Metaobject Protocols". CRC Inc., May, 1996. [Chiba] Shigeru Chiba and Takashi Masuda. Designing an Extensible Distributed Language with a Meta-Level Architecture. In "Proceedings of ECOOP'93", pages 482--501. Lecture Notes in Computer Science, Springer-Verlag, July 1993. [Stroud] R.J. Stroud, Z. Wu. Using Metaobject Protocols to Implement Atomic Data Types. In "Proceedings of ECOOP'95", pages 168-189. Lecture Notes in Computer Science, Springer-Verlag, Aug. 1995. [Fabre] Jean-Charles Fabre, Vincent Nicomette, Tanguy Perennou, Robert Stroud and Zhixue Wu. Implementing Fault Tolerant Applications using Reflective Object-Oriented Programming. Proc. of the 25th IEEE International Symposium on Fault-Tolerant Computing, Pasadena (CA), June 27-30, 1995. [Mullender] Sape Mullender (1993). Distributed Systems, 2nd edition. Addison-Wesley.