Elena Volynskaya elena2@ix.netcom.com 300 West 55 St Apt.5C New York , NY 10019 Overview The subject of this paper is problems of designing replication engine to support disconnected operation of a remote system that requires access to consolidated data and of using e-mail as transport that provides means of synchronizing data and objects between the main server and remote clients. Such systems are designed by commercial companies and are available on the market, however, their stability, compatibility, performance and level of sophistication leave room for further research. Problem A number of mobile remote clients need to have access to data stored on the main server. The data is updated daily and each remote client needs to have access only to a subset of it. The system is required that will allow remote client to connect to the source of data, download the required subset of data, disconnect and be able to operate viewing and modifying the data. The modifications made on the client have to be distributed back to the source and to other remote clients that require access to it. Also, the application for viewing and modifying the data can be modified and needs to be distributed to the remote clients. It is understood that during the time between two consecutive connects the remote and consoli- dated systems may be not in sync. Also, for the sake of this discussion assume that all remote users are registered on the main server. Discussion The problem can be divided into two parts: 1) replication rules for the data on the server and on the client; and 2) transport of the data, replication rules and application files. The solution to the first part of the problem would be an engine that allows definition of replication rules determining what data is accessible to what users. The users can form groups that have replication rules associated with them, that way when new user is added to the group he automatically inherits the rules for the groups. The engine should be able to package the changes ready to be transferred at the request of the replication process and keep track of the check- points at which the last package was created in order to be able to reproduce it in case of a delivery failure. The engine should also be able to take delivered packages and apply the changes to the data. It should be able to make distinction between the packages containing data and packages containing file (e.g. new versions of application, etc.). After the package is used it should be deleted. Each package would have among its properties recipient name, type (data, files) and serial number. The replication engine could be implemented as a base class for properties that are common between the engine on the server and the engine on the client. Its functionality can then be extended for the appropriate use. The advantage of using e-mail as a means of transferring data between the server and clients is the fact that it makes replication transparent to the user; its disadvantage is that serialized delivery of messages is not guaranteed. Since communication systems at the present stage of technology are not one hundred percent reliable in terms of delivering messages there should be a mechanism of acknowledging the receipt of the last replication message and matching the sequence of those messages. While e-mail appears to be the most natural solution as the actual carrier of replication data and instructions, there is a need to design a universal transport object that would work as an interface between data replication engine and e-mail and would be intelligent to know where to look for the package to be carried, who is the recipient, how to communicate with e-mail system, etc. Among the properties of Transport object would be package pickup location, e-mail connection specifications, package drop-off location. Its functions would be: 1) reach for the package 2) get the recipient name of the package 3) connect to e-mail 4) put the package into the e-mail out-box 5) delete the package from the pickup location 6) get package from the e-mail in-box 7) put the package to the drop-off location The Transport package can be implemented as a base class, so that its functionality can be further extended according to the needs of the specific environment. There is an issue of creating initial replicas of the data for each user. These replicas have to be created on the basis of the replication rules. This task needs a dedicated business object that is able to communicate with replication rules engine. The replicas created by this object would be distributed to users as part of a setup process. Implementation The above discussed solution was implemented using SQL Anywhere 5.0 Replication engine; cc:Mail was used as an e-mail system and the transport object was written in Visual Basic 4.0 using VIM to communicate with cc:Mail. Unfortunately, the choice if tools was dictated by the client for whom the software was written, as Visual Basic is not appropriate language for this kind of problem. It would be best to use C++ since it provides proper level of object-orientation, as well as performance and portabili- ty. Apart from problems that arise in defining replication rules as queries in relational database that have to be run every time replication process requests data, there is a problem of achieving level of performance that would make the process transparent for the user.