Data Transport Between Different Machines

I don't have experience with this topic yet. So just a survey about some means of data transport and maintaining data consistency between different machines.

Hardware

  1. external harddisks

  2. ZIP drive

Wade Hampton wrote: "You may use MS-DOS formatted ZIP and floppy discs for data transfer. You may be able to also use LS120. If you have SCSI, you could use JAZ, MO or possibly DVD-RAM (any SCSI disc that you could write to). I have the internal ZIP for my Toshiba 700CT. It works great (I use automount to mount it). I use VFAT on the ZIP disks so I can move them to Windows boxes, Linux boxes, NT, give them to coworkers, etc. One problem, I must SHUTDOWN to swap the internal CD with the ZIP."

Software

Version Management Software

Although it is certainly not their main aim, version management software like CVS (Concurrent Version System) are a perfect tool when you work on several machines and you have trouble keeping them in sync (something which is often called "disconnected filesystems" in the computer science literature). Unlike programs like rsync, which are assymetric (one side is the master and its files override those of the slave), CVS accept that you make changes on several machines, and try afterwards to merge them. Assymetric tools are good only when you can respect a strict discipline, when you switch from one machine to another. On the contrary, tools like CVS are more forgetful.

To synchronize two or more machines (typically a desktop and a laptop), just choose a CVS repository somewhere on the network. It can be on one of the machines you want to synchronize or on a third host. Anyway, this machine should be easily reachable via the network and have good disks.

Then, cvs co the module you want to work on, edit it, and cvs commit when you reached a synch point and are connected. If you made changes on both hosts, CVS will try to merge them (it typically succeeds automatically) or give in and ask you to resolve it by hand.

The typical limits of this solution: CVS does not deal well with binary files, so this solution is more for users of vi or emacs than for GIMP fans. CVS has trouble with some UNIX goodies like symbolic links.

For more information on CVS, see the Web page. The CVS documentation is excellent (in info format).

CODA Filesystem

The Coda File System is a descendant of the Andrew File System. Like AFS, Coda offers location-transparent access to a shared UNIX file name-space that is mapped on to a collection of dedicated file servers. But Coda represents a substantial improvement over AFS because it offers considerably higher availability in the face of server and network failures. The improvement in availability is achieved using the complementary techniques of server replication and disconnected operation. Disconnected operation proven especially valuable in supporting portable computers http://www.coda.cs.cmu.edu/ .

unison

unison is a file-synchronization tool for Unix and Windows. It allows two replicas of a collection of files and directories to be stored on different hosts (or different disks on the same host), modified separately, and then brought up to date by propagating the changes in each replica to the other. The current release is available for download from in source and binary form . unison shares a number of features with tools such as configuration management packages (CVS, PRCS, etc.) distributed filesystems (Coda, etc.) uni-directional mirroring utilities (rsync, etc.) and other synchronizers (Intellisync, Reconcile, etc). However, there are a number of points where it differs:

  • unison runs on both Windows (95, 98, NT, and 2k) and Unix (Solaris, Linux, etc.) systems. Moreover, unison works across platforms, allowing you to synchronize a Windows laptop with a Unix server, for example.

  • Unlike a distributed filesystem, unison is a user-level program: there is no need to hack (or own!) the kernel, or to have superuser privileges on either host.

  • Unlike simple mirroring or backup utilities, unison can deal with updates to both replicas of a distributed directory structure. Updates that do not conflict are propagated automatically. Conflicting updates are detected and displayed.

  • unison works between any pair of machines connected to the internet, communicating over either a direct socket link or tunneling over an rsh or an encrypted ssh connection. It is careful with network bandwidth, and runs well over slow links such as PPP connections.

  • unison has a clear and precise specification.

  • unison is resilient to failure. It is careful to leave the replicas and its own private structures in a sensible state at all times, even in case of abnormal termination or communication failures.

  • unison is free; full source code is available under the GNU Public License.

mirrordir

Mirrordir is a suite of functions in one package. It contains a remote login utility and daemon that provides a secure shell, a cp equivalent which additionally copies to and from ftp servers, a tool to mirror filesystems over ftp or locally, and another utility you can pass a C script to recursively perform operations on files.

mirrordir forces the mirror directory to be an exact replica of the control directory tree in every possible detail suitable for purposes of timed backup. Files whose modification times or sizes differ are copied. File permissions, ownerships, modification times, access times, and sticky bits are duplicated.Devices, pipes, and symbolic and hard links are duplicated.Files or directories that exist in the mirror directory that don't exist in the control directory are deleted.It naturally descends into subdirectories to all their depths.

InterMezzo

InterMezzo is a new distributed file system with a focus on high availability. InterMezzo is an Open Source project, currently on Linux (2.2 and 2.3).

A primary target of our development is to provide support for flexible replication of directories, with disconnected operation and a persistent cache.

For example, we want to make it easy to manage copies of home directories on multiple computers, and solve the laptop/desktop synchronization problems. On a larger scale we aim to provide replication of large file repositories, for example to support high availability for servers.

InterMezzo was deeply inspired by the Coda File System, but totally re-designed and re-engineered.

WWWsync

This is a program written in Perl that will update your web pages by ftp from your local pages. This was originally written for updating Demon home-pages, but will work with other providers which provide direct FTP access to your web pages. I didn't check this for laptop purposes yet. You may get the program at http://www.alfie.demon.co.uk/wwwsync/ .

rsync

rsync is a program that allows files to be copied to and from remote machines in much the same way as rcp. It has many more options than rcp, and uses the rsync remote-update protocol to greatly speedup file transfers when the destination file already exists. The rsync remote-update protocol allows rsync to transfer just the differences between two sets of files across the network link.

Xfiles - file tree synchronization and cross-validation

Xfiles is an interactive utility for comparing and merging one file tree with another over a network. It supports freeform work on several machines (no need to keep track of what files are changed on which machine). Xfiles can also be used as a cross-validating disk <-> disk backup strategy (portions of a disk may go bad at any time, with no simple indication of which files were affected. Cross-validate against a second disk before backup to make sure you aren't backing up bad data).

A client/server program (GUI on the client) traverses a file tree and reports any files that are missing on the server machine, missing on the client machine, or different. For each such file, the file size/sizes and modification date(s) are shown, and a comparison (using UNIX diff) can be obtained. For files that are missing from one tree, similarly named files in that tree are reported. Inconsistent files can then be copied in either direction or deleted on either machine. The file trees do not need to be accessible via nfs. Files checksums are computed in parallel, so largely similar trees can be compared over a slow network link. The client and server processes can also be run on the same machine. File selection and interaction with a revision control system such as RCS can be handled by scripting using jpython. Requirements Java1.1 or later and JFC/Swing1.1 are needed. Xfiles.

sitecopy

Sitecopy is for copying locally stored websites to remote web servers. The program will upload files to the server which have changed locally, and delete files from the server which have been removed locally, to keep the remote site synchronized with the local site, with a single command. The aim is to remove the hassle of uploading and deleting individual files using an FTP client. sitecopy.

KBriefcase

The KDE tool Kbriefcase tries to achieve a similar goal as the Windows briefcase, but in a different way. Rather than pulling your files from the desktop, they are pushed to the laptop. You drag a file from the local location to the briefcase. You are then asked for the remote path to copy it to. It will then copy the file to the remote location and make the original read-only. When you restore and remove, the file is copied back and write permissions are given back. The read-only status, of course, makes sure you don't start editing the file again before you've brought your changes back from the remote location.