Is it the right tool?
Ben Escoto
[email protected]
Thu, 29 Aug 2002 11:17:40 -0700
--==_Exmh_-914296909P
Content-Type: text/plain; charset=us-ascii
>>>>> "PC" == pchandler <[email protected]>
>>>>> wrote the following on Thu, 29 Aug 2002 11:42:53 +0100
PC> I'm involved in moving a small organisation from windows 98 p2p
PC> to running a Linux server. Everything's fine except they are
PC> now hooked on a (discontinued) Powerquest product called
PC> Datakeeper. That monitored folders on the Win98 server and kept
PC> versions of changed files. It didn't work on timed runs but, I
PC> guess, on system calls. You set how many versions you want to
PC> keep and it will keep that many, independent of how old the
PC> versions are.
Hmm, maybe a versions + time approach would be better. This way it
seems if you had the habit of saving your documents every 5 minutes
you would exhaust the number of versions.. (And a file that changed
once would have old versions around for years.)
PC> I've been trying to find a similar Linux solution and have been
PC> trying rdiff-backup. The way I've been testing it is running it
PC> every 5 minutes to try and catch recent changes but I of course
PC> end up with huge (in terms of no. of files) rdiff-backup-data
PC> directories. The other problem is that restoring is cumbersome
PC> when all you want to do is restore the 'last version'.
Yeah, rdiff-backup wasn't really designed with a 5 minute cycle in
mind. What do you mean about restoring the 'last version' though?
Can't you just use cp?
PC> I appreciate the problem is the way I'm trying to use
PC> rdiff-backup, but I'm wondering whether it is possible to
PC> achieve what they want (i.e. a set number of prior versions
PC> irrespective of age) or whether anyone knows of an alternative
PC> approach/product. I'd really like to get close as their IT guy
PC> is going out on a bit of a limb by going Linux.
You can't really set a different number of backups on a file-by-file
basis with rdiff-backup. I suppose it wouldn't be hard to write a
script that just culled the rdiff-backup-data directory based on
number of increments for each file instead of by a specific time, but
the larger problem seems to be that it is inefficient to search
through all a system's data every 5 minutes. Intercepting system
calls seems to be a must.
I don't have any experience with this (perhaps someone else on
this will pipe up) but there was a recent discussion on the rsync list
about this. Maybe this message will be helpful to you: (edited by
me)
------------------------------------------------------------
Subject: Re: directory replication between two servers
From: jw schultz <[email protected]>
Date: Wed, 3 Jul 2002 17:40:04 -0700
To: [email protected]
On Wed, Jul 03, 2002 at 11:10:13AM -0700, Eric Ziegast wrote:
> If you need read-write access on one server and need to replicate data
> to a read-only server and need synchronous operation (i.e.: the
> write must be completed on the remote server before returning to the
> local server), then you need operating-system-level or storage-level
> replication products.
>
> Veritas:
> It's not available on Linux yet, but Volume Replicator performs
> block-level incremental copies to keep two OS-level filesystems
> in sync. $$
>
> File Replicator is based (interestingly enough) on rsync, and
> runs under a virtual filesystem layer. It is only as reliable
> as a network-wide NFS mount, though. (I haven't seen it used
> much on a WAN.) $$
>
> Andrew File System (AFS)
> This advanced filesystem has methods for replication
> built in, but have a high learning curve for making them
> work well. I don't see support for Linux, though. $
>
> Distributed File System (DFS)
> Works alot like AFS, built for DCE clusters, commercially
> supported (for Linux too) $$$
>
> NetApp, Procom (et.al.):
> Several network-attached-storage providers have replication
> methods built into their products. The remote side is kept
> up to date, but integrity of the remote data depends on the
> application's use of snapshots. $$$
>
> EMC, Compaq, Hitachi (et.al.):
> Storage companies have replication methods and best practices
> built into their block-level storage products. $$$$
>
>
> If others know of other replication methods or distributed filesystem
> work, feel free to chime in.
NFS
A filesystem level sharing over the network.
Don't pooh-pooh NFS because it is old. I
don't recommend it on an unsecured network
but it is suprisingly fast. Given a fast
network Netledger found Oracle ran faster on a
NFS mounted volumes than on small local disks.
The linux NFS server does need some
performance improvement. Not suitable for
WAN.
Coda
A distributed filesystem Based on research from AFS.
Single tree structure that lives as an alien
in the unix tree. Primary focus is
disconnected operation. Lacks locking so even
when all nodes are online can have update
conflicts. Available on linux, is FREE.
Intermezzo
A distributed filesystem Based on research
from Coda. Seems less alien than than Coda
with better support for multiple
mountpoints. Provides locking mechanisms for
connected operations but still allows
resyncronization on reconnect. Developed on
Linux, is FREE.
Lustre
A cluster filesystem can be used with
multiport disks, SAN devices and xNDB.
Filesystem is online, writable for all
nodes. Storage device is responsible for
HA. Still in developement.
If you look cluster websites you will probably find a few
more solutions.
--
________________________________________________________________
J.W. Schultz Pegasystems Technologies
email address: [email protected]
Remember Cernan and Schmitt
--==_Exmh_-914296909P
Content-Type: application/pgp-signature
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)
Comment: Exmh version 2.5 01/15/2001
iD8DBQE9bmVB+owuOvknOnURAsipAKCPf/DSYvDQv0/WMyi+kaAjbOsTwACeJcZ6
aIfznRL89iXKJPkL6eriWi8=
=yh7y
-----END PGP SIGNATURE-----
--==_Exmh_-914296909P--