<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">[top posting]<br>
<br>
<i>Fabulous</i> reply from Angelo. <br>
<br>
<u>I must recommend RSYNC</u> as a first course for proving things
or for in-a-pinch replication needs. It's like option C since
RSYNC is just an application. One feature of RSYNC is that it can
avoid sending bits which match what the receiver already has. I
also like that it can be told to not step on a newer copy (if
source and target may each have newer versions of some files). <br>
<br>
I <u>have used RSYNC bi-directionally</u>. Given the specific
options that I most often use, my downside is the need to manually
delete unwanted files. (RSYNC has an option to delete files, but I
never use it "just in case" and it seems dangerous for
bi-directional.) <br>
<br>
I've <u>also seen option A used</u> in rigorous off-site and
heavily exercised D/R. (We're talking greater than 300 miles.)
Options A and C both allow for any filesystems you might need or
choose. (Not that there's anything wrong with option B, FS based
solutions.) Option A often requires that you go deep with a
specific storage vendor or service. <br>
<br>
The option A work that I was involved in was for a former
employer. Most of the exercise was truly outstanding, excellent
work. But procedures historically included <u>applying updates
via tape</u> (after snap-shot of the storage across the 300-mile
link). I am just not a fan of tape anymore for a variety of
reasons. Was annoyed that the D/R exercise would consistently burn
many hours for all the tape work, but was not the decision maker.
<br>
<br>
I'm a huge fan of <u>shared disk images</u>, even within only one
data center, so I strongly advocate option A for static content.
(And most systems on your SAN fabric probably grok ISO9660.) <br>
<br>
-- R; <><<br>
<br>
<br>
On 01/31/2017 11:04 PM, Angelo McComis wrote:<br>
</div>
<blockquote
cite="mid:CAK1KucR7y7+xk9ii3bKbXniWCwvT7Sz2Ti8+fbwxG3A_VwefDA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">I
see a fair amount of this in various customer environments all
the time. I've also been around BCP/Disaster Recovery, so I
can speak about the choices, what they do, what they don't do,
and so on. If you have a specific question, you're welcome to
reach out. But, sharing experience, in general, I can share
this off the top of my head...</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">Multi-site
replication comes in a few different varieties: </div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">A)
Storage-based (e.g. your SAN/NAS device is shipping
block-by-block changes to a matching storage system at the
other site, which is applying those changes/writes to a remote
mirror)</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">B)
Host-based (e.g. your OS file system driver is taking IO
writes for a given filesystem and replicating them to another
host, which that other host is applying those writes to a
remote filesystem)</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">C)
Application-based (e.g. some built-in feature to an
application [not unlike mysql] is pushing bits over the
network to another copy somewhere else, that is receiving
those bits)</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">In
cases of A and B above, and sometimes, but not always C, the
remote side is in a "read only" mode and not usable, since
there's no mechanism to take writes on the far side and get
them back to the original site. In the case of
application-based replication, there are some that handle
bi-directional replication, so you're not stuck with a far
site in read-only mode.</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">In
case of A above, you can ensure zero data loss if you have 1)
short enough distance between the two sites, and 2) enabled
synchronous replication. In this use case, a write is written
to the A side, replicated to the B side, and confirmation of
the write is sent back to A before the A side considers the
write to be complete and releases control back to the
requesting application. Because of the extra hops, synchronous
replication is limited to ~20 miles of distance because of
latency. If zero data loss is not an absolute requirement,
asynchronous replication is not distance limited, but as
distance increases, the lag between a write to the near side
and that write being committed to the far side increases as
well. </div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">In
your specific mentioning of XtreemFS, it works like option B
above, as it plugs in at the filesystem driver layer of the
OS, and appears, from my reading of their site info, to
operate asynchronously. GlusterFS is similar. It's
asynchronous, and can work across distance to create global
clusters.</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">The
art of applying a replication strategy is to first understand
what the business or technical requirement is that must be
met. What are some of the use cases? If you're wondering what
of the above I see most often? It's A and C. Examples:
Frame-to-Frame storage replication done at the volume level,
databases running log-shipping to remote sites for a DR copy,
that sort of thing. </div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033">Angelo</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
<div class="gmail_default"
style="font-family:verdana,sans-serif;font-size:small;color:#330033"><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Jan 31, 2017 at 9:18 PM,
Christopher Cavello <span dir="ltr"><<a
moz-do-not-send="true" href="mailto:cavello.1@osu.edu"
target="_blank">cavello.1@osu.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Is anyone
on this list willing to share his experience with file<br>
replication across data centers? (glusterfs geo-replication,
xtreemfs<br>
<a moz-do-not-send="true" href="http://www.xtreemfs.org/"
rel="noreferrer" target="_blank">http://www.xtreemfs.org/</a>,
etc.)</blockquote>
</div>
</div>
</blockquote>
<p><br>
</p>
</body>
</html>