[colug-432] Strange Burning Behavior

Rob Funk rfunk at funknet.net
Fri Sep 25 21:07:37 EDT 2009


On Friday 25 September 2009 07:49:54 pm Paul Hostetler wrote:
> I'm running Debian Lenny.  I've regularly backed up my 3.1 Gb CVS
> repository to DVD.  About two months ago the backups started failing.

Wait, you're still using CVS? Not Subversion, or Git/Mercurial/Bazaar?
:-)

> What I've discovered is really strange.  I did some testing and found
> that I can record parts of the repository to CD or DVD.  I have
> isolated a small chunk (~200 Mb) of the repository that I can record
> to CD but not to DVD.  I get an unhelpful error message and nothing
> show up in the system logs.

Reminds me a bit of the errors I've gotten in my own backup system 
lately....

What filesystem are you using on the DVDs?

CDs are normally burned with an ISO-9660 filesystem, which is a sort of 
"lowest-common-denominator" filesystem. Anything that cna't be handled by 
everybody gets thrown out. (Though somehow VMS's file versions got 
included.) Then there are the Rock Ridge and Joliet extensions to add Unix 
or Windows/vfat semantics on top of it.

DVDs are often burned with an ISO-9660 filesystem, but they are also 
commonly burned with a UDF filesystem. UDF is "universal disk format", and 
it takes the opposite approach from ISO-9660: it tries to handle all the 
common file information that various systems use in their filesystems.

But one thing that can be problematic is character encoding. Back in the 
days when everyone's filenames are basic printable ASCII that wasn't too 
much of a problem (unless you talk to those wild'n'crazy EBCDIC 
mainframers), but once we started allowing people outside the US to name 
files in their own languages things got complicated. The upshot is that 
UDF uses Unicode for its filenames, but a given UDF filesystem uses a 
particular encoding of Unicode, e.g. UTF-16 (mostly two-byte characters, 
with some longer ones) or UTF-8 (mostly 1-byte characters with some longer 
ones).

In those variable-length character encodings there are invalid character 
sequences. Meanwhile, many people's disks still have files named with 8-
bit international encodings such as Latin-1. Every character in Latin-1's 
top half (all those accented characters) is an invalid character in UTF-8; 
they have to be converted to UTF-8's two-byte version of the character.


So all that is a long version of saying that maybe you have filenames with 
international/accented/non-ASCII characters that are being rejected by the 
UDF filesystem's Unicode encoding.

Or, maybe you're using ISO-9660 for your DVDs, and the problem is 
something totally different.

-- 
==============================| "A slice of life isn't the whole cake
 Rob Funk <rfunk at funknet.net> | One tooth will never make a full grin"
 http://www.funknet.net/rfunk |    -- Chris Mars, "Stuck in Rewind"


More information about the colug-432 mailing list