diglib Archive
Date: Mon Oct 17 11:00:39 2005
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: diglib: Issue from August 15 DCC meeting - Mac OS Files
This is a good technical discussion issue. A bit more background:
1/ the information in a resource fork may in some cases be non-
critical, but in other cases is totally essential to the file. For
example, if one is preserving a MacOS 9 (classic mode) program, the
resource fork contains the actual code. In many cases, though, the
only information in the resource fork is metadata, and we all know
that metadata can be dispensed with ( that's :-) of course).
2/ in modern MacOS X world resource forks are a bit less critically
important, since now an application (program) is actually implemented
as what appears to be a file but is actually a large number of files
in a directory.
3/ MacOS is not the only system that supports multiple forks.
Windows files can also have multiple forks (more than 2, in fact).
Luckily, multiple forks on NTFS file systems are very rarely used
except by hackers who are trying to hide information or by Windows
servers that are supporting Mac clients, so most of the time on
Windows you can pretend that a file is a named finite-length single
byte stream.
4/ there are many semi-standard ways to encode a Mac file as a single
bitstream (hence an easy candidate for storing in a file on a linux
or FAT file system. One common approach is to use MacBinary II or a
similar encoding, which basically packages the 2 forks in a single
one with some syntax to allow a parser to unpack the two. Another is
to use Stuffit archive format, which also allows multiple MacOS files
to be packed into a single OS file. Corey describes a different
packing convention that uses two single-stream files on the hard
disk. That's of particular interest because it's the convention that
MacOS itself uses when storing files on filesystems that don't
support multiple-fork semantics internally (e.g. a Mac hard disk
formatted as unix style rather than Mac HFS+).
5/ Two vital pieces of information that are traditionally stored in a
Mac resource fork but can be stored in other ways as well are the
file type and preferred program to open it. Each of these is a 4-
character string (with reasonable authority control). It was
traditionally common, for instance, to have a Mac file name that was
arbitrary, and to store the information that this was a JPEG file in
the resource fork rather as part of the file name.
6/ resource fork metadata is one type of filesystem metadata that may
need to be preserved, but it's not the only one. File names may be
important. File modes/protections may be important. Ad nauseum.
I think the issue of file names and in particular leading periods is
a separable problem, but with its own swamps. File names beginning
with periods are very common on unix and linux systems as well as on
Macs. There are other problems connected with file names, e.g. the
Unix restriction that file names not contain "/" or null, and the
widespread restriction that filenames have a fairly short length and
contain only US-ASCII characters (hence no Unicode).
I don't have an opinion as to whether the resource forks on the art
on file files need to be preserved, but I'm absolutely sure that we
WILL have occasions in the future where the content of resource forks
is critical to the meaning of the intellectual resource and needs to
be preserved.
JQ