Defining digital sustainability.
Publication Date: 22-JUN-07
Publication Title: Library Trends
Format: Online
Author: Bradley, Kevin

Read this article now
Try Goliath Business News - FREE!

You can view this article PLUS...

  • Over 5 million business articles
  • Hundreds of the most trusted magazines, newswires, and journals (see list)
  • Premium business information that is timely and relevant
  • Unlimited Access

Now for a Limited Time, try Goliath Business News
Free for 7 Days!

Tell Me More   Terms and Conditions

Purchase this article for $4.95

Description

ABSTRACT

This paper investigates what is meant by digital sustainability and establishes that it encompasses a range of issues and concerns that contribute to the longevity of digital information. A significant and integral part of digital sustainability is digital preservation, which has focused on one technical concern after another as issues and fashions have shifted over the last twenty years. Digital sustainability is demonstrated as providing an appropriate context for digital preservation because it requires consideration of the overall life cycle, technical, and socio-technical issues associated with the creation and management of digital items.

INTRODUCTION

If digital technologies had a sense of humor, a joke between them might run: There are ten types of technologies in this world: those that understand binary, and those that don't. Digital storage and delivery technologies allow the encoding of meaningful representations into two states, and 1; a state of being and a state of not-being, of on and off, of plus and minus, or of falling below or climbing above a defined or given threshold. If the permanent maintenance of any given state, or set of states, was the definition of digital sustainability, then we could merely select a suitable technical strategy to permanently inscribe those states and entrust the objects to an appropriate storage and preservation strategy. However, the layers of dependencies and interdependencies, standards, agreements, understandings, technologies, strategies, workflows, and business models render that simple preservation model indefensible.

Thinking about some of the protocols associated with storing and accessing digital coding may help to illustrate these intricate dependencies. A bit, the lowest level of information, is meaningful only in relation to other bits with which it is associated; eight bits form a byte, and a word length might be 16-, 32-, or 64-bit depending on the operating system and the type of data. The word may exist, but it is just a seamless string of digits unless the system knows where the word or byte starts and finishes. The data is allocated a place on a disc that is formatted in a particular manner. The Microsoft disc operating system (MSDOS) uses a file allocation table (FAT), which may be either FAT 12, FAT 16, FAT 32, or FAT 64, depending on the memory space and partition size. In a UNIX environment the file system structure is managed by a protocol called inodes. Mac computers have used inodes as a sectoring protocol since the 2001 operating system OS X was released, and their own proprietary system for OS 9 and all earlier operating systems. As well as these there are many legacy disc structures associated with operating systems no longer supported; eventually all the current systems will also become legacy. Various tables and structures define the "address" at which data may be found.

Some systems, such as compact discs, use a small range of hard-coded words to describe the original word, and a lookup table is needed to associate the coded word with the stored word. If the data is backed up on tape, as is customary, then there are a different range of data storage protocols, tape standards, and potentially complex compression algorithms. Assuming the data can be found, and the appropriate word substituted where necessary, the operating chip will need to know if the word is bigendian or little-endian. The byte stream is described as little-endian when the low-order byte of the number is stored in memory at the lowest address, and the high-order byte at the highest address; big-endian is the reverse. This is an issue for the operating chip; the chip used in PCs have tended to be little-endian, while those used in Macs tend to be big-endian. As a consequence file formats developed on one platform or another may specify byte order. For example, a bitinap (.bmp) specifies a little-endian byte order, while a JPEG expects big-endian. TIFF image files can be bigor little-endian, and encodes in its metadata the form to which it conforms. The byte order can be reversed, but the system knowledge that this is necessary is essential.

The host computer must have access to enough coded information to allow it to recognize the binary file format and associate it with the appropriate piece of software. The version of the file is generally only known after the file is opened; then rendering software attempts to open the file if it is a version that it recognizes.

If the file is character-oriented, it will be necessary to decode the character set, which may be described in 7-bit or 8-bit ASCII (American Standard Code for Information Interchange), or UTF-7 or UTF-8 (Unicode code Transformation Format), or a number of variants. Various lookup tables describe the relationship between the code and the text it represents. Characters associated with a particular language are an issue, and the character sets might contain Chinese, Japanese, or Korean characters (CJK) or Arabic characters, the transliteration to Roman code described in the standards ISO 233 or DIN 31635. Similar standards exist for other character sets.

The browser or rendering software, if it is needed, must not only be appropriate to the version of the file, but also to the operating system on which it will operate. If the file we are trying to preserve is an executable file, it too must have the appropriate operating system on which to run. The operating system must be the proper service pack, have the correct patch and install levels, and have the appropriate device drivers. A specific example of the level of compliance required, as well as an area of constant problems, might be the dynamic link library (DLL), a file that stores data used by Windows programs and links to those programs at "runtime." Often the DLL used by a particular program is missing, corrupted, or altered by the hardware or by another program that shares it in use. This generally produces an error message and requires a reinstallation of the DLL file. The number of DLL files available is very large and the process of identifying, tracking down, and installing the proper file is described by IT support staff as "DLL hell." Changes to the kernel, which is responsible for process and task management, and memory and disk management, can render a program inoperable, as can the inability to locate low level libraries in UNIX systems. The way in which operating systems and programs interact is complex, subject to change, and mediated by commercial interests; and faults or incompatibilities in any of these areas can make the whole system seem very fragile.

Besides the software interaction, file functionality also depends on standards, agreements, and understandings...



More articles from Library Trends
Collaboration for electronic preservation.(Case study), June 22, 2007

Looking for additional articles?
Click here to search our database of over 3 million articles.