: Henry Gladney
: Preserving Digital Information
: Springer-Verlag
: 9783540378877
: 1
: CHF 47.30
:
: Informatik
: English
: 326
: Wasserzeichen/DRM
: PC/MAC/eReader/Tablet
: PDF

Cultural history enthusiasts have asserted the urgent need to protect digital information from imminent loss. This book describes methodology for long-term preservation of all kinds of digital documents. It justifies this methodology using 20th century theory of knowledge communication, and outlines the requirements and architecture for the software needed. The author emphasizes attention to the perspectives and the needs of end users.



Henry M. Gladney is an industry consultant for digital preservation and document management. In 2001, he founded his own company, HMG Consulting, based in Saratoga, CA, after having worked for IBM Research for decades, designing - among other systems - a digital library service that is the core of today's IBM Content Manager®. He is a regular author in the top ACM periodicals, holds eleven patents, and produces the 'Digital Document Quarterly', an online newsletter that has discussed preservation extensively.

12 Durable Representation (p. 235-236)

We want unambiguous communication with future generations with whom dialog is impossible, without restricting what today’s authors can communicate. For this, we need language that we can confidently expect our descendants to understand easily. This challenge is the kind of language problem that has been central to computer science since it emerged as a discipline in the 1960s. Its core can be restated as,"ensure that an arbitrary computer program will execute correctly on a machine whose architecture is unknown when the program is saved."

The English logician A. M. Turing showed in 1937 (and various computing machine experts have put this into practice since then in various particular ways) that it is possible to develop code instruction systems for a computing machine which cause it to behave as if it were another, specified, computing machine.…

A code, which according to Turing's schema is supposed to make one machine behave as if it were another specific machine… must do the following things. It must contain, in terms that the machine will understand and (purposively obey), instructions… that will cause the machine to examine every order it gets and determine whether this order has the structure appropriate to an order of the second machine. It must then contain, in terms of the order system of the first machine, sufficient orders to make the machine cause the actions to be taken that the second machine would have taken under the influence of the order in question.

The important result of Turing's is that in this way the first machine can be caused to imitate the behavior of any other machine. von Neumann 1956, The Computer and the Brain, pp.70–71

Durable encoding, described in this chapter, represents difficult content types with the aid of programs written in virtual machine code - the code of a machine we call a UVC (Universal Virtual Computer). This Turing- Machine-equivalent virtual machine is simple compared to the designs of practical hardware. Its design can be specified completely, concisely, and unambiguously for future interpretation.

Objects to be preserved might consist of several source files, each represented as a bit-stream in a Fig. 32 digital object collection, with labeled links between parts of the complete package. Much of each TDO will be encoded using XML, relations, encryption algorithms, and identifiers. These are governed by relatively simple standards that are widely used - standards that we can be reasonably confident will be completely and correctly understood many years into the future. As described in§11.1, metadata can, and should, record the representation of each TDO component. The means for making each Fig. 32 content blob interpretable forever remains to be provided. What follows describes how this can be accomplished for a single content blob.

12.1 Representation Alternatives

We want information representation methods that can be embodied in tools whose use would be practical for information producers and consumers who do not have specialized skills or equipment.

Preface8
Summary Table of Contents16
Detailed Table of Contents18
Figures23
Tables24
Part I: Why We Need Long-term Digital Preservation25
1 State of the Art31
1.1 What is Digital Information Preservation?32
1.2 What Would a Preservation Solution Provide?35
1.3 Why Do Digital Data Seem to Present Difficulties?36
1.4 Characteristics of Preservation Solutions38
1.5 Technical Objectives and Scope Limitations43
1.6 Summary45
2 Economic Trends and Social Issues47
2.1 The Information Revolution47
2.2 Economic and Technical Trends49
2.3 Democratization of Information54
2.4 Social Issues55
2.5 Documents as Social Instruments57
2.6 Why So Slow Toward Practical Preservation?67
2.7 Selection Criteria: What is Worth Saving?69
2.8 Summary74
Part II: Information Object Structure77
3 Introduction to Knowledge Theory81
3.1 Conceptual Objects: Values and Patterns82
3.2 Ostensive Definition and Names84
3.3 Objective and Subjective: Not a Technological Issue87
3.4 Facts and Values: How Can We Distinguish?89
3.5 Representation Theory: Signs and Sentence Meanings92
3.6 Documents and Libraries: Collections, Sets, and Classes94
3.7 Syntax, Semantics, and Rules96
3.8 Summary98
4 Lessons from Scientific Philosophy101
4.1 Intentional and Accidental Information101
4.2 Distinctions Sought and Avoided103
4.3 Information and Knowledge: Tacit and Human Aspects106
4.4 Trusted and Trustworthy109
4.5 Relationships and Ontologies110
4.6 What Copyright Protection Teaches112
4.7 Summary114
5 Trust and Authenticity117