| CONTENTS | 7 |
|---|
| PREFACE | 11 |
|---|
| ACKNOWLEDGEMENTS | 15 |
|---|
| PART I THE INDEXING AND ABSTRACTING ENVIRONMENT | 18 |
|---|
| Chapter 1 THE NEED FOR INDEXING AND ABSTRACTING TEXTS | 19 |
| 1. INTRODUCTION | 19 |
| 2. ELECTRONIC DOCUMENTS | 20 |
| 3. COMMUNICATION THROUGH NATURAL LANGUAGE TEXT | 21 |
| 4. UNDERSTANDING OF NATURAL LANGUAGE TEXT: THE COGNITIVE PROCESS | 23 |
| 5. UNDERSTANDING OF NATURAL LANGUAGE TEXT: THE AUTOMATED PROCESS | 24 |
| 6. IMPORTANT CONCEPTS IN INFORMATION RETRIEVAL AND SELECTION | 26 |
| 7. GENERAL SOLUTIONS TO THE INFORMATION RETRIEVAL PROBLEM | 33 |
| 8. THE NEED FOR BETTER AUTOMATIC INDEXING AND ABSTRACTING TECHNIQUES | 38 |
| Chapter 2 THE ATTRIBUTES OF TEXT | 43 |
| 1. INTRODUCTION | 43 |
| 2. THE STUDY OF TEXT | 43 |
| 3. AN OVERVIEW OF SOME COMMON TEXT TYPES | 45 |
| 4. TEXT DESCRIBED AT A MICRO LEVEL | 46 |
| 5. TEXT DESCRIBED AT A MACRO LEVEL | 54 |
| 6. CONCLUSIONS | 63 |
| Chapter 3 TEXT REPRESENTATIONS AND THEIR USE | 65 |
| 1. INTRODUCTION | 65 |
| 2. DEFINITIONS | 65 |
| 3. REPRESENTATIONS THAT CHARACTERIZE THE CONTENT OF TEXT 3.1 Set of Natural Language Index Terms | 66 |
| 4. INTELLECTUAL INDEXING AND ABSTRACTING 4.1 Gene ral | 71 |
| 5. USE OF THE TEXT REPRESENTATIONS | 76 |
| 6. A NOTE ABOUT THE STORAGE OF TEXT REPRESENTATIONS | 85 |
| 7. CHARACTERISTICS OF GOOD TEXT REPRESENTATIONS | 86 |
| 8. CONCLUSIONS | 89 |
| PART II METHODS OF AUTOMATIC INDEXING AND ABSTRACTING | 91 |
|---|
| Chapter 4 AUTOMATIC INDEXING: THE SELECTION OF NATURAL LANGUAGE INDEX TERMS | 93 |
| 1. INTRODUCTION | 93 |
| 2. A NOTE ABOUT EVALUATION | 94 |
| 3. LEXICAL ANALYSIS | 94 |
| 4. USE OF A STOPLIST | 96 |
| 5. STEMMING | 97 |
| 6. THE SELECTION OF PHRASES | 100 |
| 7. INDEX TERM WEIGHTING | 105 |
| 8. ALTERNATIVE PROCEDURES FOR SELECTING INDEX TERMS | 114 |
| 9. SELECTION OF NATURAL LANGUAGE INDEX TERMS: ACCOMPLISHMENTS AND PROBLEMS | 117 |
| 10. CONCLUSIONS | 118 |
| Chapter 5 AUTOMATIC INDEXING: THE ASSIGNMENT OF CONTROLLED LANGUAGE INDEX TERMS | 119 |
| 1. INTRODUCTION | 119 |
| 2. A NOTE ABOUT EVALUATION | 120 |
| 3. THESAURUS TERMS | 122 |
| 4. SUBJECT AND CLASSIFICATION CODES | 127 |
| 5. LEARNING APPROACHES TO TEXT CATEGORIZATION | 131 |
| 6. ASSIGNMENT OF CONTROLLED LANGUAGE INDEX TERMS: ACCOMPLISHMENTS AND PROBLEMS | 147 |
| 7. CONCLUSIONS | 148 |
| Chapter 6 AUTOMATIC ABSTRACTING: THE CREATION OF TEXT SUMMARIES | 149 |
| 1. INTRODUCTION | 149 |
| 2. A NOTE ABOUT EVALUATION | 150 |
| 3. THE TEXT ANALYSIS STEP | 152 |
| 4. THE TRANSFORMATION STEP 4.1 Selection and Generalization of the Content | 164 |
| 5. GENERATION OF THE ABSTRACT | 166 |
| 6. TEXT ABSTRACTING: ACCOMPLISHMENTS AND PROBLEMS | 168 |
| 7. CONCLUSIONS | 170 |
| PART III APPLICATIONS | 172 |
|---|
| Chapter 7 TEXT STRUCTURING AND CATEGORIZATION WHEN SUMMARIZING LEGAL CASES | 173 |
| 1. INTRODUCTION | 173 |
| 2. TEXT CORPUS AND OUTPUT OF THE SYSTEM | 174 |
| 3. METHODS: THE USE OF A TEXT GRAMMAR | 177 |
| 4. RESULTS AND DISCUSSION | 181 |
| 5. CONTRIBUTIONS OF THE RESEARCH | 184 |
| 6. CONCLUSIONS | 188 |
| Chapter 8 CLUSTERING OF PARAGRAPHS WHEN SUMMARIZING LEGAL CASES | 189 |
| 1. INTRODUCTION | 189 |
| 2. TEXT CORPUS AND OUTPUT OF THE SYSTEM | 190 |
| 3. METHODS: THE CLUSTERING TECHNIQUES | 191 |
| 4. RESULTS AND DISCUSSION | 197 |
| 5. CONTRIBUTIONS OF THE RESEARCH | 204 |
| 6. CONCLUSIONS | 206 |
| Chapter 9 THE CREATION OF HIGHLIGHT ABSTRACTS OF MAGAZINE ARTICLES | 207 |
| 1. INTRODUCTION | 207 |
| 2. TEXT CORPUS AND OUTPUT OF THE SYSTEM | 208 |
| 3. METHODS: THE USE OF A TEXT GRAMMAR | 210 |
| 4. RESULTS AND DISCUSSION | 217 |
| 5. CONTRIBUTIONS OF THE RESEARCH | 220 |
| 6. CONCLUSIONS | 221 |
| Chapter 10 THE ASSIGNMENT OF SUBJECT DESCRIPTORS TO MAGAZINE ARTICLES | 223 |
| 1. INTRODUCTION | 223 |
| 2. TEXT CORPUS AND OUTPUT OF THE SYSTEM | 224 |
| 3. METHODS: SUPERVISED LEARNING OF CLASSIFICATION PATTERNS | 226 |
| 4. RESULTS AND DISCUSSION | 233 |
| 5. CONTRIBUTIONS OF THE RESEARCH | 240 |
| 6. CONCLUSIONS | 241 |
| SUMMARY AND FUTURE PROSPECTS | 243 |
|---|
| 1. SUMMARY | 243 |
| 2. FUTURE PROSPECTS | 251 |
| REFERENCES | 253 |
|---|
| SUBJECT INDEX | 277 |