File Structures (2nd Edition)

معرفی کتاب «File Structures (2nd Edition)» نوشتهٔ Michael J. Folk, Bill Zoellick, Greg Riccardi در سال 1991. این کتاب در 5 صفحه، فرمت pdf، زبان انگلیسی ارائه شده است. «File Structures (2nd Edition)» در دستهٔ بدون دسته‌بندی قرار دارد.

This book provides the conceptual tools to build file structures that can be quickly and efficiently accessed. It teaches good design judgement through an approach that puts the "hands-on" work of constructing and running programs at the center of the learning process. This best-selling book has been thoroughly updated. It includes timely coverage of file structures in a UNIX environement, in addition to a new and substantial appendix on CD-ROM. All former programs in C and Pascal have been updated to ANSI C and Turbo Pascal 6.0. 1 Introduction to File Structures 1.1 The Heart of File Structure Design 1.2 A Short History of File Structure Design 1.3 A Conceptual Toolkit: File Structure Literacy Summary • Key Terms 2 Fundamental File Processing Operations 2.1 Physical Files and Logical Files 2.2 Opening Files 2.3 Closing Files 2.4 Reading and Writing 2.4.1 Read and Write Functions 2.4.2 A Program to Display the Contents of a File 2.4.3 Detecting End-of-File 2.5 Seeking 2.5.1 Seeking in C 2.5.2 Seeking in Pascal 2.6 Special Characters in Files 2.7 The UNIX Directory Structure 2.8 Physical and Logical Files in UNIX 2.8.1 Physical Devices as UNIX Files 2.8.2 The Console, the Keyboard, and Standard Error 2.8.3 1/0 Redirection and Pipes 2.9 File-related Header Files 2.10 UNIX Filesystem Commands Summary Key Terms • Exercises Further Readings 3 Secondary Storage and System Software 3.1 Disks 3.1.1 The Organization of Disks 3.1.2 Estimating Capacities and Space Needs 3.1.3 Organizing Tracks by Sector 3.1.4 Organizing Tracks by Block 3.1.5 Nondata Overhead 3.1.6 The Cost of a Disk Access 3.1.7 Effect of Block Size on Performance: A UNIX Example 3.1.8 Disk as Bottleneck 3.2 Magnetic Tape 3.2.1 Organization of Data on Tapes 3.2.2 Estimating Tape Length Requirements 3.2.3 Estimating Data Transmission Times 3.2.4 Tape Applications 3.3 Disk Versus Tape 3.4 Storage as a Hierarchy 3.5 A Journey of a Byte 3.5.1 The File Manager 3.5.2 The 1/0 Buffer 3.5.3 The Byte Leaves RAM: The 1/0 Processor and Disk Controller 3.6 Buffer Management 3.6.1 Buffer Bottlenecks 3.6.2 Buffering Strategies 3.7 1/0 in UNIX 3.7.1 The Kernel 3.7.2 Linking File Names to Files 3.7.3 Normal Files, Special Files, and Sockets 3.7.4 Block 1/0 3.7.5 Device Drivers 3.7.6 The Kernel and Filesystems 3.7.7 Magnetic Tape and UNIX Summary • Key Terms • Exercises Further Readings 4 Fundamental File Structure Concepts 4.1 Field and Record Organization 4.1.1 A Stream File 4.1.2 Field Structures 4.1.3 Reading a Stream of Fields 4.1.4 Record Structures 4.1.5 A Record Structure That Uses a Length Indicator 4.1.6 Mixing Numbers and Characters: Use of a File Dump 4.2 Record Access 4.2.1 Record Keys 4.2.2 A Sequential Search 4.2.3 UNIX Tools for Sequential Processing 1 4.2.4 Direct Access 4.3 More about Record Structures 4.3.1 Choosing a Record Structure and Record Length 4.3.2 Header Records 4.4 File Access and File Organization 4.5 Beyond Record Structures 4.5.1 Abstract Data Models 4.5.2 Headers and Self-Describing Files 4.5.3 Metadata 4.5.4 Color Raster Images 4.5.5 Mixing Object Types in One File 4.5.6 Object-oriented File Access 4.5.7 Extensibility 4.6 Portability and Standardization 4.6.1 Factors Affecting Portability 4.6.2 Achieving Portability Summary • Key Terms • Exercises Further Readings C Programs Pascal Programs 5 Organizing Files for Performance 5.1 Data Compression 5.1.1 Using a Different Notation 5.1.2 Suppressing Repeating Sequences 5.1.3 Assigning Variable-length Codes 5.1.4 Irreversible Compression Techniques 5.1.5 Compression in UNIX 5.2 Reclaiming Space in Files 5.2.1 Record Deletion and Storage Compaction 5.2.2 Deleting Fixed-length Records for Reclaiming Space Dynamically 5.2.3 Deleting Variable-length Records 5.2.4 Storage Fragmentation 5.2.5 Placement Strategies 5.3 Finding Things Quickly: An Introduction to Internal Sorting and Binary Searching 5.3.l Finding Things in Simple Field and Record Files 5.3.2 Search by Guessing: Binary Search 5.3.3 Binary Search versus Sequential Search 5.3.4 Sorting a Disk File in RAM 5.3.5 The Limitations of Binary Searching and Internal Sorting 5.4 Keysorting 5.4.1 Description of the Method 5.4.2 Limitations of the Keysort Method 5.4.3 Another Solution: Why Bother to Write the File Back? 5.4.4 Pinned Records Summary • Key Terms • Exercises Further Readings 6 Indexing 6.1 What Is an Index? 6.2 A Simple Index with an Entry-Sequenced File 6.3 Basic Operations on an Indexed, Entry-Sequenced File 6.4 Indexes That Are Too Large to Hold in Memory 6.5 Indexing to Provide Access by Multiple Keys 6.6 Retrieval Using Combinations of Secondary Keys 6.7 Improving the Secondary Index Structure: Inverted Lists 6.7.1 A First Attempt at a Solution 6.7.2 A Better Solution: Linking the List of References 6.8 Selective Indexes 6.9 Binding Summary • Key Terms • Exercises Further Readings 7 Cosequential Processing and the Sorting of Large Files 7.1 A Model for Implementing Cosequential Processes 7.1.1 Matching Names in Two Lists 7.1.2 Merging Two Lists 7.1.3 Summary of the Cosequential Processing Model 7.2 Application of the Model to a General Ledger Program 7.2.1 The Problem 7.2.2 Application of the Model to the Ledger Program 7.3 Extension of the Model to Include Multiway Merging 7.3.1 A K-way Merge Algorithm 7.3.2 A Selection Tree for Merging Large Numbers of Lists 7.4 A Second Look at Sorting in RAM 7.4.1 Overlapping Processing and 1/0: Heapsort 7.4.2 Building the Heap while Reading in the File 7.4.3 Sorting while Writing out to the File 7.5 Merging as a Way of Sorting Large Files on Disk 7.5.1 How Much Time Does a Merge Sort Take? 7.5.2 Sorting a File That Is Ten Times Larger 7.5.3 The Cost of Increasing the File Size 7.5.4 Hardware-based Improvements 7.5.5 Decreasing the Number of Seeks Using Multiple-step Merges 7.5.6 Increasing Run Lengths Using Replacement Selection 7.5.7 Replacement Selection Plus Multistep Merging 7.5.8 Using Two Disk Drives with Replacement Selection 7.5.9 More Drives? More Processors? 7.5.10 Effects of Multiprogramming 7.5.11 A Conceptual Toolkit for External Sorting 7.6 Sorting Files on Tape 7.6.1 The Balanced Merge 7.6.2 The K-way Balanced Merge 7.6.3 Multiphase Merges 7.6.4 Tapes versus Disks for External Sorting 7.7 Sort-Merge Packages 7.8 Sorting and Cosequential Processing in UNIX 7.8.1 Sorting and Merging in UNIX 7.8.2 Cosequential Processing Utilities in UNIX Summary • Key Terms • Exercises Further Readings 8 B-Trees and Other Tree-structured File Organizations 8.1 Introduction: The Invention of the B-Tree 8.2 Statement of the Problem 8.3 Binary Search Trees as a Solution 8.4 AVL Trees 8.5 Paged Binary Trees 8.6 The Problem with the Top-down Construction of Paged Trees 8.7 B-Trees: Working up from the Bottom 8.8 Splitting and Promoting 8.9 Algorithms for B-Tree Searching and Insertion 8.10 B-Tree Nomenclature 8.11 Formal Definition of B-Tree Properties 8.12 Worst-case Search Depth 8.13 Deletion, Redistribution, and Concatenation 8.13.1 Redistribution 8.14 Redistribution during Insertion: A Way to Improve Storage Utilization 8.15 B* Trees 8.16 Buffering of Pages: Virtual B-Trees 8.16.1 LRU Replacement 8.16.2 Replacement Based on Page Height 8.16.3 Importance of Virtual B-Trees 8.17 Placement of Information Associated with the Key 8.18 Variable-length Records and Keys Summary • Key Terms • Exercises Further Readings C Programs to Insert Keys into a B-Tree Pascal Programs to Insert Keys into a B-Tree 9 The B+ Tree Family and Indexed Sequential File Access 9.1 Indexed Sequential Access 9.2 Maintaining a Sequence Set 9.2.1 The Use of Blocks 9.2.2 Choice of Block Size 9.3 Adding a Simple Index to the Sequence Set 9.4 The Content of the Index: Separators Instead of Keys 9.5 The Simple Prefix B+ Tree 9.6 Simple Prefix B + Tree Maintenance 9.6.1 Changes Localized to Single Blocks in the Sequence Set 9.6.2 Changes Involving Multiple Blocks in the Sequence Set 9.7 Index Set Block Size 9.8 Internal Structure of Index Set Blocks: A Variable-order B-Tree 9.9 Loading a Simple Prefix B+ Tree 9.10 B+ Trees 9.11 B-Trees, B+ Trees, and Simple Prefix B+ Trees in Perspective Summary • Key Terms • Exercises Further Readings 10 Hashing 10.1 Introduction 10.1.1 What is Hashing? 10.1.2 Collisions 10.2 A Simple Hashing Algorithm 10.3 Hashing Functions and Record Distributions 10.3.1 Distributing Records among Addresses 10.3.2 Some Other Hashing Methods 10.3.3 Predicting the Distribution of Records 10.3.4 Predicting Collisions for a Full File 10.4 How Much Extra Memory Should Be Used? 10.4.1 Packing Density 10.4.2 Predicting Collisions for Different Packing Densities 10.5 Collision Resolution by Progressive Overflow 10.5.1 How Progressive Overflow Works 10.5.2 Search Length 10.6 Storing More Than One Record per Address: Buckets 10.6.1 Effects of Buckets on Performance 10.6.2 Implementation Issues 10.7 Making Deletions 10.7.1 Tombstones for Handling Deletions 10.7.2 Implications of Tombstones for Insertions 10.7.3 Effects of Deletions and Additions on Performance 10.8 Other Collision Resolution Techniques 10.8.1 Double Hashing 10.8.2 Chained Progressive Overflow 10.8.3 Chaining with a Separate Overflow Area 10.8.4 Scatter Tables: Indexing Revisited 10.9 Patterns of Record Access Summary • Key Terms • Exercises Further Readings 11 Extendible Hashing 11.1 Introduction 11.2 How Extendible Hashing Works 11.2.1 Tries 11.2.2 Turning the Trie into a Directory 11.2.3 Splitting to Handle Overflow 11.3 Implementation 11.3.1 Creating the Addresses 11.3.2 Implementing the Top-level Operations 11.3.3 Bucket and Directory Operations 11.3.4 Implementation Summary 11.4 Deletion 11.4.1 Overview of the Deletion Process 11.4.2 A Procedure for Finding Buddy Buckets 11.4.3 Collapsing the Directory 11.4.4 Implementing the Deletion Operations 11.4.5 Summary of the Deletion Operation 11.5 Extendible Hashing Performance 11.5.1 Space Utilization for Buckets 11.5.2 Space Utilization for the Directory 11.6 Alternative Approaches 11.6.1 Dynamic Hashing 11.6.2 Linear Hashing 11.6.3 Approaches to Controlling Splitting Summary • Key Terms • Exercises Further Readings Appendix A: File Structures on CD-ROM A.1 Using this Appendix A.2 Introduction to CD-ROM A.2.1 A Short History of CD-ROM A.2.2 CD-ROM as a File Structure Problem A.3 Physical Organization of CD-ROM A.3.1 Reading Pits and Lands A.3.2 CLV Instead of CAV A.3.3 Addressing A.3.4 Structure of a Sector A.4 CD-ROM Strengths and Weaknesses A.4.1 Seek Performance A.4.2 Data Transfer Rate A.4.3 Storage Capacity A.4.4 Read-Only Access A.4.5 Asymmetric Writing and Reading A.5 Tree Structures on CD-ROM A.5.1 Design Exercises A.5.2 Block Size A.5.3 Special Loading Procedures and Other Considerations A.5.4 Virtual Trees and Buffering Blocks A.5.5 Trees as Secondary Indexes on CD-ROM A.6 Hashed Files on CD-ROM A.6.1 Design Exercises A.6.2 Bucket Size A.6.3 How the Size of CD-ROM Helps A.6.4 Advantages of CD-ROM's Read-Only Status A.7 The CD-ROM File System A.7.1 The Problem A.7.2 Design Exercise A.7.3 A Hybrid Design Summary Appendix B: ASCII Table Appendix C: String Functions in Pascal: tools.prc Functions and Procedures Used to Operate on strng Appendix D: Comparing Disk Drives Bibliography Index This second edition has been thoroughly updated to instruct readers on the design of fast and flexible file structures. It includes coverage of file structures in a UNIX environment, in addition to a new and substantial appendix on CD-ROM. Other modern file structures such as extendible hashing methods are also explored. This book develops a framework for approaching the design of systems to store and retrieve information on magnetic disks and other mass storage devices. It provides a fundamental collection of tools that any user needs in order to design intelligent, cost-effective, and appropriate solutions to file structure problems

دانلود کتاب File Structures (2nd Edition)