Fault-Tolerant Systems
معرفی کتاب «Fault-Tolerant Systems» نوشتهٔ Israel Koren D.Sc. Electrical Engineering Israel Institute of Technology Haifa; C. Mani Krishna Ph.D. University of Michigan، منتشرشده توسط نشر Morgan Kaufmann ; Elsevier Science [distributor در سال 2007. این کتاب در فرمت pdf، زبان انگلیسی ارائه شده است. «Fault-Tolerant Systems» در دستهٔ بدون دستهبندی قرار دارد.
There are many applications in which the reliability of the overall system must be far higher than the reliability of its individual components. In such cases, designers devise mechanisms and architectures that allow the system to either completely mask the effects of a component failure or recover from it so quickly that the application is not seriously affected. This is the work of fault-tolerant designers and their work is increasingly important and complex not only because of the increasing number of "mission critical" applications, but also because the diminishing reliability of hardware means that even systems for non-critical applications will need to be designed with fault-tolerance in mind. Reflecting the real-world challenges faced by designers of these systems, this book addresses fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment Koren and Krishna provide. Students, designers and architects of high performance processors will value this comprehensive overview of the field. * The first book on fault tolerance design with a systems approach * Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy * Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design * Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides Title Page......Page 4 Copyright Page......Page 5 Table of Contents......Page 6 Foreword......Page 12 Preface......Page 14 Acknowledgements......Page 18 About the Authors......Page 20 1 Preliminaries......Page 22 1.1 Fault Classification......Page 23 1.2 Types of Redundancy......Page 24 1.3 Basic Measures of Fault Tolerance......Page 25 1.3.1 Traditional Measures......Page 26 1.3.2 Network Measures......Page 27 1.4 Outline of This Book......Page 28 1.5 Further Reading......Page 30 References......Page 31 2.1 The Rate of Hardware Failures......Page 32 2.2 Failure Rate, Reliability, and Mean Time to Failure......Page 34 2.3 Canonical and Resilient Structures......Page 36 2.3.1 Series and Parallel Systems......Page 37 2.3.2 Non-Series/Parallel Systems......Page 38 2.3.3 M-of-N Systems......Page 41 2.3.5 Variations on N-Modular Redundancy......Page 44 2.3.6 Duplex Systems......Page 48 2.4.1 Poisson Processes......Page 51 2.4.2 Markov Models......Page 54 2.5 Fault-Tolerance Processor-Level Techniques......Page 57 2.5.1 Watchdog Processor......Page 58 2.5.2 Simultaneous Multithreading for Fault Tolerance......Page 60 2.6 Byzantine Failures......Page 62 2.6.1 Byzantine Agreement with Message Authentication......Page 67 2.8 Exercises......Page 69 References......Page 74 3 Information Redundancy......Page 76 3.1 Coding......Page 77 3.1.1 Parity Codes......Page 78 3.1.2 Checksum......Page 85 3.1.3 M-of-N Codes......Page 86 3.1.4 Berger Code......Page 87 3.1.5 Cyclic Codes......Page 88 3.1.6 Arithmetic Codes......Page 95 3.2.1 RAID Level 1......Page 100 3.2.2 RAID Level 2......Page 102 3.2.3 RAID Level 3......Page 103 3.2.4 RAID Level 4......Page 104 3.2.6 Modeling Correlated Failures......Page 105 3.3 Data Replication......Page 109 3.3.1 Voting: Non-Hierarchical Organization......Page 110 3.3.2 Voting: Hierarchical Organization......Page 116 3.3.3 Primary-Backup Approach......Page 117 3.4 Algorithm-Based Fault Tolerance......Page 120 3.5 Further Reading......Page 122 3.6 Exercises......Page 123 References......Page 127 4 Fault-Tolerant Networks......Page 130 4.1.1 Graph-Theoretical Measures......Page 131 4.1.2 Computer Networks Measures......Page 132 4.2.1 Multistage and Extra-Stage Networks......Page 133 4.2.2 Crossbar Networks......Page 140 4.2.3 Rectangular Mesh and Interstitial Mesh......Page 142 4.2.4 Hypercube Network......Page 145 4.2.5 Cube-Connected Cycles Networks......Page 149 4.2.6 Loop Networks......Page 151 4.2.7 Ad Hoc Point-to-Point Networks......Page 153 4.3 Fault-Tolerant Routing......Page 156 4.3.1 Hypercube Fault-Tolerant Routing......Page 157 4.3.2 Origin-Based Routing in the Mesh......Page 159 4.4 Further Reading......Page 162 4.5 Exercises......Page 163 References......Page 166 5 Software Fault Tolerance......Page 168 5.1 Acceptance Tests......Page 169 5.2.1 Wrappers......Page 170 5.2.2 Software Rejuvenation......Page 173 5.2.3 Data Diversity......Page 176 5.2.4 Software Implemented Hardware Fault Tolerance (SIHFT)......Page 178 5.3 N-Version Programming......Page 181 5.3.1 Consistent Comparison Problem......Page 182 5.3.2 Version Independence......Page 183 5.4.2 Success Probability Calculation......Page 190 5.4.3 Distributed Recovery Blocks......Page 192 5.6 Exception-Handling......Page 194 5.6.1 Requirements from Exception-Handlers......Page 195 5.6.2 Basics of Exceptions and Exception-Handling......Page 196 5.6.3 Language Support......Page 198 5.7.1 Jelinski-Moranda Model......Page 199 5.7.2 Littlewood-Verrall Model......Page 200 5.7.3 Musa-Okumoto Model......Page 201 5.8.1 Primary-Backup Approach......Page 203 5.8.2 The Circus Approach......Page 204 5.9 Further Reading......Page 205 5.10 Exercises......Page 207 References......Page 209 6 Checkpointing......Page 214 6.1 What Is Checkpointing?......Page 216 6.2 Checkpoint Level......Page 218 6.3 Optimal Checkpointing — An Analytical Model......Page 219 6.3.1 Time Between Checkpoints — A First-Order Approximation......Page 221 6.3.2 Optimal Checkpoint Placement......Page 222 6.3.3 Time Between Checkpoints — A More Accurate Model......Page 223 6.3.4 Reducing Overhead......Page 225 6.3.5 Reducing Latency......Page 226 6.4 Cache-Aided Rollback Error Recovery (CARER)......Page 227 6.5 Checkpointing in Distributed Systems......Page 228 6.5.1 The Domino Effect and Livelock......Page 230 6.5.2 A Coordinated Checkpointing Algorithm......Page 231 6.5.3 Time-Based Synchronization......Page 232 6.5.4 Diskless Checkpointing......Page 233 6.5.5 Message Logging......Page 234 6.6 Checkpointing in Shared-Memory Systems......Page 238 6.6.1 Bus-Based Coherence Protocol......Page 239 6.6.2 Directory-Based Protocol......Page 240 6.7 Checkpointing in Real-Time Systems......Page 241 6.9 Further Reading......Page 244 6.10 Exercises......Page 245 References......Page 247 7.1.1 Architecture......Page 250 7.1.3 Software......Page 254 7.1.4 Modifications to the NonStop Architecture......Page 256 7.2 Stratus Systems......Page 257 7.3 Cassini Command and Data Subsystem......Page 259 7.4 IBM G5......Page 262 7.5 IBM Sysplex......Page 263 7.6 Itanium......Page 265 7.7 Further Reading......Page 267 References......Page 268 8.1 Manufacturing Defects and Circuit Faults......Page 270 8.2 Probability of Failure and Critical Area......Page 272 8.3 Basic Yield Models......Page 274 8.3.1 The Poisson and Compound Poisson Yield Models......Page 275 8.3.2 Variations on the Simple Yield Models......Page 277 8.4 Yield Enhancement Through Redundancy......Page 279 8.4.1 Yield Projection for Chips with Redundancy......Page 280 8.4.2 Memory Arrays with Redundancy......Page 284 8.4.3 Logic Integrated Circuits with Redundancy......Page 291 8.4.4 Modifying the Floorplan......Page 293 8.5 Further Reading......Page 297 8.6 Exercises......Page 298 References......Page 302 9 Fault Detection in Cryptographic Systems......Page 306 9.1.1 Symmetric Key Ciphers......Page 307 9.1.2 Public Key Ciphers......Page 316 9.2 Security Attacks Through Fault Injection......Page 317 9.2.1 Fault Attacks on Symmetric Key Ciphers......Page 318 9.2.2 Fault Attacks on Public (Asymmetric) Key Ciphers......Page 319 9.3 Countermeasures......Page 320 9.3.2 Error-Detecting Codes......Page 321 9.3.3 Are These Countermeasures Sufficient?......Page 325 9.5 Exercises......Page 328 References......Page 329 10.1 Writing a Simulation Program......Page 332 10.2.1 Point Versus Interval Estimation......Page 336 10.2.2 Method of Moments......Page 337 10.2.3 Method of Maximum Likelihood......Page 339 10.2.4 The Bayesian Approach to Parameter Estimation......Page 343 10.2.5 Confidence Intervals......Page 345 10.3.1 Antithetic Variables......Page 349 10.3.2 Using Control Variables......Page 351 10.3.3 Stratified Sampling......Page 352 10.3.4 Importance Sampling......Page 354 10.4 Random Number Generation......Page 362 10.4.1 Uniformly Distributed Random Number Generators......Page 363 10.4.2 Testing Uniform Random Number Generators......Page 366 10.4.3 Generating Other Distributions......Page 370 10.5 Fault Injection......Page 376 10.5.1 Types of Fault Injection Techniques......Page 377 10.6 Further Reading......Page 379 10.7 Exercises......Page 380 References......Page 384 Index......Page 386 Much as we hate to admit it, most prototyping practice lacks a sophisticated understanding of the broad concepts of prototyping—and its strategic position within the development process. Often we overwhelm with a high fidelity prototype that designs us into a corner. Or, we can underwhelm with a prototype with too much ambiguity and flexibility to be of much use in the software development process.
This book will help software makers—developers, designers, and architects—build effective prototypes every time: prototypes that convey enough information about the product at the appropriate time and thus set expectations appropriately.
This practical, informative book will help anyone—whether or not one has artistic talent, access to special tools, or programming ability—to use good prototyping style, methods, and tools to build prototypes and manage for effective prototyping.
Features
* A prototyping process with guidelines, templates, and worksheets;
* Overviews and step-by-step guides for 9 common prototyping techniques;
* An introduction with step-by-step guidelines to a variety of prototyping tools that do not require advanced artistic skills;
* Templates and other resources used in the book available on the Web for reuse;
* Clearly-explained concepts and guidelines;
* Full-color illustrations, and examples from a wide variety of prototyping processes, methods, and tools.
Jonathan Arnowitz is a principal user experience designer at SAP Labs and is the co-editor-in-chief of Interactions Magazine. Most recently Jonathan was a senior user experience designer at Peoplesoft. He is a member of the SIGCHI executive committee, and was a founder of DUX, the first ever joint conference of ACM SIGCHI, ACM SIGGRAPH, AIGA Experience Design Group, and STC.
Michael Arent is the manager of user experience design at SAP Labs, and has previously held positions at Peoplesoft, Inc, Adobe Systems, Inc, Sun Microsystems, and Apple Computer, Inc. He holds several U.S. patents.
Nevin Berger is design director at Ziff Davis Media. Previously he was a senior interaction designer at Oracle Corporation and Peoplesoft, Inc., and has held creative director positions at ZDNet, World Savings, and OFOTO, Inc.
* A prototyping process with guidelines, templates, and worksheets;
* Overviews and step-by-step guides for 9 common prototyping techniques;
* An introduction with step-by-step guidelines to a variety of prototyping tools that do not require advanced artistic skills;
* Templates and other resources used in the book available on the Web for reuse;
* Clearly-explained concepts and guidelines;
* Full-color illustrations, and examples from a wide variety of prototyping processes, methods, and tools.
* www.mkp.com/prototyping There are many applications in which the reliability of the overall system must be far higher than the reliability of its individual components. In such cases, designers devise mechanisms and architectures that allow the system to either completely mask the effects of a component failure or recover from it so quickly that the application is not seriously affected. This is the work of fault-tolerant designers and their work is increasingly important and complex not only because of the increasing number of “mission critical applications, but also because the diminishing reliability of hardware means that even systems for non-critical applications will need to be designed with fault-tolerance in mind.
Reflecting the real-world challenges faced by designers of these systems, this book addresses fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment Koren and Krishna provide. Students, designers and architects of high performance processors will value this comprehensive overview of the field.
* The first book on fault tolerance design with a systems approach
* Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy
* Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design
* Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides Effective Prototyping for Software Makers is a practical, informative resource that will help anyonewhether or not one has artistic talent, access to special tools, or programming abilityto use good prototyping style, methods, and tools to build prototypes and manage for effective prototyping. This book features a prototyping process with guidelines, templates, and worksheets; overviews and step-by-step guides for nine common prototyping techniques; an introduction with step-by-step guidelines to a variety of prototyping tools that do not require advanced artistic skills; templates and other resources used in the book available on the Web for reuse; clearly-explained concepts and guidelines; and full-color illustrations and examples from a wide variety of prototyping processes, methods, and tools. This book is an ideal resource for usability professionals and interaction designers; software developers, web application designers, web designers, information architects, information and industrial designers. * A prototyping process with guidelines, templates, and worksheets;* Overviews and step-by-step guides for 9 common prototyping techniques;* An introduction with step-by-step guidelines to a variety of prototyping tools that do not require advanced artistic skills;* Templates and other resources used in the book available on the Web for reuse;* Clearly-explained concepts and guidelines;* Full-color illustrations, and examples from a wide variety of prototyping processes, methods, and tools. * (http://www.mkp.com/prototyping) www.mkp.com/prototyping There are many applications in which the reliability of the overall system must be far higher than the reliability of its individual components. In such cases, designers devise mechanisms and architectures that allow the system to either completely mask the effects of a component failure or recover from it so quickly that the application is not seriously affected. This is the work of fault-tolerant designers and their work is increasingly important and complex not only because of the increasing number of?mission critical{u0094} applications, but also because the diminishing reliability of hardware means that even systems for non-critical applications will need to be designed with fault-tolerance in mind. Reflecting the real-world challenges faced by designers of these systems, this book addresses fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment Koren and Krishna provide. Students, designers and architects of high performance processors will value this comprehensive overview of the field. * The first book on fault tolerance design with a systems approach * Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy * Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design * Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides Fault-Tolerant Systems is the first book on fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment that Koren and Krishna provide. This book incorporates case studies that highlight six different computer systems with fault-tolerance techniques implemented in their design. A complete ancillary package is available to lecturers, including online solutions manual for instructors and PowerPoint slides. Students, designers, and architects of high performance processors will value this comprehensive overview of the field. The first book on fault tolerance design with a systems approach Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides
دانلود کتاب Fault-Tolerant Systems
This book will help software makers—developers, designers, and architects—build effective prototypes every time: prototypes that convey enough information about the product at the appropriate time and thus set expectations appropriately.
This practical, informative book will help anyone—whether or not one has artistic talent, access to special tools, or programming ability—to use good prototyping style, methods, and tools to build prototypes and manage for effective prototyping.
Features
* A prototyping process with guidelines, templates, and worksheets;
* Overviews and step-by-step guides for 9 common prototyping techniques;
* An introduction with step-by-step guidelines to a variety of prototyping tools that do not require advanced artistic skills;
* Templates and other resources used in the book available on the Web for reuse;
* Clearly-explained concepts and guidelines;
* Full-color illustrations, and examples from a wide variety of prototyping processes, methods, and tools.
Jonathan Arnowitz is a principal user experience designer at SAP Labs and is the co-editor-in-chief of Interactions Magazine. Most recently Jonathan was a senior user experience designer at Peoplesoft. He is a member of the SIGCHI executive committee, and was a founder of DUX, the first ever joint conference of ACM SIGCHI, ACM SIGGRAPH, AIGA Experience Design Group, and STC.
Michael Arent is the manager of user experience design at SAP Labs, and has previously held positions at Peoplesoft, Inc, Adobe Systems, Inc, Sun Microsystems, and Apple Computer, Inc. He holds several U.S. patents.
Nevin Berger is design director at Ziff Davis Media. Previously he was a senior interaction designer at Oracle Corporation and Peoplesoft, Inc., and has held creative director positions at ZDNet, World Savings, and OFOTO, Inc.
* A prototyping process with guidelines, templates, and worksheets;
* Overviews and step-by-step guides for 9 common prototyping techniques;
* An introduction with step-by-step guidelines to a variety of prototyping tools that do not require advanced artistic skills;
* Templates and other resources used in the book available on the Web for reuse;
* Clearly-explained concepts and guidelines;
* Full-color illustrations, and examples from a wide variety of prototyping processes, methods, and tools.
* www.mkp.com/prototyping There are many applications in which the reliability of the overall system must be far higher than the reliability of its individual components. In such cases, designers devise mechanisms and architectures that allow the system to either completely mask the effects of a component failure or recover from it so quickly that the application is not seriously affected. This is the work of fault-tolerant designers and their work is increasingly important and complex not only because of the increasing number of “mission critical applications, but also because the diminishing reliability of hardware means that even systems for non-critical applications will need to be designed with fault-tolerance in mind.
Reflecting the real-world challenges faced by designers of these systems, this book addresses fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment Koren and Krishna provide. Students, designers and architects of high performance processors will value this comprehensive overview of the field.
* The first book on fault tolerance design with a systems approach
* Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy
* Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design
* Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides Effective Prototyping for Software Makers is a practical, informative resource that will help anyonewhether or not one has artistic talent, access to special tools, or programming abilityto use good prototyping style, methods, and tools to build prototypes and manage for effective prototyping. This book features a prototyping process with guidelines, templates, and worksheets; overviews and step-by-step guides for nine common prototyping techniques; an introduction with step-by-step guidelines to a variety of prototyping tools that do not require advanced artistic skills; templates and other resources used in the book available on the Web for reuse; clearly-explained concepts and guidelines; and full-color illustrations and examples from a wide variety of prototyping processes, methods, and tools. This book is an ideal resource for usability professionals and interaction designers; software developers, web application designers, web designers, information architects, information and industrial designers. * A prototyping process with guidelines, templates, and worksheets;* Overviews and step-by-step guides for 9 common prototyping techniques;* An introduction with step-by-step guidelines to a variety of prototyping tools that do not require advanced artistic skills;* Templates and other resources used in the book available on the Web for reuse;* Clearly-explained concepts and guidelines;* Full-color illustrations, and examples from a wide variety of prototyping processes, methods, and tools. * (http://www.mkp.com/prototyping) www.mkp.com/prototyping There are many applications in which the reliability of the overall system must be far higher than the reliability of its individual components. In such cases, designers devise mechanisms and architectures that allow the system to either completely mask the effects of a component failure or recover from it so quickly that the application is not seriously affected. This is the work of fault-tolerant designers and their work is increasingly important and complex not only because of the increasing number of?mission critical{u0094} applications, but also because the diminishing reliability of hardware means that even systems for non-critical applications will need to be designed with fault-tolerance in mind. Reflecting the real-world challenges faced by designers of these systems, this book addresses fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment Koren and Krishna provide. Students, designers and architects of high performance processors will value this comprehensive overview of the field. * The first book on fault tolerance design with a systems approach * Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy * Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design * Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides Fault-Tolerant Systems is the first book on fault tolerance design with a systems approach to both hardware and software. No other text on the market takes this approach, nor offers the comprehensive and up-to-date treatment that Koren and Krishna provide. This book incorporates case studies that highlight six different computer systems with fault-tolerance techniques implemented in their design. A complete ancillary package is available to lecturers, including online solutions manual for instructors and PowerPoint slides. Students, designers, and architects of high performance processors will value this comprehensive overview of the field. The first book on fault tolerance design with a systems approach Comprehensive coverage of both hardware and software fault tolerance, as well as information and time redundancy Incorporated case studies highlight six different computer systems with fault-tolerance techniques implemented in their design Available to lecturers is a complete ancillary package including online solutions manual for instructors and PowerPoint slides