Distributed System Design

معرفی کتاب «Distributed System Design» نوشتهٔ Wu, Jie، منتشرشده توسط نشر CRC Press LLC در سال 1998. این کتاب در فرمت pdf، زبان انگلیسی ارائه شده است. «Distributed System Design» در دستهٔ بدون دسته‌بندی قرار دارد.

This reference outlines the main motivations for building a distributed system, including inherently distributed applications, performance / cost, resource sharing, flexibility and extendibility, availability and fault tolerance, and scalability. Presenting basic concepts, problems, and possible solutions, Distributed System Design serves graduate students in distributed system design as well as computer professionals analyzing and designing distributed / open / parallel systems. Distributed System Design......Page 1 Table of Contents......Page 2 Preface......Page 9 Acknowledgments......Page 13 1.1 Motivation......Page 15 1.2 Basic Computer Organizations......Page 18 1.3 Definition of a Distributed System......Page 20 1.4 Our Model......Page 26 1.5 Interconnection Networks......Page 28 1.6 Applications and Standards......Page 34 1.7 Scope......Page 36 1.8 Source of References......Page 39 2.1 Requirement for Distributed Programming Support......Page 45 2.2 An Overview of Parallel/Distributed Programming Language......Page 46 2.3 Expressing Parallelism......Page 48 2.4 Process Communication and Synchronization......Page 60 2.5 Remote Procedure Calls......Page 70 2.6 Robustness......Page 73 3.1 Introduction to Models......Page 83 3.1.1 A state machine model......Page 84 3.1.2 Petri nets......Page 86 3.2.1 The happened-before relation......Page 93 3.2.2 The time-space view......Page 95 3.2.3 The interleaving view......Page 96 3.3 Global States......Page 97 3.3.2 Global state: a formal definition......Page 98 3.3.3 The “snapshot” of global states......Page 102 3.3.4 A necessary and sufficient condition for a consistent global state......Page 105 3.4 Logical Clocks......Page 106 3.4.1 Scalar logical clock......Page 107 3.4.2 Extensions......Page 109 3.4.3 Efficient implementations......Page 111 3.5.1 A total ordering application: distributed mutual exclusion......Page 113 3.5.2 A logical vector clock application: ordering of messages......Page 114 3.6 Classification of Distributed Control Algorithms......Page 117 3.7 The Complexity of Distributed Algorithms......Page 120 4.1 Mutual Exclusion......Page 124 4.2.1 A simple extension to Lamport’s algorithm......Page 125 4.2.2 Ricart and Agrawala’s first algorithm......Page 128 4.2.3 Maekawa’s algorithm......Page 129 4.3.1 Ricart and Agrawala’s second algorithm......Page 131 4.3.2 A simple token-ring-based algorithm......Page 134 4.3.3 A fault-tolerant token-ring-based algorithm......Page 135 4.3.4 Token-based mutual exclusion using other logical structures......Page 136 4.4 Election......Page 137 4.4.1 Chang and Roberts’ algorithm......Page 139 4.4.2 Non-comparison-based algorithms......Page 143 4.5 Bidding......Page 146 4.6 Self-Stabilization......Page 152 5.1 The Deadlock Problem......Page 163 5.1.2 Graph-theoretic models......Page 164 5.1.3 Strategies for handling deadlocks......Page 166 5.1.6 Deadlock conditions......Page 169 5.2 Deadlock Prevention......Page 171 5.3 A Deadlock Prevention Example: Distributed Database Systems......Page 172 5.4.1 An avoidance approach using Petri nets......Page 176 5.5 A Deadlock Prevention Example: Multi-Robot Flexible Assembly Cells......Page 179 5.6 Deadlock Detection and Recovery......Page 183 5.6.1 Centralized approaches......Page 184 5.6.3 Hierarchical approaches......Page 186 5.7.1 Chandy, Misra, and Hass’s algorithm for the AND model......Page 189 5.7.2 Mitchell and Merritt’s algorithm for the AND model......Page 190 5.7.3 Chandy, Misra, and Hass’ algorithm for the OR model......Page 191 6.1 Introduction......Page 198 6.1.1 Topology......Page 199 6.1.2 Switching......Page 200 6.1.4 Routing......Page 201 6.1.5 Routing functions......Page 202 6.2.1 Dijkstra’s centralized algorithm......Page 204 6.2.2 Ford’s distributed algorithm......Page 205 6.2.3 ARPAnet’s Routing Strategy (APRS)......Page 207 6.3.1 Bidirectional rings......Page 210 6.3.2 Meshes and Tori......Page 211 6.3.3 Hypercubes......Page 215 6.4.1 Rings......Page 217 6.4.2 2-d meshes and tori......Page 220 6.4.3 Hypercubes......Page 221 6.5 Multicasting in Special-Purpose Networks......Page 226 6.5.2 Path-based approaches......Page 227 6.5.3 Tree-based approaches......Page 230 7.1 Virtual Channels and Virtual Networks......Page 237 7.2.1 Virtual channel classes......Page 242 7.2.2 Escape channels......Page 243 7.3.1 Turn model......Page 245 7.3.1.1 Planar-adaptive model......Page 246 7.3.1.2 Other partially adaptive models......Page 248 7.4 Fault-Tolerant Unicasting: General Approaches......Page 251 7.5.1 Local-information-based routing......Page 252 7.5.2 Limited-global-information-based routing......Page 255 7.5.3 Routing based on other fault models......Page 260 7.6 Fault-Tolerant Unicasting in Hypercubes......Page 261 7.6.1 Local-information-based model......Page 262 7.6.2 Limited-global-information-based model: safety level......Page 265 7.6.3 Routing based on the extended safety level model: safety vector......Page 268 7.7.1 General approaches......Page 270 7.7.2 Broadcasting using global information......Page 271 7.7.3 Broadcasting using safety levels......Page 273 7.8.2 Path-based routing......Page 275 7.8.3 Multicasting in hypercubes used safety levels......Page 279 8.1 Basic Models......Page 291 8.2.1 Stable storage......Page 294 8.2.3 Atomic actions......Page 295 8.3.1 Backward recovery......Page 297 8.3.2 Roll-forward recovery......Page 300 8.4.1 Storage of checkpoints......Page 304 8.4.2 Checkpointing methods......Page 305 8.5 Handling of Byzantine Faults......Page 310 8.5.1 Agreement protocols in synchronous systems......Page 311 8.5.2 Agreement with one sender......Page 312 8.5.3 Agreement with many senders......Page 316 8.5.4 Agreement under different models......Page 318 8.6 Handling of Communication Faults......Page 321 8.7 Handling of Software Faults......Page 327 9.1 Classification of Load Distribution......Page 336 9.2 Static Load Distribution......Page 338 9.2.1 Processor interconnections......Page 340 9.2.2 Task partition......Page 341 9.2.3 Task allocation......Page 343 9.3 An Overview of Different Scheduling Models......Page 345 9.4 Task Scheduling Based on Task Precedence Graphs......Page 346 9.5 Case Study: Two Optimal Scheduling Algorithms......Page 350 9.6 Task Scheduling Based on Task Interaction Graphs......Page 353 9.7 Case Study: Domain Partition......Page 358 9.8 Scheduling Using Other Models and Objectives......Page 362 9.8.1 Network flow techniques: task interaction graphs with different processor capacities......Page 363 9.8.2 Rate monotonic priority and deadline driven schedules: periodic tasks with realtime constraints......Page 369 9.8.3 Fault secure schedule through task duplication: tree-structured task precedence graphs......Page 374 9.9 Future Directions......Page 385 10.1 Dynamic Load Distribution......Page 392 10.1.1 Components of dynamic load distribution......Page 393 10.2.1 Static versus dynamic algorithms......Page 396 10.2.2 Various information policies......Page 397 10.2.4 Migration initiation policy......Page 399 10.2.8 Open loop vs. closed loop control......Page 400 10.2.9 Use of hardware vs. software......Page 401 10.3 Migration Policies: Sender-initiated and Receiver-initiated......Page 402 10.4 Parameters Used for Load Balancing......Page 405 10.4.1 System size......Page 406 10.4.6 Overhead costs......Page 407 10.5.1 Code and data files......Page 408 10.5.3 System architecture......Page 409 10.6.2 Nearest neighbor algorithms: diffusion......Page 410 10.6.4 Nearest neighbor algorithms: dimension exchange......Page 411 10.7 Case Study: Load Balancing on Hypercube Multicomputers......Page 415 10.7.1 Dimension-exchange-based load balancing on faulty hypercubes......Page 416 10.8 Future Directions......Page 422 11.1 Basic Concepts......Page 429 11.2 Serializability Theory......Page 430 11.3 Concurrency Control......Page 434 11.3.1 Lock-based concurrency control......Page 436 11.3.2 Timestamp-based concurrency control......Page 437 11.3.3 Optimistic concurrency control......Page 439 11.4.1 Primary site approach......Page 440 11.4.2 Active replicas......Page 441 11.4.3 Voting Protocol......Page 442 11.4.4 Optimistic approaches for network partition: version vectors......Page 445 11.5 Distributed Reliability Protocols......Page 451 12.1 Distributed Operating Systems......Page 467 12.1.2 Eight types of services......Page 470 12.1.3 Microkernel-based systems......Page 471 12.2.2 File sharing semantics......Page 473 12.2.4 Protection......Page 474 12.2.6 Encryption......Page 476 12.2.7 Caching......Page 477 12.3 Distributed Shared Memory......Page 479 12.3.1 Memory coherence problem......Page 481 12.3.2 Stumm and Zhou’s Classification......Page 482 12.3.3 Li and Hudak’s Classification......Page 485 12.4 Distributed Database Systems......Page 487 12.5 Heterogeneous Processing......Page 491 12.6 Future Directions of Distributed Systems......Page 493 Index......Page 501

future Requirements For Computing Speed, System Reliability, And Cost-effectiveness Entail The Development Of Alternative Computers To Replace The Traditional Von Neumann Organization. As Computing Networks Come Into Being, One Of The Latest Dreams Is Now Possible - Distributed Computing.
distributed Computing Brings Transparent Access To As Much Computer Power And Data As The User Needs For Accomplishing Any Given Task - Simultaneously Achieving High Performance And Reliability.
the Subject Of Distributed Computing Is Diverse, And Many Researchers Are Investigating Various Issues Concerning The Structure Of Hardware And The Design Of Distributed Software. Distributed System Design Defines A Distributed System As One That Looks To Its Users Like An Ordinary System, But Runs On A Set Of Autonomous Processing Elements (pes) Where Each Pe Has A Separate Physical Memory Space And The Message Transmission Delay Is Not Negligible. With Close Cooperation Among These Pes, The System Supports An Arbitrary Number Of Processes And Dynamic Extensions.
distributed System Design Outlines The Main Motivations For Building A Distributed System, Including:

inherently Distributed Applications

performance/cost

resource Sharing

flexibility And Extendibility

availability And Fault Tolerance

scalability Presenting Basic Concepts, Problems, And Possible Solutions, This Reference Serves Graduate Students In Distributed System Design As Well As Computer Professionals Analyzing And Designing Distributed/open/parallel Systems.
chapters Discuss:

the Scope Of Distributed Computing Systems

general Distributed Programming Languages And A Csp-like Distributed Control Description Language (dcdl)

expressing Parallelism, Interprocess Communication And Synchronization, And Fault-tolerant Design

two Approaches Describing A Distributed System: The Time-space View And The Interleaving View

mutual Exclusion And Related Issues, Including Election, Bidding, And Self-stabilization

prevention And Detection Of Deadlock

reliability, Safety, And Security As Well As Various Methods Of Handling Node, Communication, Byzantine, And Software Faults

efficient Interprocessor Communication Mechanisms As Well As These Mechanisms Without Specific Constraints, Such As Adaptiveness, Deadlock-freedom, And Fault-tolerance

virtual Channels And Virtual Networks

load Distribution Problems

synchronization Of Access To Shared Data While Supporting A High Degree Of Concurrency

booknews

intended As A Graduate Course In The Design Of Distributed Systems (i.e. Systems That Run On Cooperative Autonomous Processing Elements), This Book Concentrates On Software Elements Of Design That Emphasize Performance, Flexibility, Fault Tolerance, And Scalability. Material Covers Distributed Programming Languages, Problems Of Mutual Exclusion, The Prevention And Detection Of Deadlock, Interprocessor Communication Mechanisms, Reliability Issues, And Load Distribution Problems. Annotation C. By Book News, Inc., Portland, Or.

Presenting basic concepts, problems, and possible solutions, Distributed System Design serves graduate students in distributed system design as well as computer professionals analyzing and designing distributed/open/parallel systems. Chapters discuss-- the scope of distributed computing systems-- general distributed programming languages and a CSP-like distributed control description language (DCDL)-- expressing parallelism, interprocess communication and synchronization, and fault-tolerant design-- two approaches describing a distributed system: the time-space view and the interleaving view-- mutual exclusion and related issues, including election, bidding, and self-stabilization-- prevention and detection of deadlock-- reliability, safety, and security as well as various methods of handling node, communication, Byzantine, and software faults-- efficient interprocessor communication mechanisms as well as these mechanisms without specific constraints, such as adaptiveness, deadlock-freedom, and fault-tolerance-- virtual channels and virtual networks-- load distribution problems-- synchronization of access to shared data while supporting a high degree of concurrency. Future requirements for computing speed, system reliability, and cost-effectiveness entail the development of alternative computers to replace the traditional von Neumann organization. As computing networks come into being, one of the latest dreams is now possible - distributed computing. Distributed computing brings transparent access to as much computer power and data as the user needs for accomplishing any given task - simultaneously achieving high performance and reliability. The subject of distributed computing is diverse, and many researchers are investigating various issues concerning the structure of hardware and the design of distributed software. Distributed System Design defines a distributed system as one that looks to its users like an ordinary system, but runs on a set of autonomous processing elements (PEs) where each PE has a separate physical memory space and the message transmission delay is not negligible. With close cooperation among these PEs, the system supports an arbitrary number of processes and dynamic extensions. Distributed System Design outlines the main motivations for building a distributed system,

دانلود کتاب Distributed System Design