Course Objectives: |
The objective of the course is to provide students with a comprehensive understanding of distributed systems, including their architecture, middleware, system-level support, and design considerations for distributed algorithms. Students will gain in-depth knowledge of the design principles, challenges, and emerging paradigms in building and managing distributed computing systems. Through practical exercises and projects, students will explore various concepts, algorithms, and technologies used in distributed systems, while also learning to design and implement popular distributed system paradigms such as Google File System and MapReduce. |
Course Content: |
The content provided captures the essence of a course on Distributed Systems. It highlights key aspects such as the distribution of data over a network, the appearance of a single computer to system users, communication through message passing, and various themes including process distribution, data distribution, concurrency, resource sharing, synchronization, and more. It also emphasizes the importance of designing, implementing, and debugging large programming projects as part of the course. Overall, the content provides a good overview of the course's focus and objectives. |
Week |
Subject |
Related Preparation |
1) |
Distributed programming enables developers to use multiple nodes in a data center to increase throughput and/or reduce latency of selected applications. |
|
2) |
The way that Distributed systems components — clients, servers, etc. — are arranged, and the interactions between them, is called architecture. In this course you will study the ways these architectures are represented, both in UML and other visual tools. We will introduce the most common architectures, their qualities, and tradeoffs. We will talk about how architectures are evaluated, what makes a good architecture, and an architecture can be improved. We'll also talk about how the architecture touches on the process of software development. |
|
3) |
Web services - lecture notes
Clock synchronization
Precision Time Protocol
Logical clocks
Clocks – terms |
|
3) |
In this module, students will learn how to write distributed applications in the Single Program Multiple Data (SPMD) model.
We will also learn about the message ordering and deadlock properties. Non-blocking communications are an interesting extension of point-to-point communications, since they can be used to avoid delays due to blocking and to also avoid deadlock-related errors. Finally, we will study collective communication, which can involve multiple processes in a manner that is more powerful than multicast and publish-subscribe operations. The knowledge gained in this module will be put to practice in the mini-project associated with this module on implementing a distributed matrix multiplication program. |
|
4) |
We argue that objects that interact in a distributed system need to be dealt with in ways that are intrinsically different from objects that interact in a single address space. These differences are required because distributed systems require that the programmer be aware of latency, have a different model of memory access, and take into account issues of concurrency and partial failure. We look at a number of distributed systems that have attempted to paper over the distinction between local and remote objects, and show that such systems fail to support basic requirements of robustness and reliability. These failures have been masked in the past by the small size of the distributed systems that have been built. In the enterprise-wide distributed systems foreseen in the near future, however, such a masking will be impossible. We conclude by discussing what is required of both systems-level and application-level programmers and designers if one is to take distribution seriously. |
|
5) |
Chain replication is a new approach to coordinating clusters of fail-stop storage servers. The approach is intended for supporting large-scale storage services that exhibit high throughput and availability without sacrificing strong consistency guarantees. Besides outlining the chain replication protocols themselves, simulation experiments explore the performance characteristics of a prototype implementation. Throughput, availability, and several object-placement strategies (including schemes based on distributed hash table routing) are discussed. |
|
6) |
Consensus
Paxos
Raft |
|
7) |
|
|
8) |
|
|
9) |
Hadoop & Spark
Large-Scale Data Processing MapReduce
Traditional programming is serial
Parallel programming |
|
10) |
Google Cluster Archtecture
HA Clusters
HPC Clusters |
|
12) |
Naming
Distributed lookup services
Amazon Dynamo
DNS - slides |
|
13) |
cryptography
integrity
authentication |
|
Course Notes / Textbooks: |
1. Ajay D. Kshemkalyani and Mukesh Singhal, Distributed Computing: Principles, Algorithms and Systems, Cambridge University Press, 2011
2. George Coulouris, Jean Dollimore, Tim Kindberg and Gordon Blair, Distributed Systems: Concepts and Design, Fifth Edition, Pearson Education, 2017 |
References: |
George Coulouris, Jean Dollimore, Tim Kindberg and Gordon Blair, Distributed Systems: Concepts and Design, Fifth Edition, Pearson Education, 2017 |
|
Program Outcomes |
Level of Contribution |
1) |
Being able to develop and deepen their knowledge at the level of expertise in the same or a different field, based on undergraduate level qualifications. |
2 |
2) |
To be able to use the theoretical and applied knowledge at the level of expertise acquired in the field. |
|
3) |
To be able to interpret and create new knowledge by integrating the knowledge gained in the field with the knowledge from different disciplines. |
2 |
4) |
To be able to solve the problems encountered in the field by using research methods. |
|
5) |
To be able to systematically transfer current developments in the field and their own studies to groups in and outside the field, in written, verbal and visual forms, by supporting them with quantitative and qualitative data. |
|
6) |
To be able to communicate orally and in writing using a foreign language at least at the B2 General Level of the European Language Portfolio. |
|
7) |
To be able to critically evaluate the knowledge and skills acquired in the field of expertise and to direct their learning. |
|
8) |
To be able to use information and communication technologies at an advanced level along with computer software at the level required by the field. |
|
9) |
To be able to supervise and teach these values by observing social, scientific, cultural and ethical values in the stages of collecting, interpreting, applying and announcing the data related to the field. |
1 |
10) |
To be able to use the knowledge, problem solving and/or application skills they have internalized in their field in interdisciplinary studies. |
|
11) |
Being able to independently carry out a work that requires expertise in the field. |
|
12) |
To be able to develop new strategic approaches for the solution of complex and unpredictable problems encountered in applications related to the field and to produce solutions by taking responsibility. |
|