Data is at the center of many challenges in system design today. Difficult issues need to be figured out, such as scalability, consistency, reliability, efficiency, and maintainability. In addition, we have an overwhelming variety of tools, including relational databases, NoSQL datastores, stream or batch processors, and message brokers. What are the right choices for your application? How do you make sense of all these buzzwords?
Designing Data-Intensive Applications is a handbook of modern system design interviews. It is easy to buy, normal to read, and hard to memorize. The challenge is to wire essential information to your brain. You master some piece of information you need to retrieve it from memory several times and process this info through the different channels: visual, sonic, and mechanic(typing on a keyboard).
MongoDB University provides plenty of courses based on MongoDB experience that helps to understand core concepts of modern databases and data-intensive applications.
Part II. Distributed Data(Replication, Partitioning) of the book has a good match with M103: Basic Cluster Administration course from MongoDB University with well-illustrated lectures and hands-on labs on replication topics such as:
- leader-based replication
- asynchronous replication
- replication log
partitioning(sharding) topics:
- sharding architecture (request routing)
- shard keys (sharding by key range vs hashed shard keys)
- rebalancing shards