What is the relation between sharding and distributed systems?

Sharding is horizontal (row wise) database partitioning as opposed to vertical (column wise) partitioning which is Normalization. It separates very large databases into smaller, faster and more easily managed parts called data shards. It is a mechanism to achieve distributed systems.

Why do we need distributed systems ?
  • Increased availablity.
  • Easier expansion.
  • Economics: It costs less to create a network of smaller computers than using a single large computer of the same power.

You can read more here – Advantages of Distributed Database

How sharding help achieve distributed system?

You can partition a search index into N partitions and load each index on a separate server. If you query one server, you will get 1/Nth of the results. So to get the complete result set, a typical distributed search system uses an aggregator that will accumulate results from each server and combines them. An aggregator also distributes each query onto each server.

A visual representation below.

Leave a Reply

Your email address will not be published. Required fields are marked *