distributed systems

Oct 2, 2020

9 min read

Data-Intensive Applications: Reliability, Scalability and Maintainability

A data-intensive application is one for which raw CPU power is rarely a limiting factor — it has bigger concerns over the amount of data, complexity of data and the speed at which the data is changing.

May 10, 2020

8 min read

ZooKeeper: Wait-free coordination for Internet-scale systems

ZooKeeper is a service that allows distributed processes to coordinate with each other using a shared name space of data registers. It exposes a wait-free interface and an event-driven mechanism to provide a simple and high-performance kernel for building distributed applications. Originally developed at Yahoo!, it is now under the care of the Apache Software Foundation.

# distributed systems # distributed coordination # apache # yahoo

Apr 23, 2020

8 min read

Epidemic Algorithms for Replicated Database Maintenance

This post is a summary of the paper: Epidemic algorithms for replicated database maintenance, Demers, et al., 1988 Introduction Published in the late 80’s, this paper lays out early ideas on gossip based replication algorithms and focuses on eventual consistency, in contrast to the traditional ACID model. The idea of eventual consistency or BASE is well-known these days, however, in the late 80’s, it was one of the most important novel ideas.

# distributed systems # gossip protocol # epidemic protocol # replication

Feb 23, 2020

6 min read

Web Scale Responsive Visual Search at Bing

This post is a summary of the paper: Web-Scale Responsive Visual Search at Bing, Hu et al., 2018 Introduction Visual search is an interesting research area. A visual search system ranks a list of visually similar images when presented with a query image. As in all search systems, latency and relevance of returned results are key metrics to evaluate such a system.

# microsoft # search engine # distributed systems # visual search

Dec 15, 2019

5 min read

Peloton: Resource Scheduling at Uber

Peloton is Uber’s cluster scheduler that is capable of co-scheduling mixed types of workloads such as batch, stateless and stateful jobs in a single cluster for better resource utilization. Designed for scaling to millions of containers and tens of thousands of nodes, it features advanced resource management capabilities like elastic resource sharing, hierarchical max-min fairness, workload preemption to name a few.

# uber # distributed systems # cluster management

Feb 6, 2019

7 min read

The Anatomy of a Large-Scale Hypertextual Web Search Engine

This is a well-known paper. It was published in 1998 and it describes Google as a prototype of a large-scale search engine. According to the authors, this paper was the first in-depth public description of a large-scale web search engine. The authors describe how to build a practical large-scale search system that exploits information present in hypertext, thus producing better search results than existing search systems.

# google # search engine # distributed systems