Paper Index

1 Distributed System

1.1 Google

1.1.1 The Google File System (SOSP03)

GFS是分布式存储领域非常著名的一篇论文。HDFS、阿里云Pangu存储系统，都是参考论文实现的。核心思想是：

使用Paxos/RAFT一致性组实现高可用的Master，负责文件系统元数据，且全部元数据常驻内存；
使用ChunkServer提供单机引擎，只支持AppendOnly写入，IO数据不过Master；
使用SDK提供类POSXI文件系统接口，封装Master/ChunkServer交互过程。

GFS有Master热点，集群QPS有数万，不适合保存海量小文件。海量小文件的需求，应该在GFS基础之上搭建一款文件系统，而用GFS实现Bootstrap。

为支持超大规模集群，GFS可以用单集群多组Master，这称为联邦（Fedoration），HDFS 有文章描述这部分：https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/Federation.html

GFS在Google内部演进为分布式元数据以支持更大规模集群，内部产品代号Colossus，但没有具体的论文发出来。在网上可以找到这篇链接： https://www.systutorials.com/colossus-successor-to-google-file-system-gfs/ 部分内容节抄如下：

"We also ended up doing what we call a "multi-cell" approach, which basically

made it possible to put multiple GFS masters on top of a pool of chunkservers."

"We also have something we called Name Spaces, which are just a very static

way of partitioning a namespace that people can use to hide all of this from the actual application." … "a namespace file describes"

"The distributed master certainly allows you to grow file counts, in line

with the number of machines you’re willing to throw at it." … "Our distribute master system that will provide for 1-MB files is essentially a whole new design. That way, we can aim for something on the order of 100 million files per master. You can also have hundreds of masters."

BigTable "as one of the major adaptations made along the way to help keep

GFS viable in the face of rapid and widespread change."

GFS: Evolution on Fast-forward: https://queue.acm.org/detail.cfm?id=1594206
Google File System II: Dawn of the Multiplying Master Nodes : https://www.theregister.com/2009/08/12/google_file_system_part_deux/
http://tab.d-thinker.org/showthread.php?tid=1&pid=332#pid332

Paper Index

目录

1 Distributed System

1.1 Google

1.1.1 The Google File System (SOSP03)

1.1.2 Map Reduce: Simplified Data Processing On Large Clusters

1.1.3 Bigtable: A Distributed Storage System for Structured Data

1.1.4 The Chubby lock service for loosely-coupled distributed systems

1.1.5 Dapper, a Large-Scale Distributed Systems Tracing Infrastructure

1.1.6 Percolator: Large-scale Incremental Processing Using Distributed Transactions and Notifications

1.1.7 Megastore: Providing Scalable, Highly Available Storage for Interactive Services

1.1.8 Spanner: Google's Globally-Distributed Database

1.1.9 F1: A Distributed SQL Database That Scales

1.1.10 Goods: Organizing Google's Datasets

1.1.11 Colossus: Next generation of GFS

1.2 Microsoft

1.2.1 Window Azure Storage

1.2.2 Pelican: A building block for exascale cold data storage

1.3 Tencent

1.3.1 PaxosStore: High-availability Storage Made Practical in WeChat

1.4 ceph

1.4.1 Ceph: A Scalable, High-Performance Distributed File System

1.4.2 CRUSH

1.4.3 File Systems Unfit as Distributed Storage Backends: Lessons from 10 Years of Ceph Evolution

1.5 HDFS

1.5.1 The Hadoop Distributed File System

1.5.2 HDFS: Balancing Portability and Performance

1.6 Consensus Algorithms

1.6.1 Lamport The Part-Time Parliament

1.6.2 Lamport The Byzantine General Problem

1.6.3 Lampson How to Build a Highly Availability System using Consensus

1.6.4 Revisiting the Paxos Algorithm

1.6.5 Paxos made simple

1.6.6 Cheap Paxos

1.6.7 Fast Paxos

1.6.8 Paxos Made Live - An Engineering Perspective

1.6.9 Raft - In Search of an Understandable Consensus Algorithm

1.6.10 Consensus: Bridging theory and practice

1.6.11 ViewStamped Replications

1.7 Transactions

1.7.1 Two Phase Commit

1.7.2 Nonblocking Commit Protocols

1.7.3 Consensus on Transaction Commit

1.7.4 Revisiting the relationship between non-blocking atomic commitment and consensus

1.8 Distributed base

1.8.1 Dijkstra Solution of a Problem in Concurrent Programming Control

1.8.2 Dijkstra Self-stabilizing Systems in Spite of Distributed Control

1.8.3 Jim Gray Why Do Computers Stop and What Can Be Done About It?

1.8.4 A New Solution of Dijkstra's Concurrent Programming Problem

1.8.5 Lamport Time, Clocks, and the Ordering of Events in a Distributed System

1.8.6 Distributed Snapshots - Determining Global States of a Distributed System

1.8.7 Virtual Time and Global States of Distributed Systems

1.8.8 Impossibility of Distributed Consensus with One Faulty Process

2 Unsorted

3 Reference Websites