Papers
Time, Agreement and Causality
- Time, clocks, and the ordering of events in a distributed system,
Leslie Lamport, Communications of the ACM, 21(7):558-565, July 1978.
- Exploiting virtual synchrony in distributed systems, Kenneth P. Birman and Thomas A. Joseph,
In Proceedings of the 11th ACM Symposium on Operating Systems
Principles, pages 123-138, November 1987.
- Virtual
Time and Global States in Distributed Systems, F. Mattern, Proc.
Workshop on Parallel and Distributed Algorithms, pp. 214-216, 1989.
Replication- Providing Availability Using Lazy Replication, R. Ladin, B. Liskov, L. Shrina and S. Ghemawat, ACM Transactions on Computer Systems, Vol. 10, No. 4, pp. 360-391.
- The Bayou Architecture: Support for Data Sharing among Mobile Users, D. Terry, M. Theimer, K. Petersen, A. Demers, M. Spreitzer and C. Hauser , Proceedings of the 15th ACM Symposium on Operating Systems Principles, pp. 172-183.
- Managing Update Conflicts in Bayou, A Weakly Connected Replicated Storage System, D. Terry, M. Theimer, K. Petersen, A. Demers, M. Spreitzer and C. Hauser , Proceedings of the 15th ACM Symposium on Operating Systems Principles, pp. 172-183.
Distributed Hash Tables
- OpenDHT: A Public DHT Service and Its Uses,
Sean Rhea et al, SIGCOMM 2005.
- Chord: A Scalable Peer-to-peer Lookup Service for
Internet Applications, Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari
Balakrishnan, ACM SIGCOMM 2001, San Deigo, CA, August 2001.
Storage and Consistency
- Google
File System, Sanjay Ghemaway, Howard Gobioff, and Shun-Tak Leung, SOSP'03.
- Bigtable: A Distributed Storage System for Structured Data,
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E.
Gruber, Proceedings of OSDI 2006, Seattle, WA, 2006.
Data Center and Cloud Infrastructures
- Lessons from
Giant-Scale Services, Eric Brewer, IEEE Internet Computing '01.
- The Data Center as a Computer: An Introduction to the Design of Warehouse-Scale Machines, L. Barrosa and U. Holzle, 2009
- Google Cluster Architecture, L. Barrosa, J. Dean and U. Holzle, IEEE Micro, March-April 2003, pp. 22-29.
-
Above the Clouds: A Berkeley View of Cloud Computing
, M. Armbrust, A. Fox, R. Griffith, A. Joseph, R. Katz, A. Konwinski,
G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia. Technical Report No. UCB/EECS-2009-28.
Google
- MapReduce: Simplified Data Processing on Large Clusters,
Jeffrey Dean et al, OSDI '04.
- Improving MapReduce Performance in Heterogeneous Environments,
Zaharia et al, OSDI 2008
- Google
File System, Sanjay Ghemaway, Howard Gobioff, and Shun-Tak Leung, SOSP'03.
- The Chubby Lock Service for Loosely-Coupled Distributed Systems, M. Burrows, OSDI'06
- Bigtable: A Distributed Storage System for Structured Data,
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E.
Gruber, Proceedings of OSDI 2006, Seattle, WA, 2006.
- Building a terabyte-scale data cycle at LinkedIn with Hadoop and Project Voldemort, Project Voldemort Blog
Amazon
Yahoo
- Data challenges
at Yahoo!, R. Baeza-Yates and Ramakrishnan, EDBT 08
- Pig Latin: A Not-So-Foreign Language
for Data Processing, C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, ACM SIGMOD 2008.
- PNUTS: Yahoo!�s
Hosted Data Serving Platform, Brian F. Cooper et al, VLDB 08 (Yahoo)
- Chukwa:
A large-scale monitoring system, J. Boulon et al, CCA 2008
Microsoft
- Dryad: Distributed Data-Parallel
Programs from Sequential Building Blocks, M. Isard, M. Budiu, Y. Yu, A. Birrell, and D. Fetterly,
ACM EuroSys 2007.
- DryadLINQ: A System for
General-Purpose Distributed Data-Parallel Computing using a
High-Level Language,Y. Yu, M. Isard, D. Fetterly, M. Budiu, U. Erlingsson, P. K. Gunda,
and J. Currey Usenix OSDI 2008.
Virtual Machines and Data Centers-
Virtual Machine Monitors: Current Technology and Future Trends,
M. Rosenblum, T. Garfinkel, IEEE Computer, 38(5), May 2005.
-
The Architecture of
Virtual Machines, James Smith and Ravi Nair, IEEE Computer, 38(5), May 2005.
-
Xen and the Art of Virtualization,
P. Barham et al, SOSP'03.
-
VirtualPower: Coordinated Power Management in Virtualized Enterprise Systems. Ripal Nathuji and Karsten Schwan, SOSP 2007 (Paper Review).
-
SLA Decomposition: Translating Service Level Objectives to System Level Thresholds. Yuan Chen, Subu Iyer, Xue Liu, Dejan Milojicic and Akhil Sahai, ICAC 2007.
-
On the Use of Fuzzy Modeling in Virtualized Data Center Management. Jing Xu, Ming Zhao, José Fortes, Robert Carpentier and Mazin Yousif, ICAC 2007.
Green Computing
-
Server Workload Analysis for Power Minimization using Consolidation, Akshat Verma, Gargi Dasgupta, Tapan Kumar Nayak, Pradipta De and Ravi Kothari, Usenix technical conference 2009
-
Somniloquy: Augmenting Network Interfaces to Reduce PC Energy Usage, Yuvraj Agarwal, Steve Hodges, Ranveer Chandra, James Scott, Paramvir Bahl and Rajesh Gupta, NSDI 2009
- Reducing
Network Energy Consumption via Sleeping and Rate-Adaptation, Sergiu Nedevschi, Lucian Popa,
Gianluca Iannaccone, Sylvia Ratnasamy, David Wetherall, NSDI’08
- Energy-Aware Server
Provisioning and Load Dispatching for Connection-Intensive Internet Services, Gong Chen, Wenbo He, Jie Liu and
Suman Nath, Leonidas Rigas, Lin Xiao and Feng Zhao, NSDI’08.
-
Statistical Profiling-based Techniques for Effective Power Provisioning in Data Centers, S. Govindan, J. Choi, B. Urgaonkar, A. Sivasubramaniam and A. Baldini, EuroSys 2009.
-
GreenFS: Making Enterprise Computers Greener by Protecting Them Better, N. Joukov and J. Sipek, EuroSys 2008.
- Power Provisioning for a Warehouse-sized Computer, Xiaobo Fan, ISCA’08
- No "Power" Struggles:
Coordinated Multi-level Power Management for the Data Center, Ramya Raghavendra, Parthasarathy Ranganathan, Vanish Talwar, Zhikui
Wang, Xiaoyun Zhu, ASPLOS’08
- PICSEL: Measuring User-Perceived Performance to Control Dynamic
Frequency Scaling,
Arindam Mallik, Jack Cosgrove, Gokhan
Memik, Robert P. Dick, Peter Dinda, ASPLOS’08
-
VPM Tokens: Virtual
Machine-Aware Power Budgeting in Datacenters, Ripal Nathuji, Karsten Schwan,
HPDC 08.
Fault Management -
FUSE: Lightweight Guaranteed Distributed Failure Notification, John Dunagan, Nicholas J. A. Harvey, Michael B. Jones, Dejan Kostic, Marvin Theimer and Alec Wolman, OSDI 2004.
-
A Fault Detection Service for Wide Area Distributed Computations, P. Stelling, C. Lee, I. Foster, G. von Laszewski and C. Kesselman, HPDC 1998.
-
Exploiting Availability Prediction in Distributed Systems . James W. Mickens and Brian D. Noble, NSDI 2006.
-
A comparative study of pairwise regression techniques for problem determination. Mohammad Ahmad Munawar and Paul A. S. Ward, CASCON 2007.
Trust
-
Result Verification and Trust-based Scheduling in Open Peer-to-Peer Cycle Sharing Systems,
Shanyu Zhao and Virginia Lo, IEEE Fifth International Conference on Peer-to-Peer Systems, 2005.
-
Reputation-Based Scheduling on Unreliable Distributed
Infrastructures, Jason Sonnek, Mukesh Nathan, Abhishek Chandra,
and Jon Weissman, Proceedings of the 26th IEEE International
Conference on Distributed Computing Systems, 2006.
-
The Eigentrust algorithm for reputation management in P2P networks,
Sepandar D. Kamvar,
Mario T. Schlosser,
and Hector Garcia-Molina,
Proceedings of the 12th international conference on World Wide Web, 2003.
-
Protection and Communication Abstractions
for Web Browsers in MashupOS, Helen J. Wang, et al,
21st ACM Symposium on Operating Systems Principles, Stevenson, WA,
October 2007.