CSI 5311 -- READING LIST Peer-to-peer data management: Alon Y. Halevy, Zachary G. Ives, Dan Suciu, Igor Tatarinov: Schema mediation for large-scale semantic data sharing. VLDB J. 14(1): 68-83 (2005) Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou: PeerDB: A P2P-based System for Distributed Data Sharing. ICDE 2003:633-644 Reza Akbarinia, Vidal Martins: Data Management in the APPA System. J. Grid Comput. 5(3): 303-317 (2007) Reza Akbarinia, Esther Pacitti, Patrick Valduriez: Reducing network traffic in unstructured P2P systems using Top-k queries. Distributed and Parallel Databases 19(2-3): 67-86 (2006) Mehedi Masud, Iluju Kiringa, Anastasios Kementsietsidis: Don't Mind Your Vocabulary: Data Sharing Across Heterogeneous Peers. OTM Conferences (1) 2005: 292-309 Patricia Rodríguez-Gianolli, Maddalena Garzetti, Lei Jiang, Anastasios Kementsietsidis, Iluju Kiringa, Mehedi Masud, Renée J. Miller, John Mylopoulos: Data Sharing in the Hyperion Peer Database System. VLDB 2005: 1291-1294 Data stream managemnent: Sudipto Guha, Andrew McGregor: Approximate quantiles and the order of the stream. PODS 2006: 273-279 Peter A. Tucker, David Maier, Tim Sheard, Leonidas Fegaras: Exploiting Punctuation Semantics in Continuous Data Streams. IEEE Trans. Knowl. Data Eng. 15(3): 555-568 (2003) Themistoklis Palpanas, Michail Vlachos, Eamonn J. Keogh, Dimitrios Gunopulos, Wagner Truppel: Online Amnesic Approximation of Streaming Time Series. ICDE 2004: 339-349 Tamraparni Dasu , Shankar Krishnan , Suresh Venkatasubramanian , Ke Yi. An information-theoretic approach to detecting changes in multi-dimensional data streams (2006). In Proc. Symp. on the Interface of Statistics, Computing Science, and Applications Arvind Arasu, Shivnath Babu, Jennifer Widom: The CQL continuous query language: semantic foundations and query execution. VLDB J. 15(2): 121-142 (2006) Cloud computing: Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung: The Google file system. SOSP 2003: 29-43 Jeffrey Dean, Sanjay Ghemawat: MapReduce: Simplified Data Processing on Large Clusters. OSDI 2004: 137-150 Jeffrey Dean, Sanjay Ghemawat: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1): 107-113 (2008) Jeffrey Dean, Sanjay Ghemawat: MapReduce: a flexible data processing tool. Commun. ACM 53(1): 72-77 (2010) Michael Stonebraker, Daniel J. Abadi, David J. DeWitt, Samuel Madden, Erik Paulson, Andrew Pavlo, Alexander Rasin: MapReduce and parallel DBMSs: friends or foes? Commun. ACM 53(1): 64-71 (2010) Ashish Thusoo, Joydeep Sen Sarma, Namit Jain, Zheng Shao, Prasad Chakka, Ning Zhang, Suresh Anthony, Hao Liu, Raghotham Murthy: Hive - a petabyte scale data warehouse using Hadoop. ICDE 2010: 996-1005 Ashish Thusoo, Zheng Shao, Suresh Anthony, Dhruba Borthakur, Namit Jain, Joydeep Sen Sarma, Raghotham Murthy, Hao Liu: Data warehousing and analytics infrastructure at facebook. SIGMOD Conference 2010: 1013-1020