Tuesday, June 29, 2010

Hadoop Summit 2010 - Presentation Slides & Videos

==========================================
AGENDA
==========================================
(1)Big Data and the Power of Hadoop
Blake Irving, Executive Vice President
and Chief Products Officer, Yahoo!
- Article: Yahoo announces SOX compliance coming for Hadoop
- VIDEO

(2)Hadoop and The Future of Internet Scale Cloud Computing
Shelton Shugar, Senior Vice President, Cloud Computing, Yahoo!
- VIDEO

(3)Scaling Hadoop
Eric Baldeschwieler, Vice President,
Hadoop Software Development, Yahoo!
- VIDEO

(4)Making Hadoop Enterprise Ready with Amazon Elastic MapReduce
Peter Sirota, General Manager, Elastic Map Reduce
- VIDEO

(5)Hadoop Grows Up
Doug Cutting, Cloudera
- VIDEO

(6)Inside Large-Scale Analytics at Facebook
Mike Schroepfer, VP of Engineering, Facebook
- VIDEO

==========================================
DEVELOPERS TRACK
==========================================
(1)Hadoop Security in Detail
Owen O'Malley, Yahoo!
- PRESENTATION SLIDE
- Hadoop Security Preview
- VIDEO

(2)Hive integration: HBase and Rcfile
John Sichi and Yongqiang He, Facebook
- PRESENTATION SLIDE
- List of presentations mainly focused on Hive
- HBase Presentations
- PoweredBy Hive (some)
- PoweredBy HBase (some)
- Blog: Integrating Hive and HBase
- VIDEO

(3)Hadoop and Pig at Twitter
Kevin Weil, Twitter
- PRESENTATION SLIDE
- VIDEO

(4)Developer's Most Frequent Hadoop Headaches & How to Address Them
Shevek Mankin, Karmasphere
- PRESENTATION SLIDE
- VIDEO

(5)Workflow on Hadoop Using Oozie
Alejandro Abdelnur, Yahoo!
- PRESENTATION SLIDE
- VIDEO

(6)Cascalog: an Interactive Query Language for Hadoop
Nathan Marz, BackType
- PRESENTATION SLIDE
- VIDEO

(7)Honu - A Large Scale Streaming Data Collection
and Processing Pipeline
Jerome Boulon, Netflix
- PRESENTATION SLIDE
- VIDEO

(8)Hadoop Frameworks Panel: Pig, Hive, Cascading,
Cloudera Desktop, LinkedIn Voldemort, Twitter ElephantBird
Moderator: Sanjay Radia, Yahoo!
- Twitter ElephantBird - updated slide
- PRESENTATION SLIDE
- VIDEO

==========================================
APPLICATIONS TRACK
==========================================
(1)Disruptive Applications with Hadoop
Rod Smith, VP, IBM Emerging Internet Technologies
- PRESENTATION SLIDE
- VIDEO

(2)ZettaVox: Content Mining and Analysis Across
Heterogeneous Compute Clouds
Mark Davis, Kitenga
- PRESENTATION SLIDE
- VIDEO

(3)Biometric Databases and Hadoop
Jason Trost, Abel Sussman and Lalit Kapoor, Booz Allen Hamilton
- PRESENTATION SLIDE
- VIDEO

(4)Hadoop - Integration Patterns and Practices
Eric Sammer, Cloudera
- PRESENTATION SLIDE
- VIDEO

(5)Winning the Big Data SPAM Challenge
Stefan Groschupf, Datameer; Florian Leibert, Erich Nachbar
- PRESENTATION SLIDE
- VIDEO

(6)Data Applications and Infrastructure at LinkedIn
Jay Kreps, LinkedIn
- PRESENTATION SLIDE
- VIDEO

(7)Online Content Optimization with Hadoop
Amit Phadke, Yahoo!
- PRESENTATION SLIDE
- VIDEO

(8)Hadoop Customer Panel: Amazon Elastic MapReduce
Moderator: Deepak Singh, Amazon Web Services
- VIDEO

==========================================
RESEARCH TRACK
==========================================
(1)Design Patterns for Efficient Graph Algorithms in MapReduce
Jimmy Lin, Michael Schatz, University of Maryland
- PRESENTATION SLIDE
- RESEARCH PAPER
- BOOK
- VIDEO

(2)Mining Billion-node Graphs: Patterns, Generators and Tools
Christos Faloutsos, Carnegie Mellon University
- PRESENTATION SLIDE
- RESEARCH PAPER

(3)XXL Graph Algorithms
Sergei Vassilvitskii, Yahoo! Labs
- PRESENTATION SLIDE

(4)Efficient Parallel Set-Similarity Joins Using Hadoop
Chen Li, University of California, Irvine
- PRESENTATION SLIDE
- RESEARCH PAPER
- VIDEO

(5)Exact Inference in Bayesian Networks using MapReduce
Alex Kozlov, Cloudera
- PRESENTATION SLIDE
- VIDEO

(6)Hadoop for Scientific Workloads
Lavanya Ramakrishnan, Lawrence Berkeley National Lab
- PRESENTATION SLIDE
- VIDEO

(7)Hadoop for Genomics
Jeremy Bruestle, Spiral Genetics
- PRESENTATION SLIDE
- VIDEO

(8)Parallel Distributed Image Stacking and Mosaicing with Hadoop
Keith Wiley, University of Washington
- PRESENTATION SLIDE

Related:
- Massive Data
- List of presentations about Hadoop
- Past, 2008 Hadoop Summit slides and videos
- Apache Hadoop Wiki
- Cloudera training videos on Hadoop
- Yahoo! Hadoop Tutorial
- PoweredBy Hadoop
- Google Code University: Distributed Systems
- University of Washington: Scalable Systems Course
- Mapreduce & Hadoop Algorithms in Academic Papers
- Machine Learning on Hadoop
- Reference: Graph Theory and Complex Networks, Maarten van Steen
(via @Werner)
- Mathematics of Batch Processing
- Pig at LinkedIn, Open Source and Understanding Systems

- Yahoo's Commitment to Hadoop and Open Source
- Hadoop Trends, Opportunities, and Challenges
- Multiple Sequence Alignment Using Hadoop
- Key Challenges in Cloud Computing and Yahoo!'s Approach
- Hadoop @ Yahoo! - Internet Scale Data Processing
- Hadoop, Pig, HBase at Twitter
- CDH3 Installation and Configuration Guide

- Testing Hadoop
- Atlassian Clover - code coverage
- Challenges And Uniqueness Of Qe And Re Processes In Hadoop
- Data Management On Grid
- Benchmarking and Optimizing Hadoop
- Data Management on Hadoop @ Yahoo!
- Tuning Hadoop To Deliver Performance To Your Application