From @hackingdata:
- author a multistage processing pipeline in Python
- design a hypothesis test
- perform a regression analysis over data samples with R
- design and implement an algorithm for some data-intensive product
or service in Hadoop
- communicate the results of analyses to other members of the org
in a clear and concise fashion