Pydoop -- a Python MapReduce and HDFS API for Hadoop.
Copyright 2009-2014 CRS4.
Documentation, including installation instructions, is provided in html form under docs/html. For a general overview, you can also read the following paper:
Simone Leo and Gianluigi Zanetti. "Pydoop: a Python MapReduce and HDFS API for Hadoop". In Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC 2010), pages 819-825, 2010.
Please use the above citation if you wish to reference Pydoop in a scientific paper.
Tests performed on the following installations:
- hadoop 0.20.2
- hadoop 1.0.4
- hadoop 1.1.2
- hadoop 1.2.1
- hadoop 2.2.0
- cdh 3u4
- cdh 3u5
- cdh 4.2.0
- cdh 4.3.0