Bioinformatics with Python and Hadoop Streaming

A while back I wrote a short paper about using the Python programming language in conjunction with Apache Hadoop. The purpose of doing so was to use several to many heavy duty machines to process problems in the field of Bioinformatics. At the time, I did not come across many resources so I've decided to post some of my work on the web. The intended audience would be students in an introductory Bioinformatics course or anyone with a will. It's really simple stuff!

The Paper/Tutorial:

Relevant Slides:
I also gave a presentation on functional programming, Python, and Hadoop. The slides relate heavily to the paper above.

I haven't edited the material yet so there may be grammatical errors. And eventually I plan on moving the contents onto the blog.