At my day job we use Disco, a Python + Erlang based Map-Reduce framework, to crunch our web servers and application logs to generate useful data. Each web server log file per day is a couple of GB of data which can amount to a lot of log data that needs to be processed on a daily. [...]