I just started looking at distributed data processing platforms. There is, of course, the Apache Hadoop project, but I have also come across HPCC Systems, a system used by LexusNexus Risk Solutions, and Disco, a system developed by Nokia Research Center.