This document is the dissertation of Gianmarco De Francisci Morales submitted for the PhD program in Computer Science and Engineering at IMT Institute for Advanced Studies in Lucca, Italy. The dissertation addresses challenges in managing and analyzing large datasets, or "big data", and presents algorithms for tasks like document filtering, graph computation and real-time news recommendation. It was approved by the program coordinator and supervisor, and reviewed by two external reviewers. The dissertation contains six chapters, including introductions to big data and related work, and presents three contributed algorithms for document filtering, graph computation and news recommendation that scale to large datasets through parallel and distributed techniques.