This document discusses analyzing transportation data using open source big data analytic tools. It provides an overview of H2O and SparkR, two popular tools. It then demonstrates applying these tools to a transportation dataset, using a generalized linear model. Specifically, it shows importing and splitting the data, building a GLM model with H2O and SparkR, making predictions on test data, and comparing predicted versus actual values. The document provides examples of the coding and outputs at each step of the analysis process.