This document discusses enhancements to the Spark SQL optimizer through improved statistics collection and cost-based optimization rules. It describes collecting table and column statistics from Hive metastore and developing 1D and 2D histograms. New rules estimate operator costs based on output rows and size. Join order, filter statistics, and handling unique columns are discussed. Future work includes faster histogram collection, expression statistics, and continuous feedback optimization.