Luca Canali discusses the performance troubleshooting of Apache Spark at CERN, focusing on scaling challenges, monitoring tools, and actionable metrics for analytics on high-energy physics data. The talk emphasizes the need for understanding Spark workloads to identify bottlenecks and improve capacity planning through benchmarking and monitoring tools. Key takeaways include the importance of context in benchmarks, the complexities of performance measurement in distributed systems, and strategies for effective troubleshooting.
Related topics: