Database Seminar - Andy Grove

— 5:30pm

Location:
Virtual Presentation - ET - Remote Access - Zoom

Speaker:
ANDY GROVE, Apache Arrow , and, Apache DataFusion PMC Member
https://www.linkedin.com/in/andygrove/


Accelerating Apache Spark workloads with Apache DataFusion Comet

Apache Spark is one of the most widely-used distributed data analysis frameworks. However, its JVM-based and row-oriented query execution engine limits Spark’s performance and scalability. 

In this talk, we will introduce DataFusion Comet, an accelerator for Apache Spark designed to improve the efficiency of Spark queries by translating them into native queries that leverage Apache Arrow and Apache DataFusion. We will explore the core architecture of Comet and explain how Spark plans are translated into native plans and talk about some of the challenges of providing Spark compatibility. 

— 

Andy Grove is an Apache Arrow & Apache DataFusion PMC Member and the original creator of Apache DataFusion. 

This talk is part of the Database Building Blocks Seminar

Zoom Participation.  See announcement.

Event Website:
https://db.cs.cmu.edu/events/building-blocks-apache-datafusion-comet-andy-grove