Lyublena Antova's talk on DB seminar

Orca: Greenplum's optimizer for big data

 

Abstract

 
Over the last months we have taken up the challenging task of rewriting the query optimizer of Greenplum's MPP database, not only to make queries run faster, but to allow for future extensions, rapid feature development and verification.
 
The result is a query optimizer framework, which is multi-core enabled, highly extensible, modular and verifiable. Orca, the code-name for this optimizer, outperforms the current planner by orders of magnitude on some queries. Orca has an extensible metadata provider framework, which allows extending the product to support other target systems, both SQL and NoSQL. In this talk I'll present the challenges we encountered while building Orca, and the design principles we followed, since finding the right abstraction is what matters most. I'll also detail on AMPERe - a tool built as part of Orca for automatic capture of minimal portable and executable bug repros, which speeds up resolution of customer issues with the query processing engine, and is a stepping stones for building a flexible unittest framework for query optimizers.
 

Bio

 
Lyublena got her PhD from Cornell University and is now working at Greenplum.