Columnar Table Performance Enhancements Of Greenplum Database with Block Metadata and Sort Keys Базы данных и системы хранения
Xiaojian Fan, Male, 1991.2.3, working in Alibaba Cloud as a developer for HybridDB and PostgreSQL feature development, bug fixing, and customer support. A big fan of Greenplum and PostgreSQL, also interested in Cloud Computing, distributed database development, database performance optimization etc.
Alibaba built up a data warehouse service named HybridDB in its public cloud service, based on the open sourced Greenplum Database. And it keeps on enhancing HybridDB's preformance. This presentation will talk about how Alibaba improves HybridDB's performance for columnar tables with data block's meta data (MIN/MAX values of block data) and sort keys (pre-defined keys that data will be sorted and stored with). Testing result shows that, block metadata can be generated on-the-fly without much overhead, but can achive better performance even than index scan. With sort keys, a constant response time can be archived for GROUP-BY and ORDER-BY queries.