With Intel as a platinum sponsor, Intel Labs is involved in several technical contributions to ACM SIGMOD/PODS 2020, a virtual database systems conference on June 14-19. This annual conference is a leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools and experiences in all aspects of data management.
Funded by Intel, Google, and Microsoft, the Data Systems and Artificial Intelligence Lab (DSAIL) from the Massachusetts Institute of Technology (MIT) is scheduled to present 13 papers at the SIGMOD conference and its affiliated workshops. With the support of Intel, DSAIL researchers are investigating how novel machine learning and hardware technologies can be used to improve the performance of data processing systems.
As an example, during the research session on Thursday, June 18, Intel Labs and DSAIL researchers will present the research paper The Case for a Learned Sorting Algorithm by Ani Kristo (Brown University), Kapil Vaidya (MIT), Ugur Çetintemel (Brown University), Sanchit Misra (Intel Labs), and Tim Kraska (MIT). Sorting is one of the most fundamental algorithms in computer science and a common operation in databases, including indexing and query processing. Study results show that the DSAIL algorithm yields an average 3.38x performance improvement over C++ STL sort, 1.49x improvement over sequential Radix Sort, and 5.54x improvement over a C++ implementation of Timsort.
Co-sponsored by Intel, the Data Management on New Hardware (DaMoN) Workshop on June 15 will feature five presentations by Intel scientists. DSAIL and Intel Labs researchers will present Large-Scale In-Memory Analytics on Intel® Optane™ DC Persistent Memory by Anil Shanbhag (MIT), Nesime Tatbul (Intel Labs and MIT), David Cohen (Intel), and Samuel Madden (MIT). The research team presents one of the first experimental studies on characterizing the Intel® Optane™ DC Persistent Memory Module (PMM) performance behavior in the context of analytical database workloads. The study reveals interesting performance tradeoffs that can help guide the design of next-generation OLAP systems in presence of persistent memory in the storage hierarchy. Additional DaMoN contributions by Intel scientists include a keynote by Jose Roberto Alvarez from Intel’s Programmable Solutions Group and three presentations on the use of Intel FPGAs in high-performance database systems.
DSAIL researchers also are featured at the AI for Data Management (aiDM) Workshop on June 19, which was co-organized by Nesime Tatbul, a senior research scientist at Intel Labs and at MIT’s Computer Science and Artificial Intelligence Lab. DSAIL and Intel Labs researchers will present RadixSpline: A Single-Pass Learned Index by Andreas Kipf (MIT), Ryan Marcus (Intel Labs and MIT), Alexander van Renen (TUM), Mihail Stoian (TUM), Alfons Kemper (TUM), Tim Kraska (MIT), and Thomas Neumann (TUM). RadixSpline is a learned index that can be built in a single pass over the data and is competitive with state-of-the-art learned index models, like RMI, in size and lookup performance.
At ACM SIGMOD, Intel Labs and DSAIL researchers also will present a demonstration of CDFShop: Exploring and Optimizing Learned Index Structures by Ryan Marcus (Intel Labs and MIT), Emily Zhang (MIT), and Tim Kraska (MIT). This demonstration showcases CDFShop, a tool to explore and optimize recursive model indexes (RMIs), a type of learned index structure.
“We are truly proud to share exciting progress of our multi-year collaboration with leading researchers and thought leaders at DSAIL/MIT,” said Pradeep K. Dubey, Intel Senior Fellow and director of the Parallel Computing Lab at Intel Labs. “Large scale data systems and artificial intelligence continue to shape each other and will likely enable some novel applications together.”