Wail Alkowaileet

About Me

I am a Senior Software Engineer at Couchbase Inc..

My work targets the storage engines in Big Data Management Systems. In particular, I work on reducing the storage size to accelerate scan-based analytical workloads for document store systems.

Education

PhD, Computer Science

2017 - 2022

University of California, Irvine

Thesis: Towards Analytics-optimized Document Stores
Supervisor: Michael J. Carey

MSc, Computer Science

2011 - 2013

University of California, Irvine

Thesis: NUMA-aware Multicore Matrix Multiplication
Supervisor: Isaac D. Scherson

BSc, Computer Science

2004 - 2008

King Saud University

Experience

Senior Software Engineer

August 2022 - Present

Couchbase Inc.

www.couchbase.com

Role: Working on several optimizations for storage and query processing

Software Engineer Intern

June 2021 - Septemeber 2021

Couchbase Inc.

www.couchbase.com

Task: Added the support to query Parquet files from various sources (e.g., S3) in Couchbase Analytics with the support to pushdown field accesses.

Committer

2016–Present

Apache AsterixDB

asterixdb.apache.org

Components: Storage and Query Optimization

Research Associate
Research Affiliate

2014–2017

King Abdulaziz City for Science and Technology (KACST)

Massachusetts Institute of Technology (MIT)

cces.kacst.edu.sa

Center for Complex Engineering Systems (CCES) – Institute for Data, Systems and Society (IDSS)
Role: Developing tools capable of harnessing and analyzing large-scale data
Projects: AsterixDB-Spark Connector, CityDynamics, Connected Intelligence Platform, Integrated Transportation Systems, Innovation Space

Associate Software Engineer

2008–2009

Advanced Electronics Company (AEC)

Research and Development Department (R&D)
Role: Developing components that connect electric and water smart-meters to the data collection units
Projects: ADDAD4, Water Smart Meter

Projects

LSM-based Tuple Compaction Framework

We proposed a new mechanism to leverage LSM-lifecycle events to infer the schema and semantically compact self-describing semi-structured records automatically. We also introduced a novel semi-structured record physical format for efficient construction and compaction. Using Apache AsterixDB, we were able (combined with our implementation of page-level compression) to reduce the storage size by 9.8x and improve the query performance by the same factor.
Paper: Extended version in arXiv

Columnar Formats for Schemaless LSM-based Document Stores

In this project, we propose several techniques based on piggy-backing on Log-Structured Merge (LSM) tree events and tailored to document stores to store document data in a columnar layout. We first extend the Dremel format, a popular on-disk columnar format for semi-structured data, to comply with document stores’ flexible data model. We then introduce two columnar layouts for organizing and storing data in LSM-based storage.
Paper: Extended version in arXiv

A Code Generation Technique for Schema-on-read Databases using Truffle

In this project, We shed light on the possibility of using query compilation techniques for document stores, where value types are not known until runtime. We utilize the Oracle Truffle to implement an internal language for processing data stored in a Java-based document store. Even though we only translate part of a query plan, our evaluations show a tremendous improvement over AsterixDB’s Vectorized model (or batch-at-a-time model).

Awards

Ph.D Schoalrship

2017–2022

Awarded full graduate scholarship from the King Abdulaziz City for Science and Technology.

MSc Schoalrship

2010-2013

Awarded full graduate scholarship from King Abdullah Scholarship Program.

Second Class Honor

2004-2008

Awarded Second Class Honor for high GPA from King Saud University

Publications

[1]

W. Alkowaileet & M. J. Carey.
Columnar Formats for Schemaless LSM-based Document Stores
PVLDB, 15(10), 2085-2097, 2022, Extended Version

[2]

W. Alkowaileet, S. Alsubaiee & M. J. Carey.
An LSM-based Tuple Compaction Framework for Apache AsterixDB
PVLDB, 13(9), 1388-1400, 2020, Extended Version

[3]

W. Alkowaileet, S. Alsubaiee, M. J. Carey, C. Li, P. Sinthong & W. Wang.
End-to-End Machine Learning with Apache AsterixDB
In Proceedings of the Second Workshop on Data Management for End-To-End Machine Learning, 2018

[4]

W. Alkowaileet, S. Alsubaiee, M. J. Carey, T. Westmann & Y. Bu.
Large-scale Complex Analytics on Semi-structured Datasets using AsterixDB and Spark
PVLDB, 9(13), 1585-1588, 2016 (Demo)

[5]

W. Alkowaileet, D. Carrillo-Cisneros, D. Lim & I. D. Scherson.
NUMA-aware Multicore Matrix Multiplication
Parallel Processing Letters, 24(04), 1450006, 2014