Monte Carlo Tree Search-based Optimizing Agent for SPARQL Join Order Selection

Series

Research Master Defense
Speaker

Ruben Eschauzier

Location

Room 1.02
Amsterdam
Date and time

December 20, 2022
16:00 - 17:00

E-commerce businesses increasingly move to use RDF databases for customer experience. This includes customer service chatbots, search engines, and related product identification. These databases are queried using SPARQL. Queries should be fast for optimal customer engagement and reduced server costs. Query optimizers primarily focus on optimal join order selection, this problem is hard due to the large size of the search space and the complex relation between queried data and query performance. Recently, reinforcement learning has successfully been applied to SQL join plan selection, beating state-of-the-art join plan optimizers. However, this approach has not been translated to SPARQL join plan selection.
In this thesis, we will introduce a first-of-its-kind Monte Carlo Tree Search-based Join Order Selection agent for SPARQL and attempt to demonstrate its effectiveness on e-commerce applications. It will use graph embedding techniques to learn database-specific features that are fed into a graph convolutional network that guides the search. The agent is trained to predict query execution times.
Our experiments show that while our agent cannot improve total query execution time compared to existing optimizers, it can produce better join plans for less complex queries. We identify key challenges and problems in our method that hinder our agent's performance and should be improved in future research.