Monte Carlo Tree Search-based Optimizing Agent for SPARQL Join Order Selection
SeriesResearch Master Defense
Date and time
December 20, 2022
16:00 - 17:00
E-commerce businesses increasingly move to use RDF databases
for customer experience. This includes customer service chatbots, search
engines, and related product identification. These databases are queried using
SPARQL. Queries should be fast for optimal customer engagement and reduced
server costs. Query optimizers primarily focus on optimal join order selection,
this problem is hard due to the large size of the search space and the complex
relation between queried data and query performance. Recently, reinforcement
learning has successfully been applied to SQL join plan selection, beating
state-of-the-art join plan optimizers. However, this approach has not been
translated to SPARQL join plan selection.
In this thesis, we will introduce a first-of-its-kind Monte Carlo Tree Search-based Join Order Selection agent for SPARQL and attempt to demonstrate its effectiveness on e-commerce applications. It will use graph embedding techniques to learn database-specific features that are fed into a graph convolutional network that guides the search. The agent is trained to predict query execution times.
Our experiments show that while our agent cannot improve total query execution time compared to existing optimizers, it can produce better join plans for less complex queries. We identify key challenges and problems in our method that hinder our agent's performance and should be improved in future research.