From hours to seconds. SQL Query Optimization using Volcano

9 min readFeb 15, 2021

SQL aka the fundamental language of data. It makes data processing so simple that not just engineers, even business users could also learn and harness its power quickly. But this comes at a cost of additional complexity in the underlying data systems.

Why? Because SQL is declarative by nature, thus there are many ways the system can execute the same query and get the exact same results. But are all of them efficient?

For example, if you’re running a simple join query, then which algorithm to execute is an unknown to the system. It can either use the hash join or the merge join. And if the inputs are sorted, merge join would be an efficient choice. So all of these decisions are important when you’re working with billions of data points.

In this blog post, I’ll discuss how database optimises the SQL query before executing it on the machines. This topic is very captivating to me because I’ve worked quite a lot on databases. And, recently I developed a new project at Atlan which use these approaches (through Apache Calcite) to run federated queries on top of different databases and apply data policies like data masking, restricted rows etc.

So, now before moving forward. Let’s go through some fundamentals.

SQL as relational algebra

**Equivalent by Join commutative property**

The main advantage of SQL is that you could easily represent it in relational algebra. So, all the mathematical concepts that apply to relational algebra are also true for SQL query. For example, you can represent the same relational algebra in different ways using equivalence property which will come in very handy when we do query optimisations.

Here is an example SQL query represented in relational algebra and its equivalent execution plan tree.

**SQL: Select * from Cities where name = ‘India’**

Now that we know about relational algebra. Let’s go through some optimization examples.

**SQL: Select a.name from Artist as a cross join Appears as b where a.id = b.artist_id**

The above optimization is called replace cartesian or cross join rule. Generally, cross-product is very expensive to compute because even in the best-case scenario, their complexity is Θ(m*n) as there’s no condition to the join. But in this case, there’s a filter based on both the tables just after the Cross Join, so by the rule of equivalence, we can replace the Cross Join with Inner Join which will be way faster to compute. In the best case, the inner join will only take Θ(m*log(n)).

We can further optimize the above query by applying the projection pushdown rule. So in the whole query, we are only dealing with artist name and its Id. Thus, we can add the projection just after the table scans. That’ll do two things for us. First, a small amount of data read into the intermediate buffer. Second, if the data files are column-oriented (Parquet?), then the reading will be very fast.

Some history before understanding Volcano

In the 1990s, query optimisation was one of the hot topics because at that time machine lacks the computational power to process data in huge quantity. So there were many frameworks designed to solve this problem like exodus and starburst. But Volcano was the most successful one because of its modular design, which made it easy for database maintainers to directly use it, and the fact that it never compromised on performance.

Volcano Optimiser

I’ve divided this section into three parts. First, explaining the lifecycle of database query optimisation. Second, the inputs that database maintainer has to provide for running volcano framework. And last, a search engine which is a core of the Volcano and does the actual work.

Optimisation life cycle

So volcano optimiser or any optimiser takes abstract syntax tree which is generated through query parser as input and generate the optimised execution plan.

Internally it converts the AST to the logical plan which helps to describe the query in relational algebra. Then optimise it to a final physical plan which is also a relational algebra representation but has all the algorithm attached. So can directly use to execute on a physical machine. It also has access to database catalog which contains all the information about the tables metadata and their stats like the number of rows, distinct values etc. This piece of information is essential to generate a fast execution plan. For example, if you know that stored tables are sorted, then merge-join would be a better choice over the other types of join.

Inputs needed by volcano framework

These are all the inputs that database maintainer has to provide for running the volcano optimizer. And in the next section, we will see how these inputs are utilized by the search engine.

Logical Operators or logical algebra in which queries are written. For example, in relational databases we have Select, Join, Filter etc from relational algebra. This is required while transforming the query AST to the logical plan.
Logical transformation rules which are equivalent in nature, and transforms the logical plan. For example, join is commutative by nature, thus, the rule could be (a Join b) ↔ (b Join a). It uses these rules to generate different plans and then finds the best based on different parameters.
Implementation rules are used to transform the logical plan into the actual physical plan that is executable on the machine. For example, you can transform (a join b) → (a hashJoin b), or (a order by b) → (a mergeSort b)
Physical Algorithm with cost functions is required to represent the actual algorithm with the time complexity. For join, these algorithms could be a merge join or hash join.
Enforcers are also algorithms but those which are not part of relational algebra. For example order by or sort is not a relational operator because it takes a relation as input and produces something else which in fact is not a relation, rather a sequence of tuples. So these kind of operators are important in the physical world but cannot be a part of logical operators. Later on, we will see the importance of enforcers in query optimisation and implementing certain functions like orderBy.
Table properties and stats are required by the search engine to find the exact cost of a certain execution plan. For example stats like the number of rows, and cardinality are helpful to calculate the cost of certain algorithms.

Search Engine

Since the core of query optimization is to create alternative plans and then choose the most performant one, the search engine plays a very critical role in Volcano framework. Internally it uses all the above-mentioned inputs and, tries to optimize the query to the threshold cost limit provided by database maintainer. One of the core design principles behind it is the use of dynamic programming in a goal-oriented way using the min-cost limit, so once the engine found any plan below the provided cost limit, it’ll stop further processing and return the execution plan. That is important because in-general this problem is np-complete so if the search engine is not optimized, it could take more time than the actual query execution.

Another design principle it follows is the use of backward chaining or top-down approach rather than forward chaining. So the idea behind backward-chaining is that you start with an un-optimised execution plan and move your way backwards while exploring other options and calculating costs. Through this method, you can simply apply branch-and-bound pruning. So while traversing down the execution plan if you find a branch whose current cost is more than the cost you’ve already seen, then you could just skip that branch and reducing the search space.

Here’s the actual algorithm

Seems tough right? If I’ve to explain it in the simple terms, then the core of this algorithm is to apply logical transformations and implementation rules on the logical plan in a top-down manner while calculating the cost using table stats and physical properties. In doing this you also have to store the state at each tree level in the lookup table for better efficiency using dynamic programming. And once you found the expression with the cost less than the limit provided then just return the plan and its cost.

Still not getting? Let’s go through an example to understand it better. Assume you’ve two tables on which you’re running a Join operation along with a filter operation. A very common use case.

Step one would represent it in a logical plan and then apply the implementation rules to figure out the cost. So in the below diagram, after adding the actual algorithms to different operators in the logical plan, we come up with the first physical plan. And if you observe the majority of time is spent by nested loop join because it has a time complexity of O(m*n). So it’ll be O(2.5M * 25M) which is very high.

Step two would be to figure out if we can transform the logical plan to decrease the cost parameter. Now based on the join-predicate-push-down rule, we can see the filter operation is happening on the column which is a part of the orders table. So, even if we push down that filter operation between a table scan and join, we will get the equivalent plan.

Now with this simple optimization, we’ve reduced the join time complexity from O(25M*2.5M) to O(100K*2.5M) which can make a huge difference in query execution. From hours to seconds.

Conclusion

In the end, I want to say query optimizers plays a very crucial role in databases. It is one of the most complicated pieces to implement, even today, after all this research. I remember while working with Presto, I had a situation where queries were failing because of OOM issues or taking more than 24 hours to run. All because of how the optimizer was set up and not doing its job right. So you can imagine what kind of impact it has on the scale.

Now coming back to Volcano, even though it was one of the best solutions when it first introduced, it has some problems. One of them was with its search engine being inefficient for complex queries. At each level in the logical tree, it does the exhaustive search and finds all the possible plans before going for optimizations which could be very expensive for some complex query. So the successor, Cascade, which is right now the state of the art solution was developed to solve these problems. I might discuss this in my next blog posts. But if you’re interested, do check out this Cascade Thesis.

Still, there’re many systems which use Volcano, or rather a modified version of it. One of the most famous ones is Apache Calcite and its dependent projects like Apache Druid, Hive etc.

References

Volcano Paper: https://www.cs.ubc.ca/~rap/teaching/504/2006/readings/volcano.pdf
Apache Calcite Volcano code: https://github.com/apache/calcite/tree/master/core/src/main/java/org/apache/calcite/plan/volcano
Apache druid SQL implementation: https://github.com/apache/druid/tree/master/sql/src/main/java/org/apache/druid/sql