Thursday, September 15, 2016

Rule Based Optimization Vs Cost Based Optimization

Rule Based Optimization: This is an old technique. Basically, the RBO used a set of rules to determine how to execute a query. E.g. If an index is available on a table, the RBO rules can be to always use that index (a RBO for our travel analogy can be avoid all routes with speed brakers). As it turns out that this is simpler to implement but not the best strategy always and can backfire. RBO was supported in earlier versions of Oracle. (SQL Server supports table hints which in a way can be compared to RBO, as they force optimizer to follow certain path).

Basically, the RBO used a set of rules to determine how to execute a query. If an index was available on a table, the RBO rules said to always use the index. There are some cases where the use of an index slowed down a query. For example, assume someone put an index on the GENDER column, which holds one of two values, MALE and FEMALE.

SELECT * FROM emp WHERE gender='FEMALE';

If the above query returned approximately 50% of the rows, then using an index would actually slow things down. It would be faster to read the entire table and throw away all rows that have MALE values. Experts in Oracle query optimization have come to a rule of thumb that says if the number of rows returned is more than 5-10% of the total table volume, using an index would slow things down. The RBO would always use an index if present because its rules said to.

The biggest problem with the RBO was that it did not take the data distribution into account.

Cost Based Optimization: Motivation behind CBO is to come up with the cheapest execution plan available for each SQL statement. The cheapest plan is the one that will use the least amount of resources (CPU, Memory, I/O, etc.) to get the desired output (in relation to our travel analogy this can be Petrol, time, etc.). This can be a daunting task for DB engine as complex queries can thousands of possible execution paths, and selecting the best one can be quite expensive. CBO is supported by most of databases including MySQL, Oracle, SQL Server, etc.

The CBO uses statistics about the table, its indexes and the data distribution to make better informed decisions. Using our previous example, assume that the company has employees that are 95% female and 5% male. If you query for females, then you do not want to use the index. If you query for males, then you would like to use the index. The CBO has information at hand to help make these kind of determinations that were not available in the old RBO.


References:
https://nirajrules.wordpress.com/2009/06/10/cost-based-optimization-cbo-vs-rule-based-optimization-rbo/
http://searchoracle.techtarget.com/answer/Rule-based-vs-cost-based-optimization

No comments:

Post a Comment