An atypical approach to SQL query optimization targeting an Oracle database

An atypical approach to SQL query optimization targeting an Oracle database

The first in a series under the banner "Feedback on database administration", this article already published in French on 12-25-2024, aims to raise awareness of the importance of the SQL query optimization step in the process of implementing a relational database.

It is naturally aimed at DB administrators and developers but is also very instructive for project managers and IT decision-makers, as actors who guarantee rigorous compliance with processes.

Finally, it helps to sharpen the perception of the nature of problems related to computer software for anyone with a certain culture in the field of computer systems.

When implementing a relational database, the SQL query optimization step that should be part of the development phase is too often deadlocked for a variety of reasons.

Among these,

  • The pressure of delivery times that encourage the sacrifice of certain steps in the design process,
  • The use of code generators whose results are often not revisited by the development teams,
  • The absence of numerical elements, likely to provide developers with a projection of the volumetric evolution of the main tables over time,
  • The obviously erroneous belief that the optimization of SQL queries can be carried out entirely at a lower cost, after the database has been put into production.

This observation allows us to better understand the origin of the disastrous performance of certain applications using data from a relational database.

During a recent Oracle database optimization campaign, an AWR report highlighted SQL queries that were visibly "toxic" in a transactional environment, as shown below:

Article content

It should be noted that this report, which covers a one-hour monitoring period, reflects an exclusively transactional activity.

In this first presentation, we focus on the first request, the main characteristics of which are as follows:

  • SQL id : baqsdujdaps03,
  • Overall DBMS support time: 4327.55 seconds,
  • Unit Run Time: 84.85 seconds
  • Number of executions over the monitoring period: 51.

These characteristics alone in a transactional environment justify any attempt to improve the processing of our request, the original text of which is as follows:

Article content

A sketch of the logic sub-model hosting all the tables referenced in this text is as follows:

Article content

The urgency of optimization becomes evident when we look at the execution plan proposed by Oracle's optimizer:

Article content

The runtime cost of 1080711 displayed by the RDBMS reflects a very high level of consumption of the resources required to process the request.

This cost is the result of the combination of two full scans of the T_IMPOT table, whose cardinality here is 30609112, with a sorting induced by the set operator UNION and carried out on 14645501 lines.

If we consider that these figures will continue to grow over time, the hypothesis of a gradual deterioration in the performance of a system already on the verge of asphyxiation becomes a certainty.

Our challenge: How can we proceed to reduce and stabilize this obviously prohibitive processing cost, with a view to significantly reducing the load related to the consumption of DB server resources?

  • The first line of investigation was the use of the SQL Tuning Advisor tool. The hope of being able to benefit from possible recommendations aimed at improving the performance of the target query unfortunately remained unfulfilled.
  •  The second line of investigation was the replacement, in the two full scans table identified above (see execution plan), of the T_IMPOT table by the idx_T_IMPOT_optim_pk1 index. This index was newly created by running the following command:

Article content

The small disk footprint of the newly-created index actually helped us to achieve a slightly less costly execution plan. However, the proportions of the gain obtained did not live up to our expectations.

  • The last line of investigation was an attempt to rewrite the target query. To do this, we reformulated the original SQL query in pseudo-natural language as follows:


Article content

This reformulation inspired an optimization strategy, resolutely oriented towards avoiding the Gordian knot of two full scans through the voluminous T_IMPOT table.

The idea is to pre-qualify tax types in real-time. In more explicit terms, when a new tax identifier enters one of the two tables T_IMPOT_DOC_DEPOT or T_DOC following an insertion, a marking of the corresponding tax type is carried out in real time in one of the two new columns provided for this purpose, within table T_IMPOT_TYPE.

Having added the two pseudo-calculated columns required for the benefit of table T_IMPOT_TYPE, namely IN_T_IMPOT_DOC_DEPOT and IN_T_DOC respectively, we rewrote the text of the SQL query as follows:

Article content

And hold on tight!

As a result of this rewrite, the cost of processing the query has dropped from 1080711 to 5, as illustrated by the following new execution plan:

Article content

In our test environment, running each of the two versions of the query brought back the same dataset, consisting of 178 rows.

The spectacular cost difference was reflected in execution times of 17.134 seconds for the original query and 17 milliseconds for the rewritten query, a ratio of 1:1000.

It should be noted, however, that the impact will be significantly magnified in a production environment, where several queries run in parallel.

The new version of the query should now override the original version in the application code.

  • Developers or DB administrators who are concerned about implementation details can explore the ins and outs of how the query is rewritten. This involves the following steps:

 

1.      Add two additional columns to the T_IMPOT_TYPE table, namely: IN_T_IMPOT_DOC_DEPOT and IN_T_DOC,

Article content

The “INVISIBLE” column option, which dates to Oracle version 12c, keeps the application running smoothly despite structural changes to the T_IMPOT_TYPE table.

2.      Manually populate the two added columns, using the following update queries:

Article content

3.      Create an insert/update trigger attached to the T_IMPOT_DOC_DEPOT table, for the subsequent automatic feeding of column IN_T_IMPOT_DOC_DEPOT in T_IMPOT_TYPE table,

Article content

4.      Create a delete/update trigger attached to the T_IMPOT_DOC_DEPOT table, for the IN_T_IMPOT_DOC_DEPOT column in T_IMPOT_TYPE table to update automatically,

Article content

 5.      Create an insert/update trigger attached to the T_DOC table, for the subsequent automatic feeding of column IN_T_DOC in T_IMPOT_TYPE table,

Article content

6.      Create a delete/update trigger attached to the T_DOC table, for the subsequent automatic update of column IN_T_DOC in T_IMPOT_TYPE table,


Article content

7.      Replace the original request in the application code with the new rewritten one.

8.      Plan a daily task of readjustment, with a view to making up for any failures in the operation of the triggers, due to maintenance or other specific operation/constraint, likely to short-circuit the trigger mechanism.

The task could be in the form of a shell script, as shown below:


Article content
Article content
Note that all steps are ideally handled by a DBA profile, except for step 7 which requires the collaboration of a team of developers or, in the worst case, the software publisher.        
The proposed solution is obviously spectacular, but its implementation remains highly intrusive. It requires a modification of the application code that the software publisher is generally unwilling to make. It is certain that a rigorous development phase, with an uncompromising query optimization stage, would have encouraged the scrapping of a non-scalable execution plan, likely to deal a fatal blow to the ability to sustain a possible increase in load. It is therefore very clear that developers should not underestimate the step of optimizing SQL queries in the context of their business, at the risk of prematurely mortgaging the quality of the service provided by the software.

 

Boniface Biboussi

Conseiller en Communication et Promotion de la santé chez Organisation mondiale de la Santé

4mo

Well done dear Bro

Like
Reply
Idrissa TRAORE

Intégrateur et Administrateur Système et Deploiement

4mo

Thanks M. Nsondé for sharing your feedback with us. I really appreciate. At the beginning of the article, you must put 25-12-2024 instead of 25-12-2025.

To view or add a comment, sign in

More articles by Dr Nsonde Jean

Insights from the community

Others also viewed

Explore topics