Effective Dynamic SQL

1

Effective Dynamic SQL

Martin Büchi

Avaloq Evolution AG, Allmendstrasse 140, 8027 Zürich, Switzerland http://www.avaloq.com, [email protected]

March 1, 2005

Abstract. Static SQL is more effective than dynamic SQL in most cases. Thus, we show how to turn dynamic SQL into static SQL where possible. Where dynamic SQL is truly appropriate, effectiveness is achieved by the same means as for static SQL: bulk operations, minimization of parsing, and proper use of bind variables. We achieve these goals by using bind variables with positional and by-name binding, system contexts, our own PL/SQL variable package, dynamic PL/SQL, and other creative means. We also show when dynamic SQL without bind variables or with differentiating comments should be preferred to force the creation of a new execution plan or use star transformation, which is not supported with bind variables. A hard parse is necessary to use information from actual parameters and dynamic sampling. If the number of execution plans as a function of variable values and dynamic sampling is small and a classification can cheaply be performed, multiple static SQL statements or different security policy predicates may provide the best of both worlds. We demonstrate our findings using PL/SQL. Many principles apply to other languages, e.g., Java using Statement and PreparedStatement.

1 Introduction PL/SQL provides two means for invoking SQL, static and dynamic SQL. Static SQL is directly embedded inside PL/SQL. Figure 1 shows a PL/SQL function with static SQL in italics. The necessary setup to create the table T, which we use throughout the paper, is in the appendix. The PL/SQL compiler parses the static SQL and checks its validity when the function is created. Furthermore, the dependency of the function cnt_static on the table T is stored in Oracle’s data dictionary. The function is automatically invalidated if the table is altered or dropped. Thus, most syntactic, type, and access problems of stored SQL are detected at compile time.

Dynamic SQL is stored inside a PL/SQL string, which is passed to the SQL engine. The function cnt_dynamic of Figure 2 provides the same functionality as cnt_static, using dynamic SQL. The SQL string is stored in l_stmt and passed to the SQL engine using execute immediate. The function cnt_dynamic allows us to pass an additional condition, such as ‘ and y = 1.’ Adding such an arbitrary condition at runtime is only possible using dynamic SQL. Whenever the structure of the SQL statement is not defined at compile time, dynamic SQL is required. Dynamic SQL is not checked at compile time and no dependency on table T is registered, as can be seen by issuing the following query:

http://www.avaloq.com/

mailto:[email protected]

2

SQL> column name format a30 SQL> column referenced_name format a30 SQL> select name, referenced_name 2 from user_dependencies 3 where name in ('CNT_STATIC', 'CNT_DYNAMIC') 4 order by name; NAME REFERENCED_NAME ------------------------------ ------------------------------ CNT_DYNAMIC SYS_STUB_FOR_PURITY_ANALYSIS CNT_DYNAMIC STANDARD CNT_STATIC SYS_STUB_FOR_PURITY_ANALYSIS CNT_STATIC T CNT_STATIC STANDARD Thus, an error in the SQL statement in the hard coded part, e.g., selet instead of select, or in the condition argument, e.g., a Java programmer writing == instead of = in the above condition, would not be detected until the statement is executed.

1.1 Exception handling The function cnt_static is unlikely to fail. An ill formed security policy predicate and an ORA-1555 snapshot too old are some of the unlikely errors. Hence, there is no absolute need for an exception handler—unless you are still on 8i or 9i where there is no dbms_utility.format_error_backtrace to get the full stack trace at the time of the error in an outer exception handler. Because of the concatenation of i_condition, the function cnt_dynamic, like most code with dynamic SQL, is much more likely to fail and definitely needs an exception handler.

1.2 Reformatting of static SQL We add an origin comment to each and every SQL statement to make it easier to find the corresponding source during tuning. In the case of dynamic SQL, this can be a regular comment.

create or replace function cnt_static( i_x number ) return pls_integer as l_cnt pls_integer; begin select count(*) /*+cmt:cnt_static*/ into l_cnt from t where x = i_x; return l_cnt; end cnt_static;

Figure 1: Function cnt_static with static SQL

3

Static SQL is preprocessed by the PL/SQL compiler. This includes normalization of white space, capitalization of key words, splitting of records (Section 6.1), and—starting with Oracle 9.2.0.5—the removal of comments: SQL> variable x number SQL> exec :x := cnt_static(1); SQL> exec :x := cnt_dynamic(1); SQL> select sql_text 2 from sys.v_$sqlarea s 3 where sql_text not like '%sys.v_$sql_plan%' 4 and s.sql_text like '%/*%cnt_%ic*/%'; SQL_TEXT ----------------------------------------------------------------------- SELECT COUNT(*) /*+cmt:cnt_static*/ FROM T WHERE X = :B1 /*cnt_dynamic*/ select count(*) from t where x = :i_x Thus, one has to resort to pseudo-hints as used in the example, name an alias after the enclosing PL/SQL unit, or add a dummy condition like ‘and 'cnt_static' is not null.’ Note that Oracle only takes the first hint block into consideration, thus the pseudo-hint needs to come second or be added to the first block.

Some tools, like the Hotsos Profiler [1], that aggregate statements differing in literal values only do not aggregate statements with different comments, (pseudo) hints, alias names, or dummy conditions. We will return to the topic of artificial distinction marks in the context of bind variable peeking.

create or replace function cnt_dynamic( i_x number ,i_condition varchar2 := null ) return pls_integer as l_cnt pls_integer; l_stmt varchar2(4000); begin l_stmt := '/*cnt_dynamic*/ select count(*) from t where x = :i_x' || i_condition; -- log (debug) l_stmt and i_x execute immediate l_stmt into l_cnt using i_x; return l_cnt; exception when others then -- log (error) l_stmt and i_x, stack trace raise; end cnt_dynamic;

Figure 2: Function cnt_dynamic with dynamic SQL

4

1.3 Parsing Parsing, which is a major factor in scalability (locks, size, and ageing out of cache in shared pool), is one of the main differences between static and dynamic SQL. We first examine the execution of static SQL and then point out the differences for dynamic SQL. The first time a (normalized) static SQL statement is executed, the PL/SQL engine passes it to the SQL engine. The SQL engine checks whether the exactly (character for character) same SQL statement has already been executed by this session and is still cached (cache size defined by the parameter session_cached_cursor). If this is the case, the expensive optimization step can be skipped. This is sometimes called a softer soft parse [3], which can be performed without obtaining many locks that inhibit scalability.

If the statement is not in the local cache, the shared pool is checked. If the statement can be found there we need to also check for a semantic (e.g., does T also reference the same table or a local table in another schema?) and environment match (e.g., same value for sort_area_size). If we find a match, the SQL engine can skip the expensive optimization step. This is called a soft parse and requires locks on the shared pool.

If the statement is not found in either cache, it is fully analyzed and an execution plan is computed. This is a very expensive operation, which requires many locks and is a major scalability inhibitor. A parse operation that includes the optimization step is called a hard parse.

In all cases, the statement is executed and a pointer to the statement is returned to PL/SQL. PL/SQL keeps the pointer in a least-recently-used cache and only explicitly closes the referenced cursor if the session would otherwise exceed the limit of open cursors (parameter open_cursors). On subsequent invocations of the same static SQL, PL/SQL passes directly the pointer to the SQL engine. Thus, even the (softer) soft parse can be avoided. This optimization is not applicable for ref cursors [6]. Please refer to [8] or to Chapter 5 of [3] for a more detailed description of statement processing.

Every time a native dynamic SQL statement is executed in 9i, its text is passed to the SQL engine. The SQL engine then processes it the same way, e.g., first checks the local cache, then the shared pool and then performs a hard parse if the statement is not found in either cache. Thus, it is impossible to avoid the (softer) soft parse with native dynamic SQL: SQL> alter system flush shared_pool; SQL> declare 2 l_x number; 3 begin 4 for i in 1..10 loop 5 l_x := cnt_static(999);

l_stmt := '/*cnt_dynamic*/ select count(*) from t where x = ' || i_x || i_condition; execute immediate l_stmt into l_cnt;

Figure 3: Variation of function cnt_dynamic with string concatenation instead of bind variable

5

6 l_x := cnt_dynamic(999); 7 end loop; 8 end loop; 9 / SQL> select substr(sql_text,1,40) txt, version_count vc, parse_calls pc 2 from sys.v_$sqlarea s 3 where sql_text not like '%sys.v_$sql_plan%' 4 and s.sql_text like '%/*%cnt_%ic*/%'; TXT VC PC ---------------------------------------- ---------- ---------- SELECT COUNT(*) /*+cmt:cnt_static*/ FROM 1 1 /*cnt_dynamic*/ select count(*) 1 10 In Oracle 10g on the other hand, PL/SQL introduces an optimization that avoids the soft parse for native dynamic SQL [6]: TXT VC PC ---------------------------------------- ---------- ---------- SELECT COUNT(*) /*+cnt_static*/ FROM ALL 1 1 /*cnt_dynamic*/ select count(*) 1 1 Our experiments indicate that this cache attaches the pointer to the source line (like static SQL). That is, the exact same (dynamic) SQL executed in a different place does still incur a soft parse.

For Oracle 9i, the package dbms_sql provides for the execution of dynamic SQL without repeated soft parses by storing a pointer to the cursor. Unfortunately, this package has other drawbacks as described in Section 4.

1.4 Bind variables As illustrated in Section 1.2, PL/SQL automatically turns PL/SQL variables into bind variables. In the example, ‘x = i_x’ is transformed into ‘X = :B1’ (Section 1.2). In dynamic SQL, we have to explicitly introduce bind variables, such as :i_x. If we call cnt_dynamic twice with different values for i_x, we only incur a soft parse for the second dynamic SQL statement in 9i and no parse in 10g because we have used a bind variable. If, on the other hand, we change the function cnt_dynamic to use string concatenation rather than a bind variable for i_x (Figure 3), we pay for a hard parse for each different value of i_x.

Using literals instead of bind variables or different comments in dynamic SQL causes expensive hard parses. For example, if we call the function cnt_dynamic twice with identical values for i_x and with values for i_condition that differ only in a comment we get two hard parses:

SQL> declare 2 l pls_integer; 3 begin 4 l := cnt_dynamic(1, ' and y = 1 /*cmt1*/'); 5 l := cnt_dynamic(1, ' and y = 1 /*cmt2*/'); 6 end; 7 /

6

SQL> select sql_text 2 from sys.v_$sqlarea s 3 where s.sql_text not like '%sys.v_$sql_plan%' 4 and s.sql_text like '%select%/*cmt%'; SQL_TEXT ----------------------------------------------------------------------- /*cnt_dynamic*/ select count(*) from t where x = :i_x and y= 1 /*cmt1*/ /*cnt_dynamic*/ select count(*) from t where x = :i_x and y= 1 /*cmt2*/ Instead of querying the V$ view, we could also use SQL tracing and see both statements with the comment ‘Misses in library cache during parse: 1.’

In practice, bind variables are also a prerequisite for stored outlines together with cursor_sharing = exact.

1.5 SQL injection The function cnt_dynamic introduces a possible security hole. Assume that the function is stored in schema K and that execute on cnt_dynamic is granted to L. Furthermore assume that K contains a function side_effect that performs an operation (inside an autonomous transaction) that L has no right to execute. By invoking cnt_dynamic with i_condition set to ‘ and side_effect = 1’ L uses an exploit called SQL injection to invoke the forbidden function. See [5] for more information on SQL injection.

1.6 Overview String concatenation as used in Figure 3 is inappropriate. Yet, it may lead to better performance because of bind variable peeking as explained in Section 2. We will show how to achieve it all: almost minimal parsing, security, and good performance.

Most of this paper also applies to Java and other programming languages. A Statement in Java corresponds to dynamic SQL without bind variables. A PreparedStatement and a CallableStatement provide the same functionality as dynamic SQL using dbms_sql, namely bind variables and pointers into the shared pool to avoid redundant soft parses, but no Java compile time checking of the SQL string and no dependency tracking as offered by PL/SQL. This is true for Java outside the database as well as Java stored procedures.

This paper applies to Oracle 9i and 10g unless explicitly stated. We assume a ‘standard’ parameterization and ‘normal’ system statistics in our examples. For brevity we have removed empty lines and lines with status messages from examples. All scripts are available for download.1

2 Bind variable peeking The rule-based optimizer (RBO) does not consider the values of literals. Thus, it generates the same execution plan whether literals or bind variables are used. For the cost-based optimizer (CBO), on the other hand, the situation is different. In 8i, the CBO may generate different plans for literals and bind variables. Oracle 9i introduces bind variable peeking, which again produces identical plans. 1 http://www.abo.fi/~mbuechi/

7

2.1 Oracle 8i In Oracle 8i, the CBO treats the values of bind variables as unknowns during the optimization and execution plan generation of a hard parse. Thus, it has less information when bind variables are used and may not generate the same ideal execution plans as if literals were used. The following example shows that the CBO cannot profit from histograms when bind variables are used:

variable a number exec :a := 999 alter session set sql_trace=true; select sum(y) from t where x = :a /*999*/; select sum(y) from t where x = 999; exec :a := 1 select sum(y) from t where x = :a /*1*/; select sum(y) from t where x = 1; disconnect The trace file shows that the two statements with literals are processed differently. Namely, an index range scan is used for x = 999 and a full table scan for x = 1. The bind variable queries, on the other hand, both incur a full table scan. This indicates, that the CBO did not know that the actual bind variable value is 999, for which an index range scan is cheaper. Note the comments in the bind variable examples to force hard parses.

On the positive side, already Oracle 8i can perform partition elimination based on the values of bind variables.

2.2 Oracle 9i and 10g Starting with Oracle 9i, the CBO peeks at the value of bind variables. That is, it uses the values of the first invocation, which causes the hard parse, to optimize the query. Thus, we get the same execution plan as with identical literals. This can be easily shown by rerunning the histogram example above.

The problem, as documented e.g. in [3] is that the execution plan remains fixed for all subsequent invocations. That is, if we run the above example with the comments removed, we will use the suboptimal index range scan for a = 1 also. If we flush the shared pool and rerun the example with a = 1 first and then with a = 999, we will use a full table scan to access the single row with x equal to 999.

The Oracle Performance Tuning Guide and Reference [10] describes this as follows: The query optimizer peeks at the values of user-defined bind variables on the

first invocation of a cursor. This feature lets the optimizer determine the selectivity of any WHERE clause condition, as well as if literals have been used instead of bind variables. On subsequent invocations of the cursor, no peeking takes place, and the cursor is shared, based on the standard cursor-sharing criteria, even if subsequent invocations use different bind values.

When bind variables are used in a statement, it is assumed that cursor sharing is intended and that different invocations are supposed to use the same execution plan. If different invocations of the cursor would significantly benefit from different execution plans, then bind variables may have been used inappropriately in the SQL statement. Bind peeking works for a specific set of clients, not all clients.

8

There are several enhancement requests that would allow the user to force bind variable peeking every time, e.g., with a hint or a parameter, and the generation of a new execution plan if the assumptions made for the existing execution plans are not satisfied. This way, a statement would have an associated decision tree that could dynamically grow as new value constellations are used. Unfortunately, no such feature is available yet.

Bind variable peeking can be turned off in 9i and 10g by setting _optim_peek_user_binds to false or setting optimizer_features_enabled to 8.1.7 or lower, thereby reverting to the behavior described in Section 2.1. We do not recommend these settings.

The execution plan can also be fixed to a form that is independent of the first set of actual bind values by using stored outlines, hints, or application contexts or PL/SQL functions instead of bind variables to hide the values from the optimizer.

The parameter cursor_sharing, which is often mentioned in the context of bind variable peeking actually has just the opposite effect. It turns some literals into bind variables. One reason for the confusion might be that with cursor_sharing = similar, additional dynamic sampling may occur.

Judging from comments on OTN, Metalink, and other sites, many people consider bind variable peeking to be evil in its current form and some even suggest reverting to dynamic SQL with literals. We believe that (optional) every-time peeking would be an improvement. But already as it is, bind variable peeking can provide a performance improvement at the cost of some case distinction in PL/SQL. We will revisit the topic of bind variable peeking in the following sections.

2.2.1 Explain plan Since explain plan and autotrace, which uses explain plan, do not perform bind variable peeking, they should be used with care. If the actual statement uses literals or bind variables, explain plan must be used with representative literals. If the actual statement uses bind variables and there is no single set of representative bind values, bind variables have been used inappropriately and one of the solutions described below should be adapted. If the actual statement uses sys contexts or PL/SQL variable packages to pass values, explain plan may be used with the actual functions or bind variables.

SQL traces (but not with the explain option) and sys.v_$sql_plan always correctly display the plan that was actually used.

2.2.2 Bugs in peeking In Oracle 9i, bind variable peeking does not always result in the same execution plan as literals due to a number of bugs (e.g., 3045275, 3132098, 3349903, and 3668224). In Oracle 10g, there are fewer bugs related to peeking.

While writing this paper, we discovered a few additional bugs. In dbms_sql, there is no peeking (9.2.0.5, 9.2.0.6, 10.1.0.3 on AIX and Solaris; bug 4179405). There is no peeking for Java stored procedures on some platforms (e.g., 9.2.0.5, 9.2.0.6, 10.1.0.3 on AIX and Solaris). With older 9i JDBC drivers peeking does not work either, except against 10g using the OCI driver.

9

The execution plans with peeking sometimes look slightly different, e.g., an additional filter, under 9i. This does not seem to affect performance, even though sys.v_$sql_plan shows that the filter is applied after a full table scan.

3 Static SQL In an OLTP system 99% of all SQL should be static SQL inside PL/SQL. Most applications that use less static SQL have a potential for scalability and performance improvement right there. Let’s summarize the benefits of static SQL and elaborate on some of the details:

1. Most errors are found at compile time, i.e., before the program is run. The PL/SQL compiler checks static SQL when the embedding PL/SQL unit is created or altered. This check does not preclude all run-time errors, especially in invoker-rights procedures where object names, such as T, are resolved first against the invoker’s schema. Likewise security policy predicates, which are appended at run time, may render a static SQL statement invalid.

2. Dependencies from objects referenced in static SQL are tracked in sys.dependency$ (all_dependencies). If a referenced object is dropped or altered, the referencing unit is invalidated. Again, the error is detected at compile time, i.e., before the program is run.

3. Static SQL avoids the overhead and scalability problems of soft parses by caching cursors. Of course, this is only an advantage if the same SQL is executed multiple times in the same session.

4. Static SQL automatically uses bind variables, thereby avoiding hard parses for different values and the problem of SQL injection. All PL/SQL variables used inside static SQL are automatically turned into bind variables.2

5. Static SQL provides additional bulk functionality and—starting with 10g—automatic bulk prefetching for implicit cursors. Oracle 10g automatically bulk fetches 100 rows at a time from implicit cursors [6]. Our testing showed that this only happens if plsql_debug is false and plsql_optimize_level is 2.

Static SQL, like dynamic SQL with bind variables, makes it easier to tune an application because the SQL area (sys.v_$sqlarea) and trace files are not cluttered with entries that differ only by literal values.

3.1 Limitations of static SQL The following are the limitations of static SQL:

1. The structure of the query must be known at compile time. We cannot use static SQL if we don’t know the table we query, the columns we want to return, or the structure of the where clause. It is not possible to add an arbitrary condition to the where clause of static SQL as done in Figure 2 by using dynamic SQL.

2. There are a few implementation restrictions and bugs concerning static SQL. For example, the Oracle 8i PL/SQL parser does not know analytic functions and all versions

2 Note that PL/SQL does not precompute PL/SQL functions that depend only on constants and PL/SQL variables and turn their results into bind variables. Instead, the SQL engine needs to make callbacks with expensive context switches. It pays to precompute the results of such functions, assign them to a PL/SQL variable, and use the latter inside static SQL. This is often overlooked.

10

of Oracle wrap tool up to 10.1.0.3 require the option edebug=wrap_new_sql to correctly wrap PL/SQL with certain SQL statements (bug 4091699).

3. Star transformations cannot be applied with bind variables [9]. Of course, this limitations does not apply to static SQL without bind variables, e.g., static SQL inside generated PL/SQL (Section 7). Since they are mostly used in DSS applications where bind variables are counter productive, this is not a very serious limitation.

4. Bind variables make applications scalable and guard against SQL injection, but also force the different invocations to share a single execution path, which might not be ideal for all of them. This problem can be solved without resorting to literals in most cases as described below.

5. All referenced objects must exist before the enclosing PL/SQL unit can be compiled. For invoker rights procedures, this means that the definer schema needs to contain the referenced objects even if the procedure is never executed against them. Because of this restrictions static SQL and PL/SQL cannot be used for code that needs to run on multiple versions of Oracle and based on a version check use features that are not available in all versions. As an alternative to dynamic SQL, a package with different bodies for each supported Oracle version should also be considered. Conditional compilation with if blocks based on constants [6] did not help to overcome this restriction in our tests.

6. The PL/SQL unit is invalidated if a referenced object is altered. This may not always be desired, e.g., when dropping a partition with old data.

7. It is not possible to perform DDL. E.g., the creation of a table from inside PL/SQL requires dynamic SQL.

4 Dynamic SQL Dynamic SQL is SQL that is stored inside a PL/SQL string and submitted to the SQL engine using native dynamic SQL or the package dbms_sql. The reasons for using dynamic SQL are documented under limitations of static SQL (Section 3.1). In other words, we only use dynamic SQL if static SQL is not possible or desirable.

There are two forms of native dynamic SQL. Execute immediate is showcased in Figure 2. Figure 4 provides an example of the second form, opening an untyped ref cursor. This is the closest approximation to the implicit bulk fetch from implicit cursors in 10g. Since there is no automatic bulk fetching for untyped ref cursors, we have to perform it manually. The bulk alternative is to bulk collect all rows at once using execute immediate. The advantage of the ref cursor and limit clause approach is that the memory consumption is independent of the result set size. The disadvantage is that we keep a cursor open and are susceptible to a snapshot too old if the loop operation takes very long. Connor McDonald et al provide an example of how to determine the optimal bulk size [7].

Figure 5 provides an example of using the package dbms_sql, which was introduced in Oracle 7.1. With dbms_sql, the open, parse, bind, execute, fetch (not used in example), and close steps have to be performed explicitly similar to the usage of a Java PreparedStatement. Native dynamic SQL, on the other hand, merges them as far as possible.

The following are the main differences between native dynamic SQL and dbms_sql:

11

1. With dbms_sql, the same statement can be executed multiple times without incurring additional soft parses. With native dynamic SQL, this is only possible in 10g. Repeated execution of the same statement begs however the question whether the same couldn’t be achieved more efficiently using a single bulk statement.

2. For single executions, native dynamic SQL uses slightly fewer latches and is faster than dbms_sql. In 10g, native dynamic SQL was always faster in our tests. In 9i, it was almost always the case.

3. Native dynamic SQL uses positional binding, dbms_sql uses by-name binding. Therefore, it is easier with dbms_sql to write generic code that can handle an arbitrary number of bind variables. We show below how the same can be achieved using native dynamic SQL.

create or replace procedure dynamic_cursor( i_x varchar2 ,i_condition varchar2 := null ) as type t_tab is table of t%rowtype index by pls_integer; l_list t_tab; l_cur sys_refcursor; l_stmt varchar2(4000); begin l_stmt := '/*dynamic_cursor*/ select * from t where x = :i_x' || i_condition; -- log (debug) l_stmt and i_x open l_cur for l_stmt using i_x; -- open cursor loop fetch l_cur -- bulk fetch from cursor bulk collect into l_list limit 100; for i in 1..l_list.count loop null; -- do something -- e.g., execute immediate l_list(i).c using l_list(i).y end loop; exit when l_cur%notfound; -- check whether more elements end loop; close l_cur; -- close cursor exception when others then -- log (error) l_stmt and i_x, stack trace if l_cur is not null and l_cur%isopen then close l_cur; -- close cursor, in case implicit closing fails end if; raise; end dynamic_cursor;

Figure 4: Dynamic ref cursor with explicit bulk fetching

12

4. Bind variable peeking is used with native dynamic SQL, but not with dbms_sql. This difference is not documented, but shown with the following experiment. Note that for native dynamic SQL, peeking takes place and an index range scan is chosen for x = 999:

SQL> declare 2 l_sum pls_integer; 3 l_x pls_integer; 4 procedure db( 5 i_val pls_integer 6 ) 7 as 8 l_cur pls_integer; 9 l_num_rows_processed pls_integer; 10 begin 11 l_cur := dbms_sql.open_cursor; 12 dbms_sql.parse( 13 c => l_cur 14 ,statement => '/*peek db' || i_val || '*/' 15 || 'select sum(y) from t where x = :x' 16 ,language_flag => dbms_sql.native 17 ); 18 dbms_sql.bind_variable( 19 c => l_cur 20 ,name => 'x' 21 ,value => i_val 22 ); 23 l_num_rows_processed := dbms_sql.execute(l_cur); 24 dbms_sql.close_cursor(l_cur); 25 end db; 26 begin 27 l_x := 1; 28 execute immediate '/*peek d1*/ select sum(y) from t 29 where x = :x' into l_sum using l_x; 30 l_x := 999; 31 execute immediate '/*peek d999*/ select sum(y) from t 32 where x = :x' into l_sum using l_x; 33 db(1); 34 db(999); 35 end; 36 / PL/SQL procedure successfully completed. SQL> set pagesize 100 SQL> break on stmt skip 1 SQL> column operation format a20 SQL> column options format a20 SQL> select substr(s.sql_text, 1, 20) stmt 2 ,p.operation 3 ,p.options 4 from sys.v_$sql_plan p 5 ,sys.v_$sqlarea s 6 where p.address = s.address

13

7 and s.sql_text not like '%sys.v_$sql_plan%' 8 and s.sql_text like '%/*peek%' 9 order by stmt, id; STMT OPERATION OPTIONS -------------------- -------------------- -------------------- /*peek d1*/ select s SELECT STATEMENT SORT AGGREGATE TABLE ACCESS FULL /*peek d999*/ select SELECT STATEMENT SORT AGGREGATE TABLE ACCESS BY INDEX ROWID INDEX RANGE SCAN /*peek db1*/select s SELECT STATEMENT SORT AGGREGATE TABLE ACCESS FULL /*peek db999*/select SELECT STATEMENT SORT AGGREGATE TABLE ACCESS FULL

5. If the number and types of outputs are unknown at compile time, the decision which package to use depends upon who fetches the data. If a client application, such as Java fetches it, native dynamic SQL must be used. If the data is fetched from PL/SQL, dbms_sql is the easier solution. Again, the approach of the embedding dynamic PL/SQL block works well in most cases.

6. It takes more code to use dbms_sql.

4.1 Best Practices The above examples already show many best practices for dynamic SQL:

1. Double check whether you should be doing the work at all. Don’t return 1M rows if the user will only look at the first screen full.

2. Use bind variables to avoid hard parses and SQL injection, unless you gain an explicit benefit from not using them: star transformations and bugs in peeking. Use distinguishing comments rather than literals to create a specific execution plan through a hard parse. With dynamic PL/SQL, assign all bind variables to PL/SQL variables in the initial declare block to improve readability.

3. If you don’t use bind variables, guard yourself against SQL injection. This is especially true if a concatenated text element may be entered by a user as opposed to being loaded from a table that has been populated during parameterization. If you use literals for enabling star transformations, this check is simple: number literals are only allowed to contain digits and separators, character literals are not allowed to contain closing quotation marks, etc. If the arguments contain structure, no generic check can be provided.

14

4. Use bulk operations. Starting with 9i, most bulk operations are also possible in native dynamic SQL.

create or replace package types as type t_int_tab is table of pls_integer index by pls_integer; end types; / create or replace procedure dbms_sql_exp( i_stmt varchar2 ,i_bind_name varchar2 ,i_bind_list types.t_int_tab ) as l_cur pls_integer; l_stmt varchar2(32767); l_num_rows_processed pls_integer; begin l_cur := dbms_sql.open_cursor; l_stmt := '/*dbms_sql_exp*/ ' || i_stmt; -- log (debug), l_stmt dbms_sql.parse( c => l_cur ,statement => l_stmt ,language_flag => dbms_sql.native ); for i in 1..i_bind_list.count loop begin -- log (debug) i, i_bind_list(i) dbms_sql.bind_variable( c => l_cur ,name => i_bind_name ,value => i_bind_list(i) ); l_num_rows_processed := dbms_sql.execute(l_cur); exception -- nested block because i not visible outside when others then -– unless declared in outer scope (ugly) -- log (error) l_stmt, i, i_bind_list(i) raise; end; end loop; dbms_sql.close_cursor(l_cur); exception when others then -- log (error) l_stmt if l_cur is not null then dbms_sql.close_cursor(l_cur); end if; raise; end dbms_sql_exp;

Figure 5: Procedure dbms_sql_exp

15

5. Assign the dynamic SQL to a PL/SQL variable first and do not concatenate it directly inside the dynamic SQL statement (execute immediate, open ref cursor, dbms_sql.parse). This makes it easier to step through the code with a debugger, such as JDeveloper, and to log it first.

6. Log the dynamic SQL before executing it. It is best to use a package with multiple logging levels, such as log4plsql, and log the statement as well as all bind variables before execution. Invalid dynamic SQL statements of an application that does not perform proper logging can be found in the raw trace file—provided tracing was on in the critical moment or the problem can be reproduced easily.

7. Embed an origin comment inside the statement. Because the dynamic SQL statement is usually computed at runtime, it may be hard to find the PL/SQL text that executes it.

8. Add an exception handler after the dynamic SQL. Dynamic SQL is much more likely to fail than static SQL. Log all potentially useful information inside this handler.

9. Double check whether you shouldn’t be using static SQL instead. If the structure is not known at compile time, maybe it is fixed during parameterization—before run time—and PL/SQL with static SQL can be generated (Section 7.2).

10. Last but not least, effective dynamic SQL can only make your application fly if paired with good database and algorithm design.

Except for bulk operations, all best practices not only make the code faster but also more readable. Bulk operations can lead to dramatic speed improvements, but they often make the code slightly more obscure. That’s why we only use bulk operations if there is a clear benefit, i.e., lots of data, frequent execution, or business critical functionality.

5 Dynamic PL/SQL Using execute immediate and dbms_sql, we can also execute dynamic PL/SQL. The following trivial example, where nop is an empty procedure, omits exception handling and other best practices for brevity:

SQL> begin 2 execute immediate ' 3 declare 4 l_cnt pls_integer; 5 begin 6 for i in 1..2 loop 7 select count(*) 8 into l_cnt 9 from t 10 where x = 999; 11 dbms_output.put_line(''Count: '' || l_cnt); 12 dbms_lock.sleep(5); 13 end loop; 14 insert into t(x) values(997); 15 commit; 16 insert into t(x) values(998); 17 insert into t(x) values(-1); 18 end;';

16

19 exception 20 when others then null; 21 end; The dynamic PL/SQL is executed as follows:

1. The anonymous PL/SQL block and the bind variable values are passed to the SQL engine.

2. The SQL engine sets an implicit savepoint.

3. Oracle checks the shared pool for a match (soft parse). If one is found, the next step is skipped.

4. If no match is found, the PL/SQL compiler compiles the block (hard parse). Note that there is no such concept as bind variable peeking because the PL/SQL compiler generates the same code independent of the actual bind variable values. If the PL/SQL statement contains static SQL as in our example, the PL/SQL compiler checks this also. The static SQL will be analyzed a second time by the SQL engine upon execution.

5. The PL/SQL block is executed.

6. If the block exits abnormally, i.e., with an exception, a rollback to the savepoint set in Step 2 is performed, unless the PL/SQL block performed a rollback to an earlier point. If the PL/SQL block performed a commit, a rollback to the implicit savepoint set immediately after the last commit is performed in case the block ends abnormally. The reason for this semantic difference compared to static PL/SQL is that the call goes through the SQL engine. The latter executes statements atomically, i.e., either the whole statement completes successfully or no changes are enacted. As described, the atomicity principle collides with the durability principle in case the PL/SQL block contains a commit and the latter wins. On the other hand, the SQL engine does not guarantee read consistency for the complete PL/SQL block as it does for a single SQL statement. Both facts are shown by the following example, where the count is increased (no read consistency) and 997 is inserted and committed but 998 not (rollback to implicit savepoint). The behavior without a commit can be shown by removing the commit and inserting 997 before the anonymous block.

-- Session 1 SQL> set serveroutput on -- example from above 20 / Count: 1 Count: 2 SQL> select x from t where x between 997 and 998; X ---------- 997 SQL> rollback; Rollback complete.

17

SQL> select x from t where x between 997 and 998; X ---------- 997 -- Session 2: start immediately after session 1 SQL2> insert into t(x) values(999); SQL2> commit; Dynamic PL/SQL incurs an expensive double context switch. We use Tom Kyte’s runstat [4] to show the additional cost of dynamic PL/SQL over static PL/SQL: SQL> exec runstats_pkg.rs_start SQL> begin 2 for i in 1..10000 loop 3 execute immediate 'begin nop(:x); end;' using i; 4 end loop; 5 end; 6 / SQL> exec runstats_pkg.rs_middle SQL> begin 2 for i in 1..10000 loop 3 nop(i); 4 end loop; 5 end; 6 / SQL> exec runstats_pkg.rs_stop Run1 ran in 91 hsecs Run2 ran in 13 hsecs run 1 ran in 700% of the time Name Run1 Run2 Diff LATCH.cache buffers chains 91 59 -32 LATCH.checkpoint queue latch 384 0 -384 LATCH.library cache pin alloca 20,015 17 -19,998 LATCH.library cache pin 80,053 55 -79,998 LATCH.library cache 90,085 95 -89,990 LATCH.shared pool 90,087 89 -89,998 STAT...parse count (total) 10,004 4 -10,000 STAT...opened cursors cumulati 10,004 4 -10,000 STAT...recursive calls 10,001 1 -10,000 Run1 latches total versus runs -- difference and pct Run1 Run2 Diff Pct 280,763 354 -280,409#######% As for dynamic SQL, bind variables should be used in dynamic PL/SQL to avoid non-scalable hard parses and to avoid PL/SQL injection. In our experience, anonymous PL/SQL blocks become most readable if all bind variables are assigned to constants in the initial declare block. There are two somewhat esoteric cases where bind variables should not be used:

18

1. If the PL/SQL block contains static SQL that could profit from a star transformations and the bind variables would be used in this SQL statement.

2. If the PL/SQL block contains static SQL that would reference the bind variables and bind variable peeking does not work properly for the specific SQL due to a bug.

Instead of using literals to produce an optimal execution plan for an embedded static SQL statement, a unique comment/hint/alias name should be used in the statement. With this approach, we are protected from SQL injection.

As for dynamic SQL, bind variables can only be used for values but not for names of procedures or lengths of types used in the PL/SQL block.

6 More on bind variables There is still a lot more to say about bind variables in the context of dynamic SQL and PL/SQL.

6.1 Types Since dynamic SQL and dynamic PL/SQL go through the SQL engine, only bind variables of SQL types can be used. PL/SQL types, including boolean and user defined PL/SQL types, are excluded. This can present a major inconvenience for dynamic PL/SQL, but can be overcome with our variable package (Section 6.3.3). In some cases it is also a restriction of dynamic SQL compared to static SQL: declare l_t t%rowtype; begin l_t := ...; insert into t values l_t; -- OK -- execute immediate 'insert into t values :x' using l_t; -- Not OK end; This works for static SQL because the PL/SQL compiler transforms it into a bind of the individual record components, which is not possible in the case of dynamic SQL because the PL/SQL compiler does not parse the SQL text:

SQL> select sql_text 2 from sys.v_$sqlarea 3 where sql_text not like '%sys.v_$sql_plan%' 4 and s.sql_text like '%INSERT INTO T%'; SQL_TEXT ------------------------------------------------- INSERT INTO T VALUES (:B1,:B2,:B3) Our tests indicate that varchar2 longer than 4,000 bytes can be passed to dynamic PL/SQL blocks. However, this is not guaranteed by the documentation.

PL/SQL collections of types with SQL equivalents can be used in dynamic bulk operations: declare type t_int_tab is table of pls_integer index by pls_integer;

19

l_int_list t_int_tab; begin -- fill l_int_list forall i in 1..l_int_list.count execute immediate 'insert into t(x) values(:x)' using l_int_list(i); end; PL/SQL records may be used only in the into clause: declare l_t t%rowtype; begin execute immediate 'select * from t where rownum=1' into l_t; end; If we need to pass collections and records, we must use SQL collections and objects as best approximations.

6.2 Duplicate place holders Native dynamic SQL treats duplicate place holders with the same name inconsistently. For dynamic SQL, they are treated like two different place holders: begin execute immediate 'insert into t(x, y) values(:1, :1)' using 4, 5; end; On the other hand, dynamic PL/SQL binds repeated place holders with the same name to the same value. The following analyzes the table K in schema K: begin execute immediate 'begin dbms_stats.gather_table_stats(:1, :1); end;' using 'K'; end; We strongly suggest that you do not use duplicate place holders in native dynamic SQL or PL/SQL to avoid any confusion due to this inconsistency.

Since dbms_sql uses by-name binding, all place holders with the same name are bound to the same value. That is, the behavior of dynamic SQL and PL/SQL in dbms_sql is like that of native dynamic PL/SQL.

6.3 Variable number of bind variables For some dynamic SQL statements, the number and types of bind variables varies from execution to execution. For example, we have a generic lookup application where the user can search several views using equality and like constraints on up to 30 fields. The generated SQL only contains conditions on the fields that the user did not leave empty. This and similar tasks can easily be handled with dbms_sql. The place holders can be given meaningful names and the bind values can be stored inside a PL/SQL collection and bound by-name inside a loop over this collection.

20

Since native dynamic SQL has other advantages over dbms_sql as described in Section 4, we show here how the same can be accomplished with native dynamic SQL in various ways.

6.3.1 Dummy place holders The first option is to write the invoking statement with the maximum number of bind variables and include dummy place holders into the generated SQL. For example, assume that we need to handle up to 3 place holders. The statement then becomes: execute immediate l_stmt using l_1, l_2, l_3; Assume that the user only entered a constraint on x, but not on y and c. We still generate a statement with three place holders: l_stmt := 'insert into t2 select * from t where x=:b1 and (0=0 or :b2 is null or :b3 is null)'; This statement can be executed using the above generic statement with three bind variables. If the maximum of bind variables is very high, we may create multiple versions so that we only have to add dummies up to the next limit, e.g., 5, 10, 20: if l_bind_list.count <= 5 then execute immediate l_stmt using l_bind_list(1),..., l_bind_list(5); elsif l_bind_list.count <= 10 then execute immediate l_stmt using l_bind_list(1),..., l_bind_list(10); elsif l_bind_list.count <= 20 then execute immediate l_stmt using l_bind_list(1),..., l_bind_list(20); else -- use dbms_sql to bind inside a loop -- must name binds in order of occurrence, e.g., :b1, :b2, etc. end if; A shortcoming of this approach is that all bind variables must be of the same type, e.g., varchar2. It is suggested that explicit conversions be used inside the dynamic SQL statement. Implicit conversions can lead to various problems, e.g., an index not being used because an implicit conversion is applied to a table column rather than the bind variable.

Of course, it would theoretically be possible to have bind variables of different types, e.g., varchar2, number, and date. But already with three types we would need 310 static versions to support all combinations of ten bind variables.

With dbms_sql, on the other hand, this can be easily done. If the SQL statement is defined before we need to bind the variables, the package already provides all that is required. In a more complex setup, where parts of the dynamic SQL statement and bind variable values are computed by different subroutines, we use a utility package to store the bind value list until its elements are bound. The list is of the following (variant) record type. The field name contains the bind variable name, the field btype defines the type of the bind variable and, herewith, which of the remaining fields is used.

21

type t_bind_var_rec is record ( name varchar2(30) ,btype pls_integer ,bnumber number ,bdate date ,bvarchar2 varchar2(32767) ,bblob blob ,bclob clob ,bbfile bfile ,burowid urowid );

6.3.2 Application contexts Instead of using bind variables, values can be passed through application contexts [2]. Using application contexts, we can avoid hard parses as with bind variables. We can emulate by-name binding in native dynamic SQL and we do not get bind variable peeking. The following provides an example: create or replace procedure set_ctx_val( i_name varchar2 ,i_val varchar2 ) as begin dbms_session.set_context('bind_val', i_name, i_val); end set_ctx_val; / create or replace context bind_val using set_ctx_val; declare l_cnt pls_integer; begin set_ctx_val('c', 'foo'); set_ctx_val('x', '999'); execute immediate ' select count(*) from t where x = to_number(sys_context(''bind_val'', ''x'')) and c = sys_context(''bind_val'', ''c'')' into l_cnt; end; / Our standard query shows that the CBO does not peek into the application context: SQL> break on stmt skip 1 SQL> column operation format a20 SQL> column options format a20 SQL> select substr(s.sql_text, 1, 20) stmt 2 ,p.operation 3 ,p.options 4 from sys.v_$sql_plan p 5 ,sys.v_$sqlarea s

22

6 where p.address = s.address 7 and s.sql_text not like '%sys.v_$sql_plan%' 8 and s.sql_text like '%sys_context(''bind_val%' 9 order by stmt, id; STMT OPERATION OPTIONS -------------------- -------------------- ------------ select count(*) SELECT STATEMENT SORT AGGREGATE FILTER TABLE ACCESS FULL Unfortunately, we can only pass string values through application contexts. The explicit conversions shown in the example are not absolutely needed, but generally recommended. For dates, we suggest the Julian calendar, e.g.: set_ctx_val('d', to_char(l_date, 'j.sssss')); and to_date(sys_context(''bind_val'', ''d'')', ''j.sssss'')) Furthermore, we need to spell out the name of the application context every time and cannot use a constant defined in a package. This is, of course, a general limitation of dynamic SQL. In static SQL, constants are passed as (implicit) bind variables. We could use a constant-valued function that returned the name of the context. This way, we would receive an error message instead of a null value in case of a typo. If we are willing to pay the cost of this context switch, we should use our own variable package instead.

6.3.3 Variable Package Our own variable package overcomes the restriction of application contexts to varchar2. We store bind variable values inside a package in an index-by table for each bind variable type: create or replace package body lib_bind_var as type t_number_list is table of number index by varchar2(200); b_number_tab t_number_list; -- other types procedure set_number( i_name varchar2 ,i_val number ) as begin b_number_tab(i_name) := i_val; end set_number; function get_number( i_name varchar2 ) return number as

23

begin return b_number_tab(i_name); end get_number; end lib_bind_var; / declare l_cnt pls_integer; begin lib_bind_var.set_varchar2('c', 'bar'); lib_bind_var.set_number('x', 999); execute immediate ' select count(*) from t where x = lib_bind_var.get_number(''x'') and c = lib_bind_var.get_varchar2(''c'')' into l_cnt; end; / As for application contexts, we do not get the equivalent to bind-variable peeking with our variable package: SQL> break on stmt skip 1 SQL> column operation format a20 SQL> column options format a20 SQL> select substr(s.sql_text, 1, 20) stmt 2 ,p.operation 3 ,p.options 4 from sys.v_$sql_plan p 5 ,sys.v_$sqlarea s 6 where p.address = s.address 7 and s.sql_text not like '%sys.v_$sql_plan%' 8 and s.sql_text like '%lib_bind_var.get%' 9 order by stmt, id; STMT OPERATION OPTIONS -------------------- -------------------- --------- select count(*) SELECT STATEMENT SORT AGGREGATE FILTER TABLE ACCESS FULL Of course, we get bind variable peeking in static SQL inside dynamic PL/SQL if we assign the bind values to local PL/SQL variables and use the latter inside the nested SQL.

Using our variable package, we can even pass collections (table functions). We recommend that a cardinality hint be used in this case because the optimizer does not know the expected size of the rowset.

For dynamic PL/SQL, the variable package is always to be preferred over application contexts (Section 6.3.2) to pass bind variables because the latter incurs a context switch to SQL.

24

That is, the access of sys_context(x, y) inside PL/SQL results in a select sys_context(x, y) from sys.dual as can be seen in the trace file. We ran the following experiment on AIX under 9.2.0.5: set serveroutput on SQL> exec runstats_pkg.rs_start SQL> begin 2 for i in 1..10000 loop 3 set_ctx_val('c', 'foo'); 4 execute immediate ' 5 declare 6 l_c varchar2(30) := sys_context(''bind_val'', ''c''); 7 begin 8 null; 9 end;'; 10 end loop; 11 end; 12 / SQL> exec runstats_pkg.rs_middle SQL> begin 2 for i in 1..10000 loop 3 lib_bind_var.set_varchar2('c', 'foo'); 4 execute immediate ' 5 declare 6 l_c varchar2(30) := lib_bind_var.get_varchar2(''c''); 7 begin 8 null; 9 end;'; 10 end loop; 11 end; 12 / SQL> exec runstats_pkg.rs_stop Run1 ran in 211 hsecs Run2 ran in 149 hsecs run 1 ran in 141.61% of the time Run1 latches total versus runs -- difference and pct Run1 Run2 Diff Pct 520,533 211,018 -309,515 246.68% For passing data to dynamic PL/SQL, we can also use package variables directly. However, we prefer the abstraction of the bind variable package in most cases. Its overhead is small.

6.4 Generic API We can provide a generic API to set bind variable values and to create retrieval functions. This way, we can postpone the decision whether we want to use literals, application contexts, or the bind variable package. This is especially useful if generic functions produce SQL fragments that can be combined into different statements by various consumers: 1 2 3 4

declare l_cnt pls_integer; l_sql lib_bind_var.t_sql; -- pls_integer l_x lib_bind_var.t_bind_var; -- pls_integer

25

5 6 7 8 9 10 11 12 13 14 15 16 17 18

l_c lib_bind_var.t_bind_var; -- pls_integer l_stmt varchar2(4000); begin l_sql := lib_var.new(lib_bind_var.c_ctx); l_c := lib_var.set_varchar2(l_sql, 'bar'); l_x := lib_var.set_number(l_sql, 999, lib_bind_var.c_literal); l_stmt := ' select count(*) from t where x = ' || lib_bind_var.get_number(l_x) || ' and c = ' || lib_bind_var.get_varchar2(l_c) into l_cnt; execute immediate l_stmt; lib_var.remv(l_sql);

On line 8, we create a new query object. We define that by default, bind variables are passed via system contexts. The return subtype of pls_integer points into an index-by table that stores the relevant information.

On line 9, we introduce the first bind variable, which is stored in an application context. The corresponding retrieval function on line 15 returns the string ‘sys_context('bind_val', 'b1$1').’ Note that the bind variable name is generated by the system.

On line 10, we introduce the second variable, for which we specify that we want it to be included as literal into the statement. Thus, the function on line 13 returns the string ‘999.’ Thus, l_stmt is assigned the following value on line 11: select count(*) from t where x = 999 and c = sys_context('bind_val', 'b1$1') The package lib_var can be turned into an even more powerful lib_sql if we use it to assemble the SQL statement. For this purpose, we pass the individual elements (e.g., select, from, and where clauses or even individual elements thereof) corresponding to the nodes of the parse tree. This way we use the package to assemble complex queries from elements and to add comments and hints as described below.

6.5 Profiting from bind variable peeking and dynamic sampling Ideally, Oracle would reuse the same cursor exactly for those bind variable values that if peeked would produce the same execution plan. Likewise, it would be ideal if Oracle would perform dynamic sampling exactly in those cases where the results would lead to different plans.

Often, we as the programmer have additional knowledge that we can use to achieve this goal. In our running example, we know that for x between 0 and 9 we would like one execution plan and another one for all other values. We can easily achieve this, by adding the line in italics to Figure 2: l_stmt := '/*cnt_dynamic*/' || case when i_x between 0 and 9 then '/*f*/' else '/*i*/' end || 'select count(*) from t

26

where x = :i_x' || i_condition; This approach guarantees that two executions with identical values for i_condition share the same execution plan if both have values in the range 0 to 9 or if both have values outside this range. By inserting a simple comment, we force a hard parse with bind variable peeking. We do, however, not constrain the CBO as we would if we inserted a full-table scan hint for values in the range 0 to 9. This would be a bad solution because for some values of i_condition the CBO may rightfully choose another execution plan.

Sometimes we don’t have the information at hand, but its computation or lookup is much cheaper than the penalty of inappropriate reuse or pessimistic overparsing.

6.6 Forcing hard parses The cardinality hint is one of our favorite hints because it provides the CBO with additional information rather than forcing a plan on it. For example, we have an application where users can submit task for background processing. The tasks as well as a handle to the parameters are passed through a queue. The actual parameters are stored in a table per parameter type, e.g., one for numbers, strings, and dates each. The same task may be executed with 1 or with 10’000 values for a given parameter. Different execution plans are required for these cases, yet we want to minimize the number of hard parses. The solution is to include a cardinality hint with the closest power of 10 rather than the exact cardinality. That is, we tell Oracle that the parameter table contains 1, 10, 100, 1,000, or 10,000 values. This way queries with similar cardinalities can share the same execution plan.

Different security policy predicates for different users (non-static policy functions) can also be used to enable sharing of execution plans among users that invoke SQL statements with the same quantity structure only. Context-sensitive and dynamic policy functions, which are new in 10g, might even be used to get different execution plans for the same query and user. This may be an option if an application context or our variable package is used for bind variable passing (Section 6.3.2).

Our origin comments also lead to additional hard parses if the otherwise identical SQL appears in multiple PL/SQL sources. We consider this to be a good thing in most cases. First of all, a well structured application should rarely have the same SQL in multiple locations. If it does, it’s likely that we cannot guarantee that all occurrences are called with bind variable values that should share the same execution plan.

7 Turning dynamic SQL into dynamic PL/SQL When the same statement needs to be executed multiple times, it is especially important to use an efficient solution. Bulk operations should be the first choice whenever possible. If bulk operations are not possible, dbms_sql and turning dynamic SQL into (dynamic) PL/SQL should be considered. We have already discussed the parse once, execute many functionality of dbms_sql. Here we look at dynamic PL/SQL.

7.1 Anonymous blocks The first option is to dynamically execute a single anonymous PL/SQL block instead of multiple times the same dynamic SQL. For example, instead of

27

for i in 1..10000 loop execute immediate 'insert into ' || i_tab_name || ' values(:1)' using i; nop(i); end loop; we run execute immediate ' begin for i in 1..10000 loop insert into ' || i_tab_name || ' values(i); nop(i); end loop; end;'; The dynamic PL/SQL block is not only faster, it also uses significantly fewer latches (9.2.0.5): Run1 ran in 276 hsecs Run2 ran in 216 hsecs run 1 ran in 127.78% of the time Run1 latches total versus runs -- difference and pct Run1 Run2 Diff Pct 292,954 114,495 -178,459 255.87% The difference is not as pronounced on 10.1.0.3 because the softer soft parse is avoided (Section 1.3): Run1 ran in 227 hsecs Run2 ran in 218 hsecs run 1 ran in 104.13% of the time Run1 latches total versus runs -- difference and pct Run1 Run2 Diff Pct 155,867 107,165 -48,702 145.45% The disadvantage of the second approach is that the dependency on the procedure nop is not tracked by Oracle.

7.2 Packages and object types If the same statement is executed multiple times nonconsecutively, it may be advantageous to dynamically create a PL/SQL unit. This solution is most appropriate if generation happens infrequently, e.g., upon customization of the application or once a day automatically. This approach shines if the generated package or object contains multiple procedures or functions that would otherwise require multiple dynamic SQL blocks.

PL/SQL units can also be generated and used on the fly. If we want to statically reference the dynamically created or replace object, the latter cannot be a function or a procedure because we would receive an ‘ORA-04068: existing state of packages has been discarded.’ A package brings us further. We can create the package specification at compile time and the body at run time. This

28

works until we want to redefine it in a database call in which we have already used it. At this point we are blocked indefinitely waiting for the library cache pin. Furthermore, the approach does not work if multiple sessions try to do the same.

Using object types, we can accomplish our mission. We can create a new subtype while hanging on to a reference of another subtype in the same session. However, if we try to do the same in two different sessions, the session trying to create the subtype gets blocked. The solution is to create enough subtypes at compile time and at run time only create the bodies.

At compile time, we define the parent object type: create or replace type t_base_obj is object ( id integer -- must have at least one attribute ,not instantiable member procedure rm( i_param pls_integer ) ) not final not instantiable / We also define ‘enough’ subtypes and a sequence for generating unique type names: declare c_num_types constant pls_integer := 1000; begin for i in 1..c_num_types loop execute immediate ' create type t_' || i || '_obj under t_base_obj( overriding member procedure rm( i_param pls_integer ) )'; end loop; execute immediate 'create sequence type_id_seq minvalue 1 maxvalue ' || c_num_types || ' cycle nocache'; end; / Furthermore, we need an autonomous function to create the type body at run time without committing the main transaction. create or replace function create_obj( i_obj_def varchar2 ) return t_base_obj as pragma autonomous_transaction; l_name varchar2(30); l_obj t_base_obj;

29

begin select 't_' || s_type_id.nextval || '_obj' into l_name from dual; execute immediate ' create or replace type body ' || l_name || ' as overriding member procedure rm( i_param pls_integer ) as ' || i_obj_def || ' end;'; execute immediate 'begin :b1 := new ' || l_name || '(null); end;' using out l_obj; return l_obj; end create_obj; / The actual procedure first creates the custom-made package body and then uses it inside the loop: create or replace procedure n_dyn_obj_proc( i_tab_name varchar2 := 'test' ) as l_obj t_base_obj; begin l_obj := create_obj(' begin insert into ' || i_tab_name || ' values(i_param); end rm; '); for i in 1..10000 loop l_obj.rm(i); nop(i); end loop; end n_dyn_obj_proc; / The advantage of this solution over the dynamic PL/SQL block is that the dependency on nop is tracked by Oracle.

For a single run, the anonymous block solution presented in Section 7.1 is faster and requires fewer latches than the generated object type approach. Generation of PL/SQL units is more suitable if they are used multiple times.

8 Static SQL Revisited In Section 7 we have come close to a fully static solution by generating on-the-fly PL/SQL containing static SQL. Here, we consider the completely static approach. If in most cases we use one of few options, we can code those statically and use dynamic code only for the exceptional cases:

30

create or replace procedure n_static80( i_tab_name varchar2 := 'test' ) as begin if i_tab_name = 'test' then for i in 1..10000 loop insert into test values(i); nop(i); end loop; elsif i_tab_name = ... then ... else ... dynamic case for rare cases, use approach from above end n_static80; / Connor McDonald et al. [7] have another good example of this.

8.1 Bind variable peeking revisited Like for dynamic SQL (Section 6.5), we can use multiple versions of the basically same static SQL with artificial distinctions or cardinality hints to force the creation of multiple execution plans. It is important to keep in mind that starting with Oracle 9.2.0.5, the PL/SQL compiler removes comments from static SQL.

create or replace procedure tt_nds_bulk( i_loop pls_integer ,i_cnt pls_integer ) as type t_tt_y_list is table of tt.y%type index by pls_integer; l_tt_y_tab t_tt_y_list; begin for x in 1..i_loop loop l_tt_y_tab.delete; for y in 1..i_cnt loop l_tt_y_tab(y) := y; end loop; forall i in 1..l_tt_y_tab.count execute immediate 'insert into tt(x, y, c) values(:x, :y, ''abcdefghijklmnopqrstuvwxyz'')' using x, l_tt_y_tab(i); end loop; end tt_nds_bulk;

Figure 6: Insert using native dynamic SQL with bulk operations

31

8.2 Other approaches Another typical pattern for turning dynamic SQL into static SQL is to convert one-table-per-day approaches into one-partition-per-day solutions. Many databases contain a table per day, e.g., to store activities of the day or current account balances. Populating and accessing these tables requires dynamic SQL or PL/SQL. We can turn this into static SQL by using a list partitioned table with a partition per day instead. This approach has two drawbacks. On the technical side, we cannot drop a partition without invalidating all PL/SQL units that statically reference the table. On the business side, Oracle charges extra for the partitioning option.

9 Performance comparisons Performance comparisons of the various approaches depend heavily on the statement being executed. Most prominently, the results depend on the ratio of parse vs execution time for a single execution and the ratio of the execution vs context switch time. We have performed a benchmark using the insertion of 10 times 5,000 rows and parallelized this on 1 to 16 processes. Figure 6 shows the implementation using native dynamic SQL bulk operations. The remaining implementations are available in the accompanying scripts. We have ran the experiment under Oracle 9.2.0.6 and 10.1.0.3 on a 12-CPU IBM p5-570 with AIX 5.3 attached to a Clariion CX-400. The results are in Figure 7 and Figure 8, where NDS stands for native dynamic SQL.

The implementations can clearly be categorized into three groups. The bulk implementations come most close to the ideal of low constant response time represented by a low horizontal line.

0

5

10

15

20

25

30

35

40

45

1 2 4 8 16

Processes

Res

pons

e tim

e [s

]

dbms_sqldbms_sql bulkNDS PL/SQLNDS bindNDS bulkNDS application contextNDS literalNDS objectNDS variable packageStaticStatic bulk

Figure 7: Performance comparisons on 10.1.0.3

32

The implementations that use binding, lie in the middle. The implementation with literals that cause hard parses comes in dead last.

The largest differences between 9i and 10g is in dbms_sql bulk operations, the application context approach, and literals. Whereas the former two are significantly improved under 10g, we pay an even greater penalty for literals under 10g. Under 9.2.0.5, the object-based approach is also significantly slower than in 9.2.0.6 and 10g.

In our example, there is no significant scalability difference between static SQL and native dynamic SQL with bind variables because of cursor caching and the elimination of soft parses in native dynamic SQL under 10g (Section 1.3).

Because bind variable peeking is irrelevant for the trivial execution plan of our example, there is no difference between those approaches that permit bind variable peeking and those that don’t.

Non-bulk dbms_sql does not perform well in our example.

10 Summary and Conclusions Effective dynamic SQL is not much different from effective static SQL. Good performance starts with reduction of work (e.g., return only one screen full of data at a time), good database design, bulk operations, and bind variables. Remembering Hoare's dictum that premature optimization is

0

5

10

15

20

25

30

35

40

1 2 4 8 16

Processes

Res

pons

e tim

e [s

]

dbms_sqldbms_sql bulkNDS PL/SQLNDS bindND bulkNDS application contextNDS literalNDS objectNDS variable packageStaticStatic bulk

Figure 8: Performance comparisons on 9.2.0.6

33

the root of all evil in programming, tricky optimizations should only be performed if we have a performance problem (and a cost effective solution) in spite of adhering to the above principles.

Before using dynamic SQL, double check whether static SQL with the benefits of compile time checking, dependency tracking, and full use of bind variable is really not an option. Can we generate PL/SQL modules during configuration? Is partitioning a solution? Can we at least handle 80% of all cases with static SQL?

If dynamic SQL is really needed, we should remember the above principles: Use bulk operations when possible. Remember that even 10g does not implicitly bulk fetch from a ref cursor, but that we can do so using the limit clause. Unless we require star queries or run into a peeking bug, we use bind variables. Optimal execution plans for different bind values can be forced with differentiating comments. Queries with similar value distributions should receive the same comment to reduce the number of hard parses. Please refer to Section 4.1 for a full list of best practices.

Last but not least. Don’t just optimize for single session response time. Reduce the number of latches to improve scalability as well. Acknowledgments We would like to thank our colleagues at Avaloq for many inspiring discussions.

References 1. Hotsos Enterprises Ltd. Hotsos Profiler, v 4.20, 2004.

2. Thomas Kyte. Oracle Expert One-on-One. Apress, 2001.

3. Thomas Kyte. Effective Oracle by Design. Oracle Press, 2003.

4. Thomas Kyte. Runstat. http://asktom.oracle.com/~tkyte/runstats.html.

5. Thomas Kyte. SQL Injection, http://asktom.oracle.com/~tkyte/sqlinj.html, 2004.

6. Bryn Llewellyn. PL/SQL Performance — Debunking the Myths, Oracle Open World 2004, http://www.oracle.com/technology/tech/pl_sql/htdocs/new_in_10gr1.htm#myths, 2004.

7. Connor McDonald et al. Mastering Oracle PL/SQL: Practical Solutions, Apress, 2004.

8. Oracle Corporation. Concepts, Oracle Release 10g, 2004.

9. Oracle Corporation. Data Warehousing Guide, Oracle Release 10g, 2004.

10. Oracle Corporation. Performance Tuning Guide, Oracle Release 10g, 2004.

11. Oracle Corporation. PL/SQL User's Guide and Reference, Oracle Release 10g, 2004.

About Avaloq Evolution AG Avaloq provides the financial industry with an innovative and sustainable banking system that is compatible with future requirements and increases the efficiency of the banking business. It enables differentiation via new, individual business models and market services. The core competencies of Avaloq cover the analysis of the requirements of and monitoring alterations in the financial industry as well as the design and development of high-end standard software.

http://www.hotsos.com/

http://asktom.oracle.com/~tkyte/runstats.html

http://asktom.oracle.com/~tkyte/sqlinj.html

http://www.oracle.com/technology/tech/pl_sql/htdocs/new_in_10gr1.htm#myths

34

Avaloq Banking System The Avaloq Banking System is a tried and tested product that covers every aspect of banking. It delivers all the tools required for the individual realisation of current and future market services in private banking, asset management, retail banking and commercial banking. It works with standards, uses state-of-the-art technologies and offers open interfaces in respect of networked banking.

As a concept, product and tool, the Avaloq Banking System is the anticipated, practical answer to the strategic questions of the future.

35

Appendix: Database setup We assume valid system statistics, optimizer_mode choose or all_rows and a sensible CBO parameterization. drop table t / create table t( x number check (x>=0) ,y number ,c varchar(2000) ) / create index t#x on t(x); create index t#y on t(y); declare -- version for 9i/10g l_t t%rowtype; type t_tab is table of t%rowtype index by pls_integer; l_list t_tab; c_max constant pls_integer := 100000; c_different constant pls_integer := 10; begin for i in 1..100 loop l_t.c := l_t.c || '0123456789'; end loop; for i in 0..c_max-1 loop l_t.x := mod(i, c_different); l_t.y := trunc(i / (c_max / c_different)); l_list(i + 1) := l_t; end loop; forall i in 1..l_list.count insert into t values l_list(i); insert into t values(999, 10, l_t.c); commit; dbms_stats.gather_table_stats( ownname => sys_context('userenv', 'current_schema') ,tabname => 'T' ,cascade => true ,method_opt => 'FOR ALL COLUMNS' ); end; / create or replace procedure nop( i_x pls_integer ) as begin null; end nop; /

Effective Dynamic SQL

Documents

Transcript of Effective Dynamic SQL