SQL

From MidrangeWiki
Revision as of 20:11, 10 August 2009 by Starbuck5250 (talk | contribs) (Query Options File: +internal debug options)
Jump to: navigation, search

SQL, or Structured Query Language, is a platform independent way of accessing databases.

However, the version of SQL which runs on the AS/400, known as SQL400 or SQLDB2, does have syntax differences from other SQL dialects such as Microsoft SQL Server or Oracle, as well as DB2 on other platforms.

Detractors of DB2 for SystemI tend to point up these differences as deficiencies, but every vendor has differences that make each dialect somewhat incompatible.

Terminology

SQL Term iSeries Term
TABLE PHYSICAL FILE
ROW RECORD
COLUMN FIELD
INDEX KEYED LOGICAL FILE
ACCESS PATH
VIEW NON-KEYED LOGICAL FILE
SCHEMA LIBRARY
COLLECTION
LOG JOURNAL
ISOLATION LEVEL COMMITMENT CONTROL LEVEL

[1]

Tips

SQL7008 error

&FILE in &LIBRARY not valid for operation.
-- Code 3 -- &FILE is not journaled, or you do not have authority to the journal. Files with an RI constraint action of CASCADE, SET NULL, or SET DEFAULT must be journaled to the same journal.
  • Resolution: Use RUNSQLSTM with parameter COMMIT(*NONE)

SELECT *

For the sake of example, use two files, master and trans. We want to select all the columns in trans but only the name column from table master.
If using *SYS naming you can execute SELECT trans.*, name FROM trans JOIN master ON...
If using *SQL naming, we need to use a correlation name: SELECT t.*, name FROM trans t JOIN master ON...

RUNSQL command for ad-hoc SQL statements

"Partner TechTip: Blend SQL and RPGLE to Make Better Tools" by Kevin Forsythe

SQL via QM Query

Another method is based on Query Management Query. Create the QMQRY as all substitution variables and populate them in CL. Midrange FAQ entry for the unfortunately named RUNSQLSTM.

SQL via QSH

original post [[1]]

Use the db2 command in QShell as a really easy way to implement this sort of thing.

Write a simple QShell script like this:


#!/usr/bin/qsh
LIB=$1
TABLE=$2
COL=$3
db2 "select $COL,count(*) from $LIB.$TABLE \
group by $COL, order by $COL" \
> /tmp/report.txt


Run that Qshell script (or submit it to batch, if you like)

SBMJOB CMD(QSH CMD('myscript.sh MYLIB MYTABLE MYCOL'))

Query Options File

IBM i allows one to alter the behaviour of the Query optimiser by use of a file called QAQQINI IBM supply a template in QSYS which you can copy into QUSRSYS and alter to suit your needs. If you want a custom options file, copy it into your own library and use the command CHGQRYA to tell the system where to find it. The system uses the one in QUSRSYS as a system-wide default.

V5R4 query options

Parameter Value Description Notes
ALLOW_TEMPORARY_INDEXES *DEFAULT The default value is set to *YES.
*YES Allow temporary indexes to be considered.
*ONLY_ REQUIRED Do not allow any temporary indexes to be considered for this access plan. Choose any other implementation regardless of cost to avoid the creation of a temporary index. Only if no viable plan can be found, will a temporary index be allowed.
APPLY_REMOTE *DEFAULT The default value is set to *YES.
*NO The CHGQRYA attributes for the job are not applied to the remote jobs. The remote jobs will use the attributes associated to them on their servers.
*YES The query attributes for the job are applied to the remote jobs used in processing database queries involving distributed tables. For attributes where *SYSVAL is specified, the system value on the remote server is used for the remote job. This option requires that, if CHGQRYA was used for this job, the remote jobs must have authority to use the CHGQRYA command.
ASYNC_JOB_USAGE *DEFAULT The default value is set to *LOCAL.
*LOCAL Asynchronous jobs may be used for database queries that involve only tables local to the server where the database queries are being run. In addition, for queries involving distributed tables, this option allows the communications required to be asynchronous. This allows each server involved in the query of the distributed tables to run its portion of the query at the same time (in parallel) as the other servers.
*DIST Asynchronous jobs may be used for database queries that involve distributed tables.
*ANY Asynchronous jobs may be used for any database query.
*NONE No asynchronous jobs are allowed to be used for database query processing. In addition, all processing for queries involving distributed tables occurs synchronously. Therefore, no inter-system parallel processing will occur.
CACHE_RESULTS *DEFAULT The default value is the same as *SYSTEM.
*SYSTEM The database manager may cache a query result set. A subsequent run of the query by that job or, if the ODP for the query has been deleted, by any job, will consider reusing the cached result set.
*JOB The database manager may cache a query result set from one run to the next for a job, as long as the query uses a reusable ODP. When the reusable ODP is deleted, the cached result set is destroyed. This value mimics V5R2 processing.
*NONE The database does not cache any query results.
COMMITMENT_CONTROL_
LOCK_LIMIT
*DEFAULT *DEFAULT is equivalent to 500,000,000.
Integer Value The maximum number of records that can be locked to a commit transaction initiated after setting the new value. The valid integer value is 1–500,000,000.
FORCE_JOIN_ORDER *DEFAULT The default is set to *NO.
*NO Allow the optimizer to reorder join tables.
*SQL Only force the join order for those queries that use the SQL JOIN syntax. This mimics the behavior for the optimizer before V4R4M0.
*PRIMARY nnn Only force the join position for the file listed by the numeric value nnn (nnn is optional and will default to 1) into the primary position (or dial) for the join. The optimizer will then determine the join order for all of the remaining files based upon cost.
*YES Do not allow the query optimizer to reorder join tables as part of its optimization process. The join will occur in the order in which the tables were specified in the query.
IGNORE_DERIVED_INDEX *DEFAULT The default value is the same as *NO.
*YES Allow the SQE optimizer to ignore the derived index and process the query. The resulting query plan will be created without any regard to the existence of the derived index(s). The index types that are ignored include:
  • Keyed logical files defined with select or omit criteria and with the DYNSLT keyword omitted
  • Keyed logical files built over multiple physical file members (V5R2 restriction, not a restriction for V5R3)
  • Keyed logical files where one or more keys reference an intermediate derivation in the DDS. Exceptions to this are: 1. when the intermediate definition is defining the field in the DDS so that shows up in the logical's format and 2. RENAME of a field (these two exceptions do not make the key derived)
  • Keyed logical files with K *NONE specified.
  • Keyed logical files with Alternate Collating Sequence (ACS) specified
  • SQL indexes created when the sort sequence active at the time of creation requires a weighting (translation) of the key to occur. This is true when any of several non-US language IDs are specified. It also occurs if language ID shared weight is specified, even for language US.
*NO Do not ignore the derived index. If a derived index exists, have CQE process the query.
IGNORE_LIKE_
REDUNDANT_SHIFTS
*DEFAULT The default value is set to *OPTIMIZE.
*ALWAYS When processing the SQL LIKE predicate or OPNQRYF command %WLDCRD built-in function, redundant shift characters are ignored for DBCS-Open operands. Note that this option restricts the query optimizer from using an index to perform key row positioning for SQL LIKE or OPNQRYF %WLDCRD predicates involving DBCS-Open, DBCS-Either, or DBCS-Only operands.
*OPTIMIZE When processing the SQL LIKE predicate or the OPNQRYF command %WLDCRD built-in function, redundant shift characters may or may not be ignored for DBCS-Open operands depending on whether an index is used to perform key row positioning for these predicates. Note that this option will enable the query optimizer to consider key row positioning for SQL LIKE or OPNQRYF %WLDCRD predicates involving DBCS-Open, DBCS-Either, or DBCS-Only operands.
LIMIT_PREDICATE_
OPTIMIZATION
*DEFAULT Do not eliminate the predicates that are not simple isolatable predicates (OIF) when doing index optimization. Same as *NO.
*NO Do not eliminate the predicates that are not simple isolatable predicates (OIF) when doing index optimization.
*YES Eliminate the predicates that are not simple isolatable predicates (OIF) when doing index optimization.
LOB_LOCATOR_THRESHOLD *DEFAULT The default value is set to 0. This indicates that the database will take no action to free locators.
Integer Value If the value is 0, then the database will take no action to free locators. For values 1 through 250,000, on a FETCH request, the database will compare the active LOB locator count for the job against the threshold value. If the locator count is greater than or equal to the threshold, the database will free host server created locators that have been retrieved. This option applies to all host server jobs (QZDASOINIT) and has no impact to other jobs.
MATERIALIZED_QUERY_
TABLE_REFRESH_AGE
*DEFAULT The default value is set to 0.
0 No materialized query tables may be used.
*ANY Any tables indicated by the MATERIALIZED_ QUERY_TABLE_USAGE INI parameter may be used.
timestamp_ duration Only tables indicated by MATERIALIZED_ QUERY_TABLE_USAGE INI option which have a REFRESH TABLE performed within the specified timestamp duration may be used.
MATERIALIZED_QUERY_
TABLE_USAGE
*DEFAULT The default value is set to *NONE.
*NONE Materialized query tables may not be used in query optimization and implementation.
*ALL User-maintained materialized query tables may be used.
*USER User-maintained materialized query tables may be used.
MESSAGES_DEBUG *DEFAULT The default is set to *NO.
*NO No debug messages are to be displayed.
*YES Issue all debug messages that are generated for STRDBG.
NORMALIZE_DATA *DEFAULT The default is set to *NO.
*NO Unicode constants, host variables, parameter markers, and expressions that combine strings will not be normalized.
*YES Unicode constants, host variables, parameter markers, and expressions that combine strings will be normalized
OPEN_CURSOR_CLOSE_
COUNT
*DEFAULT *DEFAULT is equivalent to 0. See Integer Value for details.
Integer Value OPEN_CURSOR_CLOSE_COUNT is used in conjunction with OPEN_CURSOR_THRESHOLD to manage the number of open cursors within a job. If the number of open cursors, which includes open cursors and pseudo-closed cursors, reaches the value specified by the OPEN_CURSOR_THRESHOLD, pseudo-closed cursors are hard (fully) closed with the least recently used cursors being closed first. This value determines the number of cursors to be closed. The valid values for this parameter are 1 to 65536. The value for this parameter should be less than or equal to the number in the OPEN_CURSOR_THREHOLD parameter. This value is ignored if OPEN_CURSOR_THRESHOLD is *DEFAULT. If OPEN_CURSOR_THRESHOLD is specified and this value is *DEFAULT, the number of cursors closed is equal to OPEN_CURSOR_THRESHOLD multiplied by 10 percent and rounded up to the next integer value.
OPEN_CURSOR_
THRESHOLD
*DEFAULT *DEFAULT is equivalent to 0. See Integer Value for details.
Integer Value OPEN_CURSOR_THRESHOLD is used in conjunction with OPEN_CURSOR_CLOSE_COUNT to manage the number of open cursors within a job. If the number of open cursors, which includes open cursors and pseudo-closed cursors, reaches this threshold value, pseudo-closed cursors are hard (fully) closed with the least recently used cursors being closed first. The number of cursors to be closed is determined by OPEN_CURSOR_CLOSE_COUNT. The valid user-entered values for this parameter are 1 - 65536. Having a value of 0 (default value) indicates that there is no threshold and hard closes will not be forced on the basis of the number of open cursors within a job.
OPTIMIZATION_GOAL *DEFAULT Optimization goal is determined by the interface (ODBC, SQL precompiler options, OPTIMIZE FOR nnn ROWS clause).
*FIRSTIO All queries will be optimized with the goal of returning the first page of output as fast as possible. This goal works well when the control of the output is controlled by a user who is most likely to cancel the query after viewing the first page of output data. Queries coded with an OPTIMIZE FOR nnn ROWS clause will honor the goal specified by the clause.
*ALLIO All queries will be optimized with the goal of running the entire query to completion in the shortest amount of elapsed time. This is a good option for when the output of a query is being written to a file or report, or the interface is queuing the output data. Queries coded with an OPTIMIZE FOR nnn ROWS clause will honor the goal specified by the clause.
OPTIMIZE_STATISTIC_
LIMITATION
*DEFAULT The amount of time spent in gathering index statistics is determined by the query optimizer.
*NO No index statistics will be gathered by the query optimizer. Default statistics will be used for optimization. (Use this option sparingly.)
*PERCENTAGE integer value Specifies the maximum percentage of the index that will be searched while gathering statistics. Valid values for are 1 to 99.
*MAX_ NUMBER_ OF_RECORDS_ ALLOWED integer value Specifies the largest table size, in number of rows, for which gathering statistics is allowed. For tables with more rows than the specified value, the optimizer will not gather statistics and will use default values.
PARALLEL_DEGREE *DEFAULT The default value is set to *SYSVAL.
*SYSVAL The processing option used is set to the current value of the system value, QQRYDEGREE.
*IO Any number of tasks can be used when the database query optimizer chooses to use I/O parallel processing for queries. SMP parallel processing is not allowed.
*OPTIMIZE The query optimizer can choose to use any number of tasks for either I/O or SMP parallel processing to process the query or database file keyed access path build, rebuild, or maintenance. SMP parallel processing is used only if the system feature, DB2® Symmetric Multiprocessing for i5/OS®, is installed. Use of parallel processing and the number of tasks used is determined with respect to the number of processors available in the server, this job has a share of the amount of active memory available in the pool in which the job is run, and whether the expected elapsed time for the query or database file keyed access path build or rebuild is limited by CPU processing or I/O resources. The query optimizer chooses an implementation that minimizes elapsed time based on the job has a share of the memory in the pool.
*OPTIMIZE xxx This option is very similar to *OPTIMIZE. The value xxx indicates the ability to specify an integer percentage value from 1-200. The query optimizer determines the parallel degree for the query using the same processing as is done for *OPTIMIZE, Once determined, the optimizer will adjust the actual parallel degree used for the query by the percentage given. This provides the user the ability to override the parallel degree used to some extent without having to specify a particular parallel degree under *NUMBER_OF_TASKS.
*MAX The query optimizer chooses to use either I/O or SMP parallel processing to process the query. SMP parallel processing will only be used if the system feature, DB2 Symmetric Multiprocessing for i5/OS, is installed. The choices made by the query optimizer are similar to those made for parameter value *OPTIMIZE except the optimizer assumes that all active memory in the pool can be used to process the query or database file keyed access path build, rebuild, or maintenance.
*NONE No parallel processing is allowed for database query processing or database table index build, rebuild, or maintenance.
*NUMBER_OF _TASKS nn Indicates the maximum number of tasks that can be used for a single query. The number of tasks will be capped off at either this value or the number of disk arms associated with the table.
PARAMETER_MARKER_
CONVERSION
*DEFAULT The default value is set to *YES.
*NO Constants cannot be implemented as parameter markers.
*YES Constants can be implemented as parameter markers.
QUERY_TIME_LIMIT *DEFAULT The default value is set to *SYSVAL.
*SYSVAL The query time limit for this job will be obtained from the system value, QQRYTIMLMT.
*NOMAX There is no maximum number of estimated elapsed seconds.
integer value Specifies the maximum value that is checked against the estimated number of elapsed seconds required to run a query. If the estimated elapsed seconds is greater than this value, the query is not started. Valid values range from 0 to 2,147,352,578.
REOPTIMIZE_ACCESS_PLAN *DEFAULT The default value is set to *NO.
*NO Do not force the existing query to be reoptimized. However, if the optimizer determines that optimization is necessary, the query will be reoptimized.
*YES Force the existing query to be reoptimized.
*FORCE Force the existing query to be reoptimized.
*ONLY_ REQUIRED Do not allow the plan to be reoptimized for any subjective reasons. For these cases, continue to use the existing plan since it is still a valid workable plan. This may mean that you may not get all of the performance benefits that a reoptimization plan may derive. Subjective reasons include, file size changes, new indexes, and so on. Non-subjective reasons include, deletion of an index used by existing access plan, query file being deleted and recreated, and so on.
SQLSTANDARDS_MIXED_
CONSTANT
*DEFAULT The default value is set to *YES.
*YES SQL IGC constants will be treated as IGC-OPEN constants.
*NO If the data in the IGC constant only contains shift-out DBCS-data shift-in, then the constant will be treated as IGC-ONLY, otherwise it will be treated as IGC-OPEN.
SQL_FAST_DELETE_ROW_COUNT *DEFAULT The default value is set to 0. Having a value of 0 indicates that the database manager will choose how many rows to consider when determining whether fast delete should be used instead of a traditional delete. When using the default value, the database manager will most likely use 1000 as a row count. This means that using the INI option with a value of 1000 result in no operational difference than using 0 for the option.
*NONE This value will force the database manager to never attempt to fast delete on the rows.
*OPTIMIZE This value is same as using *DEFAULT.
integer value Specifying a value for this option allows the user to tune the behavior of DELETE. The target table for the DELETE statement must match or exceed the number of rows specified on the option, for fast delete to be attempted. A fast delete will not write individual rows into a journal. The valid values range from 1 to 999,999,999,999,999.
SQL_STMT_COMPRESS_MAX *DEFAULT The default value is set to 2, which indicates that the access plan associated with any statement will be removed after a statement has been compressed twice without being executed.
Integer Value The integer value represents the number of times that a statement is compressed before the access plan is removed to create more space in the package. Note that executing the SQL statement resets the count for that statement to 0. The valid Integer values are 1 to 255.
SQL_SUPPRESS_WARNINGS *DEFAULT The default value is set to *NO.
*YES Examine the SQLCODE in the SQLCA after execution of a statement. If the SQLCODE is + 30, then alter the SQLCA so that no warning is returned to the caller. Set the SQLCODE to 0, the SQLSTATE to '00000' and SQLWARN to ' '.
*NO Specifies that SQL warnings will be returned to the caller.
SQL_TRANSLATE_ASCII_
TO_JOB
*DEFAULT The default value is set to *NO.
*YES Translate ASCII SQL statement text to the CCSID of the iSeries® job.
*NO Translate ASCII SQL statement text to the EBCIDIC CCSID associated with the ASCII CCSID.
STAR_JOIN (see note) *DEFAULT The default value is set to *NO
*NO The EVI Star Join optimization support is not enabled.
*COST Allow query optimization to consider (cost) the usage of EVI Star Join support. The determination of whether the Distinct List selection is used will be determined by the optimizer based on how much benefit can be derived from using that selection.
STORAGE_LIMIT *DEFAULT The default value is set to *NOMAX.
*NOMAX Never stop a query from running because of storage concerns.
Integer Value The maximum amount of temporary storage in megabytes that may be used by a query. This value is checked against the estimated amount of temporary storage required to run the query as calculated by the query optimizer. If the estimated amount of temporary storage is greater than this value, the query is not started. Valid values range from 0 through 2147352578.
SYSTEM_SQL_STATEMENT_
CACHE
*DEFAULT The default value is set to *YES.
*YES Examine the SQL system-wide statement cache when an SQL prepare request is processed. If a matching statement already exists in the cache, use the results of that prepare. This allows the application to potentially have better performing prepares.
*NO Specifies that the SQL system-wide statement cache should not be examined when processing an SQL prepare request.
UDF_TIME_OUT *DEFAULT The amount of time to wait is determined by the database. The default is 30 seconds. [2]
*MAX The maximum amount of time that the database will wait for the UDF to finish.
integer value Specify the number of seconds that the database should wait for a UDF to finish. If the value given exceeds the database maximum wait time, the maximum wait time will be used by the database. Minimum value is 1 and maximum value is system defined.
VARIABLE_LENGTH_
OPTIMIZATION
*DEFAULT The default value is set to *YES.
*YES Allow aggressive optimization of variable length columns. Allows index only access for the column(s). It also allows constant value substitution when an equal predicate is present against the column(s). As a consequence, the length of the data returned for the variable length column may not include any trailing blanks that existed in the original data.
*NO Do not allow aggressive optimization of variable length columns.

Notes:

  1. Business Intelligence Indexing and statistics strategies for DB2 UDB for iSeries by Michael W. Cain iSeries Teraplex Integration Center, December 2003, Version 3.0, Template:PDF
  2. Only modifies the environment for the Classic Query Engine.

There is an IBM Support Technical Document that describes in-depth debugging using additional internal QAQQINI options. Document number 462591814

SQL Settings

On V5R4, using OLAP can mean having a good look at the value of IGNORE_DERIVED_INDEX. One may want to set it to YES in order to use a heritage database with OLAP.

References

See also

See also: SQL Primer and SOUNDEX
See also on Wikipedia: SQL
"TechTip: Calling SQL from REXX" Written by Joe Pluta, Thursday, 24 February 2005

Categories