|
Specific SQL Tuning Techniques & Server Facility Management for
CA-INGRES II Applications Developers.
This page is a summarisation of a one day instructional course which is
available from Enterprise Computing Australia.
The course has been specially designed to cater to the needs of the
professional, practicing CA-INGRES application developer. Unlike several other
generic SQL tuning courses which have been available from time to time, this
course contains information which is specifically applicable to the CA-INGRES
and CA-OpenINGRES databases.
There are, of course, many sections of the course which can be shown to be
beneficial to any Relational Database system.
If you are interested in any particular part of the course as shown in the
contents image above then simply click on the section label. Some sections
are incorprated into this page 'as is', while other serve as nothing more
than place markers (if I put the whole course on the Internet there wouldn't
be much point in having a course ;-)).
For more information on this course, for schedule details or for any of the
other services which Enterprise Computing has to offer please use this
Contact Page

The CA-INGRES Database Server.
There are 10 primary components which collectively form the CA-INGRES DBMS
Server.
The server is a multi-point re-entrant process which has a number of threads
(facilities).
The server from the UNIX level looks to be a large, resource hungry process
which can dominate the CPU and memory use of the machine.
Query Processing in the Server is controlled by specialist facilities which
act as traffic cops in the server.
The performance of queries can be affected by a number of things both within
the server and the operating system. The processing of queries is THE
single most frequently executed path in the DBMS server.
Each of the facilities that make up the CA-INGRES/OpenINGRES server are:
- The GCA (General Communication Area)
- Is part of the General Communication Facility (GCF).
- All front-ends send queries to the GCA present on the front-end.
- In conjunction with the General Communication Server performs datatype
conversion in network environments & sends server based requests.
- Is part of the CA-INGRES/Net product.
- The SCF (System Control Facility)
- Coordinates all server threads (facilities) to perform query execution.
- Is the manager of access to global resources such as memory, CPU, I/O.
- Consists of dispatcher/scheduler and sequencer.
- Is the principle time slice manager for user connections.
- The PSF (The Parser Facility)
- Converts SQL descriptors into parser trees for internal uses.
- Consists of two differing parsers, SQL and QUEL.
- Interacts with the QueryMod tasker.
- The QueryMod Tasker
- Modifies and augments the parser tree information based on the executing
user id.
- Incorporates and validates the permits, integrities, view and database
rules present on objects mentioned in the parser tree.
- Explodes and expands view information into required table information.
- The RDF (Relation Description Facility)
- Is a server cache which stores table and object descriptors including
- tables
- views
- integrities
- permits
- rules
- statistics
- Table information is timestamped in the cache
- The OPF (Optimizer Facility)
- Reads query tree which was stored in the Query Storage Facility pool.
- Evaluates the optimal execution profile for the tree and produces a QEP.
- In order to derive the optimal plan the OPF uses
- join strategies
- join ordering
- key and index strategies
- access methods
- The QSF (Query Storage Facility)
- A semaphore controlled memory area for storing server transient objects.
- Controls the internal storage of queries and database procedures.
- Enables sharing of objects including
- data structures
- trees
- text
- QEPs
- Trace point QS501 will monitor QSF Pool usage.
- The QEF (Query Execution Facility)
- Executes the queries (single queries, repeated queries and database
procedures.)
- Rules for each query are attached to each action of the QP.
- QEF interprets the QP and produces DMF calls.
- Executes QEP using ADF and DMF calls.
- The DMF (Data Manipulation Facility)
- Has five basic reasons for being:
- storage and retrieval of data
- locking and concurrency control
- logging and recovery control
- caching from data stores
- sorting tuples (rows)
- It is essentially a mini non-relational database which gets data from operating system files.
- The ADF (Abstract Datatype Facility)
- The Abstract Data Facility is responsible for undertaking domain type conversion and function evaluation during the execution of a query.
- ADF calls are made repeatedly for each tuple in the query which appears to satisfy the conditions.
- ADF calls can result in a noticeable drop in performance of queries if the server (DMF) thread is becoming saturated with requests.
- The sequencer thread in the server could become a bottleneck if ADF calls are to heavily relied upon.
- ADF calls and functions negate the use of indexes on the columns affected.
- Non-enumerated query passes can be substantially slower for queries that use ADF based calls.
- Tree and QEP evaluation times are increased (querymod affect). Therefore the optimal QEP may have been 'just missed'.

The Progress of a Query.
The CA-INGRES Optimizer is a Rule and Cost (statistical) based
approach to determining the optimal execution method for data retrieval. During
the compilation of a query execution plan (QEP), an optimization timeout can
occur. This will happen when the time taken looking for a better plan equals
the time projected for execution based on the best plan found so far.
There are some basic rules governing the derivation of a QEP for individual
queries.
These are {briefly}:
- view flattening
- subquery flattening
- aggregate handling
- Default Optimizer Conditions
- These defaults are used if optimization statistics are not available for the tables involved.
- for an exact match lookup on table, 1% of data expected to satisfy
- for partial or range search comparison, 10% of data expected to satisfy
- overflow immediately increases expectation
- joining two tables, expect that the result table will be the size of the largest single table involved.
- only a single row will be retrieved by a unique column key lookup

There are two distinct types of query optimization within the CA-INGRES
database server. These are generally refered to as the non-enumeration &
enumeration paths. Performance and resource utilization when either of these
separate and distinct paths are used are markedly different.
Non-enumeration Queries involve a fast path for single table queries
using the primary key (no secondary indexes are allowed). The query optimizer
does not use table and column statistics, and will always use
the primary key if possible.
The non-enumerated query will use page level locks automatically during the
execution of the query.
The other more commonly executed query path is that of the enumeration query.
This type of query path is executed when
- the query is on more than one table OR
- Secondary indexes are involved OR
- A QEP listing is requested OR
- The knowledge management query resource limit is in effect
The enumerated query type will always uses optimizedb
statistics.

Topics covered in this section include:
- QEP - The Good, Bad and Ugly
- Reading the QEP - The Black Art
- Node Types in the QEP
- The Leaf Node
- Projection Restriction Nodes
- Sort Nodes
- Join Nodes
- K join
- FSM join
- PSM join
- T join
- SE join
- Cart-Prod
- QEP EXAMPLES
- Primary versus Secondary Lookup
- Secondary Index Lookup QEP
- Range Search Lookup QEP
- Primary/Secondary Index QEP
- Cartesian Product QEP
- Function Queries

Topics covered in this section include:
- Establishing Query Execution Performamce Costs.
- Isolating the server facility usage and therefore what is causing the
execution time profile problems.
- Examining in detail the PSF, OPF, QEF, GCF components.
- Examining table and index structure models on query performance profile.
- Sorting out Sorting Needs.
- Determining the use of views.
- Stored QEPs and repeated queries.
- Isolating internal server facilities from query execution times.
- Primary and secondary key construction.
- Joins
- Complex Queries
- Disjunctive SQL
- Writting good queries.
- OLTP verses EIS (MIS)
- Database optimization - how, why and when.
- Lock conversion and escalation problems.

The information shown above is based on a course which can be presented to
development staff of your organisation 'as is' or tailored to suite your
specific needs. Please feel free to contact me via email at
rallas@ozemail.com.au for more information.
Original content of this page is © Enterprise Computing Pty Ltd 1997-2000.
No portion of this page may be reproduced in any format electronic, digital,
analogue, photocopied, printed or facsimile without the prior written consent
of Enterprise Computing Australia Pty Ltd.
|