Query Options
Click the Query Options
button on the StreamingDB spreadsheet to display the Query Options
dialog box. Use these options to determine exactly how the exchange of
data is handled between the client machine and the database on a server
machine.
General
information. Not all database environments (the database itself and
the OLE DB provider or ODBC driver) will support all combinations of options
on the Query Options dialog. In general, these situations are handled
"silently and gracefully" by creating a cursor with possibly different options and
settings than the ones specified. For example, if you specify a client
side dynamic cursor, you will get a static cursor instead, since all client
side cursors (which use the ADO cursor library) are by definition static.
If you specify an updateable server-side static cursor, you may receive
a keyset cursor if the OLE DB provider or ODBC driver can support one,
or a read-only cursor if not. To determine exactly what kind of cursor
you have received after executing a query, display the Query Options
dialog again. The settings here will be updated based on the current cursor
type and options.
Writing
information back to the database. If you intend to write information
back to the input database via the Rapid Deployment of Models
options for computing predicted values and classifications (and other
statistics), review the options described below carefully to select the
appropriate Cursor Type and Lock Type consistent with that
operation. Specifically, choose a Lock Type other than Read
Only and either a Server Side Dynamic or Keyset
cursor, or a Client Side Static cursor. Generally, a Lock Type
of Batch Optimistic will be most efficient (updates are sent to
the remote database in batches rather than one record at a time.) However,
not all database environments will support this option. The Client
Side cursor will always support the batch optimistic lock type. Some
experimentation with various Query Options may be necessary to
determine the optimal settings for your particular database environment.
Note that writing back to the database requires additional processing
resources, and unless this functionality is required (e.g., for deployment of predictive data mining models (via the Rapid
Deployment of Models module), it is strongly recommended not to configure
the connection to the database for write-access.
Cursor Location. The
following options are in the Cursor Location group box.
Server
Side. If this option button is selected, the cursor is maintained
on the server.
Client
Side. If this option button is selected, the cursor is maintained
on the client (all data are copied onto client machine)
Cache
Size. Specify the number of records that are cached on the client
machine (applies when using a server side cursor)
Cursor Type. The following
options are in the Cursor Type group box.
Forward
Only. If this option button is selected, the cursor can be traversed
only forward, one record at a time. If an analysis requests data that
comes before the current position, the implementation must re-query and
advance to the requested record. Good for analyses that require only a
single pass through the data.
Static.
A static cursor supports scrolling forward and backward through the data,
thus allowing random access to the data. This cursor type provides a "snapshot"
of the result of your query - records modified, added to, or removed from
the database after the cursor is populated will not be visible. A server
side static cursor can place considerable strain on the database system.
A client-side cursor is always static. If an analysis or other usage of
the Streaming Database Connector requires random
access or multiple passes through the data, this type of cursor may be
needed.
Writing
back to the database. Generally, a server side static cursor is by
definition read-only. Note however that client side static cursors can
be updateable when an appropriate Lock Type is chosen, and with
certain limitations. When using a client side cursor, the cursor is populated
and maintained on the local client machine by Microsoft’s ADO Cursor
Engine, and after all data is copied to the local machine, the data in
the cursor is inherently disconnected from the database. Therefore, when
you attempt to update the database through such a cursor, the cursor engine
constructs and executes an action query, for example
UPDATE
Customers SET Balance = 100 WHERE CustomerID = 7
Limitations
of Client Side Updatable Cursors. In order for this scheme to work,
the cursor engine needs to be able to get certain meta data about the
tables in your query from the database. If the OLE DB provider or ODBC
driver returns inaccurate data or ADO cannot gather the base table and
field names, updates may fail.
ADO must be able to locate
the records in the database that you want to modify. If the table has
a primary key, ADO will use the primary key fields in the action queries
it builds to locate the correct record, if those primary key fields were
all included in your query. Retrieving all the primary key data for a
given table can be expensive and there is a notable exception; the Microsoft
OLE DB provider for Oracle requests the row ID and uses that row ID instead
of the primary key fields in constructing the action queries, resulting
in better performance.
If you use a database table
that has no primary key, or does not include the primary key fields in
your query, then ADO has no reliable way to locate the record you want
to modify. In this case, it will use all the values for all fields in
the cursor in the WHERE clause of the action query.
Keyset.
A Keyset cursor is a server side cursor that enables you to update
data (when using an appropriate Lock Type) and to see changes made
by other users. It allows random access to records and multiple passes
through the data. In contrast to a static cursor, which the query processor
will completely populate when you submit your query, the Keyset
cursor is initially populated with only the data required to locate the
records in the tables that satisfy your query. Generally this data corresponds
to the primary key in your tables and is referred to as the keyset.
As you (or a Statistica
analysis) pass through the data in the cursor, the number of records specified
in the Cache Size is copied to the local machine. When a record
not in this cache is requested, the stored key data in the cursor is used
to get the next batch of records from the database.
With the Keyset
cursor, any record changed by another user will be visible to you the
next time that record is loaded into the local cache, e.g., on the next
pass through the data. Records added to the database that satisfy your
query will NOT be visible; the set of key values stored in the cursor
is itself static. Records deleted from the database by other users will
be removed from the Keyset cursor the next time the cache is refreshed.
Dynamic.
The Dynamic cursor "behaves" somewhat like the Keyset
cursor. Initially, the query processor retrieves the key field information
for the records that satisfy your query and returns the number of records
requested in the Cache Size property to the local machine. With
the dynamic cursor, however, the query processor will rebuild the keyset
information each time the cache is refreshed. Thus records added by other
users will become available in this cursor as you navigate through the
records.
Note that the Keyset
and Dynamic cursor types are significantly more complicated than
forward only and static cursors, and therefore they are typically much
slower and place a greater strain on the remote database system. In addition,
unexpected or misleading results can be obtained if the number or content
of records in the Streaming DB Connector changes while an analysis is
in progress.
Lock Type. Locking is
the process by which a DBMS restricts access to a row in a multi-user
environment. When a row or column is exclusively locked, other users are
not permitted to access the locked data until the lock is released. This
ensures that two users cannot simultaneously update the same data. The
Lock Type options specify how records are to be locked when
updating data in the database. Most uses of the streaming DB connector
do not require updating the database, and so for best performance in those
circumstances, the Read Only option should be chosen. However,
certain Statistica analyses and operations, such as the Rapid Deployment of Models
module, may write to the data source (spreadsheet or StreamingDB spreadsheet).
Therefore when performing such an operation with streaming DB connector,
you must ensure that your recordset is updatable or the operation will
fail.
Providers may not support
all locking options. If an unsupported type is requested, a supported
type will be substituted.
Read
Only. If this option button is selected, the Streaming DB Connector
is read-only. You cannot write back to the database. This setting will
normally result in faster data reading, so unless you need to update the
database, this is the recommended option.
Pessimistic.
Pessimistic locking is the most aggressive locking option. When
using this option, other users are unable to access records that you are
modifying, and vice versa. When using streaming DB connector programmatically
via the Statistica Object Model, pessimistic locking
implies that the record becomes locked when you first change any field
(variable value) in a record (case) and remains locked until you call
the Update (or CancelUpdate) method of the streaming DB
connector DBTable object.
Optimistic.
With Optimistic locking, two users can edit the same record simultaneously.
The first one to commit their changes successfully updates the database,
and the second user’s attempt to update that data fails. With this
option, other users are able to edit records that you are updating (possibly
creating conflicts). When using streaming DB connector programmatically
via the Statistica Object Model, optimistic locking
implies that a record is not locked as you modify fields (variable values)
for a particular record (case); they are only locked when the Update
method of the streaming DB connector DBTable object is called, and remain
locked only until that call returns.
Batch
Optimistic. This option is primarily for client-side recordsets, but
can be used with server side recordsets if the OLE DB provider and/or
database supports having multiple records with pending changes; for example,
SQL Server cursors support this functionality, while Microsoft Access
cursors do not. With this option, updates to many records can be batched
and submitted all at once, thus reducing network traffic and the number
of database transactions.
When using a client side
recordset, or if your database and OLE DB provider/ODBC driver support
batch updates in this manner, and when using a Statistica analysis that
writes back to the database, this is the recommended setting.
When using streaming DB
connector programmatically via the Statistica Object Model, you must call
the streaming DB connector DBTable method UpdateBatch to submit
the batch update, or CancelBatch to cancel the pending changes.
Asynchronous Query. When
the Asynchronous Query check box is selected, the application does
not wait for the query to complete. Data will be previewed as it becomes
available. An analysis can begin immediately but will block when it requests
data until the data becomes available.
Asynchronous Fetch. Asynchronous
Fetch applies when using a client side cursor. The application does
not wait for all records to be transferred to the client machine. Records
will be previewed as they become available. When an analysis requests
data that has not been fetched yet, it will block until the data is available.
Maximum number of records to return
from query. If supported by the database platform, this option
limits the number of rows returned from a query. An entry of zero means
no limit.
Use optimized Oracle driver for
updating the database. Select this option to use Oracle's native
API (OO4O) for write back operations to Oracle databases. This option
is recommended for maximum performance when writing back to Oracle databases.
Note that you must select the primary key from every table in your query
to take advantage of this option. If the table that you are querying and
writing back to does not have a primary key or you are not including that
primary key in your query then you must clear this option.
Note:
Statistica may alter some of the options that you set in this dialog box
after the initial query if it determines that different settings may be
more optimal.
Note:
Performance considerations. For best performance (and for read-access
only, i.e., when not writing statistics back to the database via the Rapid Deployment of Models module), follow
these guidelines:
If you
are creating an analysis that requires only one forward pass through the
data, use a server side forward only cursor. This is the fastest type
of cursor.
If you
require random access to or will be making several passes through the
data, use a client side cursor with asynchronous query and fetch.
Select
the option Use optimized Oracle driver
for updating the database when
working with Oracle databases.
See the Streaming Database Connector and Cursor definitions. See also, Streaming Database Connector Technology (Technical
Overview) and Streaming
Database Connector FAQs.