Thursday, May 22, 2008

ORACLE DATA PUMP Part #3

EVEN MORE ADVANCED FEATURES OF ORACLE DATA PUMP

Beyond the command-line and performance features of Oracle Data Pump are new capabilities that DBAs will find invaluable. A couple of prominent features are described here.

Interactive Command-Line Mode

You have much more control in monitoring and controlling Data Pump jobs with interactive command-line mode. Because Data Pump jobs run entirely on the server, you can start an export or import job, detach from it, and later reconnect to the job to monitor its progress. Here are some of the things you can do while in this mode:
  1. See the status of the job. All of the information needed to monitor the job’s execution is available.
  2. Add more dump files if there is insufficient disk space for an export file. Change the default size of the dump files.
  3. Stop the job (perhaps it is consuming too many resources) and later restart it (when more resources become available).
  4. Restart the job. If a job was stopped for any reason (system failure, power outage), you can attach to the job and then restart it.
  5. Increase or decrease the number of active worker processes for the job. (Enterprise Edition only.)
  6. Attach to a job from a remote site (such as from home) to monitor status.

Network Mode

Data Pump gives you the ability to pass data between two databases over a network (via a database link), without creating a dump file on disk. This is very useful if you’re moving data between databases, like data marts to data warehouses, and disk space is not readily available. Note that if you are moving large volumes of data, Network mode is probably going to be slower than file mode. Network export creates the dump file set on the instance where the Data Pump job is running and extracts the metadata and data from the remote instance. Network export gives you the ability to export read-only databases. (Data Pump Export cannot run locally on a read-only instance because the job requires write operations on the instance.) This is useful when there is a need to export data from a standby database.

Generating SQLFILES

In original Import, the INDEXFILE parameter generated a text file which contained the SQL commands necessary to recreate tables and indexes that you could then edit to get a workable DDL script. With Data Pump, it’s a lot easier to get a workable DDL script. When you run Data Pump Import and specify the SQLFILE parameter, a text file is generated that has the necessary DDL (Data Definition Language) in it to recreate all object types, not just tables and indexes. Although this output file is ready for execution, the DDL statements are not actually executed, so the target system will not be changed. SQLFILEs can be particularly useful when pre-creating tables and objects
in a new database. Note that the INCLUDE and EXCLUDE parameters can be used for tailoring sqlfile output. For example, if you want to create a database that contains all the tables and indexes of the source database, but that does not include the same constraints, grants, and other metadata, you would issue a command as follows:

>impdp username/password DIRECTORY=dpumpdir1 DUMPFILE=expfull.dmp SQLFILE=dpump_dir2:expfull.sql INCLUDE=TABLE,INDEX

The SQL file named expfull.sql is written to dpump_dir2 and would include SQL DDL that could be executed in another database to create the tables and indexes as desired.

Interested To learn datapump, then go through the two topics

1. Data Pump Export
2. Data Pump Import

ORACLE DATA PUMP Part #2

Export of an entire database to a dump file with all GRANTS, INDEXES, and data

> exp username/password FULL=y FILE=dba.dmp GRANTS=y INDEXES=y ROWS=y

> expdp username/password FULL=y INCLUDE=GRANT INCLUDE=INDEX DIRECTORY=dpump_dir1 DUMPFILE=dba.dmp CONTENT=ALL

Data Pump offers much greater metadata filtering than original Export and Import. The INCLUDE parameter allows you to specify which object (and its dependent objects) you want to keep in the export job. The EXCLUDE parameter allows you to specify which object (and its dependent objects) you want to keep out of the export job. You cannot mix the two parameters in one job. Both parameters work with Data Pump Import as well, and you can use different INCLUDE and EXCLUDE options for different operations on the same dump file.

Tuning Parameters

Unlike original Export and Import, which used the BUFFER, COMMIT, COMPRESS, CONSISTENT, DIRECT, and RECORDLENGTH parameters, Data Pump needs no tuning to achieve maximum performance. Data Pump chooses the best method to ensure that data and metadata are exported and imported in the most efficient manner. Initialization parameters should be sufficient upon installation.

Moving data between versions

The Data Pump method for moving data between different database versions is different from the method used by original Export and Import. With original Export, you had to run an older version of Export to produce a dump file that was compatible with an older database version. With Data Pump, you use the current Export version and simply use the VERSION parameter to specify the target database version. You cannot specify versions earlier than Oracle Database 10g (since Data Pump did not exist before 10g).

> expdp username/password TABLES=hr.employees VERSION=10.1 DIRECTORY =dpump_dir1 DUMPFILE=emp.dmp

Data Pump Import can always read dump file sets created by older versions of Data Pump Export. Note that Data Pump Import cannot read dump files produced by original Export.


Maximizing the Power of Oracle Data Pump

Data Pump works great with default parameters, but once you are comfortable with Data Pump, there are new capabilities that you will want to explore.

Parallelism

Data Pump Export and Import operations are processed in the database as a Data Pump job, which is much more efficient that the client-side execution of original Export and Import. Now Data Pump operations can take advantage of the server’s parallel processes to read or write multiple data streams simultaneously (PARALLEL is only available in the Enterprise Edition of Oracle Database.) The number of parallel processes can be changed on the fly using Data Pump’s interactive command-line mode. You may have a certain number of processes running during the day and decide to change that number if more system resources become available at night (or vice versa).

For best performance, you should do the following:
  1. Make sure your system is well balanced across CPU, memory, and I/O.
  2. Have at least one dump file for each degree of parallelism. If there aren’t enough dump files, performance will not be optimal because multiple threads of execution will be trying to access the same dump file.
  3. Put files that are members of a dump file set on separate disks so that they will be written and read in parallel.
  4. For export operations, use the %U variable in the DUMPFILE parameter so multiple dump files can be automatically generated.
> expdp username/password DIRECTORY=dpump_dir1 JOB_NAME=hr DUMPFILE=par_exp%u.dmp PARALLEL=4

REMAP

REMAP_TABLESPACE – This allows you to easily import a table into a different tablespace from which it was originally exported. The databases have to be 10.1 or later.

> impdp username/password REMAP_TABLESPACE=tbs_1:tbs_6 DIRECTORY =dpumpdir1 DUMPFILE=employees.dmp

REMAP_DATAFILES – This is a very useful feature when you move databases between platforms that have different file naming conventions. This parameter changes the source datafile name to the target datafile name in all SQL
statements where the source datafile is referenced. Because the REMAP_DATAFILE value uses quotation marks, it’s best to specify the parameter within a parameter file.

The parameter file, payroll.par, has the following content:

DIRECTORY=dpump_dir1
FULL=Y
DUMPFILE=db_full.dmp
REMAP_DATAFILE=”’C:\DB1\HRDATA\PAYROLL\tbs6.dbf’:’/db1/hrdata/pa
yroll/tbs6.dbf’”

You can then issue the following command:

> impdp username/password PARFILE=payroll.par

Continued in ORACLE DATA PUMP Part #3

ORACLE DATA PUMP Part #1

Oracle Data Pump is the replacement for the original Export and Import utilities. Available starting in Oracle Database 10g, Oracle Data Pump enables very high-speed movement of data and metadata from one database to another. The Data Pump Export and Import utilities have a similar look and feel to theoriginal utilities, but they are much more efficient and give you greater control and management of import and export jobs.

There are two new concepts in Oracle Data Pump that are different from original Export and Import.

1. Directory Objects: Data Pump differs from original Export and Import in that all jobs run primarily on the server using server processes. These server processes access files for the Data Pump jobs using directory objects that identify the location of the files. The directory objects enforce a security model that can be used by DBAs to control access to these files.

2. Interactive Command-Line Mode: Besides regular operating system command-line mode, there is now a very powerful interactive command-line mode which allows the user to monitor and control Data Pump Export and Import operations.

Creating Directory Objects

In order to use Data Pump, the database administrator must create a directory object and grant privileges to the user on that directory object. If a directory object is not specified, a default directory object called data_pump_dir is provided. The default data_pump_dir is available only to privileged users unless access is granted by the DBA.

The following SQL statement creates a directory object named dump_dir that is mapped to a directory located at /u01/dumps.

>sulplus / as sysdba
SQL> CREATE DIRECTORY dump_dir AS ‘/u01/dumps/’;

After a directory is created, you need to grant READ and WRITE permission on the directory to other users. For example, to allow the Oracle database to read and to write to files on behalf of user scott in the directory named by dump_dir, you must execute the following command:

SQL> GRANT READ,WRITE ON DIRECTORY dump_dir TO scott;

Note that READ or WRITE permission to a directory object means only that the Oracle database will read or write that file on your behalf. You are not given direct access to those files outside of the Oracle database unless you have the appropriate operating system privileges. Similarly, the Oracle database requires permission from the operating system to read and write files in the directories. Once the directory access is granted, the user scott can export his database objects with command arguments:

>expdp username/password DIRECTORY=dpump_dir1 dumpfile=scott.dmp

Comparison of command-line parameters from Original Export and Import to Data Pump

Data Pump commands have a similar look and feel to the original Export and Import commands, but are different. Below are a few examples that demonstrate some of these differences.

Import of tables from scott’s account to jim’s account

Original Import:
> imp username/password FILE=scott.dmp FROMUSER=scott TOUSER=jim TABLES=(*)
Data Pump Import:
> impdp username/password DIRECTORY=dpump_dir1 DUMPFILE=scott.dmp TABLES=scott.emp REMAP_SCHEMA=scott:jim

Continued in ORACLE DATA PUMP Part #2

Monday, May 19, 2008

Managing Indexes Part #3

After Managing Indexes Part #2

Creating an Index Associated with a Constraint

Oracle Database enforces a UNIQUE key or PRIMARY KEY integrity constraint on a table by creating a unique index on the unique key or primary key. This index is automatically created by the database when the constraint is enabled. No action is required by you when you issue the CREATE TABLE or ALTER TABLE statement to create the index, but you can optionally specify a USING INDEX clause to exercise control over its creation. This includes both when a constraint is defined and enabled, and when a defined but disabled constraint is enabled.

To enable a UNIQUE or PRIMARY KEY constraint, thus creating an associated index, the owner of the table must have a quota for the tablespace intended to contain the index, or the UNLIMITED TABLESPACE system privilege. The index associated with a constraint always takes the name of the constraint, unless you optionally specify otherwise.


Specifying Storage Options for an Index Associated with a Constraint


You can set the storage options for the indexes associated with UNIQUE and PRIMARY KEY constraints using the USING INDEX clause. The following CREATE TABLE statement enables a PRIMARY KEY constraint and specifies the storage options of the associated index:

CREATE TABLE emp ( empno NUMBER(5) PRIMARY KEY, age INTEGER)
ENABLE PRIMARY KEY USING INDEX TABLESPACE users PCTFREE 0;

Specifying the Index Associated with a Constraint

If you require more explicit control over the indexes associated with UNIQUE and PRIMARY KEY constraints, the database lets you:
  1. Specify an existing index that the database is to use to enforce the constraint
  2. Specify a CREATE INDEX statement that the database is to use to create the index and enforce the constraint
These options are specified using the USING INDEX clause. The following statements present some examples.

Example 1:

CREATE TABLE a (a1 INT PRIMARY KEY USING INDEX (create index ai on a (a1)));


Example 2:

CREATE TABLE b( b1 INT, b2 INT,CONSTRAINT bu1 UNIQUE (b1, b2)
USING INDEX (create unique index bi on b(b1, b2)), CONSTRAINT bu2 UNIQUE (b2, b1) USING INDEX bi);

Example 3:

CREATE TABLE c(c1 INT, c2 INT);
CREATE INDEX ci ON c (c1, c2);
ALTER TABLE c ADD CONSTRAINT cpk PRIMARY KEY (c1) USING INDEX ci;


If a single statement creates an index with one constraint and also uses that index for another constraint, the system will attempt to rearrange the clauses to create the index before reusing it.


Collecting Incidental Statistics when Creating an Index

Oracle Database provides you with the opportunity to collect statistics at very little resource cost during the creation or rebuilding of an index. These statistics are stored in the data dictionary for ongoing use by the optimizer in choosing a plan for the execution of SQL statements. The following statement computes index, table, and column statistics while building index emp_ename on column ename of table emp:

CREATE INDEX emp_ename ON emp(ename) COMPUTE STATISTICS;

Creating an Index Online

You can create and rebuild indexes online. This enables you to update base tables at the same time you are building or rebuilding indexes on that table. You can perform DML operations while the index build is taking place, but DDL operations are not allowed. Parallel execution is not supported when creating or rebuilding an index online.

The following statements illustrate online index build operations:

CREATE INDEX emp_name ON emp (mgr, emp1, emp2, emp3) ONLINE;


Creating a Function-Based Index

Function-based indexes facilitate queries that qualify a value returned by a function or expression. The value of the function or expression is precomputed and stored in the index.

To create a function-based index, you must have the COMPATIBLE parameter set to 8.1.0.0.0 or higher. In addition to the prerequisites for creating a conventional index, if the index is based on user-defined functions, then those functions must be marked DETERMINISTIC. Also, you just have the EXECUTE object privilege on any user-defined function(s) used in the function-based index if those functions are owned by another user.

Additionally, to use a function-based index:
  1. The table must be analyzed after the index is created.
  2. The query must be guaranteed not to need any NULL values from the indexed expression, since NULL values are not stored in indexes.

To illustrate a function-based index, consider the following statement that defines a function-based index (area_index) defined on the function area(geo):

CREATE INDEX area_index ON rivers (area(geo));


In the following SQL statement, when area(geo) is referenced in the WHERE clause, the optimizer considers using the index area_index.

SELECT id, geo, area(geo), desc FROM rivers WHERE Area(geo) >5000;

Table owners should have EXECUTE privileges on the functions used in function-based indexes.

Because a function-based index depends upon any function it is using, it can be invalidated when a function changes. If the function is valid, you can use an ALTER INDEX ... ENABLE statement to enable a function-based index that has been disabled. The ALTER INDEX ... DISABLE statement lets you disable the use of a function-based index. Consider doing this if you are working on the body of the function.

Altering Indexes

To alter an index, your schema must contain the index or you must have the ALTER ANY INDEX system privilege. Among the actions allowed by the ALTER INDEX statement are:
  1. Rebuild or coalesce an existing index
  2. Deallocate unused space or allocate a new extent
  3. Specify parallel execution (or not) and alter the degree of parallelism
  4. Alter storage parameters or physical attributes
  5. Specify LOGGING or NOLOGGING
  6. Enable or disable key compression
  7. Mark the index unusable
  8. Start or stop the monitoring of index usage

You cannot alter index column structure.

END

Managing Indexes Part #2

Continued after Managing Indexes Part #1

Order Index Columns for Performance


The order of columns in the CREATE INDEX statement can affect query performance. In general, specify the most frequently used columns first. If you create a single index across columns to speed up queries that access, for example, col1, col2, and col3; then queries that access just col1, or that access just col1 and col2, are also seeded up. But a query that accessed just col2, just col3, or just col2 and col3 does not use the index.

Limit the Number of Indexes for Each Table

A table can have any number of indexes. However, the more indexes there are, the more overhead is incurred as the table is modified. Specifically, when rows are inserted or deleted, all indexes on the table must be updated as well. Also, when a column is updated, all indexes that contain the column must be updated.

Thus, there is a trade-off between the speed of retrieving data from a table and the speed of updating the table. For example, if a table is primarily read-only, having more indexes can be useful; but if a table is heavily updated, having fewer indexes could be preferable.

Drop Indexes That Are No Longer Required

Consider dropping an index if:
  1. It does not speed up queries. The table could be very small, or there could be many rows in the table but very few index entries.
  2. The queries in your applications do not use the index.
  3. The index must be dropped before being rebuilt.

Consider Parallelizing Index Creation


You can parallelize index creation, much the same as you can parallelize table creation. Because multiple processes work together to create the index, the database can create the index more quickly than if a single server process created the index sequentially.

When creating an index in parallel, storage parameters are used separately by each query server process. Therefore, an index created with an INITIAL value of 5M and a parallel degree of 12 consumes at least 60M of storage during index creation.

Consider Creating Indexes with NOLOGGING

You can create an index and generate minimal redo log records by specifying NOLOGGING in the CREATE INDEX statement. Creating an index with NOLOGGING has the following benefits:
  1. Space is saved in the redo log files.
  2. The time it takes to create the index is decreased.
  3. Performance improves for parallel creation of large indexes.

In general, the relative performance improvement is greater for larger indexes created without LOGGING than for smaller ones. Creating small indexes without LOGGING has little effect on the time it takes to create an index. However, for larger indexes the performance improvement can be significant, especially when you are also parallelizing the index creation.

Creating an Index Explicitly

You can create indexes explicitly (outside of integrity constraints) using the SQL statement CREATE INDEX. The following statement creates an index named emp_ename for the ename column of the emp table:

CREATE INDEX emp_ename ON emp(ename) TABLESPACE users
STORAGE (INITIAL 20K NEXT 20k PCTINCREASE 75) PCTFREE 0;

Notice that several storage settings and a tablespace are explicitly specified for the index. If you do not specify storage options (such as INITIAL and NEXT) for an index, the default storage options of the default or specified tablespace are automatically used.

Creating a Unique Index Explicitly

Indexes can be unique or nonunique. Unique indexes guarantee that no two rows of a table have duplicate values in the key column (or columns). Nonunique indexes do not impose this restriction on the column values. Use the CREATE UNIQUE INDEX statement to create a unique index. The following example creates a unique index:

CREATE UNIQUE INDEX dept_unique_index ON dept (dname)
TABLESPACE indx;

Alternatively, you can define UNIQUE integrity constraints on the desired columns. The database enforces UNIQUE integrity constraints by automatically defining a unique index on the unique key. This is discussed in the following section. However, it is advisable that any index that exists for query performance, including unique indexes, be created explicitly.

Discussion continued Managing Indexes Part #3

Managing Indexes Part #1

Indexes are optional structures associated with tables and clusters that allow SQL statements to execute more quickly against a table. An Oracle Database index provides a faster access path to table data. You can use indexes without rewriting any queries. Your results are the same, but the execution time increase.

Oracle Database provides several indexing schemes that provide complementary performance functionality. These are:
  1. B-tree indexes: the default and the most common
  2. B-tree cluster indexes: defined specifically for cluster
  3. Hash cluster indexes: defined specifically for a hash cluster
  4. Global and local indexes: relate to partitioned tables and indexes
  5. Reverse key indexes: most useful for Oracle Real Application Clusters applications
  6. Bitmap indexes: compact; work best for columns with a small set of values
  7. Function-based indexes: contain the precomputed value of a function/expression
  8. Domain indexes: specific to an application or cartridge.

Indexes are logically and physically independent of the data in the associated table. Being independent structures, they require storage space. You can create or drop an index without affecting the base tables, database, applications, or other indexes. The database automatically maintains indexes when you insert, update, and delete rows of the associated table. If you drop an index, all applications continue to work. However, access to previously indexed data might be slower.

Guidelines for Managing Indexes


Create Indexes After Inserting Table Data

Data is often inserted or loaded into a table using either the SQL*Loader or an import utility. It is more efficient to create an index for a table after inserting or loading the data. If you create one or more indexes before loading data, the database then must update every index as each row is inserted.

Creating an index on a table that already has data requires sort space. Some sort space comes from memory allocated for the index creator. The amount for each user is determined by the initialization parameter SORT_AREA_SIZE. The database also swaps sort information to and from temporary segments that are only allocated during the index creation in the users temporary tablespace.


Index the Correct Tables and Columns

Use the following guidelines for determining when to create an index:
  1. Create an index if you frequently want to retrieve less than 15% of the rows in a large table. The percentage varies greatly according to the relative speed of a table scan and how the distribution of the row data in relation to the index key. The faster the table scan, the lower the percentage; the more clustered the row data, the higher the percentage.
  2. To improve performance on joins of multiple tables, index columns used for joins.
  3. Primary and unique keys automatically have indexes, but you might want to create an index on a foreign key.
  4. Small tables do not require indexes. If a query is taking too long, then the table might have grown from small to large.

Some columns are strong candidates for indexing. Columns with one or more of the following characteristics are candidates for indexing:
  1. Values are relatively unique in the column.
  2. There is a wide range of values (good for regular indexes).
  3. There is a small range of values (good for bitmap indexes).
  4. The column contains many nulls, but queries often select all rows having a value. In this case, use the following phrase: WHERE COL_X > -9.99 * power(10,125)Using the preceding phrase is preferable to: WHERE COL_X IS NOT NULL. This is because the first uses an index on COL_X (assuming that COL_X is a numeric column).

Columns with the following characteristics are less suitable for indexing:
  1. There are many nulls in the column and you do not search on the not null values.
  2. LONG and LONG RAW columns cannot be indexed.

The size of a single index entry cannot exceed roughly one-half (minus some overhead) of the available space in the data block.

Discussion Continued in Managing Indexes Part #2

Oracle Tips And Tricks Part#1

Tips 4 : Join small table first.

Suppose we have three tables A,B,C which can be join each other by any combinations (i,e a join b join c, c join a join b, ... etc).

A has 10 columns and 100 rows, that means 10*100=1000 cells
B has 5 columns and 2000 rows,
that means 5*2000=10000 cells
and C has 20 columns had 30 rows, that means 20*30=600 cells

Now When you apply "Left Join"/ "Right Join" on those tables, for achieving optimum performance please join small table fist (if possible for certain case. In our example a,b,c can be join by any combinations) that means

SELECT ..... FROM C Left Join A ON (...) Left Join B ON (...);


Tips 3: Do not use LONG datatype.


LONG datatype is used to store variable length character strings.
Oracle has advised, since 8i, that the LONG datatype no longer be used, it has placed in dtatatype set only for backward compatibility. The limitation of LONG datatype are:
  1. There is a limit of one column of type LONG per table.
  2. Tables containing a LONG cannot be partitioned.
  3. LONG datatypes cannot be used in subqueries.
  4. Few functions will work with LONG data.
Instead of LONG please use CLOB. The CLOB datatype is the preferred datatype for character data larger than VARCHAR2.

Tips 2 : Place column/columns in last, if there is a possibility to contain null.

Suppose we have a table like this student (id,name not null,address1,address2,address3,age not null). Here you can see, there is a possibility to remain NULL in 3 column (
address1,address2,address3). so for a row of this table that have null on those three columns, 3 byte space consume to store the NULL. But if you rearrange the columns like this student (id,name not null,,age not null,address1,address2,address3) then a row of this table that have null on those three columns, 1 byte space consume to store the NULL. Because it took 1 byte to store NULL on column 'address1' and no space needed for columns 'address2' and 'address2'. And again if you have data on 'address3' then again 2 byte space needed to store NULL info :( .

So you need to place the column such a way that all null columns stay in a row to get this space benefit . 1 byte space too small to consider but a large table with 30 columns (10 of then can be null) and have 4 lacks rows, this tricks can be hand full ! :)


Tips 1 : Create index on FOREIGN KEY columns


Suppose we have two Tables:
1. branch(id,name,Address,total_employee).
2. employee(id,name,branch_id,sal).

Here
employee.branch_id is a foreign key (referenced by branch.id). Now suppose 50 branches has 20 employees and another 10 employee add in each branch (already inserted into employee table). so we need to update branch.total_employee =20 to 30. If we have no index on employee.branch_id column, update on branch will lock all 50*30=1500 rows of employee table and all DML operations on employee table wait until DML operation completed on branch table. But if we have an index on employee.branch_id column, update on branch will lock only 1*30=30 rows of employee table at a time.

For small database the wait may not bother you but in large database it will bother you significantly.






Thursday, May 15, 2008

High-water mark (HWM)

High-water mark is an indicator or pointer up to which table or index has ever contain data. Suppose we create an empty table , the high-water mark would be at the beginning of the table segment



Unused block or extent occupied by the table segment on initial creation.


After inserting data in the table segment , the HWM moves and point the position up to which the data is in the segment

DATA

-------------------

Un-Used Block

By inserting more data in the table segment, HWM moves further to point the position up to which the data is in the segment

------ DATA-----------------


Un-Used

Blocks

Now let us delete the data and see the pointer of HWM

DATA

-------

Empty Blocks

Un-Used Blocks


Full Table Scan

As you seen above by deleting the data , HWM does not move. The main disadvantage of this is that oracle always read the blocks up to high water mark in case of full table scan . You may have ever notice that doing a count(*) on empty table , takes time to show you 0 rows. The reason for delay is setting of HWM at higher position.


Now the question arises in front of us , how we set the high-water mark at lower position ?

The one way to set the HWM is to truncate a table (but all row will be deleted !). Let us see how truncate set the HWM.

No data in the segmen


HWM is reset now , after truncating data.

Another way is SEGMENT SHRINKING (it will hold all data) . To see how it works please go through the article on Oracle tuning on this blog

Table where lots of deletion or insertion takes place , probably has High HWM. If the HWM is high , it is better to rebuild table segment for performance sake.


Calculate the HWM as follows

HWM % = (HWM BLOCKS - Actual DATA BLOCKS / HWM BLOCKS ) * 100



Sunday, May 11, 2008

Place multiple row in a single column

Some times we need to Place data which are in multiple row, into a single column. suppose we have a table EMP which contains following data

SQL> select JOB,ENAME from emp order by job;

JOB ENAME
--------- ----------
ANALYST SCOTT
ANALYST FORD
CLERK MILLER
CLERK JAMES
CLERK SMITH
CLERK ADAMS
MANAGER BLAKE
MANAGER JONES
MANAGER CLARK
PRESIDENT KING
SALESMAN TURNER
SALESMAN MARTIN
SALESMAN WARD
SALESMAN ALLEN

14 row selected

But we need the out put of the sql quarry in following format:

JOB ENAME
--------- ----------
ANALYST SCOTT, FORD
CLERK MILLER, JAMES, SMITH, ADAMS
MANAGER BLAKE, JONES, CLARK
PRESIDENT KING
SALESMAN TURNER, MARTIN, WARD, ALLEN

To obtain such formated output we should rewrite the sql quarry as follows:

SQL> select JOB, REPLACE( REPLACE( XMLAGG
(XMLELEMENT("#",ENAME)).GETSTRINGVAL(),'<#>',' '),'',', ' )"ENAME" from emp group by job order by job;

JOB ENAME
--- ----------------------------------
ANALYST SCOTT, FORD,
CLERK SMITH, JAMES, ADAMS, MILLER,
MANAGER JONES, CLARK, BLAKE,
PRESIDENT KING,
SALESMAN ALLEN, WARD, TURNER, MARTIN,

Thursday, May 8, 2008

External Tables

External tables allow Oracle to query data that is stored outside the database in flat files. The ORACLE_LOADER driver can be used to access any data stored in any format that can be loaded by SQL*Loader. No DML can be performed on external tables but they can be used for query, join and sort operations. Views and synonyms can be created against external tables. They are useful in the ETL process of data warehouses since the data doesn't need to be staged and can be queried in parallel. They should not be used for frequently queried tables.

First create a sample text file named test.txt which contains

CHINA,CHI,
CANADA,CA,
BANGLADESH,BD,
INDIA,IND,
AUSTRALIA,AUS,

Next create a directory object which points to the location of the files:

> conn / as sysdba;

SQL>CREATE OR REPLACE DIRECTORY DUMP_DIR AS 'C:\temp\';
SQL> GRANT READ,WRITE ON DIRECTORY DUMP_DIR TO scott;

Next create the metadata for the external table using the CREATE TABLE..ORGANIZATION EXTERNAL syntax:


SQL> CREATE TABLE countries(
country_name VARCHAR2(50),
country_code VARCHAR2(5)
)
ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER
DEFAULT DIRECTORY dump_dir
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
FIELDS TERMINATED BY ','
MISSING FIELD VALUES ARE NULL
(
country_name CHAR(50),
country_code CHAR(5)
)
)
LOCATION ('test.txt')
)
PARALLEL 5
REJECT LIMIT UNLIMITED;

Table created.


SQL> select * from countries;

COUNTRY_NAME COUNT
-------------------------------------------------- -----
CHINA CHI
CANADA CA
BANGLADESH BD
INDIA IND
AUSTRALIA AUS

SQL>

TO drop the table

SQL > drop table countries;

Wednesday, May 7, 2008

Online Segment Shrink


Based on the recommendations from the segment advisor you can recover space from specific objects using one of the variations of the ALTER TABLE ... SHRINK SPACE command:

-- Enable row movement.
ALTER TABLE scott.emp ENABLE ROW MOVEMENT;

-- Recover space and amend the high water mark (HWM).
ALTER TABLE scott.emp SHRINK SPACE;

-- Recover space, but don't amend the high water mark (HWM).
ALTER TABLE scott.emp SHRINK SPACE COMPACT;

-- Recover space for the object and all dependant objects.
ALTER TABLE scott.emp SHRINK SPACE CASCADE;

The shrink is accomplished by moving rows between blocks, hence the requirement for row movement to be enabled for the shrink to take place. This can cause problem with ROWID based triggers. The shrinking process is only available for objects in tablespaces with automatic segment-space management enabled.

The COMPACT option allows the shrink operation to be broken into two stages. First the rows are moved using the COMPACT option but the HWM is not adjusted so no parsed SQL statements are invalidated. The HWM can be adjusted at a later date by reissuing the statement without the COMPACT option. At this point any depenedant SQL statements will need to be reparsed.

Monday, May 5, 2008

Bind variables


If you've been developing applications on Oracle for a while, you've no doubt come across the concept of «Bind Variables». Bind variables are one of those Oracle concepts that experts frequently cite as being key to application performance, but it's often not all that easy to pin down exactly what they are and how you need to alter your programming style to use them.

To understand bind variables, consider an application that generates thousands of SELECT statements against a table; for example:

SELECT fname, lname, pcode FROM cust WHERE id = 674;
SELECT fname, lname, pcode FROM cust WHERE id = 234;
SELECT fname, lname, pcode FROM cust WHERE id = 332;

Each time the query is submitted, Oracle first checks in the shared pool to see whether this statement has been submitted before. If it has, the execution plan that this statement previously used is retrieved, and the SQL is executed. If the statement cannot be found in the shared pool, Oracle has to go through the process of parsing the statement, working out the various execution paths and coming up with an optimal access plan before it can be executed. This process is know as a «hard parse» and for OLTP applications can actually take longer to carry out that the DML instruction itself.

When looking for a matching statement in the shared pool, only statements that exactly match the text of the statements are considered; so, if every SQL statement you submit is unique (in that the predicate changes each time, from id = 674 to id=234 and so on) then you'll never get a match, and every statement you submit will need to be hard parsed. Hard parsing is very CPU intensive, and involves obtaining latches on key shared memory areas, which whilst it might not affect a single program running against a small set of data, can bring a multi-user system to it's knees if hundreds of copies of the program are trying to hard parse statements at the same time. The extra bonus with this problem is that contention caused by hard parsing is pretty much immune to measures such as increasing available memory, numbers of processors and so on, as hard parsing statements is one thing Oracle can't do concurrently with many other operations, and it's a problem that often only comes to light when trying to scale up a development system from a single user working on subset of records to many hundreds of users working on a full data set.

The way to get Oracle to reuse the execution plans for these statements is to use bind variables. Bind variables are «substituion» variables that are used in place of literals (such as 674, 234, 332) and that have the effect of sending exactly the same SQL to Oracle every time the query is executed. For example, in our application, we would just submit

SELECT fname, lname, pcode FROM cust WHERE id = :cust_no;

and this time we would be able to reuse the execution plan every time, reducing the latch activity in the SGA, and therefore the total CPU activity, which has the effect of allowing our application to scale up to many users on a large dataset.

Bind Variables in SQL*Plus

In SQL*Plus you can use bind variables as follows:

SQL> variable deptno number
SQL> exec :deptno := 10
SQL> select * from emp where deptno = :deptno;

What we've done to the SELECT statement now is take the literal value out of it, and replace it with a placeholder (our bind variable), with SQL*Plus passing the value of the bind variable to Oracle when the statement is processed. This bit is fairly straighforward (you declare a bind variable in SQL*Plus, then reference the bind variable in the SELECT statement)

Bind Variables in PL/SQL

Taking PL/SQL first of all, the good news is that PL/SQL itself takes care of most of the issues to do with bind variables, to the point where most code that you write already uses bind variables without you knowing. Take, for example, the following bit of PL/SQL:

create or replace procedure dsal(p_empno in number)
as
begin
update emp
set sal=sal*2
where empno = p_empno;
commit;
end;
/

Now you might be thinking that you've got to replace the p_empno with a bind variable. However, the good news is that every reference to a PL/SQL variable is in fact a bind variable.

Dynamic SQL

In fact, the only time you need to consciously decide to use bind variables when working with PL/SQL is when using Dynamic SQL.

Dynamic SQL, allows you to execute a string containing SQL using the EXECUTE IMMEDIATE command. For next example would always require a hard parse when it is submitted:

create or replace procedure dsal(p_empno in number)
as
begin
execute immediate
'update emp set sal = sal*2 where empno = '||p_empno;
commit;
end;
/

The way to use bind variables instead is to change the EXECUTE IMMEDIATE command as follows:

create or replace procedure dsal(p_empno in number)
as
begin
execute immediate
'update emp set
sal = sal*2 where empno = :x' using p_empno;
commit;
end;
/

And that's all there is to it. One thing to bear in mind, though, is that you can't substitute actual object names (tables, views, columns etc) with bind variables - you can only subsitute literals. If the object name is generated at runtime, you'll still need to string concatenate these parts, and the SQL will only match with those already in the shared pool when the same object name comes up. However, whenever you're using dynamic SQL to build up the predicate part of a statement, use bind variables instead and you'll reduce dramatically the amount of latch contention going on.

The Performance Killer

Just to give you a tiny idea of how huge of a difference this can make performance wise, you only need to run a very small test:

Here is the Performance Killer ....

SQL> alter system flush shared_pool;
SQL> set serveroutput on;

declare
type rc is ref cursor;
l_rc rc;
l_dummy all_objects.object_name%type;
l_start number default dbms_utility.get_time;
begin
for i in 1 .. 1000
loop
open l_rc for
'select object_name
from all_objects
where object_id = ' || i;
fetch l_rc into l_dummy;
close l_rc;
-- dbms_output.put_line(l_dummy);
end loop;
dbms_output.put_line
(round((dbms_utility.get_time-l_start)/100, 2) ||
' Seconds...' );
end;
/
101.71 Seconds...

... and here is the Performance Winner:

declare
type rc is ref cursor;
l_rc rc;
l_dummy all_objects.object_name%type;
l_start number default dbms_utility.get_time;
begin
for i in 1 .. 1000
loop
open l_rc for
'select object_name
from all_objects
where object_id = :x'
using i;
fetch l_rc into l_dummy;
close l_rc;
-- dbms_output.put_line(l_dummy);
end loop;
dbms_output.put_line
(round((dbms_utility.get_time-l_start)/100, 2) ||
' Seconds...' );
end;
/
1.9 Seconds...

That is pretty dramatic. The fact is that not only does this execute much faster (we spent more time PARSING our queries then actually EXECUTING them!) it will let more users use your system simultaneously.

Bind Variables in VB, Java and other applications

The next question is though, what about VB, Java and other applications that fire SQL queries against an Oracle database. How do these use bind variables? Do you have to in fact split your SQL into two statements, one to set the bind variable, and one for the statement itself?

In fact, the answer to this is actually quite simple. When you put together an SQL statement using Java, or VB, or whatever, you usually use an API for accessing the database; ADO in the case of VB, JDBC in the case of Java. All of these APIs have built-in support for bind variables, and it's just a case of using this support rather than just concatenating a string yourself and submitting it to the database.

For example, Java has PreparedStatement, which allows the use of bind variables, and Statement, which uses the string concatenation approach. If you use the method that supports bind variables, the API itself passes the bind variable value to Oracle at runtime, and you just submit your SQL statement as normal. There's no need to separately pass the bind variable value to Oracle, and actually no additional work on your part. Support for bind variables isn't just limited to Oracle - it's common to other RDBMS platforms such as Microsoft SQL Server, so there's no excuse for not using them just because they might be an Oracle-only feature.

Related Topics:

  1. Value of a bind variable

Sunday, May 4, 2008

Oracle Session Resource Profiler

Introduction

In short, OraSRP is a profiler. It parses extended SQL trace files and creates report which allows you to learn where your session spent its time and why. OraSRP may generate reports either in html or in text format.

Features

  1. Builds complete and detailed session profile (which includes such pseudo-events like 'PARSE/EXEC/FETCH calls' and 'unaccounted-for time').
  2. Display results either in html or text format.
  3. Display graph of statements calls.
  4. Display events histograms.
  5. Display various session statistics like 'blocks read', 'datafiles read', 'parallel execution' etc.
  6. Group similar statements' statistics into one.
  7. Display values of bind variables.
  8. Able to parse trace files from Oracle version 7.2 upto 11g.

Installation

OraSRP comes in several forms:
  1. Installer for Windows (orasrp-setup.exe). Just run it and follow usual instructions.
  2. Binary for Windows (orasrp-windows.zip). Just unzip anywhere you want.
  3. Binary for Linux (orasrp-linux.zip). Just unzip anywhere you want.

Usage

OraSRP is a command-line utility. Pass trace file name as a first parameter and output report name as a second. Like this:

$ orasrp trace.trc report.html

That's it.

Command-line options

Brief explanation:

$ orasrp -h
usage: orasrp [options] file [outfile]

options:
-h, --help show this help message and exit
--version show program's version number and exit
-t, --text output in text format
--aggregate=YES/no aggregate similar statements
-bn, --binds=n how many unique bind-sets to display (1 by defaut)
--maxbinds=n how many unique bind-sets to analyze (all by default)
--recognize-idle-events=YES/no recognize or not idle wait events
--sys=YES/no print sys statements
--sort how to sort statements in output (values like in tkprof)
--display-sections what sections of report to display (all by default)
--skip-sections what sections of report to skip
--sessionid analyze data only for the specified session id
--threshold omit statements which spend less than threshold % from total time (not used by defaut)
--google-charts display charts using Google Chart API

Downloads

Current version is 3.2.0 beta released 20 Dec 2007.


Samples

  • Sample One (excessive parsing; demonstration of grouping similar statements' statistics)
  • Sample Two (slow delete because of row lock contention)
  • Sample Three (text output; display values of bind variables)
To find the location of Tracle File :

-bash-3.00$ sqlplus / as sysdba

SQL> show parameter dump_dest

NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
background_dump_dest string /oracle/app/oracle/product/10.
2.0/db_1/admin/bdump
core_dump_dest string /backup2/Core_Dump
user_dump_dest string /backup2/User_dump

So the trace files are stored in /backup2/User_dump.Here you find lot of trace files, to identify which trace file has created for current session :

select d.value || '/ora_' || p.spid || '.trc' trace_file_name
from
( select p.spid from sys.v_$mystat m,
sys.v_$session s, sys.v_$process p
where m.statistic# = 1 and
s.sid = m.sid and p.addr = s.paddr
) p,
( select value from sys.v_$parameter
where name = 'user_dump_dest'
) d;

To Enable SQL TRACE:

alter session set sql_trace=true;
alter system set sql_trace=true scope=spfile;

Thursday, May 1, 2008

Eliminate FULL TABLE ACCESS

In SQL quarry tuning one of the major goals is to eliminate FULL TABLE ACCESS. Using index we can achieve the goal. let see how it can be achieve

Suppose we have the following table

SQL> desc emp;

Name Type
---------- ----------------------------
EMPNO NUMBER(4)
ENAME VARCHAR2(10)
JOB VARCHAR2(9)
MGR NUMBER(4)
HIREDATE DATE
SAL NUMBER(7,2)
COMM NUMBER(7,2)
DEPTNO NUMBER(2)

Now enable the autotrace option to see sql quarry execution plan.

SQL> set autotrace on;
SQL>
SQL> select * from emp;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10

14 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 1218 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| EMP | 14 | 1218 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------

Update some rows for experimant purpose.

SQL> update emp set sal=500 where sal<2000;
8 row updated.
SQL> commit;
Commit complete.

SQL> create unique index empno_emp on emp(empno);
Index created.
SQL> select empno from emp where empno>0;

EMPNO
----------
7369
7499
7521
7566
7654
7698
7782
7788
7839
7844
7876
7900
7902
7934

14 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932

--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 14 | 182 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| EMP | 14 | 182 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------

The EMPNO_EMP index is not used because we access all 14 row.


SQL> select empno from emp where empno<7700;

EMPNO
----------
7369
7499
7521
7566
7654
7698

6 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2992108249
------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 | 78 | 1 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| EMPNO_EMP | 6 | 78 | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------

Here you can see the index is used because we access a portion of 14 rows.

SQL> select empno from emp where sal=500;

EMPNO
----------
7369
7499
7521
7654
7844
7876
7900
7934

8 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 208 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMP | 8 | 208 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------

Again you can see the index is not used because indexed column empno in not prasent in the "WHERE" close.


To ellimenate full table scane of previous sql quarry create an index on sal column which is in the "WHERE" close.

SQL> create index sal_emp on emp(sal);

Index created.

SQL> select empno from emp where sal=500;

EMPNO
----------
7369
7499
7521
7654
7844
7876
7900
7934

8 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 216044341
---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 208 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMP | 8 | 208 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | SAL_EMP | 8 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------

See ! we r abole to elliminate the full table scan..


Learning: If you have an index and the indexed column is not present in "Where" close the index will not be used.

Let go to our second experimant. for this drop the previously build indexs

SQL> drop index empno_emp;

Index dropped.

SQL> drop index sal_emp;

Index dropped.

Create a composite index

SQL> create index EMPNO_SAL_EMP on emp(empno,sal);
Index created.


SQL> select empno from emp where empno<7900;

EMPNO
----------
7369
7499
7521
7566
7654
7698
7782
7788
7839
7844
7876

11 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 615958546
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 11 | 143 | 1 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| EMPNO_SAL_EMP | 11 | 143 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------

SQL> select empno from emp where sal=500;

EMPNO
----------
7369
7499
7521
7654
7844
7876
7900
7934

8 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1761287244
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 8 | 208 | 1 (0)| 00:00:01 |
|* 1 | INDEX FULL SCAN | EMPNO_SAL_EMP | 8 | 208 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------

See EMPNO_SAL_EMP index experienced FULL SCAN because the quarry has sal (right part of index) in "WHERE" close.


SQL> select ename,empno,deptno from emp where deptno=30 and sal=500;

ENAME EMPNO DEPTNO
---------- ---------- ----------
ALLEN 7499 30
WARD 7521 30
MARTIN 7654 30
TURNER 7844 30
JAMES 7900 30


Execution Plan
----------------------------------------------------------
Plan hash value: 3956160932
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 5 | 230 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMP | 5 | 230 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------

Again an interesting thing .. now the table experienced FULL SCAN because select close has non indexed column and sal (right part of index) in "WHERE" close.


SQL> select ename,deptno from emp where empno<7700;

ENAME DEPTNO
---------- ----------
SMITH 20
ALLEN 30
WARD 30
JONES 20
MARTIN 30
BLAKE 30

6 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1330938245
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 | 198 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMP | 6 | 198 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | EMPNO_SAL_EMP | 6 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

This time empno (left part of the index) prasent in the "WHERE"close.

Learning : if you have composite index, make sure all/most of columns are present in "where" close in same order (suppose you have 3 column index and 2 &3 are in where close, the index is not used, you must have 1,2,3 or 1,2 or 1,3 colume in "WHERE" close)