KUTEN: 23 September 2007

This guide is part of a series of documents that was written to help to understand Oracle Data Pump and implement automated scripts using Oracle “expdp” utility on easy steps (then impdp), but not to duplicate official documentation, however I’ll use some “quotes and transcripts” of online Oracle books or Web pages. We’ll find those sources over the links section. Any comments, advice or errata could be send to: snevigat@yahoo.com

For a quick list of issued commands over the document, see at the end, the "Command list" section.

Description:

Oracle Data Pump is a feature of Oracle Database 10g/11g that enables very fast bulk data and metadata movement between Oracle databases. Oracle Data Pump provides new high-speed, parallel Export and Import utilities (expdp and impdp) as well as a Web-based Oracle Enterprise Manager interface. A new public interface package, DBMS_DATAPUMP, provides a server-side infrastructure for fast data and metadata movement.

Data Pump Export and Import utilities are typically much faster than the original Export and Import Utilities. A single thread of Data Pump Export is about twice as fast as original Export, while Data Pump Import is 15-45 times fast than original Import (we’ll see if true on “impdp” series).

Data Pump jobs can be restarted without loss of data, whether or not the stoppage was voluntary or involuntary.

Data Pump jobs support fine-grained object selection. Virtually any type of object can be included or excluded in a Data Pump job.

Data Pump supports the ability to load one instance directly from another (network import) and unload a remote instance (network export).

Versions: Oracle Database 10g, 11g

Data Pump is available on the Oracle Database 10g and11g Standard Edition, Enterprise Edition, and Personal Edition. However, the parallel capability is only available on Oracle10g, 11g Enterprise Edition. Data Pump is included on all the same platforms supported by Oracle 10g/11g, including Unix, Linux, Windows NT, Windows 2000, and Windows XP.

For more details please see: New Features in Oracle Database 11g Release 1 at the end of the document.

Key Features:

Migrating between platforms, moving data between development, test, and production databases, logical database backup, and for application deployment throughout a corporation.
Data Pump is installed automatically during database creation or database upgrade
Using the Direct Path method of unloading, a single stream of data unload is about 2 times faster than original Export because the Direct Path API has been modified to be even more efficient. Depending on the level of parallelism, the level of improvement can be much more. (more than that on a 115GB DB, 5HS conventional export to 1 ½ HS with expdp). We’ll take more accurate figures on other series.
We can dynamically throttle the number of threads of execution throughout the lifetime of the job. There is an interactive command mode where you can adjust the level of parallelism. So, for example, you can start up a job during the day with a PARALLEL=2, and then increase it at night to a higher level.
All the Oracle database data types are supported via Data Pump’s two data movement mechanisms, Direct Path and External Tables.
With Data Pump, there is much more flexibility in selecting objects for unload and load operations. You can now unload any subset of database objects (such as functions, packages, and procedures) and reload them on the target platform. Almost all database object types can be excluded or included in an operation using the new Exclude and Include parameters.
You can either use the Command line interface or the Oracle Enterprise Manager web-based GUI interface, interactive mode or hand made scripts.
Data Pump handles all the necessary compatibility issues between hardware platforms and operating systems.
Jobs can be monitored from any location. Clients may also detach from an executing job without affecting it. Data Pump job creates a Master Table in which the entire record of the job is maintained. The Master Table is the directory to the job, so if a job is stopped for any reason, it can be restarted at a later point in time, without losing any data.

Note:
It is not possible to start or restart Data Pump jobs on one instance in a Real Application Clusters (RAC) environment if there are Data Pump jobs currently running on other instances in the RAC environment. [This note added per mail from Steve Dipirro on 8/30/04.]

Data Pump Export and Import both support a network mode in which the job’s source is a remote Oracle instance. This is an overlap of unloading the data, using Export, and loading the data, using Import, so those processes don’t have to be serialized. A database link is used for the network. You don’t have to worry about allocating file space because there are no intermediate dump files.
Data Pump supports the Flashback infrastructure, so you can perform an export and get a dump file set that is consistent with a specified point in time or SCN.
When transporting a tablespace, Data Pump Export and Import are still used to handle the extraction and recreation of the metadata for that tablespace.

Data Pump Export Modes:

Full Export Mode

Restrictions

The following system schemas are not exported as part of a Full export because the metadata they contain is exported as part of other objects in the dump file set: SYS, ORDSYS, EXFSYS, MDSYS, DMSYS, CTXSYS, ORDPLUGINS, LBACSYS, XDB, SI_INFORMTN_SCHEMA, DIP, DBSNMP, and WMSYS.

Grants on objects owned by the SYS schema are never exported.

Schema Mode

Table Mode

Tablespace Mode

Transportable Tablespace Mode

The scripting process:

In this section I’ll explain the main command for a full dump or Full Export Mode (regardless the way we can script it) as in a basis:

*To see expdp help (like Oracle exp command) we have to pass help=yes (help=Y) on the command line.

Main line.

$ORACLE_HOME/bin/expdp user/password DIRECTORY=DATA_PUMP DUMPFILE=expdp_${ORACLE_SID}_`date '+%d%m%Y%HH'`_%U.dmp COMPRESSION=METADATA_ONLY FILESIZE=2147483648 FULL=Y LOGFILE=expFULL_EXPDP`date '+%d%m%Y%HH'`_$ORACLE_SID.log PARALLEL=2

Where $ORACLE_HOME and $ORACLE_SID must be defined previously.

“%U” on DUMPFILE name, makes a correlative series of dump’s file names. (Dump file template). Enough dump files are created to allow all processes specified by the current setting of the PARALLEL parameter to be active. If one of the dump files becomes full because its size has reached the maximum size specified by the FILESIZE parameter, it is closed, and a new dump file (with a new generated name) is created to take its place.

In order to create the proper environment to run the command a DBA, have to prepare the database.

Create the “directory” of DIRECTORY parameter, or “directory objects”.

Because Data Pump is server-based, rather than client-based, dump files, log files, and SQL files are accessed relative to server-based directory paths. Data Pump requires you to specify directory paths as directory objects. A directory object maps a name to a directory path on the file system.

For example, the following SQL statement creates a directory object named data_pump that is mapped to a directory located at /backup/INSTANCE/expdp.

SQL> CREATE OR REPLACE DIRECTORY data_pump as '/backup/INSTANCE/expdp';

The reason that a directory object is required is to ensure data security and integrity.
For example:

If you were allowed to specify a directory path location for an input file, you might be able to read data that the server has access to, but to which you should not.

If you were allowed to specify a directory path location for an output file, the server might overwrite a file that you might not normally have privileges to delete.

On Unix and Windows NT systems, a default directory object, DATA_PUMP_DIR, is created at database creation or whenever the database dictionary is upgraded. By default, it is available only to privileged users.

If you are not a privileged user, before you can run Data Pump Export or Data Pump Import, a directory object must be created by a database administrator (DBA) or by any user with the CREATE ANY DIRECTORY privilege.

After a directory is created, the user creating the directory object needs to grant READ or WRITE permission on the directory to other users. For example, to allow the Oracle database to read and write files on behalf of user hr in the directory named by data_pump, the DBA must execute the following command:

SQL> GRANT read,write ON DIRECTORY data_pump to USER;

Note that READ or WRITE permission to a directory object only means that the Oracle database will read or write that file on your behalf. You are not given direct access to those files outside of the Oracle database unless you have the appropriate operating system privileges. Similarly, the Oracle database requires permission from the operating system to read and write files in the directories.

Create the “directory” of Data Pump LOGS

SQL> CREATE OR REPLACE DIRECTORY data_pump_log as '/backup/INSTANCE/expdp';

Concede read, write permisions to the created directories.

SQL> GRANT read,write ON DIRECTORY data_pump_log to USER;

NOTE: The USER must have the EXP_FULL_DATABASE role enabled in order to make a FULL Data Pump export and quota on user's default_tablespace.

ie. (grant EXP_FULL_DATABASE to USER;) and (alter user "USER" quota unlimited on users;)

Data Pump job creates a Master Table in which the entire record of the job is maintained, that's why it's necesary to have write permissions on the user's default tablespace.

We can query directories and privileges:

SELECT privilege, directory_name
FROM user_tab_privs t, all_directories d
WHERE t.table_name(+)=d.directory_name

ORDER BY 2,1;

SELECT grantee, t.owner ,privilege, directory_name, directory_path
FROM user_tab_privs t, all_directories d
WHERE t.table_name(+)=d.directory_name
ORDER BY 2,1;

Data Pump Export and Import use the following order of precedence to determine a file's location:

1. If a directory object is specified as part of the file specification, then the location specified by that directory object is used. (The directory object must be separated from the filename by a colon.)

2. If a directory object is not specified for a file, then the directory object named by the DIRECTORY parameter is used.

3. If a directory object is not specified, and if no directory object was named by the DIRECTORY parameter, then the value of the environment variable, DATA_PUMP_DIR, is used. This environment variable is defined using operating system commands on the client system where the Data Pump Export and Import utilities are run. The value assigned to this client-based environment variable must be the name of a server-based directory object, which must first be created on the server system by a DBA.

For example, the following SQL statement creates a directory object on the server system. The name of the directory object is DUMP_FILES1, and it is located at '/usr/apps/dumpfiles1'.

SQL> CREATE DIRECTORY DUMP_FILES1 AS '/usr/apps/dumpfiles1';

Then, a user on a UNIX-based client system using csh can assign the value DUMP_FILES1 to the environment variable DATA_PUMP_DIR. The DIRECTORY parameter can then be omitted from the command line. The dump file employees.dmp, as well as the log file export.log, will be written to '/usr/apps/dumpfiles1'.

%setenv DATA_PUMP_DIR DUMP_FILES1

%expdp hr/hr TABLES=employees DUMPFILE=employees.dmp

4. If none of the previous three conditions yields a directory object and you are a privileged user, then Data Pump attempts to use the value of the default server-based directory object, DATA_PUMP_DIR. This directory object is automatically created at database creation or when the database dictionary is upgraded. You can use the following SQL query to see the path definition for DATA_PUMP_DIR:

SQL> SELECT directory_name, directory_path FROM dba_directories WHERE directory_name='DATA_PUMP_DIR';

If you are not a privileged user, access to the DATA_PUMP_DIR directory object must have previously been granted to you by a DBA.

Do not confuse the default DATA_PUMP_DIR directory object with the client-based environment variable of the same name.

Using Directory Objects When Automatic Storage Management Is Enabled

If you use Data Pump Export or Import with Automatic Storage Management (ASM) enabled, you must define the directory object used for the dump file so that the ASM disk-group name is used (instead of an operating system directory path). A separate directory object, which points to an operating system directory path, should be used for the log file. For example, you would create a directory object for the ASM dump file as follows:

SQL> CREATE or REPLACE DIRECTORY dpump_dir as '+DATAFILES/';

Then you would create a separate directory object for the log file:

SQL> CREATE or REPLACE DIRECTORY dpump_log as '/homedir/user1/';

To enable user hr to have access to these directory objects, you would assign the necessary privileges, for example:

SQL> GRANT READ, WRITE ON DIRECTORY dpump_dir TO hr;
SQL> GRANT READ, WRITE ON DIRECTORY dpump_log TO hr;

You would then use the following Data Pump Export command:

expdp hr/hr DIRECTORY=dpump_dir DUMPFILE=hr.dmp LOGFILE=dpump_log:hr.log

New Features in Oracle Database 10g Release 2

The following features have been added for Oracle Database 10g Release 2 (10.2):

The ability to perform database subsetting. This is done by using the SAMPLE parameter on an export operation or by using the TRANSFORM=PCTSPACE parameter on an import operation

The ability to compress metadata before it is written to a dump file set.

The ability to encrypt column data on an export operation and then to access that data on an import operation

The ability to downgrade a database through use of the VERSION parameter.

New Features in Oracle Database 11g Release 1

Compress both, data and metadata, only data, only metadata, or no data during an export

Syntax and Description

COMPRESSION={ALL | DATA_ONLY | METADATA_ONLY | NONE}

· ALL enables compression for the entire export operation.

· DATA_ONLY results in all data being written to the dump file in compressed format.

· METADATA_ONLY results in all metadata being written to the dump file in compressed format. (This is the default).

· NONE disables compression for the entire export operation.

Restrictions:

· To make full use of all these compression options, the COMPATIBLE initialization parameter must be set to at least 11.0.0.

· The METADATA_ONLY option can be used even if the COMPATIBLE initialization parameter is set to 10.2.

· Specify additional encryption options in the following areas:

You can choose to encrypt both, data and metadata, only data, only metadata, no data, or only encrypted columns during an export.

You can specify a specific encryption algorithm to use during an export.

You can specify the type of security to use for performing encryption and decryption during an export. For example, perhaps the dump file set will be imported into a different or remote database and it must remain secure in transit. Or perhaps the dump file set will be imported onsite using the Oracle Encryption Wallet but it may also need to be imported offsite where the Oracle Encryption Wallet is not available.

· Overwrite existing dump files during an export operation..

· Specify that XMLType columns are to be exported in uncompressed CLOB format regardless of the XMLType storage format that was defined for them.

· During an export, specify a remap function that takes as a source the original value of the designated column and returns a remapped value that will replace the original value in the dump file.

· Automatic restart of workers on the same instance.

Additionally, Data Pump will now perform a one-time automatic restart of workers (on the same instance) that have stopped due to certain errors. For example, if someone manually stops a process, the worker is automatically restarted one time, on the same instance. If the process stops a second time, it must be manually restarted.

Automatic Diagnostic Repository Command Interpreter (ADRCI)

The Automatic Diagnostic Repository Command Interpreter (ADRCI) provides a way for you to work with the diagnostic data contained in the Automatic Diagnostic Repository (ADR). The ADR is a file-based repository for database diagnostic data, such as traces, dumps, the alert log, health monitor reports, and more. It has a unified directory structure across multiple instances and multiple products.

Command list

CREATE OR REPLACE DIRECTORY data_pump as '/backup/INSTANCE/expdp';

GRANT read,write ON DIRECTORY data_pump to USER;

CREATE OR REPLACE DIRECTORY data_pump_log as '/backup/INSTANCE/expdp';

GRANT read,write ON DIRECTORY data_pump_log to USER;

grant EXP_FULL_DATABASE to USER;

alter user "USER" quota unlimited on users;

$ORACLE_HOME/bin/expdp user/password DIRECTORY=DATA_PUMP DUMPFILE=expdp_${ORACLE_SID}_`date '+%d%m%Y%HH'`_%U.dmp COMPRESSION=METADATA_ONLY FILESIZE=2147483648 FULL=Y LOGFILE=expFULL_EXPDP`date '+%d%m%Y%HH'`_$ORACLE_SID.log PARALLEL=2

SELECT privilege, directory_name FROM user_tab_privs t, all_directories d WHERE t.table_name(+)=d.directory_name ORDER BY 2,1;

SELECT grantee, t.owner ,privilege, directory_name, directory_path FROM user_tab_privs t, all_directories d WHERE t.table_name(+)=d.directory_name ORDER BY 2,1;

Links, References:
database/utilities/index.html

database/utilities/htdocs/data_pump_overview.html

database/utilities/htdocs/datapump_faq.html

Oracle Database Documention Library 10g Release 2 (10.2)
Oracle Database Documention Library 11g Release 1 (11.1)

KUTEN

Tuesday, September 25, 2007

Oracle Data Pump, easy steps – Attach to an existing job

Monday, September 24, 2007

Oracle Data Pump, easy steps – (expdp) – Part one

Search Kuten

Labels

Blog Archive

Blogs.oracle.com Recent Posts

Alejandro Vargas' Blog

About Me