SQL*Loader
Environment
SQL*Loader
takes as input a control file, which controls the behavior of SQL*Loader, and
one or more datafiles. Output of the SQL*Loader is an Oracle database (where
the data is loaded), a log file, a bad file, and potentially a discard file.
Execution
Steps:
- Executes from the command prompt or parameter file.
- Reads the control file.
- Reads the data from the control file or from one or more datafiles.
- Loads the data in the database and logs the information in the log file.
- Places the rejected records in the bad file.
- Places the discarded records in the discard file.
1) The parameter file:
The
command line information can be saved in a parameter file. This parameter file
is executed from the command prompt. A parameter file has a .par extension.
Following is a sample parameter file.
Steps
to create a parameter file:
- Open a text editor.
- Type in each parameter with its corresponding value as shown in the parameter file.
- Save the file with the .par extension.
How
do you execute the parameter file?
At
the command prompt type sqlldr PARFILE = <parfile name>.
2) The Control File:
- The control file is a text file written in a language that SQL*Loader understands.
- The control file describes the task that the SQL*Loader is to carry out. The control file tells SQL*Loader where to find the data, how to parse and interpret the data, where to insert the data, and more.
- It also contains the names and locations of the bad file and the discard file.
- Some of above information (such as name and location of the input file) can also be passed to SQL*Loader as command-line parameters.
- It’s also possible for the control file to contain the actual data to be loaded. This is sometimes done when small amounts of data need to be distributed to many sites, because it reduces (to just one file) the number of files that need to be passed around.
A
sample control file is given below:
In
general, the control file has three main sections, in the following order:
a)
Session-wide information:
The
session-wide information contains the names of the input/output files for the
data load session. Apart from this, other SQL*Loader parameters can also be
listed in this section.
- The LOAD DATA statement is required at the beginning of the control file.
- INFILE * specifies that the data is found in the control file and not in an external data file.
- BADFILE ‘example1.bad’ indicates that all erroneous records must be stored in the file example1.bad.
- DISCARDFILE ‘example1.dsc’ indicates that all discarded records must be stored in the file example1.dsc.
b)
Table and Field_List Information:
The
INTO TABLE statement specifies the table into which data should be loaded. In
this case it is the dept table. By default, SQL*Loader requires the table to be
empty before it inserts any records.
FIELDS
TERMINATED BY specifies that the data is terminated by commas, but can also be
enclosed by quotation marks. Data types for all fields default to CHAR.
The
names of columns to load are enclosed in parentheses. Because no data type or
length is specified, the default is type CHAR with a maximum length of 255.
c)
Input Data:
BEGINDATA
specifies the beginning of the data. The data to be loaded is present below the
BEGINDATA command
3) Input Datafiles:
- The data to be loaded is contained in one or more datafiles if it is not contained in the control file.
- The data in the datafile can be in the fixed length format, variable length format, or in the stream record format.
a)
Fixed Length Format:
A
file is in the fixed record format when all the records in the datafile have
the same byte length. This format is not flexible but offers very good
performance.then the syntax for the INFILE command is – INFILE student.dat “fix
15″
The
syntax for letting SQL*Loader know that the data is in the fixed length format
is:
INFILE
datafile_name “fix n”
Here
INFILE datafile_name refers to the file that contains the data to be loaded.
“fix n” implies that each record in the datafile has a fixed byte length of n.
For
example if the name of the following datafile is student.dat and the byte
length of a record is 15 bytes
0001,
—–Rina, 0002, —-Harry, 0003,—–Sudha
b)
Variable Length Format:
A
file is in the variable record format when the length of each record varies.
The length of each record is included at the beginning of the record in the
datafile. This format provides some added flexibility over the fixed record
format and a performance advantage over the stream record format.
For
example, you can specify a datafile that is to be interpreted as being in
variable record format as follows:
INFILE
“datafile_name” “var n”
Here
n specifies the number of bytes in the record length field. If n is not
specified, SQL*Loader assumes a length of 5 bytes. If n is specified larger
than 40 it results in an error. The following datafile is random.dat and the
value for n is 3.
009hello,cd,010world,im,
012my,name
is,
SQL*Loader
reads the first 3 bytes to gather the length of the record. Here the first record
is 9 bytes long. After SQL*Loader has read 9 bytes, it reads the next 3 bytes
to find the size of this record which is 10 bytes long. It reads the next 10
bytes of the record and then finds the third record is 12 bytes long and so on.
c)
Stream Record Format:
A
file is in the stream record format when the records are not specified by size;
instead SQL*Loader forms records by scanning for the record terminator. Stream
record format is the most flexible format, but there can be a negative effect
on performance.
The
syntax for specifying the stream record format is as follows:
INFILE
datafile_name ["str terminator_string"]
The
terminator_string can be a ‘char_string’ which is a string of characters
enclosed in single or double quotation marks or a ‘hex_string’ which is a byte
string in hexadecimal format.
4) The Log File:
The
log file is a record of SQL*Loader’s activities during a load session. It
contains information such as the following:
- The names of the control file, log file, bad file, discard file, and data file
- The values of several command-line parameters
- A detailed breakdown of the fields and datatypes in the data file that was loaded
- Error messages for records that cause errors
- Messages indicating when records have been discarded
- A summary of the load that includes the number of logical records read from the data file, the number of rows rejected because of errors, the number of rows discarded because of selection criteria, and the elapsed time of the load
Always
review the log file after a load to be sure that no errors occurred, or at
least that no unexpected errors occurred. This type of information is written
to the log file, but is not displayed on the terminal screen.
5) The Bad File:
Whenever
you insert data into a database, you run the risk of that insert failing
because of some types of error. Integrity constraint violations undoubtedly
represent the most common type of error. However, other problems, such as the
lack of free space in a tablespace, can also cause insert operations to fail.
Whenever SQL*Loader encounters a database error while trying to load a record,
it writes that record to a file known as the bad file.
- If one or more records are rejected, the bad file is created and the rejected records are logged.
- If no records are rejected, then the bad file is not created.
6) The Discard File:
While
SQL*Loader is being executed it creates a discard file for records that do not
meet any of the loading criteria. The records contained in this file are called
discarded records. Discarded records do not satisfy any of the WHEN clauses
specified in the control file. These records differ from rejected records.
Discarded records do not necessarily have any bad data. A discarded record is
never inserted into the Oracle table.
A
discard file is created according to the following rules:
- You have specified a discard filename and one or more records fail to satisfy all of the WHEN clauses specified in the control file. (If the discard file is created, it overwrites any existing file with the same name, so be sure that you do not overwrite any files that you want to retain.)
- If no records are discarded, then a discard file is not created.
No comments:
Post a Comment