2011年5月30日星期一

Humble thoughts on data structure

 

Technorati Tags:

Data structure actually contains two words. The first one is data, which is the abstract representation of reality in the computer programs.

For the other word, structure, which means the way to organize one or more than more things. The main purpose to organize something is to make the access to them available and easier.

Hereafter, something are referred as a set of data. The set here is not the same concept of set in pure mathematics, nor the class set  in java collection framework. Here element in the set can be duplicated or not, depends on the specific scenario.

Sometimes, the set of data are organized as an array because maybe random access and indexed access to the elements in the set are necessary and important. Sometimes, the set of data are organized as a linked list because maybe it’s very difficult to allocate a large space at the beginning (a necessity when using array ) or the above features are not important. 

Generally, array and linked list are two main low-level data structures that widely used in the compute program. Know how to manipulate elements and design algorithms on these two structures are very important to programmers.

High level data structures like List, Map, Set , Tree, graph are all represented by the two elementary structures.

In order to master the manipulation of the two, some practices are necessary. Let’s take some examples of practices to learn arrays and linked lists in programs.

Equilibrium array

Equilibrium index of a sequence is an index such that the sum of elements at lower indexes is equal to the sum of elements at higher indexes. For example, in a sequence A:
A[0]=-7 A[1]=1 A[2]=5 A[3]=2 A[4]=-4 A[5]=3 A[6]=0

3 is an equilibrium index, because:
A[0]+A[1]+A[2]=A[4]+A[5]+A[6]

6 is also an equilibrium index, because:
A[0]+A[1]+A[2]+A[3]+A[4]+A[5]=0

(sum of zero elements is zero) 7 is not an equilibrium index, because it is not a valid index of sequence A.
Assume the sum of zero elements is equal zero. Write a function

int equi(int[] A);

that given a sequence, returns its equilibrium index (any) or -1 if no equilibrium indexes exist. Assume that the sequence may be very long.

 

Solution:

1. First get the sum of all the numbers in the array first. sum

2. adding the array from the left one by one. tempSum

3. if the sum – tempSum – A[i] == tempSum, then i is an equibibrium index.

int equip(int[] A){
  int sum = sum(A);
  int tempSum = 0;
  for(int i=0;i<A.length;i++){
   if(sum - tempSum - A[i] == tempSum){
    return i;
   }
   tempSum += A[i];
   }
   return -1;
}

long sum(int [] A){
int sum = 0;
for(int i=0;i<A.length;i++){
   sum += A[i];
  }
}

The best solution, which are very often available should be the O(N) complex one. Always try to find the solution with O(N) time complexity or less.

Given a positive integer number, how to find if it’s anagram.

Technorati Tags: ,

public class Symmetric {

    // Solution A

    int[] digits = new int[] {};

    boolean symmetrical(int i) {

        int n = i;

        int index = 0;

        while (n != 0) {

            digits[index++] = n % 10;

            n = n / 10;

        }

        return symmetrical(digits, 0, digits.length - 1);

    }

    //recursive way to just if the array is symmetrical

    boolean symmetrical(int[] digits, int begin, int end) {

        if (begin >= end)

            return true;

        return (digits[begin] == digits[end])

                && symmetrical(digits, begin + 1, end - 1);

    }

    //traditional way to judge if the array is symmetrical

    boolean symmetrical(int[] digits, int begin,int end){

      int length = (end - begin)/2;

      for(int i=0;i< length;i++){

       if(digits[begin+i] != digits[end-i]){ return false;}

      }

      return true;

    }

    //end of Solution A

    // Solution B

    //just compute the new number from the end to first.  This method is recommended since there is no need to assign an new int[] array.

    boolean symmetrical_B(int i) {

        int n = i;

        int m = 0;

        int t = 0;

        while (n != 0) {

            t = n % 10;

            m = m * 10 + t;

            n = n / 10;

        }

        return m == i;

    }

}

When using the first algorithm, the algorithm tries to convert an integer into an array. That’s really a space consuming method. For every single problem, please consider just using the existing space and data structure, don’t try to convert the existing data structure into another structure.

2011年5月24日星期二

Rownum in Oracle for filtering

 It's a traditional question in Oracle:

  How to select the 11th record to the 20 th record in a database table.
A. The simplest form:
  How to select the 1 to 10 records.
  Just the following sql:
  SELECT * FROM ta WHERE rownum < 11
  will select the first 10 records if there are 10 or more records in the database.
  SELECT 1 FROM ta WHERE rownum < 11
  will select 10 ones if there are 10 or more records in the database, if there is no record in the database, there will be no records returned by the SQL.

B. How to select the 11th to 20 th records.
  Please don't use :
  SELECT * FROM ta WHERE rownum BETWEEN 11 AND 21
    between means 11<=x<=21

  The statement will return no record. it's because the rownum of the statement begins with 0 and is a virtual column, it will add 1 unit a new row is caught. So, from beginning, the rownum will never meet the condition.
  Let's use another way:
 SELECT * FROM ( 
( SELECT T.*, ROWNUM ROWCOUNTER FROM ta T WHERE rownum <21 )
  ) WHERE ROWCOUNTER > 10

  Once, I thought the following statement may works:
   ( SELECT t.ctr_cont_no FROM ta WHERE rownum < 21 )
   MINUS
   (SELECT t.ctr_cont_no FROM ta WHERE rownum < 11)

  but it turns out the MINUS operator is an set operator, so the result may have less than 10 results. So use the following instead:
  (SELECT t.rowid, t.ctr_cont_no FROM ta WHERE ROWNUM <21 )
   MINUS
  (SELECT t.rowid,t.ctr_cont_no FROM ta WHERE ROWNUM < 11)

 it works but with an additional column.
  For
  (SELECT T.* FROM ta WHERE rownum < 21)
  MINUS
  (SELECT T.* FROM ta WHWER rownum < 10)

 I don't know the exact, then have the following test.
  create a single column table:
   create table sin_column_test (COLUMN1 NUMBER NOT NULL);
   insert into sin_column_test select 10 from all_objects where rownum <=100;


  issue the command:
    (select * from sin_column_test where rownum< 21 )
   minus
  (select * from sin_column_test where rownum < 11)


 There is no record returned. So T.* only means the columns defined in the DML SQL, not any virtual columns are included.

2011年5月16日星期一

Automating Oracle 10g Database Startup and shutdown on Linux

  Steps to configure the automatically startup and shutdown on Linux

   Once the instance is created, edit the "/etc/oratab" file setting the restart flag for each instance to 'Y'.
   iflex:/u01/app/oracle/product/10.1.0/db_1:Y

   Next, edit the dbstart batch
   vi $ORACLE_HOME/bin/dbstart
   edit the line
   ORACLE_HOME_LISTENER=$ORACLE_HOME
  
   Create a file called /etc/init.d/dbora as the root user, containing the following:
   #!/bin/sh
   # chkconfig: 345 99 10
   # description: Oracle auto start-stop script
  ORA_HOME=/u01/app/oracle/product/10.1.0/db_1
  ORA_OWNER=oracle

  if [ ! -f $ORA_HOME/bin/dbstart ]
  then
    echo "Oracle Startup: cannot start"
    exit
  fi

  case "$1" in
    'start')
    #start the oracle database
    #The following command assumes that the oracle
    #login will not prompt the user for any value
    su - $ORA_OWNER -c "$ORA_HOME/bin/dbstart $ORA_HOME"
    touch /var/lock/subsys/dbora
    ;;
    'stop')
    #stop the oracle database
    #The following command assumes that the oracle
    #login will not prompt the user for any value
    su - $ORA_OWNER -c "$ORA_HOME/bin/dbshut $ORA_HOME"
    rm -f /var/lock/subsys/dbora
    ;;
 esac

 
  Please pay attention two the second and the third comment line, it should begin with chkconfig and description to make it follow the chkconfig standard. Please refer the 'man chkconfig' to see the full version of this standard.
 
  Login as root, issue the following command:
  chmod 750 /etc/init.d/dbora
  Associate the dbora service with the appropriate run levels and set it to auto-start using the following command:
  chkconfig --add dbora
  The relevant instances should now startup/shutdown automatically at the system startup and shutdown.


2011年5月6日星期五

Tablespaces, Datafiles, and Control files

Tablespace---The primary logical database structures of any oracle database. datafiles correspond to each tablespace.

Oracle stores data logically in tablespace and physically in datafiles associated with the corresponding tablespace.  Objects store in tablespaces may span several datafiles but a datafile is a physical structure associated with only one tablespace.

    Database, tablespaces and datafiles:

         An Oracle database consists of one or more logical storage units called tablespaces, which collectively store all of the database’s data;

         Each tablespace in an Oracle database consists of one or more files called datafiles, which are physical structures that conform to the operating system in which Oracle is running.

         A database’s data is collectively stored in the datafiles that constitute each tablespace of the database.

Oracle-Managed Files

      Oracle-managed files eliminate the need for the DBA to directly manage the operating system files comprising an Oracle database.  The DBA can then specify operations in terms of database objects rather than filenames.  Oracle internally uses standard file system interfaces to create and delete files as needed for the following database structures:

          A. Tablespace

          B. Redo log files

          C. Control files

      The size of a tablespace is the size of the datafiles that consitute the tablespace. The size of the a database is the collective size of the tablespaces that consitute the database. The DBA can add a datafile to a tablespace; add a new tablespace; increase the size of a datafile. Tablespaces are divided into logical units of storage called segments, which are further divided into …

Bigfile tablespace

      Bigfile tablespace allows Oracle database to contain tablespaces made up of single large files rather than numerous smaller ones.

SYSTEM tablespace

     Every Oracle database contains a tablespace named SYSTEM, which Oracle creates automatically when the database is created. The SYSTEM tablespace is always online when the database is open.

     Create a locally managed SYSTEM tablespace or migrate an existing dictionary managed SYSTEM tablespace to a locally managed format.  With a locally managed SYSTEM tablespace, dictionary managed tablespace can not be created. If a tablespace is locally managed, it cannot be reverted back to being dictionary managed.

      The SYSTEM tablespace always contains the data dictionary tables for the entire database.  The data dictionary tables are stored in datafile 1.

      All data stored on behalf of stored PL/SQL program units (procedures, functions, packages and triggers) resides in the SYSTEM tablespace.

SYSAUX tablespace

     The SYSAUX tablespace is an auxiliary tablespace to the SYSTEM tablespace. Many database components use the SYSAUX tablespace as their default location to store data. The SYSAUX tablespace is always created during database creation or database upgrade. The SYSAUX tablespace provides a centralized location for database metadata that does not reside in the SYSTEM tablespace.  During normal database operation, the Oracle database server does not allow the SYSAUX tablespace to be dropped or renamed. Transportable tablespaces for SYSAUX is not supported.

Undo tablespace

      Undo tablespaces are special tablespaces used solely for storing undo information. Any other segment types (e.g tables or indexes) are not allowed to be created in the undo tablespace. 

      Each undo tablespace is composed of a set of undo files and is locally managed. Undo blocks are also grouped in extents and the status of each extent is represented in the bitmap.

Default Temporary Tablespace

      When the SYSTEM tablespace is locally managed, the DBA must define at least one default temporary tablespace when creating a database. A locally managed SYSTEM tablespace cannot be used for default temporary storage.  If SYSTEM is dictionary managed and if The DBA do not define default temporary tablespace when creating the database, the SYSTEM is still used for default temporary storage.

 

      Tablespace allocates space in extents. Tablespace can use two different methods to keep track of their free and used space:

          A. Locally managed tablespace: extent management by the tablespace

          B. Dictionary managed tablespace: extent management by the data dictionary.

       Locally Managed Tablespace: a tablespace that manages its own extents maintains a bitmap in each datafile to keep track of the free or used status of blocks in that datafile. Each bit in the bitmap corresponds to a block or a group of blocks. Oracle changes the bitmap values to show the new status of the blocks when an extent is allocated for freed for reuse.  These changes do not generate rollback information because they do not update tables in the data dictionary.

      Locally managed tablespaces’ advantages:

           1. automatically tracks adjacent free space, eliminating the need to coalesce free extents.

           2. avoids recursive space management operations. Such recursive operations can occur in dictionary managed tablespace if consuming or releasing space in an extent results in another operation that consumes or release space in a data dictionary table or rollback segment.

     The sizes of extents that are managed locally can be determined automatically by the system. Alternatively, all extents can have the same size in a locally managed tablespace and override object storage options.

     Segment Space Management in Locally Management Tablespace can be both AUTO and MANUAL.

 

Dictionary Managed Tablespace

     For a tablespace that uses the data dictionary to manage its extents, Oracle updates the appropriate tables in the data dictionary whenever an extent is allocated or freed for reuse. Oracle Also stores rollback information about each update of the dictionary tables.

 

Oracle supports multiple block sizes in a database. The standard block size is used for the SYSTEM tablespace. The DBA specifies the standard block size by setting the initialization parameter DB_BLOCK_SIZE, values from 2K to 32K.

A DBA can bring any tablespace other than the SYSTEM tablespace online(accessible) or offline(not accessible) whenever the database is open. The SYSTEM tablespace is always online when the database is open because the data dictionary must always be available to Oracle.

The primary purpose of read-only tablespace is to eliminate the need to perform backup and recovery of large, static portions of a database.  Oracle never updates the fiels of a read-only tablespace, and therefore the files can be reside on read-only media such as CD-ROMs or WORM drives.

Temporary tablespace for sort operations

    The DBA can manage space for sort operations more efficiently by designating one or more temporary tablespace exclusively for sorts.  Doing so effectively eliminates serialization of space management operations involved in the allocation and deallocation of sort space.  A single SQL operation can use more than one temporary tablespace for sorting. All operations that use sorts, including joins, index builds, ordering, computing aggregates and collecting optimizer statistics benefit from temporary tablespace.

    One or more temporary tablespace can be used only for sort segments. Sort segments are used when a segment is shared by multiple sort operations. One sort segment exists for every instance that performs a sort operation in a given tablespace.  Temporary tablespace provides performance improvements when you have multiple sorts that are too large to fit into memory.

Transport of Tablespace between database

     A transportable tablespace lets you move a subset of an Oracle database from one Oracle database to another, even cross different platforms.

     A tablespace repository is a collection of tablespace sets.  Tablespace repositories are built on file group repositories, but tablespace repositories only contain the files required to move or copy tablespaces between databases.

    A tablespace in an Oracle database consists of one or more physical datafiles.  a datafile can be associated with only one tablespace and only one database.

 

Datafiles

     When a datafile is first created, the allocated disk space is formatted but does not contain any user data. Oracle reserves the space to hold the data for future segments of the associated tablespace—it’s used exclusively by Oracle.

     Data associated with schema objects in a tablespace is physically stored in one or more of the datafiles that constitute the tabelspace. Schema object does not correspond to a specific datafile; a datafile is a repository for the data of any schema object within a specific tablespace.  The DBA can alter the size of a datafile after its creation or can specify that a datafile should dynamically grow as schema objects in the tablespace grow.

2011年5月4日星期三

Logical storage structures in Oracle database

Oracle allocates logical database space for all data in a database. The units of database space allocation are data blocks, extents, and segments.

At the finest level of granularity, Oracle stores data in data blocks( also called logical blocks, Oracle blocks or pages). One data block corresponds to a specific number of bytes of physical database space on disk.

The next level of logical database space is an extent. An extent is a specific number of contiguous data blocks allocated for storing a specific type of information.

A segment is a set of extents, each of which has been allocated for a specific data structure and all of which are stored in the same table-space.  Each table’s data is stored in its own data segment, which each index’s data is stored in its own index segment.  If the table or index is partitioned, each partition is stored in its own segment. When the existing extents of a segment are full, Oracle allocates another extent for that segment.  Because extents are allocated as needed, the extents of a segment may or may not be contiguous on disk.  A segment and all its extents are stored in one table-space. Within a tablespace, a segment can include extends from more than one file; The segment can span datafiles. Each extent can contain data from only one datafile.

If you allocate an extent to a specific instance(Manually storage management), the blocks are immediately allocated to the free list. However, if the extent is not allowed to a specific instance(Automatically storage management), then the blocks themselves are allocated only when the high water mark moves.  The high water mark is the boundary between used and unused space in a segment. 

Data blocks

Storage space in the datafiles of a database  is managed in units called data blocks.  At the OS level, all data is stored in bytes. Oracle block size is different from system blocks.

The standard block size is specified by the DB_BLOCK_SIZE initialization parameter.

Data block format is similar regardless of whether the data block contains table, index, or clustered data.

   A. common and variable Header: the header contains general block information, such as the block address and the type of segment ( for example, data or index).

   B. Table directory: This portion of the data block contains information about the table having rows in this block.

   C. Row directory: contains information about the actual rows in the block(including addresses for each row piece in the row data area). The space for a row is not reclaimed when the row is deleted. Oracle reuses this space only when new rows are inserted in the block.

    E. Overhead: The data block header, table directory, and row directory are referred to collectively as overhead.

    F. Row Data: contains table or index data, rows can span blocks.

    G. Free space: for insertion of new rows and for updates to rows that requir additional space. In data blocks allocated for the data segment of a table or cluster, or for the index segment of an index, free space can also hold transaction entries. A transaction entry is required in a block for each INSERT, UPDATE, DELETE and SELECT … FOR UPDATE statement accessing one or more rows in the block. The transaction entry tells that the block is now updating by this transaction, so others have to wait?????

Free Space Management

    Free space can be managed automatically or manually.

    Free space can be managed automatically inside database segments. The in-segment free/used space is tracked using bitmaps, as opposed to free lists. You specify automatic segment-space management when you create a locally managed tablespace. The specification then applies to all segments subsequently created in this tablespace.

Availability and  Optimization of Free Space in a Data Block

Two types of statement can increase the free space of one or more data blocks: DELETE statement, and UPDATE statements that update existing values to smaller values.  The released space from these types of statements is available for subsequent INSERT statements under the following conditions.

     1. If the INSERT statement is in the same transaction and subsequent to the statement that frees space, then the INSERT statement can use the space made available

     2. If the INSERT statement is in a separate transaction from the statement that frees space, then the INSERT statement can use the space made available only after the other transaction commits and only if the space is needed.

   Based in pre-conditions, Oracle will compress free spaces in a block.

Row Chaining and Migrating

   In two circumstances, the data for a row in a table maybe too large to fit into a single data block.

     1. The rows is too large to fit into one data block when it is first inserted. Then Oracle stores the data for the row in a chain of data blocks reserved for that segment. Row chaining most often occurs with large rows, such as rows that contain a column of data types LONG or LONG RAW. Row chaining in these cases is unavoidable.

      2. A row that originally fit into one data block is updatable so that the overall row length increases, and the block’s free space is already completely filled. Oracle migrates the data for the entire row to a new data block, assuming the entire row can fit in a new block. Oracle preserves the original row piece of a migrated row to point to the new block containing the migrated row. The rowid of a migrated row does not change.

When a row is chained or migrated, I/O performance associated with this row decreases because Oracle must scan more than one data block to retrieve the information for the row.

 

PCTFREE, PCTUSED and Row Chaining

For manually managed tablespaces, two space management parameters: PCTFREE and PCTUSED enable you to control the use of the free space of inserts and updatesd to the rows in all the data blocks of a particular segment.

   PCTFREE: sets the minimum percentage of a data block to be reserved as free space for possible updates to rows that already exist in that block. If set PCTFREE=20, this states 20% of each data block in this table’s data segment be kept free and available for possible updates to the existing rows already within each block.  New rows can be added to the row data area, and corresponding information can be added to the variable portions of the overhead area until the row data and overhead total 80% of the total block size.

   PCTUSED: set the minimum percentage of a block that can be used for row data plus overhead before new rows are added to the block.  After a data block is filled to the limit determined by PCTFREE, oracle considers the block unavailable for the insertion of new rows until the percentage of that block falls beneath the parameter PCTUSED. Until this value is achieved, Oracle uses the free space of the data block only for updates to rows already contained in the data block.

   Which means with PCTFREE and PCTUSED parameters, the free space is sometimes only available to update commands.

   For each data and index segment, oracle maintains one or more free lists---lists of data blocks that have been allocated for that segment’s extents and have free space greater than PCTFREE. This blocks are available for inserts.  If the free space in that block is not large enough and the block is at least PCTUSED, then Oracle takes the block off the free list. Multiple free lists for each segment can reduce contention for free lists when concurrent inserts take place.

  After issue a DELETE or UPDATE statement, Oracle processes the statement and checkes to see if the space being used in the block is now less than PCTUSED. If it is, then the block goes to the beginning of the transaction free list, and it is the first of the available blocks to be used in that transaction.

 

Overview of Extents

An Extent is made up a number of contiguous data blocks. When the existing space in a segment is completely used, Oracle allocates a new extent for the segment.  

When create a table, Oracle allocates to the table’s data segment an initial extent of a specified number of data blocks. When the initial extent become full and more space is required, Oracle automatically allocates an incremental extent for that segment. An incremental extent is a subsequent extent of the same or greater size than the previously allocated extent in that segment.

The header block of each segment contains a directory of the extents in that segment.

Storage parameters expressed in terms of extents define every segment.

Data dictionary managed table-space and locally managed table-space by bitmap.

The extents of a segment do not return to the table-space until you drop the schema object whose data is stored in the segment with some exceptions:

When extents are freed, Oracle modifies the bitmap in the data-file(for locally managed table-space) or updates the data dictionary ( for dictionary managed table-space) to reflect the regained extents as available space. Any data in the blocks of freed extents becomes inaccessible.

Extents in Non-clustered tables

As long as a non-clustered table exists or until you truncate the table, any data block allocated to its data segment remains allocated for the table.  Even if you delete all rows of a table, oracle does not reclaim the data blocks for use by other objects in the table-space. After the dropping of a non-clustered table, this space can be reclaimed when other extents require free space. Oracle reclaims all the extents and makes them available for other schema objects in the same tablespace.

In dictionary managed table-space,when a segment requires an extent larger than the available extents, Oracle identifies and combines contiguous reclaimed extents to form a larger one. this is called coalescing extents.

For clustered tables, if you drop one table in a cluster, the data segment remains for the other tables in the cluster, and no extents are de-allocated. You can also truncate clusters(except for hash clusters) to free extents. 

All extents allocated to an index segment remain allocated as long as the index exists.

Extents in Temporary Segments

When Oracle completes the execution of a statement requiring a temporary segment, Oracle automatically drops the temporary segment and returns the extents allocated for that segment to the associated table-space.  Multiple sorts, can use sort segments in temporary tablespaces designated exclusively for sorts. The sort segments are allocated only once for the instance and they are not returned after the sort, but remain available for other multiple sorts.  A temporary segment in a temporary table contains data for multiple statements of a single transaction or session. Oracle drops the temporary segment at the end of the transaction or session, returning the extents allocated for that segment to the associated tables.  So, there are two categories of temporary table-space, the one for temporary table and the one for multiple sorts.

Extents in Rollback Segments

Oracle periodically checks the rollback segments of the database to see if they have grown larger than their optimal size. If a rollback segment is larger than is optimal (that is , it has too many extents), then Oracle automatically de-allocates one or more extents from the roll-back segment.

 

Overview of Segments

A segment is a set of extents that contains all the data for a specific logical storage structure within a table-space.

Data Segments

A single data segment in an Oracle database holds all of the data for one of the following:

    1. A table that is not partitioned or clustered

    2. A partition of a partitioned table

    3. A cluster of tables

The specified storage parameters when creating the table/cluster affect the efficiency of data retrieval and storage for the data segment associated with the object.

Index Segment

Every non-partitioned index in an Oracle database has a single index segment to hold all of its data. For a partitioned index, every partition has a single index segment to hold its data.  Also the storage parameters directly affects the efficiency of data retrieval and storage.

Temporary Segments

      A. When processing queries, Oracle often requires temporary workspace for intermediate stages of SQL statement parsing and execution. Oracle automatically allocates this disk space called a temporary segment.  Typically, Oracle requires a temporary segments as a database area for sorting. Oracle does not create a segment if the sorting operation can be done in memory or if Oracle finds some other way to perform the operation using indexes. Operations that requires temporary segments include: CREATE INDEX/SELECT …ORDER/ SELECT DISTINCT/SELECT ..GROUP/SELECT …UNION/ SELECT …INTERSECT/SELECT …MINUS

      B. Segments in Temporary tables and their indexes. Oracle can also allocate temporary segments for temporary tables and indexes created on temporary tables. Temporary tables hold data that exists only for the duration of a transaction or session.

      Oracle allocates temporary segments differently for queries and temporary tables.

      A. Allocation of Temporary segments for queries: If no temporary table-space is defined for the user, then the default temporary table-space is the SYSTEM table-space.  Oracle drops temporary segments when the statement completes.

      B. Oracle allocates segments for a temporary table when the first INSERT into that table is issued. The first INSERT into a temporary table allocates the segments for the table and its indexes, creates the root page for the indexes, and allocates any LOB segments.  Segments for a temporary table are allocated in a temporary table-space for the user who created the temporary table. Oracle drops segments for a transaction/session-specific temporary table at the end of the transaction/session-specific temporary table at the end of the transaction/session.  If other transactions/sessions share the use of that temporary table, the segments containing their data remain in the table.  Means a temporary segment for every transaction/session.

Automatic Undo Management

    Oracle maintains information to nullify changes made to the database. This information consists of records of the actions of transactions, collectively known as undo. Oracle use undo to do the following:

    A. Rollback an active transaction

    B. Recover a terminated transaction

    C. Provide read consistency

    D. Recovery from logical corruptions

     Oracle recommends you use undo tablespaces to manage undo rather rollback segments.

     Undo mode provides a more flexible way to migrate from manual undo ma

Data Integrity and Triggers

An integrity constraint is a declarative way to define a business rule for a column of a table.  If an integrity constraint is created for a table and some existing table data does not satisfy the constraint, then the constraint can not be enforced.  After a constraint is defined, if any of the results of a DML statement violate the integrity constraint, then the statement is rolled back, and an error is returned.  Integrity constraint are defined with a table and are stored as part of the table’s definition in the data dictionary.

Supported Integrity constraint:

1. NOT NULL

2. UNIQUE KEY

3. CHECK.

4. PRIMARY KEY

5. FOREIGN KEY

Key is used in the definitions of several types of integrity constraints. Different types of keys include:

   1. Primary Key

   2. Unique Key

   3. Foreign Key

    4. Referenced Key

Triggers

Triggers are procedures written in PL/SQL, java or C that run (fire) implicitly whenever a table or view is modified or when some user actions or database system actions occur.   

 

SQL Statement categories:

    Data definition Language (DDL) Statements

    Data Manipulation Language (DML) Statements

    Transaction Control Statements

    Session Control Statements

     System control Statements

      Embedded SQL statements

2011年5月3日星期二

Structures used for recovery

Oracle uses several structures to provide complete recovery from an instance or disk failure: the redo log, undo records, a control file, and database backups.

The Redo Log: is a set of files that protect altered database data in memory that has not been written to the datafiles. The redo log can consist of the online redo log and the archived redo log.  The online redo log is a set of two or more online redo log files that record all changes made to the database, including uncommitted and committed changes.  The background process LGWR writes the redo entries sequentially to an online redo log file. LGWR writes redo entries continually, and it also writes a commit record every time a user process commits a transaction.

Filled online redo files can be manually or automatically archived before being reused, creating archived redo logs. ARCHIVELOG and NOARCHIVELOG options.

Undo Records: undo records are stored in undo tablespaces.  Oracle uses the undo data for a variety of purposes, including accessing before-images of blocks changed in uncommitted transactions. During database recovery, Oracle applies all changes recorded in the redo log and then uses undo information to roll back any uncommitted transactions.

Control files: The control files include information about the file structure of the database and the current log sequence number being written by LGWR. During normal recovery procedures, the information in a control file guides the automatic progression of the recovery operation.

Concurrency

Concurrency is the simultaneous access of the same data by many users. Without adequate concurrency controls, data could be updated or changed improperly, compromising data integrity. Destructive interactions between concurrent transactions must be prevented.

Oracle resolves such issues by using various types of locks and a multiversion consistency model.  This features are based on the concept of a transaction.

Read consistency:

read consistency supports the following features:

   1.  Guarantees that the set of data seen by a statement is consistent with respect to a single point in time and does not change during statement execution(statement-level read consistency)

   2. Ensures that readers of database data do not wait for writers or other readers of the same data.

   3.  Ensures that writers of database data do not wait for readers of the same data

   4. Ensures that writers only wait for other writers if they attempt to update identical rows in concurrent transactions.

   The simplest way to think of read consistency is to imagine each user operating a private copy of the database, hence the multi-version consistency model.

Read consistency, undo records and transactions: When an update occurs, the original data values changed by the update are recorded in the database undo records. As long as this update remains part of an uncommitted transaction, any user that later queries the modified data views the original data values.   Oracle uses current information in the system global area and information in the undo records to construct a read-consistent view of a table’s data for a query. 

The transaction is key to Oracle’s strategy for providing read consistency. This unit of committed or uncommitted SQL statements:

    1. Dictates the start point for read-consistent views generated on behalf of readers.

     2. Controls when modified data can be seen by other transactions of the database for reading or updating.

Read-only transactions: By default, oracle guarantees statement-level read consistency. The set of data returned by a single query is consistent with respect to a single point in time. In some situations, you might also require transaction-level read consistency. This is the ability to run multiple queries within a single transaction, all of which are read-consistent with respect to the same point in time, so that queries in this transaction do not see the effects of intervening committed transactions. If you want to run a number of queries against multiple tables and if you are not doing any updating ,you prefer a read-only transaction.

Locking Mechanisms

Oracle also uses locks to control concurrent access to data. When updating information, the data server holds that information with a lock until the update is submitted or committed.  Until that happens, no one else can make changes to the locked information.

Oracle provides unique non-escalating row-level locking. Oracle always locks only the row of information being updated. Oracle can lock an unlimited number or rows so users can work concurrently without unnecessary delays.

Automatic Locking: Oracle locking is performed automatically and requires no user action.

The two general types of locks are exclusive locks and share locks. Only one exclusive lock can be placed on a resource(such as a row or a table), Many share locks can be placed on a single resource.  Both exclusive and share locks always allow queries on the locked resource but prohibit other activity on the resource(such as updates and deletes).

Manual locking: …..

Quiesce Database:

How Oracle works

Most basic level of operations that Oracle performs. User and associated server process are on separate computers.

    1. An instance has started on the computer running Oracle (a Host or database server)

    2. A computer running an application ( a local computer or client workstation) runs the application in a user process. The client application attempts to establish a connection to the server using the proper Oracle Net Services driver.

    3. The server is running the proper Oracle Net Services driver. The server detects the connection request from the application and creates a dedicated server process on behalf of the user process.

    4. The user runs a SQL statement and commits the transaction.

    5. The server process receives the statement and checks the shared pool for any shared SQL area that contains a similar SQL statement. If a shared SQL area is found, then the server process checks the user’s access privileges to the requested data, and the previously existing shared SQL area is used to process the statement. If not, then a new shared SQL area is allocated for the statement, so it can be parsed and processed.

    6. The server process retrieves any necessary data values from the actual datafile (table) or those stored in the SGA.

    7. The server process modifies data in the system global area. The DBWn process writes modified blocks permanently to the disk when doing so is efficient. Because the transaction is committed, the LGWR process immediately records the transaction in the redo log file.

    8. If the transaction is successful, then the server process sends a message across the network to the application. If it is not successful, then an error message is transmitted.

    9. Throughout this entire procedure, the other background processes run, watching the conditions that require intervention. The database server also manages other users’ transactions and prevents contention between transactions that request the same data.

Overview of Accessing the database

Network connections

Oracle Net Services is Oracle’s mechanism for interfacing with the communication protocols used by the networks that facilitate distributed processing and distributed database.  Oracle Net Service supports communications on all major network protocols, including TCP/IP, HTTP, FTP and WebDAV.

Oracle Net, a component of Oracle Net Services, enables a network session from a client application to an Oracle database server. Once a network session is established, Oracle Net acts as the data courier for both the client application and the database server. It establishes and maintains the connection between the client application and database server, as well as exchanges messages between them. Oracle Net can perform these jobs because it is located on each computer in the network.

Starting up the database

Three steps to starting an Oracle database:

     1. Start an instance

     2. Mount the database

     3. Open the database

Oracle instance

Technorati Tags:

An oracle database server consists of an Oracle database and an Oracle instance. Every time a database is started, a system global area(SGA) is allocated and Oracle background processes are started. The combination of the background processes and memory buffers is called an Oracle instance.

Real Application Clusters: Multiple Instance Systems

Some hardware architectures like shared disk systems, enables multiple computers to share access to data, software, or peripheral devices. Real Application clusters (RAC) takes advantage of such architecture by running multiple instances that share a single physical database. RAC enables access to a single database by users on multiple computers with increased performance.

Instance Memory Structures: The basic memory structures are associated with Oracle: the system global area and the program global area.

System global area: SGA

SGA is a shared memory region that contains data and control information for one Oracle instance. Oracle allocates the SGA when an instance starts and de-allocate it when the instance shuts down. Each instance has its own SGA.

User currently connected to an Oracle database share the data in the SGA. Information stored in the SGA is divided into several types of memory structures, including the database buffers, redo log buffer and the shared pool.

   Database Buffer Cache: store the most recently used blocks of data. The buffer cache contains modified as well as unmodified blocks.

    Redo Log Buffer: the redo log buffer stores redo entries – a log of changes made to the database. The redo entries stored in the redo log buffers are written to an online redo log, which is used if database recovery is necessary.  The size of the redo log is static.

   Shared Pool of the SGA: contains shared memory constructs, such as shared SQL areas. A shared SQL area is required to process every unique SQL statement submitted to a database. A shared SQL area contains information such as the parse tree and execution plan for the corresponding statement. A single shared SQL area is used by multiple applications that issue the same statement, leaving more shared memory for other uses.

    Statement Handles or Cursors: A cursor is a handler or name for a private SQL area in which a parsed statement and other information for processing the statement are kept.

Program Global Area(PGA)

    PGA is a memory buffer that contains data and control information for a server process. A PGA is created by Oracle when a server process is started. The information in a PGA depends on the Oracle configuration.

Oracle Background Processes

Oracle creates a set of background process for each instance.  The background processes consolidate functions that would otherwise be handled by multiple Oracle program running for each user process.  Each Oracle instance can use several background processes.

Process Architecture

A process is a ‘thread of control’ or a mechanism in an OS that can run a series of steps.  A process generally has its own private memory area in which it runs.  An Oracle database server has two general types of processes: user processes and Oracle processes.

   User (Client) Process: user processes are created and maintained to run the software code of an application program. User processes also manage communication with the server process through the program interface.

   Oracle process: Oracle processes are invoked by other processes to perform functions on behalf of the invoking process.  Oracle creates server processes to handle requests from connected user processes. A server process communicates with the user process and interacts with Oracle to carry out requests from the associated user process.

    Oracle can be configured to vary the number of user processes for each server process. In a dedicated server configuration, a server process handles requests for a single user process. A shared server configuration lets many user processes share a small number of server process.  On some systems, the user and server processes are separate, while on others, they are combined into a single process. Client/Server systems separate the user and server processes.

Schema and common schema objects

Technorati Tags:

A schema is a collection of database objects. A schema is owned by a database user and has the same name as that user.  Schema objects are the logical structures that directly refer to the database’s data. Objects in the same schema can be in different tablespaces, and a tablespace can hold objects from different schemas.

Tables:

basic unit of data storage in an Oracle database. Database tables hold all user-accessible data. each table has columns and rows.

Indexes

Indexes are optional structures associated with tables. An Oracle index provides an access path to table data.  Indexes are useful when applications frequently query a table for a range of rows or a specific row.

Views

Views are customized presentations of data in one or more tables or other views. Views derive their data from the tables on which they are based, referred to as the base tables of the views.  Like tables, views can be queried, updated, inserted into and deleted from, with some restrictions. All operations performed on a view actually affect the base tables of the view.

Clusters

clusters are groups of one or more tables physically stored together because they share common columns and are often used together. Because related rows are physically stored together, disk access time improves.

Synonyms

A synonym is an alias for any table, view, materialized view, sequence, procedure, function, package, type, Java class schema object, user-defined object type, or another synonym. A synonym requires no storage other than its definition in the data dictionary.

Oracle data dictionary

Each Oracle database has a data dictionary. An Oracle data dictionary is a set of tables and views that are used as a read-only reference about the database. A data dictionary stores information about both the logical and physical structure of the database.  A data dictionary also stores the following information:

  1. The valid users of an oracle database

  2. Information about integrity constraints defined for tables in the database

  3. The amount of space allocated for a schema object and how much of it is in use.

  During database operation, Oracle reads the data dictionary to verify that schema objects exist and that users have proper access to them.