28 October, 2019

Basic Replication -- 10 : ON PREBUILT TABLE

In my previous blog post, I've shown a Materialized View that is built as an empty MV and subsequently populated by a Refresh call.

You can also define a Materialized View over an *existing*  (pre-populated) Table.

Let's say you have a Source Table and have built a Replica of it it another Schema or Database.  Building the Replica may have taken an hour or even a few hours.  You now know that the Source Table will have some changes every day and want the Replica to be updated as well.  Instead of executing, say, a TRUNCATE and INSERT, into the Replica every day, you define a Fast Refresh Materialized View over it and let Oracle identify all the changes (which, on a daily basis, could be a small percentage of the total size of the Source/Replica) and update the Replica using a Refresh call.


Here's a quick demo.

SQL> select count(*) from my_large_source;

  COUNT(*)
----------
     72447

SQL> grant select on my_large_source to hr;

Grant succeeded.

SQL> connect hr/HR@orclpdb1
Connected.
SQL> alter session enable parallel dml;

Session altered.

SQL> create table my_large_replica
  2  as select * from hemant.my_large_source
  3  where 1=2;

Table created.

SQL> insert /*+ PARALLEL (8) */
  2  into my_large_replica
  3  select * from hemant.my_large_source;

72447 rows created.

SQL>


So, now, HR has a Replica of the Source Table in the HEMANT schema.  Without any subsequent updates to the Source Table, I create the Materialized View definition, with the "ON PREBUILT TABLE" clause.

SQL> connect hemant/hemant@orclpdb1
Connected.
SQL> create materialized view log on my_large_source;

Materialized view log created.

SQL> grant select, delete on mlog$_my_large_source to hr;

Grant succeeded.

SQL> connect hr/HR@orclpdb1
Connected.
SQL>
SQL> create materialized view my_large_replica
  2  on prebuilt table
  3  refresh fast
  4  as select * from hemant.my_large_source;

Materialized view created.

SQL> select count(*) from hemant.my_large_source;

  COUNT(*)
----------
     72447

SQL> select count(*) from my_large_replica;

  COUNT(*)
----------
     72447

SQL>


I am now ready to add data and Refresh the MV.

SQL> connect hemant/hemant@orclpdb1
Connected.
SQL> desc my_large_source
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID_COL                                    NOT NULL NUMBER
 PRODUCT_NAME                                       VARCHAR2(128)
 FACTORY                                            VARCHAR2(128)

SQL> insert into my_large_source
  2  values (74000,'Revolutionary Pin','Outer Space');

1 row created.

SQL> commit;

Commit complete.

SQL> select count(*) from mlog$_my_large_source;

  COUNT(*)
----------
         1

SQL>
SQL> connect hr/HR@orclpdb1
Connected.
SQL> select count(*) from hemant.my_large_source;

  COUNT(*)
----------
     72448

SQL> select count(*) from my_large_replica;

  COUNT(*)
----------
     72447

SQL>
SQL> execute dbms_mview.refresh('MY_LARGE_REPLICA','F');

PL/SQL procedure successfully completed.

SQL> select count(*) from my_large_replica;

  COUNT(*)
----------
     72448

SQL>
SQL> select id_col, product_name
  2  from my_large_replica
  3  where factory = 'Outer Space'
  4  /

    ID_COL
----------
PRODUCT_NAME
--------------------------------------------------------------------------------
     74000
Revolutionary Pin


SQL>
SQL> select count(*) from hemant.mlog$_my_large_source;

  COUNT(*)
----------
         0

SQL>


Instead of rebuilding / repopulating the Replica Table with all 72,448 rows, I used the MV definition and the MV Log on the Source Table to copy over that 1 new row.

The above demonstration is against 19c.

Here are two older posts, one in March 2009 and the other in January 2012 on an earlier release of Oracle.


27 October, 2019

Basic Replication -- 9 : BUILD DEFERRED

A Materialized View can be created with all the target rows pre-inserted (and subsequently refreshed for changes).  This is the default behaviour.

However, it is possible to define a Materialized View without actually populating it.

You might want to take such a course of action for scenarios like :

1.  Building a number of Materialized Views along with a code migration but not wanting to spend time that would be required to actually populate the MVs  and deferring the population to a subsequent maintenance window after which the code and data will be referenced by the application/users

2.  Building a number of MVs in a Tablespace that is initially small but will be enlarged in the maintenance window to handle the millions of rows that will be inserted

3.  Building an MV definition without actually having all the "clean" Source Table(s) rows currently available, deferring the cleansing of data to a later date and then populating the MV after the cleansing

The BUILD DEFERRED clause comes in handy here.


Let's say that we have a NEW_SOURCE_TABLE (with many rows and/or with rows that are yet to be cleansed) and want to build an "empty" MV on it  (OR that this MV is one of a number of MVs that are being built together simply for migration of dependent code, without the data).

SQL> desc new_source_table
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 ID                                        NOT NULL NUMBER
 DATA_ELEMENT_1                                     VARCHAR2(15)
 DATA_ELEMENT_2                                     VARCHAR2(15)
 DATE_COL                                           DATE

SQL>
SQL> create materialized view log on new_source_table;
create materialized view log on new_source_table
*
ERROR at line 1:
ORA-12014: table 'NEW_SOURCE_TABLE' does not contain a primary key constraint


SQL> create materialized view log on new_source_table with rowid;

Materialized view log created.

SQL>
SQL> create materialized view new_mv
  2  build deferred
  3  refresh with rowid
  4  as select id as id_number,
  5  data_element_1 as data_key,
  6  data_element_2 as data_val,
  7  date_col as data_date
  8  from new_source_table
  9  /

Materialized view created.

SQL>


Notice that my Source Table currently does not have a Primary Key.  The MV Log can be created with the "WITH ROWID" clause in the absence of the Primary Key.
The Materialized View is also built with the ROWID as the Refresh cannot use a Primary Key.
Of course, you may well have a Source Table with a Primary Key.  In that case, you can continue to default using the Primary Key instead of the ROWID

Once the Source Table is properly populated / cleansed and/or the tablespace containing the MV is large enough, the MV is first refreshed with a COMPLETE Refresh and subsequently with FAST Refresh's.

SQL> select count(*) from new_source_table;

  COUNT(*)
----------
       106

SQL> execute dbms_mview.refresh('NEW_MV','C',atomic_refresh=>FALSE);

PL/SQL procedure successfully completed.

SQL>


Subsequently, when one or more rows are inserted/updated in the Source Table, the next Refresh is a Fast Refresh.

SQL> execute dbms_mview.refresh('NEW_MV','F');

PL/SQL procedure successfully completed.

SQL>
SQL> select mview_name, refresh_mode, refresh_method, last_refresh_type
  2  from user_mviews
  3  where mview_name = 'NEW_MV'
  4  /

MVIEW_NAME         REFRESH_M REFRESH_ LAST_REF
------------------ --------- -------- --------
NEW_MV             DEMAND    FORCE    FAST

SQL>


Thus, we started off with an empty MV and later used REFRESHs (COMPLETE and FAST) to populate it.


19 October, 2019

Basic Replication -- 8 : REFRESH_MODE ON COMMIT

So far, in previous posts in this series, I have demonstrated Materialized Views that set to REFRESH ON DEMAND.

You can also define a Materialized View that is set to REFRESH ON COMMIT -- i.e. every time DML against the Source Table is committed, the MV is also immediately updated.  Such an MV must be in the same database  (you cannot define an ON COMMIT Refresh across two databases  -- to do so, you have to build your own replication code, possibly using Database Triggers or external methods of 2-phase commit).

Here is a quick demonstration, starting with a Source Table in the HEMANT schema and then building a FAST REFRESH MV in the HR schema.

SQL> show user
USER is "HEMANT"
SQL> create table hemant_source_tbl (id_col number not null primary key, data_col varchar2(30));

Table created.

SQL> grant select on hemant_source_tbl to hr;

Grant succeeded.

SQL> create materialized view log on hemant_source_tbl;

Materialized view log created.

SQL> grant select on mlog$_hemant_source_tbl to hr;

Grant succeeded.

SQL>
SQL> grant create materialized view to hr;

Grant succeeded.

SQL> grant on commit refresh on hemant_source_tbl to hr;

Grant succeeded.

SQL>
SQL> grant on commit refresh on mlog$_hemant_source_tbl to hr;

Grant succeeded.

SQL>


Note : I had to grant the CREATE MATERIALIZED VIEW privilege to HR for this test case. Also, as the MV is to Refresh ON COMMIT, two additional object-level grants on the Source Table and the Materialized View Log are required as the Refresh is across schemas.

SQL> connect hr/HR@orclpdb1
Connected.
SQL> create materialized view hr_mv_on_commit
  2  refresh fast on commit
  3  as select id_col as primary_key_col, data_col as value_column
  4  from hemant.hemant_source_tbl;

Materialized view created.

SQL>


Now that the Materialized View is created successfully, I will test DML against the table and check that an explicit REFRESH call (e.g. DBMS_MVIEW.REFRESH or DBMS_REFRESH.REFRESH) is not required.

SQL> connect hemant/hemant@orclpdb1
Connected.
SQL> insert into hemant_source_tbl values (1,'First');

1 row created.

SQL> commit;

Commit complete.

SQL> select * from hr.hr_mv_on_commit;

PRIMARY_KEY_COL VALUE_COLUMN
--------------- ------------------------------
              1 First

SQL> connect hr/HR@orclpdb1
Connected.
SQL> select * from hr_mv_on_commit;

PRIMARY_KEY_COL VALUE_COLUMN
--------------- ------------------------------
              1 First

SQL>


The Materialized View in the HR schema was refreshed immediately, without an explicit REFRESH call.

Remember : An MV that is to REFRESH ON COMMIT must be in the same database as the Source Table.




12 October, 2019

Basic Replication -- 7 : Refresh Groups

So far, all my blog posts in this series cover "single" Materialized Views (even if I have created two MVs, they are independent of each other and can be refreshed at different schedules).

A Refresh Group is what you would define if you want multiple MVs to be refreshed to the same point in time.  This allows for
(a) data from transaction that touch multiple tables
or
(b) views of multiple tables
to be consistent in the target MVs.

For example, if you have SALES_ORDER and LINE_ITEMS tables and the MVs on these are refreshed at different times, you might see the ORDER (Header) without the LINE_ITEMs (or, worse, in the absence of Referential Integrity constraints, LINE_ITEMs without the ORDER (Header) !).

Here's a demo, using the HR  DEPARTMENTS and EMPLOYEES table with corresponding MVs built in the HEMANT schema.

SQL> show user
USER is "HR"
SQL> select count(*) from departments;

  COUNT(*)
----------
        27

SQL> select count(*) from employees;

  COUNT(*)
----------
       107

SQL> 
SQL> grant select on departments to hemant;

Grant succeeded.

SQL> grant select on employees to hemant;

Grant succeeded.

SQL>
SQL> create materialized view log on departments;

Materialized view log created.

SQL> grant select, delete on mlog$_departments to hemant;

Grant succeeded.

SQL>
SQL> create materialized view log on employees;

Materialized view log created.

SQL> grant select, delete on mlog$_employees to hemant;

Grant succeeded.

SQL>
SQL>


Having created the source MV Logs  note that I have to grant privileges to the account (HEMANT) that will be reading and deleting from the MV Logs.

Next, I setup the MVs and the Refresh Group

SQL> show user
USER is "HEMANT"
SQL>
SQL> select count(*) from hr.departments;

  COUNT(*)
----------
        27

SQL> select count(*) from hr.employees;

  COUNT(*)
----------
       107

SQL>
SQL>
SQL> create materialized view mv_dept
  2  refresh fast on demand
  3  as select department_id as dept_id, department_name as dept_name
  4  from hr.departments
  5  /

Materialized view created.

SQL>
SQL> create materialized view mv_emp
  2  refresh fast on demand
  3  as select department_id as dept_id, employee_id as emp_id,
  4  first_name, last_name, hire_date
  5  from hr.employees
  6  /

Materialized view created.

SQL>
SQL> select count(*) from mv_dept;

  COUNT(*)
----------
        27

SQL> select count(*) from mv_emp;

  COUNT(*)
----------
       107

SQL>
SQL> execute dbms_refresh.make(-
> name=>'HR_MVs',-
> list=>'MV_DEPT,MV_EMP',-
> next_date=>sysdate+0.5,-
> interval=>'sysdate+1');

PL/SQL procedure successfully completed.

SQL>
SQL> commit;

Commit complete.

SQL>


Here, I have built two MVs and then a Refresh Group called "HR_MVS".  The first refresh will be 12hours from now and every subsequent refresh will be after 24hours.  (The Refresh Interval must be set to what would be larger than the time taken to execute the actual Refresh).

However, I can manually execute the Refresh after new rows are populated into the source tables. First, I insert new rows

SQL> show user
USER is "HR"
SQL> insert into departments (department_id, department_name)
  2  values
  3  (departments_seq.nextval, 'New Department');

1 row created.

SQL> select department_id
  2  from departments
  3  where department_name = 'New Department';

DEPARTMENT_ID
-------------
          280

SQL> insert into employees(employee_id, first_name, last_name, email, hire_date, job_id, department_id)
  2  values
  3  (employees_seq.nextval, 'Hemant', 'Chitale', 'hkc@myenterprise.com', sysdate, 'AD_VP', 280);

1 row created.

SQL> select employee_id
  2  from employees
  3  where first_name = 'Hemant';

EMPLOYEE_ID
-----------
        208

SQL> commit;

Commit complete.

SQL>


Now that there are new rows, the target MVs must be refreshed together.

SQL> connect hemant/hemant@orclpdb1
Connected.
SQL> execute dbms_refresh.refresh('HR_MVS');

PL/SQL procedure successfully completed.

SQL> select count(*) from mv_dept;

  COUNT(*)
----------
        28

SQL> select count(*) from mv_emp;

  COUNT(*)
----------
       108

SQL>
SQL> select * from mv_dept
  2  where dept_id=280;

   DEPT_ID DEPT_NAME
---------- ------------------------------
       280 New Department

SQL> select * from mv_emp
  2  where emp_id=208;

   DEPT_ID     EMP_ID FIRST_NAME           LAST_NAME                 HIRE_DATE
---------- ---------- -------------------- ------------------------- ---------
       280        208 Hemant               Chitale                   12-OCT-19

SQL>


Both MVs have been Refresh'd together as an ATOMIC Transaction.  If either of the two MVs had failed to refresh (e.g. unable to allocate extent to grow the MV), both the INSERTs would be rolled back.  (Note : It is not a necessary requirement that both source tables have new / updated rows, the Refresh Group works even if only one of the two tables has new / updated rows).

Note : I have used DBMS_REFRESH.REFRESH (instead of DBMS_MVIEW.REFRESH) to execute the Refresh.

You can build multiple Refresh Groups, each consisting of *multiple* Source Tables from the same source database.
You would define each Refresh Group to maintain consistency of data across multiple MVs (sourced from different tables).
Besides the Refresh Group on two HR tables, I could have, within the HEMANT schema, more Refresh Groups on FINANCE schema tables as well.

(Can you have a Refresh Group sourcing from tables from different schemas ?  Try that out !)


What's the downside of Refresh Groups ?    
Undo and Redo !  Every Refresh consists of INSERT/UPDATE/DELETE operations on the MVs.  And if any one of the MVs fails to Refresh, the entire set of DMLs (across all the MVs in the Refresh Group) has to *Rollback* !