23 April, 2018

Domain Indexes -- 4 : CTXRULE Index

I have earlier provided simple demonstrations of CONTEXT and CTXCAT Indexes.

A CTXRULE Index can be used to build a Document Classification application.  This involves indexing a table of "queries" that define the classification.  Queries use the MATCHES clause.
(Note : Like the CONTEXT Index, a call to SYNC_INDEX is required before the rows are indexed).

SQL> create table common_query_classes
  2  (classification varchar2(64),
  3   query_text varchar2(4000));

Table created.

SQL> create index query_class_index
  2  on common_query_classes (query_text)
  3  indextype is ctxsys.ctxrule
  4  /

Index created.

SQL> 
SQL> insert into common_query_classes
  2  values ('Players','Gavaskar OR Tendulkar OR Chappell OR Imran OR Botham');

1 row created.

SQL> insert into common_query_classes
  2  values ('Grounds','Brabourne OR Wankhede OR Lords');

1 row created.

SQL> commit;

Commit complete.

SQL> 


Note that the query predicates are divided by the OR.  They are NOT listed in Alphabetical order.

Now, I test a few queries :

SQL> exec ctx_ddl.sync_index('QUERY_CLASS_INDEX');

PL/SQL procedure successfully completed.

SQL> 
SQL> select classification
  2  from common_query_classes
  3  where MATCHES (query_text,'Tendulkar is a Player at Brabourne') > 0
  4  /

CLASSIFICATION
----------------------------------------------------------------
Grounds
Players

SQL> 
SQL> select classification
  2  from common_query_classes
  3  where MATCHES (query_text,'Botham') > 0
  4  /

CLASSIFICATION
----------------------------------------------------------------
Players

SQL> select classification
  2  from common_query_classes
  3  where MATCHES (query_text, 'Kohli is a Player at Wankhede') > 0
  4  /

CLASSIFICATION
----------------------------------------------------------------
Grounds

SQL> 


Note that, since Kohli is not in the Players list, the last query doesn't return the Classification "Players".
.
.
.

22 April, 2018

Domain Indexes -- 3 : CTXCAT Index

In previous posts in December 2017, I had demonstrated a CONTEXT Index.

A CONTEXT Index is used for full-text retrieval from large pieces of text (or document formats stored in LOBs)

A CTXCAT Index is best suited for small fragments of text that are to be indexed with other relational data.

Before I begin with the CTXCAT index, in addition to the CTXAPP role (that I had granted during the earlier demonstration), the account also needs the CREATE TRIGGER privilege.

SQL> grant ctxapp to ctxuser;

Grant succeeded.

SQL> grant create trigger to ctxuser;

Grant succeeded.

SQL> 


I can now proceed with the CTXUSER demonstration.

SQL> connect ctxuser/ctxuser
Connected.
SQL> create table books
  2  (book_id  integer primary key,
  3   book_title varchar2(250) not null,
  4   book_author varchar2(80),
  5   book_subject varchar2(25),
  6   shelf_id  integer)
  7  /

Table created.

SQL> 
SQL> insert into books values
  2  (1,'A Study In Scarlet','Arthur Conan Doyle','Mystery',1);

1 row created.

SQL> insert into books values
  2  (2,'The Sign Of Four','Arthur Conan Doyle','Mystery',1);

1 row created.

SQL> insert into books values
  2  (3,'Murder On The Orient Express','Agatha Christie','Mystery',1);

1 row created.

SQL> insert into books values
  2  (4,'A Brief History of Time','Stephen Hawking','Science - Physics',2);

1 row created.

SQL> 
SQL> insert into books values
  2  (5,'2001: A Space Odyssey','Arthur C Clarke','Science Fiction',3);

1 row created.

SQL>
SQL> commit;

Commit complete.

SQL> 


Next, I specify what is called an Index Set -- which specifies the structured columns that are to be included in the CTXCAT Index.  I then define the CTXCAT Index on the BOOK_TITLE column.

SQL> begin
  2  ctx_ddl.create_index_set('books_set');
  3  ctx_ddl.add_index('books_set','book_subject');
  4  ctx_ddl.add_index('books_set','shelf_id');
  5  end;
  6  /

PL/SQL procedure successfully completed.

SQL> 
SQL> create index books_title_index
  2  on books (book_title)
  3  indextype is ctxsys.ctxcat
  4  parameters ('index set books_set')
  5  /

Index created.

SQL> 


Now, I can use the Index to query the table, using the CATSEARCH clause instead of the CONTAINS clause. My query includes both BOOK_TITLE and SHELF_ID

SQL> select book_title,book_author,book_subject,shelf_id
  2  from books
  3  where catsearch (book_title,'History','shelf_id=1') > 0
  4  /

no rows selected

SQL> select book_title,book_author,book_subject,shelf_id
  2  from books
  3  where catsearch (book_title,'History','shelf_id>1') > 0
  4  /

BOOK_TITLE
--------------------------------------------------------------------------------
BOOK_AUTHOR
--------------------------------------------------------------------------------
BOOK_SUBJECT      SHELF_ID
------------------------- ----------
A Brief History of Time
Stephen Hawking
Science - Physics     2


SQL> 


The CTXCAT Index that I built on BOOK_TITLE also includes BOOK_SUBJECT and SHELF_ID as indexed columns by virtue of the INDEX_SET called "BOOKS_SET".

Now, I add another row and verify if I need to Sync the index (as I had to do with the CONTEXT Index earlier).

SQL> insert into books
  2  values 
  3  (6,'The Selfish Gene','Richard Dawkins','Evolution',2);

1 row created.

SQL> commit;
SQL> select book_title,book_author,book_subject,shelf_id
  2  from books
  3  where catsearch (book_title,'Gene','book_subject > ''S'' ') > 0
  4  /

no rows selected

SQL> select book_title,book_author,book_subject,shelf_id
  2  from books
  3  where catsearch (book_title,'Gene','book_subject > ''E'' ') > 0
  4  /

BOOK_TITLE
--------------------------------------------------------------------------------
BOOK_AUTHOR
--------------------------------------------------------------------------------
BOOK_SUBJECT      SHELF_ID
------------------------- ----------
The Selfish Gene
Richard Dawkins
Evolution      2


SQL> 


Note, specifically, how I could use the BOOK_SUBJECT in the query as if looking up a separate index on BOOK_SUBJECT.
The new book was included in the index without a call to CTX_DDL.SYNC_INDEX as would be required for the CONTEXT IndexType.

The portion of the query that is on the BOOK_TITLE column does a Text search on this column but the portions on BOOK_SUBJECT an SHELF_ID behave as with regular indexes.


(I know  that some readers will dispute the subject categorization "Evolution"  but I deliberately threw that in so that I  could show a query that uses a predicate filter not on "Science").

.
.
.