20 October, 2025

The "Retrieval" for RAG using Semantic Search

 Reusing the sample code from my previous demo, I build a table with 130 sentences describing AI Tools and then use SQL to run the sort of queries that you'd expect RAG to use against an "internal knowledge source".

Here's a preview of the sentences loaded :


SQL> select count(*) from my_data_source;

  COUNT(*)
----------
       130

SQL> select my_sentence from my_data_source fetch first 5 rows only;

MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
        Adobe Firefly: A family of generative AI models integrated into Adobe's Creative Cloud suite. It is used for professional image editing and graphic
 design

        AIVA (Artificial Intelligence Virtual Artist): An AI composer that generates music in various styles for content creators and brands
        Amazon CodeWhisperer: An AI coding assistant from Amazon Web Services that provides contextual code recommendations
        Canva Magic Media: An AI image generator integrated into the user friendly design platform. It is used for creating visuals for social media and pr
esentations

        ChatGPT: An AI chatbot from OpenAI. It can engage in human like conversations, write code, summarize text, and create content such as stories and e
ssays


SQL> select my_sentence from my_data_source where id > 126 ;

MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
        Zubtitle.com   captions for video
        Cohere.ai   large language models
        Grok.com   personal AI assistant
        Claude.ai   advanced AI chatbot

SQL>


The embeddings are then computed and loaded into the target table using the same query (the HuggingFace model ALL_MINILM_L12_V2_AUGMENTED has already been loaded  with the ONNX RunTime engine with the same code as in the previous blog post) :


insert into my_data_vectors
select ds.id as id, my_sentence, to_vector(et.embed_vector) sentence_vector
from
    my_data_source ds,
    dbms_vector_chain.utl_to_embeddings(
       dbms_vector_chain.utl_to_chunks(dbms_vector_chain.utl_to_text(ds.my_sentence), json('{"normalize":"all"}')),
       json('{"provider":"database", "model":"ALL_MINILM_L12_V2_AUGMENTED"}')) t,
    JSON_TABLE(t.column_value, '$[*]' COLUMNS (embed_id NUMBER PATH '$.embed_id', embed_data VARCHAR2(4000) PATH '$.embed_data', embed_vector CLOB PATH '$.embed_vector')) et
/


I now test a few "Retrieval" queries thus :  (Note how the queries return different results with smaller VECTORE_DISTANCE as I refine them) 

SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : Video
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
  2    :text_variable := '&text_input';
  3    SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING  :text_variable as data) into :query_vector;
  4  END;
  5  /
old   2:   :text_variable := '&text_input';
new   2:   :text_variable := 'Video';

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
  2  FROM my_data_vectors
  3  ORDER BY 2
  4  FETCH FIRST 3 ROWS ONLY;

MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
        Clipchamp.com   quick video creation
          4.635E-001

        Lumen5.com   AI powered social video
          5.277E-001

        Synths.video   convert blogs into videos
           5.39E-001


SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------



SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : What AI Tool can I use to edit Videos ?
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
  2    :text_variable := '&text_input';
  3    SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING  :text_variable as data) into :query_vector;
  4  END;
  5  /
old   2:   :text_variable := '&text_input';
new   2:   :text_variable := 'What AI Tool can I use to edit Videos ?';

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
  2  FROM my_data_vectors
  3  ORDER BY 2
  4  FETCH FIRST 3 ROWS ONLY;

MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
        Wisecut.video   auto edit with AI
           2.62E-001

        Runwayml.com   AI video editing
          2.777E-001

        Gling.ai   YouTube video editor
          3.413E-001


SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------



SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : image
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
  2    :text_variable := '&text_input';
  3    SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING  :text_variable as data) into :query_vector;
  4  END;
  5  /
old   2:   :text_variable := '&text_input';
new   2:   :text_variable := 'image';

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
  2  FROM my_data_vectors
  3  ORDER BY 2
  4  FETCH FIRST 3 ROWS ONLY;

MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
        VanceAI.com   image enhancement
          5.268E-001

        Canva Magic Media: An AI image generator integrated into the user friendly design platform. It is used for creating visuals for social media and pr
esentations
          5.459E-001

        PicWish.com   photo editing
          5.696E-001


SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------



SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : How can I use AI to edit Images ?
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
  2    :text_variable := '&text_input';
  3    SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING  :text_variable as data) into :query_vector;
  4  END;
  5  /
old   2:   :text_variable := '&text_input';
new   2:   :text_variable := 'How can I use AI to edit Images ?';

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
  2  FROM my_data_vectors
  3  ORDER BY 2
  4  FETCH FIRST 3 ROWS ONLY;

MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
        Hotpot.ai   AI image editing
          2.805E-001

        Runwayml.com   AI video editing
          3.711E-001

        Wisecut.video   auto edit with AI
          3.921E-001


SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------



SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : Can I chat with an AI ?
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
  2    :text_variable := '&text_input';
  3    SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING  :text_variable as data) into :query_vector;
  4  END;
  5  /
old   2:   :text_variable := '&text_input';
new   2:   :text_variable := 'Can I chat with an AI ?';

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
  2  FROM my_data_vectors
  3  ORDER BY 2
  4  FETCH FIRST 3 ROWS ONLY;

MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
        Claude.ai   advanced AI chatbot
          2.989E-001

        Claude.ai   conversational assistant
          3.828E-001

        Grok.com   personal AI assistant
          4.025E-001


SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------



SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : coding
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
  2    :text_variable := '&text_input';
  3    SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING  :text_variable as data) into :query_vector;
  4  END;
  5  /
old   2:   :text_variable := '&text_input';
new   2:   :text_variable := 'coding';

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
  2  FROM my_data_vectors
  3  ORDER BY 2
  4  FETCH FIRST 3 ROWS ONLY;

MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
        Amazon CodeWhisperer: An AI coding assistant from Amazon Web Services that provides contextual code recommendations
          5.543E-001

        Replit.com   write and run code
          5.548E-001

        GitHub Copilot: An AI coding assistant that provides code suggestions and autocompletion within an IDE
          5.677E-001


SQL>
SQL>
----------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------------


SQL> ACCEPT text_input CHAR PROMPT 'Enter your query : '
Enter your query : Tell me of a Coding Assistant
SQL> VARIABLE text_variable VARCHAR2(1000)
SQL> VARIABLE query_vector VECTOR
SQL> BEGIN
  2    :text_variable := '&text_input';
  3    SELECT vector_embedding(ALL_MINILM_L12_V2_AUGMENTED USING  :text_variable as data) into :query_vector;
  4  END;
  5  /
old   2:   :text_variable := '&text_input';
new   2:   :text_variable := 'Tell me of a Coding Assistant';

PL/SQL procedure successfully completed.

SQL>
SQL>
SQL> SELECT my_sentence, vector_distance(sentence_vector , :query_vector, COSINE) as Calc_Vector_Distance
  2  FROM my_data_vectors
  3  ORDER BY 2
  4  FETCH FIRST 3 ROWS ONLY;

MY_SENTENCE
----------------------------------------------------------------------------------------------------------------------------------------------------
CALC_VECTOR_DISTANCE
--------------------
        GitHub Copilot: An AI coding assistant that provides code suggestions and autocompletion within an IDE
          4.306E-001

        Amazon CodeWhisperer: An AI coding assistant from Amazon Web Services that provides contextual code recommendations
          4.702E-001

        Replit: An online IDE with an integrated AI assistant that helps with code generation, debugging, and project creation
          4.734E-001


SQL>
SQL>

Thus, in the first pair of examples, if I refine (aka "prompt engineering") my query from "Video" to "What AI Tool can I use to edit Videos ?", I get a better set of responses with  a smaller VECTOR_DISTANCE from my query.

In the second pair, I change my query from "image" to "How can I use AI to edit Images ?"  I get an improved set of responses (again with a smaller VECTOR_DISTANCE).

In the last pair, when I change  my query from "coding" to "Tell me of a Coding Assistant", I get responses with a better match to my query.

As noted in my previous blog post, the "LLM" is actually a "MinLM" called "all-MiniLM-L12-v2" from HuggingFace.

No comments: