Back to Articles

Retrieval-Augmented Generation (RAG) - with Oracle Database

Retrieval-Augmented Generation (RAG) - with Oracle Database

What is RAG?

Retrieval-Augmented Generation (RAG) is a technique that combines:

  • Retrieval: Fetching relevant, up-to-date information from a data source (like a database, documents, or search engine).
  • Generation: Using a language model (LLM) to process that information and generate a response, summary, or decision.

How RAG is Used in Your Workflow (For example: A customer table in Oracle database)

  • The script retrieves live customer data from Oracle (retrieval).
  • It builds a prompt with this data and sends it to Gemini (generation) in my example.
  • Here Gemini generates a summary or decision based on the actual, current data

What if You Don’t Use RAG?

If you use only a language model (LLM) without retrieval:

  • The LLM relies solely on its training data, which may be outdated or generic.
  • It cannot access your real-time or private data (like your Oracle customers).
  • Answers may be less accurate, less relevant, or even hallucinated.

Example:

-- SQL script to create a sample 'customers' table for RAG demo
CREATE TABLE customers (
    customer_id NUMBER GENERATED BY DEFAULT AS IDENTITY PRIMARY KEY,
    customer_name VARCHAR2(100),
    account_status VARCHAR2(20),
    balance NUMBER(12,2)
);

-- Insert sample data
INSERT INTO customers (customer_name, account_status, balance) VALUES ('Alice', 'Active', 1200.50);
INSERT INTO customers (customer_name, account_status, balance) VALUES ('Bob', 'Inactive', 0.00);
INSERT INTO customers (customer_name, account_status, balance) VALUES ('Charlie', 'Active', 500.00);
INSERT INTO customers (customer_name, account_status, balance) VALUES ('Diana', 'Overdue', -50.00);
INSERT INTO customers (customer_name, account_status, balance) VALUES ('Eve', 'Active', 3000.00);
COMMIT;

  • Without RAG:
       “Which customers need attention?”
       → The LLM guesses based on general knowledge, not your actual data.

  • With RAG:
      “Which customers need attention?” (with your real customer data provided) → The LLM             analyses your actual data and gives a specific, actionable answer.

Key Differences

Aspect

With RAG

Without RAG

Data Freshness

Uses live, up-to-date data

Relies on old training data

Personalization

Answers are specific to your data

Answers are generic

Accuracy

High, as it uses real context

Lower, may hallucinate

Use Cases

Data-driven insights, automation, search

General Q&A, brainstorming

Here is the sample result, 

Prompt sent to Gemini: (use python script for RAG demo , where you can connect database and interact using   Gemini API)

Given the following customer data, summarize which customers need attention: Customer: Alice, Status: Active, Balance: 1200.5

Customer: Bob, Status: Inactive, Balance: 0.0
Customer: Charlie, Status: Active, Balance: 500.0
Customer: Diana, Status: Overdue, Balance: -50.0
Customer: Eve, Status: Active, Balance: 3000.0


Gemini API Output:

Based on the provided data, here's a summary of customers who likely need attention:

*   **Bob:** Inactive status and zero balance likely indicates churn. Should investigate why he's inactive and attempt to re-engage.

*   **Diana:** Overdue status and negative balance is a clear priority. Needs immediate attention to resolve the overdue payment.

Current Customers:

ID: 1, Name: Alice, Status: Active, Balance: 1200.5
ID: 2, Name: Bob, Status: Inactive, Balance: 0.0
ID: 3, Name: Charlie, Status: Active, Balance: 500.0
ID: 4, Name: Diana, Status: Overdue, Balance: -50.0
ID: 5, Name: Eve, Status: Active, Balance: 3000.0