RAGenesis features three distinct Retrieval-Augmented Generation (RAG) pipelines, each designed to provide different modes of interaction with religious and philosophical texts like the Torah, Bible, Quran, Bhagavad Gita, and Analects. These pipelines vary in their retrieval mechanisms, the context they provide to the generative model (Llama 3), and the instruction prompts used to shape the AI’s responses. They all build on a shared foundation: texts are chunked into verses, embedded using models like all-MiniLM-L6-v2, stored in a Milvus vector database for semantic searches, and connected via a Semantic Similarity Network (SSN)—a graph where verses are nodes and edges represent cosine similarity above a threshold (e.g., 0.5 or 0.75). Centrality metrics from graph theory (e.g., degree, eigenvector, betweenness, closeness) help identify key verses. 26 20

1. The Oracle

This pipeline is integrated with the “Verse Uni Verse” page and focuses on exploratory, mystical exegesis. It emphasizes semantic similarity to the user’s query for retrieval.

  • Retrieval Mechanism: Performs a semantic search to retrieve the top 5 most similar verses across all texts in “Open” mode, or one verse from each selected text in “Ecumenical” mode. Similarity is calculated using cosine distance on embeddings.
  • Generative Prompt: Guides the model to respond in a mystical, thoughtful tone, appreciating the query and retrieved verses while providing interpretive insights.
  • Use Case: Ideal for broad, comparative queries, such as exploring themes like compassion across traditions, fostering ecumenical dialogue.

2. The Exegete

Linked to the “Semantic Network” page in “Main Verses” mode, this pipeline prioritizes structural importance in the SSN for a more interpretive, concise analysis.

  • Retrieval Mechanism: Identifies and retrieves verses with the highest centrality scores across four graph metrics (degree, eigenvector, betweenness, closeness). These “main verses” represent the most interconnected or influential ideas in the network.
  • Generative Prompt: Instructs the model to deliver responsible, succinct interpretations that distill the core messages from the source texts.
  • Use Case: Suited for extracting key teachings or overarching themes, such as summarizing central concepts in a text’s semantic structure.

3. The Scientist

Associated with the “Semantic Network” page in “All Verses” mode, this pipeline enables analytical exploration of the full graph structure.

  • Retrieval Mechanism: Allows users to select any verse, then retrieves its subgraph including the top 10 closest neighbors based on closeness centrality. This reveals relational patterns in the SSN.
  • Generative Prompt: Directs the model to take an analytical stance, explaining network structures, centrality relationships, and graph-theoretic implications.
  • Use Case: Best for in-depth structural analysis, like examining how verses cluster or connect across texts in a data-driven way.

These pipelines transform static texts into dynamic, AI-enhanced experiences, with the Oracle promoting unity, the Exegete focusing on essence, and the Scientist revealing underlying connections. 26