loader

Azure CosmosDB for GraphRAG: A Single-Platform Solution for GenAI

Introduction

In the evolving landscape of enterprise architecture, organisations often find themselves managing a complex web of specialised databases - Neo4j for graph operations, traditional databases for document storage, and various other systems for different data needs. This fragmentation not only increases operational complexity but also significantly impacts costs and security management. However, Azure Cosmos DB's Gremlin API offers a compelling alternative: a unified platform that brings together graph capabilities with the robustness of a fully managed database service.

This article explores how to implement a Graph-based Retrieval Augmented Generation (RAG) system using Azure Cosmos DB's Gremlin API, eliminating the need for separate specialised graph databases. We'll demonstrate how this approach not only simplifies the technology stack but also provides enterprise-grade features like automatic scaling, global distribution, and comprehensive security controls - all while potentially reducing total cost of ownership.

Benefits of Azure CosmosDB Gremlin API

Elastic Scalability throughput and Storage: Azure CosmosDB supports horizontally scalable graph databases  meaning that as the graph database can grow naturally with your business. As more data is added, the system automatically handles distribution and balancing across multiple servers removing the need for manual intervention.

Global Reach, Local Performance: For businesses operating across multiple regions, Cosmos DB automatically replicates your knowledge graph worldwide. This means whether your team is in London, Singapore, or New York, they'll experience the same snappy response times when accessing information.

Simplified Development and Maintenance: Instead of multiple specialised databases, your development team can focus on one platform that handles everything. The familiar Gremlin query language means faster development cycles, while automatic indexing and managed services reduce maintenance overhead significantly.

Enterprise-Ready Security and Reliability: Built with enterprise needs in mind, Cosmos DB provides robust security features and reliability guarantees. Your sensitive business data is protected with encryption at rest and in transit, while automatic backups and regional failover ensure business continuity.

Cost-Effective Scaling: Rather than maintaining separate systems for different data needs, consolidating on Cosmos DB can lead to significant cost savings. You pay for what you use, and the elimination of multiple database licenses and maintenance costs can substantially reduce total ownership costs.

Understanding Graph Databases

Imagine organising your company's information like a social network. Just as LinkedIn shows connections between people, positions, and companies, a graph database stores data by focusing on relationships. The graph database approach relies on persisting relationships in the storage layer instead, which leads to highly efficient graph retrieval operations.

Key Graph Components

Vertices (Nodes) and Edges

  • Vertices: Think of these as the nouns in your data - documents, people, or concepts
  • Edges: These are the verbs - the relationships that connect vertices, like "reports to" or "authored by"

Properties

Each vertex and edge can have properties - additional information that describes them. For example:

  • A document vertex might have properties like title, creation_date, and file_size
  • An "authored by" edge might have properties like date_authored and version_number

Labels

Labels help categorise your vertices and edges. Just like tagging photos on social media:

  • A vertex might have the label "Document" or "Person"
  • An edge might be labeled "AUTHORED" or "REFERENCES"
Person (Label)
  Vertex: Sarah
  Properties:
    - department: "Engineering"
    - role: "Tech Lead"
    
AUTHORED (Label)
  Edge: Sarah -> TechnicalSpec
  Properties:
    - date: "2024-01-13"
    - version: "1.0"

This structure makes complex queries intuitive and efficient, perfect for modern knowledge management systems.

GraphRAG: Enhancing GenAI RAG Systems

Traditional RAG systems, while effective, often miss crucial connections in your knowledge base. They treat documents as independent entities, relying heavily on vector similarity to find relevant information. This approach can lead to missing context and related documents that might be crucial for comprehensive answers.

Why GraphRAG Excels for GenAI

Contextual Understanding: Instead of just finding similar documents, Graph RAG understands how information is connected. When your LLM asks about cloud security, it doesn't just see individual documents - it sees the entire web of related policies, implementations, and dependencies.

Relationship-Based Retrieval: Traditional RAG might miss crucial information that isn't semantically similar but is logically connected. Graph RAG follows actual relationships:

    • Document A references Policy B
    • Policy B impacts Systems C and D
    • Systems C and D have specific compliance requirements

Enhanced Context:

# Traditional RAG - Limited to similarity search
similar_docs = vector_search("cloud security compliance")

# Graph RAG - Follows actual relationships
compliance_docs = graph.traversal().V() \
    .hasLabel("Document") \
    .has("category", "cloud") \
    .out("REFERENCES") \
    .hasLabel("ComplianceRequirement") \
    .in_("IMPLEMENTS") \
    .hasLabel("SecurityControl")

Hierarchical Understanding

# Trace document dependencies and impact
dependency_chain = graph.traversal().V() \
    .hasLabel("Document") \
    .has("title", "Cloud Architecture") \
    .repeat(out("DEPENDS_ON")) \
    .until(__.not_(out("DEPENDS_ON"))) \
    .path().by("title")

By capturing these real-world connections, GraphRAG enhances your AI's capabilities in practical ways. It knows which documents rely on others, tracks how information flows through your organisation, and understands the ripple effects of changes. This means you get answers that aren't just accurate - they're relevant to your specific business context and consider all the important connections that might otherwise be missed.

Implementing GraphRAG with Azure Cosmos DB

Let's explore how to implement a GraphRAG system using Azure Cosmos DB's Gremlin API and LangChain. This implementation showcases how to structure and query your knowledge graph effectively for GenAI applications. For simplicity I will showcase simple connection, creation and utilisation of for Gremlin GraphRAG.

Core Components

Our implementation centre around the AzureGremlinProcessor class, which handles document processing and knowledge graph interactions. Here's how it works: 

The integration of LangChain through the GremlinQAChain and GremlinGraph classes enables sophisticated document handling. When processing documents, LangChain intelligently manages the conversion of unstructured text into structured graph components while preserving semantic relationships. This is evident in our implementation where the RecursiveCharacterTextSplitter ensures documents are divided meaningfully while maintaining context:

class AzureGremlinProcessor:

    def __init__(self, cosmosdb_name, cosmosdb_db_id, cosmosdb_db_graph_id, cosmosdb_access_key):
        self.endpoint = f"wss://{cosmosdb_name}.gremlin.cosmos.azure.com:443/"
        self.username = f"/dbs/{cosmosdb_db_id}/colls/{cosmosdb_db_graph_id}"
        self.password = cosmosdb_access_key
       
        self.graph = GremlinGraph(
            url=self.endpoint,
            username=self.username,
            password=self.password
        )
       
        self.llm = AzureChatOpenAI(
            openai_api_version=os.getenv('azure_openai_api_version'),
            deployment_name=os.getenv('azure_openai_deployment_name'),
            azure_endpoint=os.getenv('azure_openai_endpoint'),
            api_key=os.getenv('azure_openai_api_key'),
            temperature=0
        )
       
        self.qa_chain = GremlinQAChain.from_llm(
            llm=self.llm,
            graph=self.graph,
            verbose=True,
            allow_dangerous_requests=True,
            query_prompt="g.V().hasLabel('chunk').values('content')"
            )
       
        self.text_splitter = RecursiveCharacterTextSplitter(
            chunk_size=1000,
            chunk_overlap=200
        )
Document Processing and Graph Creation

The system processes documents intelligently by:

  1. Breaking documents into meaningful chunks while preserving context
  2. Creating graph nodes for both documents and chunks
  3. Establishing relationships between chunks to maintain document flow
def process_document(self, file_path):
    # Create document node
    doc_node = Node(
        id=doc_id,
        type="document",
        properties={"content": document[0].page_content}
    )
    
    # Create chunk nodes and relationships
    for i, chunk in enumerate(chunks):
        chunk_node = Node(
            id=chunk_id,
            type="chunk",
            properties={"content": chunk.page_content}
        )
Intelligent Querying

The system leverages LangChain's GremlinQAChain to provide context-aware responses:

def query_graph(self, question: str) -> str:
    return self.qa_chain.invoke(question)

This implementation offers several advantages:

  1. Seamless integration with Azure services
  2. Automatic handling of document relationships
  3. Context-aware query responses
  4. Scalable document processing

By structuring our RAG system this way, we create a powerful knowledge graph that understands not just the content of documents, but their relationships and dependencies as well.

Running the System

When we process a document e.g. "deloitte-finance-2025-revisited.pdf" and query the system, we see the true value of graph-based retrieval. Consider this example query about Deloitte's 2025 predictions:

result = processor.query_graph("What happened in 2018 and list some key predictions for 2018.")

The system returns a comprehensive response that demonstrates its understanding of temporal context and related information:

This response showcases how GraphRAG maintains temporal context within documents, connects related concepts and predictions and provides structured relevant information whilst synthesising information from across document sections.

The implementation demonstrates that GraphRAG isn't just about storing and retrieving information—it's about understanding and presenting knowledge in a way that preserves context and relationships, making it an invaluable tool GenAI knowledge management.

Visualising the Knowledge Graph in Azure Cosmos DB

The power of GraphRAG comes alive when we examine how it structures and visualises document information within Azure Cosmos DB's graph database. The visual representation transforms complex document relationships into an intuitive network that enhances document retrieval and analysis capabilities (Note this is the Graph visual representation of our deloite-finance-2025-revisted.pdf document).

Graph Structure Overview

The visualisation demonstrates a document-centric knowledge graph where:

  1. The central node represents our main document "deloite-finance-2025-revisited.pdf" with its unique identifier and properties including:
    • Document type and label
    • File name and path
    • Document-specific metadata
  2. Surrounding nodes represent individual chunks of the document, connected to the central node through "contains" relationships. This structure enables:
    • Granular access to document sections
    • Maintained context through chunk relationships
    • Efficient navigation between related content

This visualisation demonstrates how GraphRAG transforms traditional document storage into an interconnected knowledge network, enabling more intelligent and context-aware document retrieval for GenAI applications.

Conclusion

The implementation of GraphRAG using Azure Cosmos DB's Gremlin API represents a significant advancement in how organisations can manage and utilise their knowledge bases. By unifying graph capabilities with enterprise-grade database features, this approach offers a compelling solution to the challenges of fragmented data systems and complex information retrieval.

The benefits of this approach extend beyond technical elegance—it delivers real business value through improved context awareness, reduced operational complexity, and enhanced cost efficiency. As organisations continue to grapple with growing volumes of interconnected information, the ability to understand and navigate relationships between documents and concepts becomes increasingly critical.

For organisations interested in exploring or implementing this solution, contact our team at Advancing Analytics.

Additional Resources

author profile

Author

Toyosi Babayeju