In the evolving landscape of enterprise architecture, organisations often find themselves managing a complex web of specialised databases - Neo4j for graph operations, traditional databases for document storage, and various other systems for different data needs. This fragmentation not only increases operational complexity but also significantly impacts costs and security management. However, Azure Cosmos DB's Gremlin API offers a compelling alternative: a unified platform that brings together graph capabilities with the robustness of a fully managed database service.
This article explores how to implement a Graph-based Retrieval Augmented Generation (RAG) system using Azure Cosmos DB's Gremlin API, eliminating the need for separate specialised graph databases. We'll demonstrate how this approach not only simplifies the technology stack but also provides enterprise-grade features like automatic scaling, global distribution, and comprehensive security controls - all while potentially reducing total cost of ownership.
Elastic Scalability throughput and Storage: Azure CosmosDB supports horizontally scalable graph databases meaning that as the graph database can grow naturally with your business. As more data is added, the system automatically handles distribution and balancing across multiple servers removing the need for manual intervention.
Global Reach, Local Performance: For businesses operating across multiple regions, Cosmos DB automatically replicates your knowledge graph worldwide. This means whether your team is in London, Singapore, or New York, they'll experience the same snappy response times when accessing information.
Simplified Development and Maintenance: Instead of multiple specialised databases, your development team can focus on one platform that handles everything. The familiar Gremlin query language means faster development cycles, while automatic indexing and managed services reduce maintenance overhead significantly.
Enterprise-Ready Security and Reliability: Built with enterprise needs in mind, Cosmos DB provides robust security features and reliability guarantees. Your sensitive business data is protected with encryption at rest and in transit, while automatic backups and regional failover ensure business continuity.
Cost-Effective Scaling: Rather than maintaining separate systems for different data needs, consolidating on Cosmos DB can lead to significant cost savings. You pay for what you use, and the elimination of multiple database licenses and maintenance costs can substantially reduce total ownership costs.
Imagine organising your company's information like a social network. Just as LinkedIn shows connections between people, positions, and companies, a graph database stores data by focusing on relationships. The graph database approach relies on persisting relationships in the storage layer instead, which leads to highly efficient graph retrieval operations.
Vertices (Nodes) and Edges
Properties
Each vertex and edge can have properties - additional information that describes them. For example:
title
, creation_date
, and file_size
date_authored
and version_number
Labels
Labels help categorise your vertices and edges. Just like tagging photos on social media:
Person (Label) Vertex: Sarah Properties: - department: "Engineering" - role: "Tech Lead" AUTHORED (Label) Edge: Sarah -> TechnicalSpec Properties: - date: "2024-01-13" - version: "1.0" |
This structure makes complex queries intuitive and efficient, perfect for modern knowledge management systems.
Traditional RAG systems, while effective, often miss crucial connections in your knowledge base. They treat documents as independent entities, relying heavily on vector similarity to find relevant information. This approach can lead to missing context and related documents that might be crucial for comprehensive answers.
Contextual Understanding: Instead of just finding similar documents, Graph RAG understands how information is connected. When your LLM asks about cloud security, it doesn't just see individual documents - it sees the entire web of related policies, implementations, and dependencies.
Relationship-Based Retrieval: Traditional RAG might miss crucial information that isn't semantically similar but is logically connected. Graph RAG follows actual relationships:
Enhanced Context:
# Traditional RAG - Limited to similarity search similar_docs = vector_search("cloud security compliance") # Graph RAG - Follows actual relationships compliance_docs = graph.traversal().V() \ .hasLabel("Document") \ .has("category", "cloud") \ .out("REFERENCES") \ .hasLabel("ComplianceRequirement") \ .in_("IMPLEMENTS") \ .hasLabel("SecurityControl") |
Hierarchical Understanding
# Trace document dependencies and impact dependency_chain = graph.traversal().V() \ .hasLabel("Document") \ .has("title", "Cloud Architecture") \ .repeat(out("DEPENDS_ON")) \ .until(__.not_(out("DEPENDS_ON"))) \ .path().by("title") |
By capturing these real-world connections, GraphRAG enhances your AI's capabilities in practical ways. It knows which documents rely on others, tracks how information flows through your organisation, and understands the ripple effects of changes. This means you get answers that aren't just accurate - they're relevant to your specific business context and consider all the important connections that might otherwise be missed.
Let's explore how to implement a GraphRAG system using Azure Cosmos DB's Gremlin API and LangChain. This implementation showcases how to structure and query your knowledge graph effectively for GenAI applications. For simplicity I will showcase simple connection, creation and utilisation of for Gremlin GraphRAG.
Our implementation centre around the AzureGremlinProcessor
class, which handles document processing and knowledge graph interactions. Here's how it works:
The integration of LangChain through the GremlinQAChain
and GremlinGraph
classes enables sophisticated document handling. When processing documents, LangChain intelligently manages the conversion of unstructured text into structured graph components while preserving semantic relationships. This is evident in our implementation where the RecursiveCharacterTextSplitter
ensures documents are divided meaningfully while maintaining context:
class AzureGremlinProcessor: def __init__(self, cosmosdb_name, cosmosdb_db_id, cosmosdb_db_graph_id, cosmosdb_access_key):
self.endpoint = f"wss://{cosmosdb_name}.gremlin.cosmos.azure.com:443/"
self.username = f"/dbs/{cosmosdb_db_id}/colls/{cosmosdb_db_graph_id}"
self.password = cosmosdb_access_key
self.graph = GremlinGraph(
url=self.endpoint,
username=self.username,
password=self.password
)
self.llm = AzureChatOpenAI(
openai_api_version=os.getenv('azure_openai_api_version'),
deployment_name=os.getenv('azure_openai_deployment_name'),
azure_endpoint=os.getenv('azure_openai_endpoint'),
api_key=os.getenv('azure_openai_api_key'),
temperature=0
)
self.qa_chain = GremlinQAChain.from_llm(
llm=self.llm,
graph=self.graph,
verbose=True,
allow_dangerous_requests=True,
query_prompt="g.V().hasLabel('chunk').values('content')"
)
self.text_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
|
The system processes documents intelligently by:
def process_document(self, file_path): # Create document node doc_node = Node( id=doc_id, type="document", properties={"content": document[0].page_content} ) # Create chunk nodes and relationships for i, chunk in enumerate(chunks): chunk_node = Node( id=chunk_id, type="chunk", properties={"content": chunk.page_content} ) |
The system leverages LangChain's GremlinQAChain
to provide context-aware responses:
def query_graph(self, question: str) -> str: return self.qa_chain.invoke(question) |
This implementation offers several advantages:
By structuring our RAG system this way, we create a powerful knowledge graph that understands not just the content of documents, but their relationships and dependencies as well.
When we process a document e.g. "deloitte-finance-2025-revisited.pdf" and query the system, we see the true value of graph-based retrieval. Consider this example query about Deloitte's 2025 predictions:
result = processor.query_graph("What happened in 2018 and list some key predictions for 2018.") |
The system returns a comprehensive response that demonstrates its understanding of temporal context and related information:
This response showcases how GraphRAG maintains temporal context within documents, connects related concepts and predictions and provides structured relevant information whilst synthesising information from across document sections.
The implementation demonstrates that GraphRAG isn't just about storing and retrieving information—it's about understanding and presenting knowledge in a way that preserves context and relationships, making it an invaluable tool GenAI knowledge management.
The power of GraphRAG comes alive when we examine how it structures and visualises document information within Azure Cosmos DB's graph database. The visual representation transforms complex document relationships into an intuitive network that enhances document retrieval and analysis capabilities (Note this is the Graph visual representation of our deloite-finance-2025-revisted.pdf document).
The visualisation demonstrates a document-centric knowledge graph where:
This visualisation demonstrates how GraphRAG transforms traditional document storage into an interconnected knowledge network, enabling more intelligent and context-aware document retrieval for GenAI applications.
The implementation of GraphRAG using Azure Cosmos DB's Gremlin API represents a significant advancement in how organisations can manage and utilise their knowledge bases. By unifying graph capabilities with enterprise-grade database features, this approach offers a compelling solution to the challenges of fragmented data systems and complex information retrieval.
The benefits of this approach extend beyond technical elegance—it delivers real business value through improved context awareness, reduced operational complexity, and enhanced cost efficiency. As organisations continue to grapple with growing volumes of interconnected information, the ability to understand and navigate relationships between documents and concepts becomes increasingly critical.
For organisations interested in exploring or implementing this solution, contact our team at Advancing Analytics.