Using AI to Explore Graph Databases with Neo4j and LangChain

Using AI to Explore Graph Databases with Neo4j and LangChain

Today, anyone can ask questions about complex data — and get instant answers — thanks to AI tools like OpenAI and LangChain. In this article, you’ll learn how to build a simple system that allows people to ask natural language questions like “Which hospitals in Austin have cardiologists?” and get accurate answers based on data stored in a graph database like Neo4j. With LangChain acting as the bridge between your questions and the database, and GPT-4 interpreting the results, the entire process becomes intuitive and accessible.

Graph databases like Neo4j are great at modeling real-world relationships. Unlike traditional databases that use rows and tables, Neo4j stores data as nodes (like doctors or hospitals) and relationships (like “WORKS_AT” or “LOCATED_IN”). This setup makes it really easy to represent things like healthcare networks.

The challenge is that Neo4j uses its own query language called Cypher. It’s powerful but not exactly beginner-friendly. That’s where LangChain and OpenAI come in. Instead of writing Cypher, you can just ask questions in plain English, and LangChain + GPT-4 take care of the rest — translating your question, querying the graph, and returning a clean answer.

Step 1: Install the Tools

To start, install the necessary Python libraries:

pip install openai langchain neo4j        

Step 2: Add Data to Neo4j

Let’s build a small graph of fictional hospitals and doctors in Austin, Texas. Go to your Neo4j browser and run:

CREATE (h1:Hospital {name: "Dell Seton Medical Center", city: "Austin"})
CREATE (h2:Hospital {name: "St. David's South Austin", city: "Austin"})

CREATE (d1:Doctor {name: "Dr. Ana Lopez", specialty: "Cardiology"})
CREATE (d2:Doctor {name: "Dr. Ben Kim", specialty: "Pediatrics"})

CREATE (d1)-[:WORKS_AT]->(h1)
CREATE (d2)-[:WORKS_AT]->(h2)        

Now you have a small graph with doctors and hospitals connected by relationships that looks like this :

Article content
Neo4j Visual

Now that your data is set up, connect to the Neo4j database in Python using the LangChain Neo4jGraph object:

from langchain.graphs import Neo4jGraph

graph = Neo4jGraph(
    url="bolt://localhost:7687",
    username="neo4j",
    password="your_password"
)
        

We then use GraphCypherQAChain from LangChain and ChatOpenAI to connect our graph to GPT-4 and start asking questions in English:

from langchain.chains import GraphCypherQAChain
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0, model="gpt-4")

chain = GraphCypherQAChain.from_llm(
    llm=llm,
    graph=graph,
    verbose=True
)        

Now, you can ask your first question about Austin healthcare data:

chain.run("Which doctors work in Austin?")        

Behind the scenes, LangChain interprets the user’s question and prompts GPT-4 to generate a Cypher query, such as:

MATCH (d:Doctor)-[:WORKS_AT]->(h:Hospital {city: "Austin"}) RETURN d.name        

Neo4j runs the query and sends back the results. GPT-4 then turns it into a readable response like:

“Dr. Ana Lopez works at Dell Seton Medical Center, and Dr. Ben Kim works at St. David’s South Austin.”

While this is already a powerful setup, we can go further. Neo4j supports vector indexing, which lets us store embeddings (numerical representations of text) inside the graph. This is useful in Retrieval-Augmented Generation (RAG), where a user’s query is matched with semantically similar information—even if there’s no exact keyword match. For example, someone might ask, “Who handles heart conditions near downtown Austin?” and the system can match that with “Cardiology” even if the word “heart” wasn’t explicitly in the database.


Article content

In practical terms, this setup can be used to build intelligent search tools for healthcare systems, school directories, product databases, or customer support knowledge bases. The main advantage is that users don’t need to know how the data is stored or how to query it — they just ask questions, and the system handles the rest.

To view or add a comment, sign in

More articles by Schadrack Karekezi

Insights from the community

Others also viewed

Explore topics