To extract answers from structured data, the first step is actually extracting relevant data from the database.
Once we obtained the most relevant data, inserting the data to a prompt with GPT, we can then get the answers easily.
Here we discuss how to leverage langchain and GPT to extract data from sql database according to natural language.
Set up a connection to the SQL database using the SQLDatabase utility:
from langchain.utilities import SQLDatabase
db = SQLDatabase.from_uri("sqlite:///mydatabase.db")Create a text-to-SQL chain that converts natural language questions to SQL queries:
from langchain.chains import create_sql_query_chain
sql_chain = create_sql_query_chain(chat_model, db)Where
chat_model
is a chat model like ChatGPT anddb
is the SQLDatabase instance.Pass the natural language question to the chain to generate the query:
question = "How many employees are there?"
query = sql_chain.invoke({"question": question})The returned query will contain the generated SQL query string:
SELECT COUNT(*) FROM Employee
Execute the query using the chain:
results = query_chain.invoke({"question": query}) | db.run
So in summary, you use create_sql_query_chain to create a chain that converts text to SQL, pass your natural language question to it, and it will return the generated SQL query that corresponds to that question.
Some relevant information can be found here as well here