Crystalstructure Agent

Crystalstructure Agent#

As a first demonstration the langchain team decided to develop a LLM agent, which can predict the crystal structure of a chemical element, by accessing the reference database of the Atomic Simulation Environment. While the corresponding python function is simple and only requires a few lines of code, this example already limits the hallucinations of the LLM, by using the langchain framework to interface the LLM with the python function.

In particular, we follow the Custom Agent tutorial from the Langchain documentation.

Python Function#

For this first example, we use OpenAI as LLM provider but the example can also be adjusted to work with other LLM providers, for more details check the langchain documentation. We store the OpenAI API key in the OPENAI_API_KEY variable:

from getpass import getpass

OPENAI_API_KEY = getpass(prompt='Enter your OpenAI Token:')

As a next step, we import the corresponding functionality from ASE and the tool decorator from langchain:

from ase.data import reference_states, atomic_numbers
from langchain_core.tools import tool

For the python function, it is important to include type hints and documentation based on a Docstring for the LLM to understand the functionality of the function. Finally, all data types used as input or output of the function need to have a JSON representation so they can be communicated to the LLM. For example, numpy arrays have to be converted to standard python lists.

@tool
def get_crystal_structure(chemical_symbol: str) -> str:
    """Returns the atomic crystal structure of a chemcial symbol"""
    ref_state = reference_states[atomic_numbers[chemical_symbol]]
    if ref_state is None:
        return "No crystal structure known."
    else:
        return ref_state["symmetry"]

After applying the decorator, the functions can be called using invoke(). We validate the functionality for two elements iron (Fe) and gold (Au).

get_crystal_structure.invoke("Fe")

'bcc'

get_crystal_structure.invoke("Au")

'fcc'

Define Agent#

After the definition of the Python function, the next step is the definition of the agent which the LLM uses to interact with the Python function. In this example the ChatOpenAI interface of the langchain_openai package is used. Depending on your configuration, it might be necessary to install the langchain_openai package using the following command:

conda install -c conda-forge langchain-openai

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-3.5-turbo", 
    temperature=0, 
    openai_api_key=OPENAI_API_KEY,
)

Following the definition of the LLM, the next step is the definition of the prompt. Here we start with a very basis prompt. In the following section on prompt engineering the optimization of the prompt is discussed in more detail.

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "You are very powerful assistant, but don't know current events.",
        ),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

Finally, the different parts are plugged together, by creating an agent which combines the prompt with the python function referenced here as one tool in a list of potentially many tools and an executor to communicate with the agent. The technical details are discussed in the corresponding langchain Custom Agent tutorial.

from langchain.agents import AgentExecutor
from langchain.agents.format_scratchpad.openai_tools import (
    format_to_openai_tool_messages,
)
from langchain.agents.output_parsers.openai_tools import OpenAIToolsAgentOutputParser

tools = [get_crystal_structure]
agent = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        ),
    }
    | prompt
    | llm.bind_tools(tools)
    | OpenAIToolsAgentOutputParser()
)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

Conversation#

Once the AgentExecutor is defined we can communicate with the LLM using the stream() interface. We repeat the test from above and ask for the crystal structure of gold.

lst = list(agent_executor.stream({"input": "What is the crystal structure of gold"}))  # Yeah this worked !!

> Entering new AgentExecutor chain...

Invoking: `get_crystal_structure` with `{'chemical_symbol': 'Au'}`

fccThe crystal structure of gold is face-centered cubic (fcc).

> Finished chain.

With the verbose=True parameter the internal steps of the LLM agent are printed in green. As a first step the agent calls the get_crystal_structure() already with the converted input parameter, rather than using gold as input it uses the chemical symbol Au. The function returns fcc and the LLM converts this answer in a sentence a human can understand:

The crystal structure of gold is face-centered cubic (fcc).

This example highlights how easy it is these days to make a python function accessible via a LLM for all kinds of users to interact with this python funtion.

Prompt Engineering#

This first agent can start to hallucinate rather quickly, for example by asking for the crystal structure of a car, it does not ask which elements a car typically consists of, but rather connects car with carbon and replies the crystal structure of a car is diamond, which is obviously wrong.

lst = list(agent_executor.stream({"input": "What is the crystal structure of car"}))  # I did not know cars were made of carbon

> Entering new AgentExecutor chain...

Invoking: `get_crystal_structure` with `{'chemical_symbol': 'C'}`

diamondThe crystal structure of carbon (C) is diamond.

> Finished chain.

To restrict the hallucination of the agent we extend the system prompt with the following statement:

For each query vailidate that it contains a chemical element and otherwise cancel.    

prompt_improved = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            # "You are very powerful assistant, but don't know current events.",
            "You are very powerful assistant, but don't know current events. For each query vailidate that it contains a chemical element and otherwise cancel.",
        ),
        ("user", "{input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

agent_improved = (
    {
        "input": lambda x: x["input"],
        "agent_scratchpad": lambda x: format_to_openai_tool_messages(
            x["intermediate_steps"]
        ),
    }
    | prompt_improved
    | llm.bind_tools(tools)
    | OpenAIToolsAgentOutputParser()
)
agent_improved_executor = AgentExecutor(agent=agent_improved, tools=tools, verbose=True)

lst = list(agent_improved_executor.stream({"input": "What is the crystal structure of car"}))

> Entering new AgentExecutor chain...
I'm sorry, but the query does not contain a valid chemical element. Please provide a chemical symbol for an element to determine its crystal structure.

> Finished chain.

With the modified system prompt, the agent correctly replies that it was not able to determine the crystal structure of a car, because it fails to determine the chemical element a car consists of.

Summary#

By following the Custom Agent tutorial from the Langchain documentation, we were able to create a first simulation agent, which calls specialized python frameworks like ASE to address material-science specific questions. Still, it is important to carefully engineer the prompt of the agent, otherwise even these LLMs with access to specialized agents tend to hallucinate.