[LLM] Langchain Concepts

Cookbook 1 - Concepts
Cookbook 2 - Usecases

There are two types of language models, which in LangChain are called:

  • LLMs: this is a language model which takes a string as input and returns a string
  • ChatModels: this is a language model which takes a list of messages as input and returns a message

Most LLM applications do not pass user input directly into an LLM. Usually they will add the user input to a larger piece of text, called a prompt template, that provides additional context on the specific task at hand.

The output parser will transform the LLM output into formated outputs such as JSON.

  • Temperature: the randomness of the model. If temperature=0 the model will yield the same result for same prompt

diagram

Prompts

Either a string or a list of messages

1
2
3
4
5
6
7
8
9
10
11
12
13
from langchain.prompts import ChatPromptTemplate

template = ChatPromptTemplate.from_messages([
("system", "You are a helpful AI bot. Your name is {name}."),
("human", "Hello, how are you doing?"),
("ai", "I'm doing well, thanks!"),
("human", "{user_input}"),
])

messages = template.format_messages(
name="Bob",
user_input="What is your name?"
)

Chat Messages

  • System: helpful background context that tell AI what to do
  • Human: Messages that are intended to represent the user
  • AI: Messages that show what the AI responded with

Selectors

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector

llm = OpenAI(model_name="text-davinci-003")

example_prompt = PromptTemplate(
input_variables=["input", "output"],
template="Example Input: {input}\nExample Output: {output}",
)

# Examples of locations that nouns are found
examples = [
{"input": "pirate", "output": "ship"},
{"input": "pilot", "output": "plane"},
]

Chains

Combining different LLM calls and action automatically

1
2
3
4
5
6
7
8
9
from langchain.chains import LLMChain

llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(
input_variables=["food"],
template="What are 5 vacation destinations for someone who likes to eat {food}?",
)
chain = LLMChain(llm=llm, prompt=prompt)
chain.run('test')

Simple Sequential Chains

Easy chains where you can use the output of an LLM as an input into another. Good for breaking up tasks

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains import SimpleSequentialChain

llm = OpenAI(temperature=1, openai_api_key=openai_api_key)

template = """Your job is to come up with a classic dish from the area that the users suggests.
% USER LOCATION
{user_location}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_location"], template=template)

# Holds my 'location' chain
location_chain = LLMChain(llm=llm, prompt=prompt_template)


template = """Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL
{user_meal}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_meal"], template=template)

# Holds my 'meal' chain
meal_chain = LLMChain(llm=llm, prompt=prompt_template)

overall_chain = SimpleSequentialChain(chains=[location_chain, meal_chain], verbose=True)

review = overall_chain.run("Rome")

Summarize Chain

Easily run through long numerous documents and get a summary. Check out this video for other chain types besides map-reduce

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = TextLoader('data/PaulGrahamEssays/disc.txt')
documents = loader.load()

# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# There is a lot of complexity hidden in this one line. I encourage you to check out the video above for more detail
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)
chain.run(texts)

Agents: dynamically call chains based on user input

1
2
3
4
5
6
7
8
# example uses serpAPI which scrapes google search results
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAI

tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
agent.run("Who is the current leader of Japan? What is the largest prime number that is smaller than their age?")

Tools

A ‘capability’ of an agent. This is an abstraction on top of a function that makes it easy for LLMs (and agents) to interact with it. Ex: Google search.

Toolkit

Groups of tools that your agent can select from

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAI
import json

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

serpapi_api_key='...'
toolkit = load_tools(["serpapi"], llm=llm, serpapi_api_key=serpapi_api_key)

agent = initialize_agent(toolkit, llm, agent="zero-shot-react-description", verbose=True, return_intermediate_steps=True)

response = agent({"input":"what was the first album of the"
"band that Natalie Bergman is a part of?"})

Memory: Add state to chains and agents

Chat Message History

1
2
3
4
5
6
7
8
9
10
11
from langchain.memory import ChatMessageHistory

chat = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)

history = ChatMessageHistory()

history.add_ai_message("hi!")

history.add_user_message("What is the capital of france?")

ai_response = chat(history.messages)

Models

Language Model

Text in and text out

1
lm = OpenAI(model_name="text-ada-001")

Chat Model

Takes a series of messages and returns a message output

1
chat = ChatOpenAI(temperature=1)

Text Embedding Model

Change text into a vector (a series of numbers that hold the semantic ‘meaning’ of the text).

1
2
embeddings = OpenAIEmbeddings()
text_embedding = embeddings.embed_query("test")

Indexes: Structuring documents to LLM can work with them

Documents Loaders

1
2
3
from langchain.document_loaders import HNLoader # Hackernews loader
loader = HNLoader("https://news.ycombinator.com/item?id=232")
data = loader.load()

Text Splitters

If the documents are too long for the LLM, you need to split it up into chunks. Text splitters help with this.

1
2
3
4
5
6
7
8
9
10
11
from langchain.text_splitter import RecursiveCharacterTextSplitter

with open('data/PaulGrahamEssays/worked.txt') as f:
pg_work = f.read()

text_splitter = RecursiveCharacterTextSplitter(
chunk_size = 150,
chunk_overlap = 20,
)

texts = text_splitter.create(documents[pg_work])

Retrievers

Easy way to combine multiple documents with LLMs. Most widely supported: VectorStoreRetriever

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

loader = TextLoader('data/PaulGrahamEssays/worked.txt')
documents = loader.load()

# Get splitter ready
text_splitter = RecursiveCharacterTextSplitter(
chunk_size = 1000,
chunk_overlap = 50,
)

# Split docs into texts
texts = text_splitter.split_documents(documents)

# Get embedding engine ready
embeddings = OpenAIEmbeddings(openai_api_key = openai_api_key)

# Embed your texts
db = FAISS.from_documents(texts, embeddings)

# Init your retriever, asking for just 1 document back
retriever = db.as_retriever()

docs = retriever.get_relevant_documents("What types of things did the author want to build?")

VectorStores

Databases to store vectors. Tables with a column for embeddings and a column for metadata

1
embedding_list = embeddings.embed_documents([text.page_content for text in texts])

Feature Store

Feast

1
2
3
4
5
from feast import FeatureStore

# You may need to update the path depending on where you stored it
feast_repo_path = "../../../../../my_feature_repo/feature_repo/"
store = FeatureStore(repo_path=feast_repo_path)