[LLM] Langchain Concepts

Cookbook 1 - Concepts
Cookbook 2 - Usecases

There are two types of language models, which in LangChain are called:

LLMs: this is a language model which takes a string as input and returns a string
ChatModels: this is a language model which takes a list of messages as input and returns a message

Most LLM applications do not pass user input directly into an LLM. Usually they will add the user input to a larger piece of text, called a prompt template, that provides additional context on the specific task at hand.

The output parser will transform the LLM output into formated outputs such as JSON.

Temperature: the randomness of the model. If temperature=0 the model will yield the same result for same prompt

diagram

Prompts

Either a string or a list of messages

from langchain.prompts import ChatPromptTemplate

template = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful AI bot. Your name is {name}."),
    ("human", "Hello, how are you doing?"),
    ("ai", "I'm doing well, thanks!"),
    ("human", "{user_input}"),
])

messages = template.format_messages(
    name="Bob",
    user_input="What is your name?"
)

Chat Messages

System: helpful background context that tell AI what to do
Human: Messages that are intended to represent the user
AI: Messages that show what the AI responded with

Selectors

from langchain.prompts.example_selector import SemanticSimilarityExampleSelector

llm = OpenAI(model_name="text-davinci-003")

example_prompt = PromptTemplate(
    input_variables=["input", "output"],
    template="Example Input: {input}\nExample Output: {output}",
)

# Examples of locations that nouns are found
examples = [
    {"input": "pirate", "output": "ship"},
    {"input": "pilot", "output": "plane"},
]

Chains

Combining different LLM calls and action automatically

from langchain.chains import LLMChain

llm = OpenAI(temperature=0.9)
prompt = PromptTemplate(
    input_variables=["food"],
    template="What are 5 vacation destinations for someone who likes to eat {food}?",
)
chain = LLMChain(llm=llm, prompt=prompt)
chain.run('test')

Simple Sequential Chains

Easy chains where you can use the output of an LLM as an input into another. Good for breaking up tasks

from langchain.llms import OpenAI
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.chains import SimpleSequentialChain

llm = OpenAI(temperature=1, openai_api_key=openai_api_key)

template = """Your job is to come up with a classic dish from the area that the users suggests.
% USER LOCATION
{user_location}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_location"], template=template)

# Holds my 'location' chain
location_chain = LLMChain(llm=llm, prompt=prompt_template)


template = """Given a meal, give a short and simple recipe on how to make that dish at home.
% MEAL
{user_meal}

YOUR RESPONSE:
"""
prompt_template = PromptTemplate(input_variables=["user_meal"], template=template)

# Holds my 'meal' chain
meal_chain = LLMChain(llm=llm, prompt=prompt_template)

overall_chain = SimpleSequentialChain(chains=[location_chain, meal_chain], verbose=True)

review = overall_chain.run("Rome")

Summarize Chain

Easily run through long numerous documents and get a summary. Check out this video for other chain types besides map-reduce

from langchain.chains.summarize import load_summarize_chain
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

loader = TextLoader('data/PaulGrahamEssays/disc.txt')
documents = loader.load()

# Get your splitter ready
text_splitter = RecursiveCharacterTextSplitter(chunk_size=700, chunk_overlap=50)

# Split your docs into texts
texts = text_splitter.split_documents(documents)

# There is a lot of complexity hidden in this one line. I encourage you to check out the video above for more detail
chain = load_summarize_chain(llm, chain_type="map_reduce", verbose=True)
chain.run(texts)

Agents: dynamically call chains based on user input

# example uses serpAPI which scrapes google search results
from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAI

tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(tools, llm, agent="zero-shot-react-description", verbose=True)
agent.run("Who is the current leader of Japan? What is the largest prime number that is smaller than their age?")

Tools

A ‘capability’ of an agent. This is an abstraction on top of a function that makes it easy for LLMs (and agents) to interact with it. Ex: Google search.

Toolkit

Groups of tools that your agent can select from

from langchain.agents import load_tools
from langchain.agents import initialize_agent
from langchain.llms import OpenAI
import json

llm = OpenAI(temperature=0, openai_api_key=openai_api_key)

serpapi_api_key='...'
toolkit = load_tools(["serpapi"], llm=llm, serpapi_api_key=serpapi_api_key)

agent = initialize_agent(toolkit, llm, agent="zero-shot-react-description", verbose=True, return_intermediate_steps=True)

response = agent({"input":"what was the first album of the" 
                    "band that Natalie Bergman is a part of?"})

Memory: Add state to chains and agents

Chat Message History

from langchain.memory import ChatMessageHistory

chat = ChatOpenAI(temperature=0, openai_api_key=openai_api_key)

history = ChatMessageHistory()

history.add_ai_message("hi!")

history.add_user_message("What is the capital of france?")

ai_response = chat(history.messages)

Models

Language Model

Text in and text out

1	lm = OpenAI(model_name="text-ada-001")

Chat Model

Takes a series of messages and returns a message output

1	chat = ChatOpenAI(temperature=1)

Text Embedding Model

Change text into a vector (a series of numbers that hold the semantic ‘meaning’ of the text).

1 2	embeddings = OpenAIEmbeddings() text_embedding = embeddings.embed_query("test")

Indexes: Structuring documents to LLM can work with them

Documents Loaders

1
2
3

from langchain.document_loaders import HNLoader # Hackernews loader
loader = HNLoader("https://news.ycombinator.com/item?id=232")
data = loader.load()

Text Splitters

If the documents are too long for the LLM, you need to split it up into chunks. Text splitters help with this.

from langchain.text_splitter import RecursiveCharacterTextSplitter

with open('data/PaulGrahamEssays/worked.txt') as f:
    pg_work = f.read()

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 150,
    chunk_overlap = 20,
)

texts = text_splitter.create(documents[pg_work])

Retrievers

Easy way to combine multiple documents with LLMs. Most widely supported: VectorStoreRetriever

from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings

loader = TextLoader('data/PaulGrahamEssays/worked.txt')
documents = loader.load()

# Get splitter ready
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 1000,
    chunk_overlap = 50,
)

# Split docs into texts
texts = text_splitter.split_documents(documents)

# Get embedding engine ready
embeddings = OpenAIEmbeddings(openai_api_key = openai_api_key)

# Embed your texts
db = FAISS.from_documents(texts, embeddings)

# Init your retriever, asking for just 1 document back
retriever = db.as_retriever()

docs = retriever.get_relevant_documents("What types of things did the author want to build?")

VectorStores

Databases to store vectors. Tables with a column for embeddings and a column for metadata

1	embedding_list = embeddings.embed_documents([text.page_content for text in texts])

Feature Store

Feast

from feast import FeatureStore

# You may need to update the path depending on where you stored it
feast_repo_path = "../../../../../my_feature_repo/feature_repo/"
store = FeatureStore(repo_path=feast_repo_path)