nighca's log

Algebraic Effects

在 6 月 13 号的 update 里，OpenAI 介绍了被称作 Function calling 的 feature：

Developers can now describe functions to gpt-4-0613 and gpt-3.5-turbo-0613, and have the model intelligently choose to output a JSON object containing arguments to call those functions.

在这之前，LangChain 已经提供了类似的 Agent Tools：

A tool is an abstraction around a function that makes it easy for a language model to interact with it.

与 OpenAI Function calling 类似，它们都是在模型已有的能力之上实现的，而不要求对模型的训练行为或运行环境做特别的变更。不出意外的话，ChatGPT 对 Plugins 的集成也是通过类似的手段。

除了少数，如提供准确计算能力的 Calculator 等，大部分 Function / Tool 对应的能力都会直接与现实世界产生交互，官方文档中给到的例子包括发送邮件、天气查询、执行 SQL 等。

而一直以来大模型的运行环境是极其受限的，LLM 除了文本形式的输入与文本形式的输出外没有任何可以跟外部交互的途径；让 LLM 拥有直接调用其他 API 的能力是困难且危险的。所以这里 LangChain 们解决的问题会非常接近我们在函数式编程里会面临的问题：如何让计算过程产生副作用。

如果我们去细看执行过程，就会豁然开朗。

以 LangChain 文档中的 Agents: Dynamically Run Chains Based on User Input 为例，对于问题“Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?”，我们会预期模型先通过调用 Search API 查询得到 Olivia Wilde's boyfriend 的结果，然后接着查询得到其年龄，最后调用 Calculator 计算出年龄对应的数值计算 0.23 次方的结果。这实际上会体现为 SDK 与模型的多次交互：

第一次的 Prompt 是：

Answer the following questions as best you can. You have access to the following tools:

search: a search engine. useful for when you need to answer questions about current events. input should be a search query.
calculator: Useful for getting the result of a math expression. The input to this tool should be a valid mathematical expression that could be executed by a simple calculator.

Use the following format in your response:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [search,calculator]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?
Thought:

这里会定义清楚整个交互的规则，并抛出问题；得到的 Completion 是：

I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.
Action: search
Action Input: "Olivia Wilde boyfriend"

接着 SDK 会从 Completion 中取出 Action & Action Input 并调用 SerpApi，得到“Olivia Wilde boyfriend”的结果，使用其作为 Observation 接着构造第二次的 Prompt：

<这里的内容跟第一条 Prompt 相同，省略之>

Question: Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?
Thought: I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.
Action: search
Action Input: "Olivia Wilde boyfriend"
Observation: Looks like Olivia Wilde and Jason Sudeikis are starting 2023 on good terms. Amid their highly publicized custody battle – and the actress' ...
Thought:

接着从模型得到的 Completion 是：

I need to find out Jason Sudeikis' age
Action: search
Action Input: "Jason Sudeikis age"

类似地，SDK 会通过调用 SerpApi 得到“Jason Sudeikis age”的结果，并构造 Prompt：

<这里的内容跟第一条 Prompt 相同，省略之>

Question: Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?
Thought: I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.
Action: search
Action Input: "Olivia Wilde boyfriend"
Observation: Looks like Olivia Wilde and Jason Sudeikis are starting 2023 on good terms. Amid their highly publicized custody battle – and the actress' ...
Thought: I need to find out Jason Sudeikis' age
Action: search
Action Input: "Jason Sudeikis age"
Observation: 47 years
Thought:

模型给出 Completion：

I need to calculate 47 raised to the 0.23 power
Action: calculator
Action Input: 47^0.23

对于 Calculator 的调用同样由 SDK 完成，并将结果交回给模型：

<这里的内容跟第一条 Prompt 相同，省略之>

Question: Who is Olivia Wilde's boyfriend? What is his current age raised to the 0.23 power?
Thought: I need to find out who Olivia Wilde's boyfriend is and then calculate his age raised to the 0.23 power.
Action: search
Action Input: "Olivia Wilde boyfriend"
Observation: Looks like Olivia Wilde and Jason Sudeikis are starting 2023 on good terms. Amid their highly publicized custody battle – and the actress' ...
Thought: I need to find out Jason Sudeikis' age
Action: search
Action Input: "Jason Sudeikis age"
Observation: 47 years
Thought: I need to calculate 47 raised to the 0.23 power
Action: calculator
Action Input: 47^0.23
Observation: 2.4242784855673896
Thought:

这次模型会直接给出问题最终的结果：

I now know the final answer
Final Answer: Jason Sudeikis is Olivia Wilde's boyfriend and his current age raised to the 0.23 power is 2.4242784855673896.

在降噪耳机与纯真博物馆里我们提到过 Algebraic Effects；如果你对 Algebraic Effects 熟悉的话，一定不难联想到这样的代码（这里以 JavaScript Generator 语法示例）：

function*() {
  const boyfriend = yield { action: 'search', input: 'Olivia Wilde boyfriend' }
  const age = yield { action: 'search', input: `${boyfriend} age` }
  const result = yield { action: 'calculator', input: `${age}^0.23` }
}

二者的相似不止在于对“如何让计算过程产生副作用”这个问题整体的解决思路上，同样在于执行过程中的细节。不过在 LangChain 的例子里，是 LLM“写”出了这样的代码，由 SDK 去运行并处理对应的 effect。