科技巨頭
2025年12月15日
AI Agents, Clearly Explained
🤖 AI 重點摘要
- 大型語言模型 (LLM):如 ChatGPT,擅長生成和編輯文字,但缺乏對專有資訊的了解且是被動的,需要人類提示才能回應。
- AI 工作流程:透過預先設定的路徑(控制邏輯)讓 LLM 執行任務,例如搜尋資料或使用 API。雖然可以整合多個步驟,但仍需人類定義流程。
- AI 代理人:能自主思考、行動和迭代,取代人類決策者。它會推論最佳方法、使用工具執行任務、觀察結果並自行調整,以達成目標。
- RAG (檢索擴增生成):是 AI 工作流程的一種,幫助 LLM 在回答問題前查找資訊。
- React 框架:是 AI 代理人常見的配置,強調推理和行動能力。
📝 雙語字幕
▶️
暫停
🔄
1x
AI.
AI。
AI.
AI。
AI.
AI。
AI.
AI。
AI.
AI。
AI.
AI。
You know, more agentic.
你知道,更具主動性。
Agentic capabilities.
主動性能力。
An AI agent.
一個AI代理人。
Agents.
代理人。
Agentic workflows.
主動性工作流程。
Agents.
代理人。
Agents.
代理人。
Agent.
代理人。
Agent.
代理人。
Agent.
代理人。
Agent.
代理人。
Agentic.
具主動性的。
All right.
好吧。
Most explanations of AI agents is either too technical or too basic.
大部分的AI代理人解釋,不是過於技術性,就是過於基礎。
This video is meant for people like myself.
這段影片是為像我這樣的人準備的。
You have zero technical background, but you use AI tools regularly and you want to learn just enough about AI agents to see how it affects you.
你沒有任何技術背景,但你經常使用人工智慧工具,並且想學習足夠多關於人工智慧代理人的知識,以便了解它如何影響你。
In this video, we'll follow a simple one, two, three learning path by building on concepts you already understand like chatbt and then moving on to AI workflows and then finally AI agents.
在這段影片中,我們將遵循一個簡單的一、二、三學習路徑,從你已經了解的概念(例如聊天機器人)開始,然後進一步了解人工智慧工作流程,最後是人工智慧代理人。
All the while using examples you will actually encounter in real life.
我們將在過程中始終使用你在現實生活中實際會遇到的例子。
And believe me when I tell you those intimidating terms you see everywhere like rag, rag, or react, they're a lot simpler than you think.
相信我,到處看到的那些令人望而卻步的術語,例如RAG、向量資料庫或React,其實比你想像的簡單得多。
Let's get started.
讓我們開始吧。
Kicking things off at level one, large language models.
首先,我們從第一層級開始,也就是大型語言模型。
Popular AI chatbots like CHBT, Google Gemini, and Claude are applications built on top of large language models, LLMs, and they're fantastic at generating and editing text.
像ChatGPT、Google Gemini和Claude這樣流行的人工智慧聊天機器人都是建立在大型語言模型(LLM)之上的應用程式,它們非常擅長生成和編輯文字。
Here's a simple visualization.
這是一個簡單的視覺化呈現。
You, the human, provides an input and the LLM produces an output based on its training data.
你,作為人類,提供輸入,而LLM則根據其訓練資料產生輸出。
For example, if I were to ask Chachi BT to draft an email requesting a coffee chat, my prompt is the input and the resulting email that's way more polite than I would ever be in real life is the output.
例如,如果我要求ChatGPT起草一封請求咖啡聊天的電子郵件,我的提示就是輸入,而產生的那封比我本人更禮貌的電子郵件就是輸出。
So far so good right?
到目前為止一切都還好,對吧?
Simple stuff.
很簡單。
But what if I asked Chachi BT when my next coffee chat is?
但如果我問ChatGPT我下次的咖啡聊天是什麼時候呢?
Even without seeing the response, both you and I know Chachi PT is gonna fail because it doesn't know that information.
即使還沒看到回應,你和我都知道ChatGPT肯定會失敗,因為它不知道這些資訊。
It doesn't have access to my calendar.
它無法存取我的日曆。
This highlights two key traits of large language models.
這突顯了大型語言模型的兩個關鍵特徵。
First despite being trained on vast amounts of data, they have limited knowledge of proprietary information like our personal information or internal company data.
首先,儘管經過大量資料的訓練,它們對專有資訊的了解有限,例如我們的個人資訊或公司內部資料。
Second, LLMs are passive.
其次,LLM是被動的。
They wait for our prompt and then respond.
它們等待我們的提示,然後再做出回應。
Right?
對吧?
Keep these two traits in mind moving forward.
請記住這兩個特點,並在後續應用。
Moving to level two, AI workflows.
接下來進入第二層,AI 工作流程。
Let's build on our example.
讓我們在之前的例子上繼續建立。
What if I, a human, told the LM, "Every time I ask about a personal event perform a search query and fetch data from my Google calendar before providing a response." With this logic implemented, the next time I ask, "When is my coffee chat with Elon Husky?" I'll get the correct answer because the LLM will now first go into my Google calendar to find that information.
如果我,作為一個人類,告訴語言模型:「每次我詢問個人事件時,先執行搜尋查詢,並從我的 Google 日曆中提取資料,再提供回覆。」有了這個邏輯的實施,當我下次問:「我和 Elon Husky 的咖啡聊天是什麼時候?」我會得到正確的答案,因為語言模型現在會先進入我的 Google 日曆尋找該資訊。
But here's where it gets tricky.
但這裡就變得棘手了。
What if my next follow-up question is, "What will the weather be like that day?" The LM will now fail at answering the query because the path we told the LM to follow is to always search my Google calendar, which does not have information about the weather.
如果我接下來的問題是:「那天會是什麼天氣?」語言模型將無法回答這個問題,因為我們告訴語言模型遵循的路徑是始終搜尋我的 Google 日曆,而 Google 日曆中沒有關於天氣的資訊。
This is a fundamental trait of AI workflows.
這正是 AI 工作流程的一個基本特點。
They can only follow predefined paths set by humans.
它們只能遵循人類預先設定的路徑。
And if you want to get technical, this path is also called the control logic.
如果你想更專業地說,這個路徑也稱為控制邏輯。
Pushing my example further, what if I added more steps into the workflow by allowing the LM to access the weather via an API and then just for fun use a text to audio model to speak the answer.
繼續推進我的例子,如果我允許語言模型透過 API 存取天氣資訊,然後僅僅為了好玩,使用文字轉語音模型來說出答案,會怎麼樣?
The weather forecast for seeing Elon Husky is sunny with a chance of being a good boy.
預計與 Elon Husky 見面時將是晴天,而且有機會遇到一個好孩子。
Here's the thing.
事情是這樣。
No matter how many steps we add, this is still just an AI workflow.
無論我們增加多少步驟,這仍然只是一個 AI 工作流程。
Even if there were hundreds or thousands of steps, if a human is the decision maker, there is no AI agent involvement.
即使有數百或數千個步驟,如果人類是決策者,就不涉及 AI 代理。
Pro tip: retrieval augmented generation or rag is a fancy term that's thrown around a lot.
小技巧:檢索擴增生成,或稱 RAG,是一個經常被提及的術語。
In simple terms, rag is a process that helps AI models look things up before they answer, like accessing my calendar or the weather service.
簡單來說,RAG 是一個幫助 AI 模型在回答問題之前查找資訊的過程,例如存取我的日曆或天氣服務。
Essentially, Rag is just a type of AI workflow.
本質上,RAG 只是 AI 工作流程的一種。
By the way, I have a free AI toolkit that cuts through the noise and helps you master essential AI tools and workflows.
順便說一下,我有一個免費的 AI 工具包,可以消除混亂,幫助你掌握必要的 AI 工具和工作流程。
I'll leave a link to that down below.
我會在下方留下連結。
Here's a real world example.
這是一個實際的例子。
Following Helena Louu's amazing tutorial, I created a simple AI workflow using make.com.
根據海倫娜·婁的精彩教學,我使用 make.com 建立了一個簡單的 AI 工作流程。
Here you can see that first I'm using Google Sheets to do something.
在這裡可以看到,我首先使用 Google Sheets 執行某項操作。
Specifically I'm compiling links to news articles in a Google sheet.
具體來說,我正在 Google 表格中整理新聞文章的連結。
And this is that Google sheet.
這就是那個 Google 表格。
Second, I'm using Perplexity to summarize those news articles.
其次,我使用 Perplexity 總結這些新聞文章。
Then using Claude and using a prompt that I wrote, I'm asking Claude to draft a LinkedIn and Instagram post.
然後使用 Claude,並使用我撰寫的提示詞,我要求 Claude 撰寫 LinkedIn 和 Instagram 的貼文。
Finally, I can schedule this to run automatically every day at 8 a.m.
最後,我可以安排這個工作流程每天早上八點自動執行。
As you can see, this is an AI workflow because it follows a predefined path set by me.
正如你所見,這是一個 AI 工作流程,因為它遵循了我設定的預先定義路徑。
Step one, you do this.
第一步,你做這個。
Step two, you do this.
第二步,你做這個。
Step three, you do this.
第三步,你做這個。
And finally remember to run daily at 8 am.
最後,記得每天早上八點執行。
One last thing, if I test this workflow and I don't like the final output of the LinkedIn post, for example, as you can see right here, uh, it's not funny enough and I'm naturally hilarious right?
還有最後一件事,如果我測試這個工作流程,而且我不喜歡 LinkedIn 貼文的最終輸出,例如,你可以在這裡看到,嗯,它不太好笑,而我天生就很幽默,對吧?
I'd have to manually go back and rewrite the prompt for Claude.
我必須手動回去重寫 Claude 的提示詞。
Okay?
好嗎?
And this trial and error iteration is currently being done by me, a human.
目前這種試錯迭代是由我,一個人類來完成的。
So keep that in mind moving forward.
所以請記住這一點,並以此為基礎繼續思考。
All right, level three, AI agents.
好吧,第三層,AI 代理。
Continuing the make.com example, let's break down what I've been doing so far as the human decision maker.
延續 make.com 的例子,讓我們分解一下我到目前為止作為人類決策者所做的事情。
With the goal of creating social media posts based off of news articles, I need to do two things.
.
First, reason or think about the best approach.
為了根據新聞文章製作社群媒體貼文,我需要做兩件事。
I need to first compile the news articles, then summarize them, then write the final posts.
首先,思考或找出最佳方法。
Second, take action using tools.
我需要先彙整新聞文章,然後摘要它們,最後撰寫最終的貼文。
I need to find and link to those news articles in Google Sheets.
其次,使用工具採取行動。
Use Perplexity for real-time summarization and then claw for copyrightiting.
我需要在 Google Sheets 中找到並連結這些新聞文章。
So and this is the most important sentence in this entire video.
使用 Perplexity 進行即時摘要,然後使用 Claw 進行文案撰寫。
The one massive change that has to happen in order for this AI workflow to become an AI agent is for me, the human decision maker, to be replaced by an LLM.
所以,這也是這段影片中最重要的一句話。
In other words the AI agent must reason.
為了讓這個 AI 工作流程變成一個 AI 代理人,必須進行的一項重大改變,就是用 LLM 取代我這個人類決策者。
What's the most efficient way to compile these news articles?
換句話說,AI 代理人必須能夠思考。
Should I copy and paste each article into a word document?
彙整這些新聞文章最有效率的方法是什麼?
No, it's probably easier to compile links to those articles and then use another tool to fetch the data.
我應該將每篇文章複製並貼到 Word 文件中嗎?
Yes, that makes more sense.
不,彙整文章連結,然後使用另一個工具提取資料可能更容易。
The AI agent must act, aka do things via tools.
是的,這樣更有道理。
Should I use Microsoft Word to compile links?
AI 代理人必須行動,也就是透過工具完成事情。
No.
我應該使用 Microsoft Word 彙整連結嗎?
Inserting links directly into rows is way more efficient.
不。
What about Excel?
直接將連結插入列中效率更高。
M.
Excel 呢?
So the user has already connected their Google account with make.com.
嗯。
So Google Sheets is a better option.
所以 Google Sheets 是一個更好的選擇。
Pro tip.
專業小技巧。
Because of this, the most common configuration for AI agents is the react framework.
正因為如此,AI 代理最常見的配置是 React 框架。
All AI agents must reason and act.
所有 AI 代理都必須進行推理和行動。
So react.
所以是 React。
Sounds simple once we break it down, right?
聽起來很簡單,一旦我們分解開來,對吧?
A third key trait of AI agents is their ability to iterate.
AI 代理的第三個關鍵特徵是它們能夠迭代。
Remember when I had to manually rewrite the prompt to make the LinkedIn post funnier?
還記得我必須手動重寫提示,才能讓 LinkedIn 貼文更好笑嗎?
I, the human, probably need to repeat this iterative process a few times to get something I'm happy with right?
我,這個人類,可能需要重複這個迭代過程幾次,才能得到讓我滿意的結果,對吧?
An AI agent will be able to do the same thing autonomously.
AI 代理將能夠自主地做到同樣的事情。
In our example, the AI agent would autonomously add in another LM to critique its own output.
在我們的例子中,AI 代理會自主地新增另一個 LLM 來評論它自己的輸出。
Okay, I've drafted V1 of a LinkedIn post.
好的,我已經起草了 LinkedIn 貼文的 V1 版本。
How do I make sure it's good?
我該如何確保它寫得很好?
Oh, I know.
哦,我知道了。
I'll add another step where an LM will critique the post based on LinkedIn best practices.
我會新增另一個步驟,讓 LLM 根據 LinkedIn 最佳實務來評論這篇貼文。
And let's repeat this until the best practices criteria are all met.
然後我們就重複這個步驟,直到所有最佳實務標準都達到。
And after a few cycles of that, we have the final output.
經過幾個循環之後,我們就得到了最終的輸出。
That was a hypothetical example.
這是一個假設的例子。
So let's move on to a real world AI agent example.
所以我們來看看一個真實世界的 AI 代理例子。
Andrew is a preeeminent figure in AI and he created this demo website that illustrates how an AI agent works.
Andrew 是 AI 領域的傑出人物,他創建了這個示範網站,展示了 AI 代理的工作方式。
I'll link the full video down below, but when I search for a keyword like skier, enter the AI vision agent in the background is first reasoning what a skier looks like.
我會在下方連結完整的影片,但當我搜尋像是「滑雪者」這樣的關鍵字時,背景中的 AI 視覺代理人首先會推論滑雪者是什麼樣子。
A person on skis going really, fast, in, snow,, for example,, right?
例如,一個人在雪地上用滑雪板快速滑行,對吧?
I'm not sure.
我不太確定。
And then it's acting by looking at clips in video footage trying to identify what it thinks a skier is, indexing that clip, and then returning that clip to us.
然後它會透過檢視影片片段,試圖辨識它認為的滑雪者,為該片段建立索引,並將該片段回傳給我們。
Although this might not feel impressive, remember that an AI agent did all that instead of a human reviewing the footage beforehand manually identifying the skier, and adding tags like skier, mountain, ski snow.
雖然這可能看起來沒有什麼了不起,但請記住,一個 AI 代理人完成了所有這些工作,而不是人類事先檢視影片,手動辨識滑雪者,並新增像是「滑雪者」、「山」、「滑雪雪」這樣的標籤。
The programming is obviously a lot more technical and complicated than what we see in the front end, but that's the point of this demo, right?
程式設計顯然比我們在前端看到的更技術性和複雜,但這也是這個示範的目的,對吧?
The average user like myself wants a simple app that just works without me having to understand what's going on in the back end.
以我這樣的一般使用者來說,想要的是一個簡單易用的應用程式,而不需要我了解後端發生的事情。
Speaking of examples, I'm also building my very own basic AI agent using Nan.
說到範例,我也正在使用 Nan 建立我自己的基本 AI 代理人。
So, let me know in the comments what type of AI agent you'd like me to make a tutorial on next.
所以,請在評論區告訴我,您希望我下一個製作什麼類型的 AI 代理人教學。
To wrap up, here's a simplified visualization of the three levels we covered today.
總結來說,以下是我們今天涵蓋的三個層級的簡化視覺化呈現。
Level one, we provide an input and the LM responds with an output.
第一層級,我們提供輸入,而 LLM 則以輸出回應。
Easy.
很簡單。
Level two, for AI workflows, we provide an input and tell the LM to follow a predefined path that may involve in retrieving information from external tools.
第二層級,對於 AI 工作流程,我們提供輸入並指示 LLM 遵循預先定義的路徑,該路徑可能涉及從外部工具檢索資訊。
The key trait here is that the human programs a path for LM to follow.
這裡的關鍵特徵是,人類為 LLM 程式設計了要遵循的路徑。
Level three, the AI agent receives a goal and the LM performs reasoning to determine how best to achieve the goal, takes action using tools to produce an interim result observes that interim result, and decides whether iterations are required and produces a final output that achieves the initial goal.
第三層級,AI 代理人接收到目標,LLM 會進行推論以確定最佳實現目標的方式,使用工具採取行動以產生中間結果,觀察該中間結果,並決定是否需要迭代,最後產生實現初始目標的最終輸出。
The key trait here is that the LLM is a decision maker in the workflow.
這裡的關鍵特徵是,LLM 是工作流程中的決策者。
If you found this helpful, you might want to learn how to build a prompts database in Notion.
如果您覺得這有幫助,您可能想學習如何在 Notion 中建立提示詞資料庫。
See you on the next video.
我們下次影片見。
In the meantime, have a great one.
在此之前,祝您有個美好的一天。
🎓
英語學習專區
透過影片掌握實用單字與理解能力
例句中文翻譯
💡 點擊按鈕切換顯示/隱藏中文翻譯
agentic
/eɪˈdʒen.tɪk/
adjective
發音比對結果
0%
relating to the capacity of an entity to act in the world.
具代理能力的;關於個體在世界中行動的能力。
📍 影片例句
"You know, more agentic."
→ 你知道,更具代理性。
💡 補充例句
"The development of truly agentic AI is a major goal."
→ 開發真正具代理能力的 AI 是一個主要目標。
workflows
/ˈwɜːk.floʊz/
noun
發音比對結果
0%
a series of tasks performed by people or systems in a specific order.
工作流程;一系列按特定順序由人員或系統執行的任務。
📍 影片例句
"Agentic workflows."
→ 代理工作流程。
💡 補充例句
"We need to optimize our internal workflows to improve efficiency."
→ 我們需要優化內部工作流程以提高效率。
intimidating
/ɪnˈtɪm.ɪ.deɪ.tɪŋ/
adjective
發音比對結果
0%
causing fear or a feeling of being overwhelmed.
令人畏懼的;造成恐懼或不知所措的感覺。
📍 影片例句
"those intimidating terms you see everywhere like rag, rag, or react, they're a lot simpler than you think."
→ 你到處看到的那些令人望而卻步的術語,比如 RAG、RAG 或 REACT,其實比你想像的要簡單得多。
💡 補充例句
"The large size of the book was a little intimidating."
→ 這本書的篇幅很大,有點讓人望而卻步。
proprietary
/proʊˈpraɪ.ə.teri/
adjective
發音比對結果
0%
relating to ownership; something owned by a particular person or organization.
專有的;與所有權有關;由特定個人或組織擁有的事物。
📍 影片例句
"limited knowledge of proprietary information like our personal information or internal company data."
→ 對專有信息的了解有限,例如我們的個人信息或內部公司數據。
💡 補充例句
"The company protects its proprietary technology with patents."
→ 公司以專利保護其專有技術。
passive
/ˈpæs.ɪv/
adjective
發音比對結果
0%
accepting what happens without active response or resistance.
被動的;接受發生的事情而不主動回應或反抗。
📍 影片例句
"LLMs are passive."
→ LLM 是被動的。
💡 補充例句
"He adopted a passive role in the negotiations."
→ 他在談判中採取了被動的角色。
predefined
/ˌpriː.dɪˈfaɪnd/
adjective
發音比對結果
0%
established or defined in advance.
預先定義的;預先確立的。
📍 影片例句
"They can only follow predefined paths set by humans."
→ 它們只能遵循人類設定的預先定義的路徑。
💡 補充例句
"The software operates according to a predefined set of rules."
→ 該軟件按照預先定義的規則集運作。
control logic
/kənˈtroʊl ˈlɒdʒ.ɪk/
noun
發音比對結果
0%
the set of rules and instructions that govern the operation of a system.
控制邏輯;控制系統運作的規則和指令。
📍 影片例句
"this path is also called the control logic."
→ 這個路徑也被稱為控制邏輯。
💡 補充例句
"The engineer reviewed the control logic of the robot."
→ 工程師審查了機器人的控制邏輯。
autonomously
/ɔːˈtɒn.ə.məs.li/
adverb
發音比對結果
0%
in a self-governing way; independently.
自主地;獨立地。
📍 影片例句
"An AI agent will be able to do the same thing autonomously."
→ AI 代理將能夠自主地執行相同的操作。
💡 補充例句
"The drone can fly autonomously for several hours."
→ 無人機可以自主飛行數小時。
iterate
/ˈɪt.ə.reɪt/
verb
發音比對結果
0%
to repeat a process or procedure in order to approach a desired outcome.
迭代;為了達到理想結果而重複一個過程或程序。
📍 影片例句
"this trial and error iteration is currently being done by me, a human."
→ 這種試錯迭代目前由我,一個人來做。
💡 補充例句
"We need to iterate on the design based on user feedback."
→ 我們需要根據用戶的反饋對設計進行迭代。
configuration
/kənˌfɪɡ.jəˈreɪ.ʃən/
noun
發音比對結果
0%
the arrangement of parts or elements in a complex system.
配置;複雜系統中各部分的排列。
📍 影片例句
"Because of this, the most common configuration for AI agents is the react framework."
→ 正因如此,AI 代理最常見的配置是 React 框架。
💡 補充例句
"The software allows you to customize the configuration settings."
→ 該軟件允許您自定義配置設置。
hypothetical
/ˌhaɪ.pəˈθet.ɪ.kəl/
adjective
發音比對結果
0%
based on or serving as a hypothesis.
假設的;基於或作為假說。
📍 影片例句
"That was a hypothetical example."
→ 那是一個假設的例子。
💡 補充例句
"Let's consider a hypothetical situation."
→ 讓我們考慮一個假設的情況。
indexing
/ˈɪn.deks.ɪŋ/
verb
發音比對結果
0%
the action or process of compiling a list of items for easy reference.
索引;編制清單以方便參考的動作或過程。
📍 影片例句
"and then it's acting by looking at clips in video footage trying to identify what it thinks a skier is, indexing that clip, and then returning that clip to us."
→ 然後它通過查看視頻片段,嘗試識別它認為的滑雪者,對該片段進行索引,然後將該片段返回給我們。
💡 補充例句
"The library is currently indexing all its new acquisitions."
→ 圖書館目前正在索引其所有新收購的書籍。