<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>LangChain on &lt;Vunb /></title><link>https://vunb.github.io/tags/langchain/</link><description>Recent content in LangChain on &lt;Vunb /></description><generator>Source Themes Academic (https://sourcethemes.com/academic/)</generator><language>en-us</language><copyright>Vunb &amp;copy; {year}</copyright><lastBuildDate>Thu, 14 May 2026 00:00:00 +0700</lastBuildDate><atom:link href="https://vunb.github.io/tags/langchain/index.xml" rel="self" type="application/rss+xml"/><item><title>Tool Use &amp; Function Calling — Trang bị công cụ cho AI Agent</title><link>https://vunb.github.io/tutorials/ai-agent/tool-use-va-function-calling-trang-bi-cong-cu-cho-ai-agent/</link><pubDate>Thu, 14 May 2026 00:00:00 +0700</pubDate><guid>https://vunb.github.io/tutorials/ai-agent/tool-use-va-function-calling-trang-bi-cong-cu-cho-ai-agent/</guid><description>&lt;h2 id="1-t-llm-tnh-sang-agent-c-hnh-ng-v-sao-cn-tool-use">1. Từ LLM tĩnh sang Agent có hành động: vì sao cần Tool Use?&lt;/h2>
&lt;p>Ở các bài trước, chúng ta đã xây dựng được một AI Agent biết &lt;strong>trả lời&lt;/strong> dựa trên kho tri thức RAG. Nhưng doanh nghiệp thực tế cần hơn thế:&lt;/p>
&lt;blockquote>
&lt;p>&amp;ldquo;Chatbot đã trả lời đúng rằng đơn hàng đang chờ duyệt — nhưng sao nó không &lt;strong>tự tạo ticket hỗ trợ&lt;/strong> hay &lt;strong>gửi email xác nhận&lt;/strong> luôn được?&amp;rdquo;&lt;/p>
&lt;/blockquote>
&lt;p>Đây là giới hạn cốt lõi của LLM thuần: mô hình ngôn ngữ &lt;strong>chỉ sinh ra văn bản&lt;/strong>. Nó không thể tự gọi API, truy vấn CSDL, đọc file thời gian thực hay gửi thông báo — trừ khi bạn trang bị cho nó &lt;strong>công cụ (Tools)&lt;/strong>.&lt;/p>
&lt;p>&lt;strong>Tool Use&lt;/strong> (còn gọi là &lt;strong>Function Calling&lt;/strong>) là cơ chế cho phép LLM:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Nhận diện&lt;/strong> khi nào cần dùng một công cụ bên ngoài&lt;/li>
&lt;li>&lt;strong>Tạo ra yêu cầu có cấu trúc&lt;/strong> (JSON) mô tả công cụ nào cần gọi và với tham số nào&lt;/li>
&lt;li>&lt;strong>Nhận kết quả&lt;/strong> từ công cụ đó rồi tổng hợp câu trả lời cuối cùng&lt;/li>
&lt;/ol>
&lt;p>Đây là bước chuyển đổi từ LLM &amp;ldquo;biết nói&amp;rdquo; sang AI Agent &amp;ldquo;biết làm&amp;rdquo; — và là nền tảng của mọi hệ thống tự động hóa thông minh.&lt;/p>
&lt;hr>
&lt;h2 id="2-m-hnh-hot-ng-ca-function-calling">2. Mô hình hoạt động của Function Calling&lt;/h2>
&lt;p>Giao thức Function Calling của OpenAI đã trở thành chuẩn thực tế (de-facto standard) trong ngành. Luồng hoạt động gồm 4 bước rõ ràng:&lt;/p>
&lt;pre>&lt;code>Người dùng / Orchestrator
│
│ [1] User message + danh sách tool definitions (JSON schema)
▼
┌─────────────┐
│ LLM │ Phân tích intent
│ (GPT-4o, │ → Quyết định dùng tool nào
│ Claude...) │ → Tạo tool_call JSON có cấu trúc
└──────┬──────┘
│
│ [2] tool_call: { name: &amp;quot;get_order_status&amp;quot;, args: { order_id: &amp;quot;ORD-001&amp;quot; } }
▼
┌─────────────────────┐
│ Tool Executor │ Nhận tool_call từ LLM
│ (Application Code) │ → Validate args
│ │ → Gọi API / DB / service thực tế
└──────────┬──────────┘
│
│ [3] tool_response: { status: &amp;quot;pending&amp;quot;, eta: &amp;quot;2026-05-15&amp;quot; }
▼
┌─────────────┐
│ LLM │ Nhận kết quả từ tool
│ (lần 2) │ → Tổng hợp câu trả lời tự nhiên
└──────┬──────┘
│
│ [4] Final answer: &amp;quot;Đơn hàng ORD-001 đang chờ duyệt, dự kiến giao 15/05.&amp;quot;
▼
Người dùng / Orchestrator
&lt;/code>&lt;/pre>&lt;h3 id="21-chi-tit-tng-bc">2.1. Chi tiết từng bước&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Bước&lt;/th>
&lt;th>Tên&lt;/th>
&lt;th>Ai thực hiện&lt;/th>
&lt;th>Nội dung&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>[1]&lt;/strong>&lt;/td>
&lt;td>Request&lt;/td>
&lt;td>App code&lt;/td>
&lt;td>Gửi message + &lt;code>tools[]&lt;/code> (JSON schema) lên LLM API&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>[2]&lt;/strong>&lt;/td>
&lt;td>Tool call&lt;/td>
&lt;td>LLM&lt;/td>
&lt;td>Trả về &lt;code>finish_reason: &amp;quot;tool_calls&amp;quot;&lt;/code> kèm JSON tham số&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>[3]&lt;/strong>&lt;/td>
&lt;td>Tool response&lt;/td>
&lt;td>App code&lt;/td>
&lt;td>Thực thi tool, gửi kết quả lại API với &lt;code>role: &amp;quot;tool&amp;quot;&lt;/code>&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>[4]&lt;/strong>&lt;/td>
&lt;td>Final answer&lt;/td>
&lt;td>LLM&lt;/td>
&lt;td>Nhận tool result, sinh câu trả lời tự nhiên cho người dùng&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;blockquote>
&lt;p>&lt;strong>Lưu ý quan trọng&lt;/strong>: LLM &lt;strong>không tự gọi tool&lt;/strong> — nó chỉ tạo ra &lt;strong>yêu cầu có cấu trúc&lt;/strong>. Phần thực thi là do application code của bạn đảm nhiệm. Đây vừa là thiết kế an toàn, vừa là điểm bạn phải tự kiểm soát.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="3-nh-ngha-toolfunction--cch-c-t-ng-chun">3. Định nghĩa Tool/Function — Cách đặc tả đúng chuẩn&lt;/h2>
&lt;p>Chất lượng định nghĩa tool quyết định trực tiếp độ chính xác khi LLM chọn và gọi tool. Một tool definition tốt phải có:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>name&lt;/strong>: tên ngắn gọn, snake_case, mô tả rõ hành động&lt;/li>
&lt;li>&lt;strong>description&lt;/strong>: mô tả chi tiết &lt;strong>khi nào dùng&lt;/strong> và &lt;strong>không dùng khi nào&lt;/strong> — đây là phần quan trọng nhất&lt;/li>
&lt;li>&lt;strong>parameters&lt;/strong>: JSON Schema đầy đủ, bao gồm type, description, required, enum nếu có&lt;/li>
&lt;/ul>
&lt;h3 id="31-v-d-3-tool-thc-t">3.1. Ví dụ 3 tool thực tế&lt;/h3>
&lt;p>&lt;strong>Tool 1 — Truy vấn trạng thái đơn hàng (Read)&lt;/strong>&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-json" data-lang="json">{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;function&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;function&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;name&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;get_order_status&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Tra cứu trạng thái đơn hàng theo mã đơn. Dùng khi khách hỏi về tình trạng, tiến độ hoặc thời gian giao hàng. KHÔNG dùng để sửa hay hủy đơn.&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;parameters&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;object&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;properties&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;order_id&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;string&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Mã đơn hàng, định dạng ORD-XXXXXX hoặc số nguyên&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;include_history&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;boolean&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Có trả về lịch sử trạng thái hay không. Mặc định false.&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;default&amp;#34;&lt;/span>: &lt;span style="color:#66d9ef">false&lt;/span>
}
},
&lt;span style="color:#f92672">&amp;#34;required&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;order_id&amp;#34;&lt;/span>]
}
}
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>Tool 2 — Tạo ticket hỗ trợ (Write)&lt;/strong>&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-json" data-lang="json">{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;function&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;function&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;name&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;create_support_ticket&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Tạo ticket hỗ trợ khi vấn đề không thể giải quyết tự động, hoặc khi khách hàng yêu cầu được hỗ trợ bởi nhân viên. KHÔNG tạo ticket nếu vấn đề đã được giải quyết.&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;parameters&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;object&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;properties&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;customer_id&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;string&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;ID khách hàng trong hệ thống CRM&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;subject&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;string&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Tiêu đề ngắn gọn mô tả vấn đề, tối đa 100 ký tự&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;string&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Mô tả chi tiết vấn đề, bao gồm context từ hội thoại&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;priority&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;string&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;enum&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;low&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;medium&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;high&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;urgent&amp;#34;&lt;/span>],
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Mức độ ưu tiên. Chọn &amp;#39;urgent&amp;#39; chỉ khi ảnh hưởng đến giao dịch tài chính hoặc sức khỏe.&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;category&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;string&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;enum&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;billing&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;shipping&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;technical&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;product&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;other&amp;#34;&lt;/span>],
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Danh mục của ticket&amp;#34;&lt;/span>
}
},
&lt;span style="color:#f92672">&amp;#34;required&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;customer_id&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;subject&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;description&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;priority&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;category&amp;#34;&lt;/span>]
}
}
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>Tool 3 — Gửi thông báo (Notify)&lt;/strong>&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-json" data-lang="json">{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;function&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;function&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;name&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;send_notification&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Gửi thông báo tới khách hàng qua kênh ưu tiên của họ (email hoặc SMS). Chỉ dùng sau khi đã hoàn thành một hành động cụ thể cần xác nhận. KHÔNG spam — không gửi quá 1 thông báo cho cùng một sự kiện.&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;parameters&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;object&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;properties&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;customer_id&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;string&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;ID khách hàng&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;channel&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;string&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;enum&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;email&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;sms&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;zalo&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;push&amp;#34;&lt;/span>],
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Kênh gửi thông báo&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;template_id&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;string&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;ID mẫu thông báo đã được phê duyệt trong hệ thống&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;variables&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;object&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;description&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Các biến điền vào template, ví dụ: { order_id, status, eta }&amp;#34;&lt;/span>
}
},
&lt;span style="color:#f92672">&amp;#34;required&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;customer_id&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;channel&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;template_id&amp;#34;&lt;/span>]
}
}
}
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="32-nguyn-tc-vit-description-hiu-qu">3.2. Nguyên tắc viết description hiệu quả&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Nguyên tắc&lt;/th>
&lt;th>Ví dụ xấu&lt;/th>
&lt;th>Ví dụ tốt&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>Nói rõ &lt;strong>khi nào&lt;/strong> dùng&lt;/td>
&lt;td>&amp;ldquo;Lấy thông tin đơn hàng&amp;rdquo;&lt;/td>
&lt;td>&amp;ldquo;Dùng khi khách hỏi về trạng thái, tiến độ hoặc ETA giao hàng&amp;rdquo;&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Nói rõ &lt;strong>không&lt;/strong> dùng khi nào&lt;/td>
&lt;td>&lt;em>(bỏ qua)&lt;/em>&lt;/td>
&lt;td>&amp;ldquo;KHÔNG dùng để sửa hay hủy đơn&amp;rdquo;&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Mô tả &lt;strong>format&lt;/strong> tham số&lt;/td>
&lt;td>&amp;ldquo;mã đơn&amp;rdquo;&lt;/td>
&lt;td>&amp;ldquo;mã đơn hàng, định dạng ORD-XXXXXX&amp;rdquo;&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Đặt &lt;strong>enum&lt;/strong> cho giá trị giới hạn&lt;/td>
&lt;td>&lt;code>&amp;quot;type&amp;quot;: &amp;quot;string&amp;quot;&lt;/code>&lt;/td>
&lt;td>&lt;code>&amp;quot;enum&amp;quot;: [&amp;quot;low&amp;quot;,&amp;quot;medium&amp;quot;,&amp;quot;high&amp;quot;,&amp;quot;urgent&amp;quot;]&lt;/code>&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="4-chin-lc-thit-k-b-cng-c-tool-catalog">4. Chiến lược thiết kế bộ công cụ (Tool Catalog)&lt;/h2>
&lt;p>Khi hệ thống phát triển, bạn sẽ có hàng chục tool. Cần thiết kế có hệ thống ngay từ đầu.&lt;/p>
&lt;h3 id="41-phn-loi-tool-theo-tc-ng">4.1. Phân loại tool theo tác động&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Nhóm&lt;/th>
&lt;th>Đặc điểm&lt;/th>
&lt;th>Ví dụ&lt;/th>
&lt;th>Yêu cầu bảo mật&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Read&lt;/strong>&lt;/td>
&lt;td>Chỉ đọc, không thay đổi trạng thái&lt;/td>
&lt;td>&lt;code>get_order_status&lt;/code>, &lt;code>search_product&lt;/code>, &lt;code>check_stock&lt;/code>&lt;/td>
&lt;td>Thấp — cần authn&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Write&lt;/strong>&lt;/td>
&lt;td>Thay đổi dữ liệu/trạng thái hệ thống&lt;/td>
&lt;td>&lt;code>create_ticket&lt;/code>, &lt;code>update_order&lt;/code>, &lt;code>cancel_subscription&lt;/code>&lt;/td>
&lt;td>Cao — cần authn + authz + audit log&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Notify&lt;/strong>&lt;/td>
&lt;td>Gửi thông tin ra bên ngoài&lt;/td>
&lt;td>&lt;code>send_email&lt;/code>, &lt;code>send_sms&lt;/code>, &lt;code>push_notification&lt;/code>&lt;/td>
&lt;td>Trung bình — cần rate limiting&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Compute&lt;/strong>&lt;/td>
&lt;td>Tính toán, xử lý dữ liệu&lt;/td>
&lt;td>&lt;code>calculate_price&lt;/code>, &lt;code>generate_report&lt;/code>, &lt;code>summarize_data&lt;/code>&lt;/td>
&lt;td>Thấp — cần timeout&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>External&lt;/strong>&lt;/td>
&lt;td>Gọi API bên thứ ba&lt;/td>
&lt;td>&lt;code>call_payment_gateway&lt;/code>, &lt;code>query_shipping_api&lt;/code>&lt;/td>
&lt;td>Cao — cần circuit breaker&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="42-nguyn-tc-thit-k-tool-catalog">4.2. Nguyên tắc thiết kế Tool Catalog&lt;/h3>
&lt;p>&lt;strong>Scope — Giới hạn rõ phạm vi:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Mỗi tool chỉ làm &lt;strong>một việc cụ thể&lt;/strong>, không làm nhiều thứ trong một lần gọi&lt;/li>
&lt;li>Tên tool phải nói lên hành động: &lt;code>get_&lt;/code>, &lt;code>create_&lt;/code>, &lt;code>update_&lt;/code>, &lt;code>delete_&lt;/code>, &lt;code>send_&lt;/code>, &lt;code>calculate_&lt;/code>&lt;/li>
&lt;li>Tránh tool quá generic như &lt;code>do_action&lt;/code> hay &lt;code>handle_request&lt;/code>&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Granularity — Độ chi tiết phù hợp:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Tool quá nhỏ → LLM phải gọi nhiều bước → tăng latency, tăng chi phí&lt;/li>
&lt;li>Tool quá lớn → khó kiểm soát, khó audit, khó reuse&lt;/li>
&lt;li>&lt;strong>Nguyên tắc vàng&lt;/strong>: một tool nên hoàn thành một &lt;strong>business unit of work&lt;/strong> có thể audit độc lập&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Idempotency — Thiết kế để an toàn khi retry:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Read tools: luôn idempotent&lt;/li>
&lt;li>Write tools: bắt buộc phải idempotent (dùng idempotency key)&lt;/li>
&lt;li>Notify tools: implement deduplication để tránh gửi trùng&lt;/li>
&lt;/ul>
&lt;h3 id="43-mu-tool-catalog-cho-d-n-customer-support">4.3. Mẫu Tool Catalog cho dự án Customer Support&lt;/h3>
&lt;pre>&lt;code>Tool Catalog — Customer Support Agent
├── READ
│ ├── get_order_status(order_id)
│ ├── get_customer_profile(customer_id)
│ ├── search_faq(query)
│ └── check_product_availability(sku, quantity)
├── WRITE
│ ├── create_support_ticket(customer_id, subject, description, priority, category)
│ ├── update_ticket_status(ticket_id, status, note)
│ └── schedule_callback(customer_id, datetime, agent_id?)
├── NOTIFY
│ ├── send_notification(customer_id, channel, template_id, variables)
│ └── escalate_to_human(ticket_id, reason, urgency)
└── COMPUTE
└── calculate_refund_amount(order_id, items, reason)
&lt;/code>&lt;/pre>&lt;hr>
&lt;h2 id="5-tool-routing-cch-llm-chn-ng-tool">5. Tool Routing: Cách LLM chọn đúng Tool&lt;/h2>
&lt;p>Hiểu rõ cơ chế routing giúp bạn thiết kế tool catalog và điều chỉnh khi agent chọn sai tool.&lt;/p>
&lt;h3 id="51-ba-ch--tool-choice">5.1. Ba chế độ tool_choice&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Chế độ&lt;/th>
&lt;th>Cấu hình&lt;/th>
&lt;th>Khi nào dùng&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>auto&lt;/strong>&lt;/td>
&lt;td>&lt;code>tool_choice: &amp;quot;auto&amp;quot;&lt;/code>&lt;/td>
&lt;td>Mặc định — LLM tự quyết định có dùng tool không&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>required&lt;/strong>&lt;/td>
&lt;td>&lt;code>tool_choice: &amp;quot;required&amp;quot;&lt;/code>&lt;/td>
&lt;td>Bắt buộc LLM phải gọi ít nhất một tool&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>forced&lt;/strong>&lt;/td>
&lt;td>&lt;code>tool_choice: { type: &amp;quot;function&amp;quot;, function: { name: &amp;quot;get_order_status&amp;quot; } }&lt;/code>&lt;/td>
&lt;td>Buộc LLM gọi đúng tool chỉ định&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>none&lt;/strong>&lt;/td>
&lt;td>&lt;code>tool_choice: &amp;quot;none&amp;quot;&lt;/code>&lt;/td>
&lt;td>Tắt tool use — LLM chỉ trả lời bằng text&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="52-parallel-tool-calls">5.2. Parallel Tool Calls&lt;/h3>
&lt;p>OpenAI GPT-4o và nhiều mô hình hiện đại hỗ trợ gọi &lt;strong>nhiều tool song song&lt;/strong> trong một lần:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-json" data-lang="json">&lt;span style="color:#960050;background-color:#1e0010">/&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">/&lt;/span> &lt;span style="color:#960050;background-color:#1e0010">L&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">L&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">M&lt;/span> &lt;span style="color:#960050;background-color:#1e0010">t&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">r&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">ả&lt;/span> &lt;span style="color:#960050;background-color:#1e0010">v&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">ề&lt;/span> &lt;span style="color:#960050;background-color:#1e0010">n&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">h&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">i&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">ề&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">u&lt;/span> &lt;span style="color:#960050;background-color:#1e0010">t&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">o&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">o&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">l&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">_&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">c&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">a&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">l&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">l&lt;/span> &lt;span style="color:#960050;background-color:#1e0010">c&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">ù&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">n&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">g&lt;/span> &lt;span style="color:#960050;background-color:#1e0010">l&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">ú&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">c&lt;/span>
{
&lt;span style="color:#f92672">&amp;#34;tool_calls&amp;#34;&lt;/span>: [
{
&lt;span style="color:#f92672">&amp;#34;id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;call_abc&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;function&amp;#34;&lt;/span>: { &lt;span style="color:#f92672">&amp;#34;name&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;get_order_status&amp;#34;&lt;/span>, &lt;span style="color:#f92672">&amp;#34;arguments&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;{\&amp;#34;order_id\&amp;#34;: \&amp;#34;ORD-001\&amp;#34;}&amp;#34;&lt;/span> }
},
{
&lt;span style="color:#f92672">&amp;#34;id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;call_def&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;function&amp;#34;&lt;/span>: { &lt;span style="color:#f92672">&amp;#34;name&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;get_customer_profile&amp;#34;&lt;/span>, &lt;span style="color:#f92672">&amp;#34;arguments&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;{\&amp;#34;customer_id\&amp;#34;: \&amp;#34;CUS-123\&amp;#34;}&amp;#34;&lt;/span> }
}
]
}
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>Lợi ích&lt;/strong>: giảm round-trip, giảm latency đáng kể khi cần nhiều dữ liệu độc lập nhau.&lt;br>
&lt;strong>Lưu ý&lt;/strong>: phải xử lý đồng thời ở phía application code (async/await hoặc Task.WhenAll).&lt;/p>
&lt;h3 id="53-k-thut-ci-thin--chnh-xc-routing">5.3. Kỹ thuật cải thiện độ chính xác routing&lt;/h3>
&lt;ol>
&lt;li>&lt;strong>Description rõ ràng&lt;/strong> — đây là tín hiệu chính để LLM phân biệt tool nào phù hợp&lt;/li>
&lt;li>&lt;strong>Tên tool nhất quán&lt;/strong> — prefix theo nhóm (get_, create_, send_) giúp LLM nhận pattern&lt;/li>
&lt;li>&lt;strong>Giới hạn số tool&lt;/strong> — gửi tối đa 10–15 tool liên quan theo context, không gửi toàn bộ catalog&lt;/li>
&lt;li>&lt;strong>Tool examples trong system prompt&lt;/strong> — cung cấp ví dụ khi nào dùng tool nào nếu tool catalog phức tạp&lt;/li>
&lt;li>&lt;strong>Sử dụng forced tool call&lt;/strong> trong các workflow có bước cụ thể (ví dụ: bắt buộc verify trước khi write)&lt;/li>
&lt;/ol>
&lt;hr>
&lt;h2 id="6-mcp--model-context-protocol">6. MCP — Model Context Protocol&lt;/h2>
&lt;h3 id="61-mcp-l-g">6.1. MCP là gì?&lt;/h3>
&lt;p>&lt;strong>MCP (Model Context Protocol)&lt;/strong> là giao thức mở do Anthropic đề xuất (2024) và được nhiều công ty áp dụng, nhằm &lt;strong>chuẩn hóa cách LLM tương tác với tool/resource bên ngoài&lt;/strong> theo mô hình client–server.&lt;/p>
&lt;p>Nếu Function Calling là cơ chế &amp;ldquo;điện thoại trực tiếp&amp;rdquo; giữa LLM và một tool, thì MCP là &amp;ldquo;tổng đài&amp;rdquo; — một lớp trừu tượng chuẩn hóa, có thể kết nối nhiều tool/resource từ nhiều nguồn.&lt;/p>
&lt;h3 id="62-kin-trc-mcp">6.2. Kiến trúc MCP&lt;/h3>
&lt;pre>&lt;code>┌──────────────────────────────────────────────────────────┐
│ MCP Host (Application) │
│ ┌─────────────┐ │
│ │ LLM / AI │◀──── tool_call / tool_response │
│ │ Engine │ │
│ └──────┬──────┘ │
│ │ MCP Protocol (JSON-RPC 2.0 over stdio / HTTP) │
│ ┌──────▼──────────────────────────────────────────┐ │
│ │ MCP Client (built-in) │ │
│ └──────┬───────────────┬───────────────┬──────────┘ │
└─────────┼───────────────┼───────────────┼────────────────┘
│ │ │
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ MCP Server │ │ MCP Server │ │ MCP Server │
│ (Database) │ │ (REST API) │ │ (File Sys) │
│ │ │ │ │ │
│ Tools: │ │ Tools: │ │ Tools: │
│ query_db │ │ call_crm │ │ read_file │
│ write_db │ │ send_email │ │ list_files │
└────────────┘ └────────────┘ └────────────┘
&lt;/code>&lt;/pre>&lt;h3 id="63-so-snh-direct-function-call-vs-mcp">6.3. So sánh Direct Function Call vs MCP&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Tiêu chí&lt;/th>
&lt;th>Direct Function Calling&lt;/th>
&lt;th>MCP&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Độ phức tạp triển khai&lt;/strong>&lt;/td>
&lt;td>Thấp — code trực tiếp&lt;/td>
&lt;td>Trung bình — cần MCP server&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tái sử dụng tool&lt;/strong>&lt;/td>
&lt;td>Thấp — gắn chặt với app&lt;/td>
&lt;td>Cao — server dùng được với nhiều app/LLM&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tiêu chuẩn hóa&lt;/strong>&lt;/td>
&lt;td>Mỗi provider khác nhau&lt;/td>
&lt;td>Giao thức chung, vendor-neutral&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Bảo mật&lt;/strong>&lt;/td>
&lt;td>App tự kiểm soát&lt;/td>
&lt;td>MCP server có thể implement auth riêng&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Phù hợp với&lt;/strong>&lt;/td>
&lt;td>MVP, tool ít, team nhỏ&lt;/td>
&lt;td>Platform, nhiều app dùng chung tool&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Hỗ trợ streaming&lt;/strong>&lt;/td>
&lt;td>Có (SSE)&lt;/td>
&lt;td>Có (SSE qua HTTP transport)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Ecosystem&lt;/strong>&lt;/td>
&lt;td>OpenAI, Anthropic, Google&lt;/td>
&lt;td>Claude Desktop, Cursor, VS Code, nhiều IDE&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>&lt;strong>Khuyến nghị thực chiến:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Bắt đầu với Direct Function Calling&lt;/strong> — đơn giản, deploy nhanh, dễ debug&lt;/li>
&lt;li>&lt;strong>Chuyển sang MCP&lt;/strong> khi có từ 3+ ứng dụng dùng chung tool set, hoặc cần tool marketplace nội bộ&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="7-so-snh-framework-openai-native-vs-langchain-vs-semantic-kernel">7. So sánh Framework: OpenAI Native vs LangChain vs Semantic Kernel&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Tiêu chí&lt;/th>
&lt;th>OpenAI Native Function Calling&lt;/th>
&lt;th>LangChain Tools&lt;/th>
&lt;th>Semantic Kernel Plugins&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Ngôn ngữ&lt;/strong>&lt;/td>
&lt;td>Any (REST API)&lt;/td>
&lt;td>Python chính, JS beta&lt;/td>
&lt;td>C# / Python / Java&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Độ phức tạp&lt;/strong>&lt;/td>
&lt;td>Thấp — gần với API thô&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Trung bình–Cao&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Abstraction&lt;/strong>&lt;/td>
&lt;td>Tối thiểu&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Cao&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Multi-LLM&lt;/strong>&lt;/td>
&lt;td>Chỉ OpenAI&lt;/td>
&lt;td>✅ Nhiều provider&lt;/td>
&lt;td>✅ Nhiều provider&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Agent loop&lt;/strong>&lt;/td>
&lt;td>Tự code&lt;/td>
&lt;td>✅ AgentExecutor, LangGraph&lt;/td>
&lt;td>✅ Planner, Agents&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tool auto-discovery&lt;/strong>&lt;/td>
&lt;td>❌ Thủ công&lt;/td>
&lt;td>✅ Tool registry&lt;/td>
&lt;td>✅ Plugin auto-import&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Memory/State&lt;/strong>&lt;/td>
&lt;td>❌ Tự xử lý&lt;/td>
&lt;td>✅ Memory modules&lt;/td>
&lt;td>✅ KernelMemory&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Enterprise .NET&lt;/strong>&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Thấp&lt;/td>
&lt;td>✅ Xuất sắc&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tài liệu/Community&lt;/strong>&lt;/td>
&lt;td>✅ Rất tốt&lt;/td>
&lt;td>✅ Rất tốt&lt;/td>
&lt;td>✅ Tốt (Microsoft)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Phù hợp với&lt;/strong>&lt;/td>
&lt;td>Script nhanh, prototype&lt;/td>
&lt;td>Python AI team&lt;/td>
&lt;td>Enterprise .NET, Azure&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="8-trin-khai-thc-t">8. Triển khai thực tế&lt;/h2>
&lt;h3 id="81-c--semantic-kernel-plugin">8.1. C# — Semantic Kernel Plugin&lt;/h3>
&lt;p>Semantic Kernel dùng khái niệm &lt;strong>Plugin&lt;/strong> (tương đương Tool Catalog) với attribute-based definition, rất gần với thiết kế .NET:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-csharp" data-lang="csharp">&lt;span style="color:#66d9ef">using&lt;/span> Microsoft.SemanticKernel;
&lt;span style="color:#66d9ef">using&lt;/span> System.ComponentModel;
&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// Bước 1: Định nghĩa Plugin với KernelFunction attributes
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">OrderPlugin&lt;/span>
{
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">readonly&lt;/span> IOrderService _orderService;
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">readonly&lt;/span> ITicketService _ticketService;
&lt;span style="color:#66d9ef">public&lt;/span> OrderPlugin(IOrderService orderService, ITicketService ticketService)
{
_orderService = orderService;
_ticketService = ticketService;
}
&lt;span style="color:#a6e22e">
&lt;/span>&lt;span style="color:#a6e22e"> [KernelFunction(&amp;#34;get_order_status&amp;#34;)]&lt;/span>
&lt;span style="color:#a6e22e"> [Description(&amp;#34;Tra cứu trạng thái đơn hàng theo mã đơn. Dùng khi khách hỏi về tình trạng, tiến độ hoặc ETA giao hàng. KHÔNG dùng để sửa hay hủy đơn.&amp;#34;)]&lt;/span>
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">async&lt;/span> Task&amp;lt;&lt;span style="color:#66d9ef">string&lt;/span>&amp;gt; GetOrderStatusAsync(
&lt;span style="color:#a6e22e"> [Description(&amp;#34;Mã đơn hàng, định dạng ORD-XXXXXX&amp;#34;)]&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> orderId,
&lt;span style="color:#a6e22e"> [Description(&amp;#34;Có trả về lịch sử trạng thái không, mặc định false&amp;#34;)]&lt;/span> &lt;span style="color:#66d9ef">bool&lt;/span> includeHistory = &lt;span style="color:#66d9ef">false&lt;/span>)
{
&lt;span style="color:#66d9ef">var&lt;/span> order = &lt;span style="color:#66d9ef">await&lt;/span> _orderService.GetOrderAsync(orderId);
&lt;span style="color:#66d9ef">if&lt;/span> (order == &lt;span style="color:#66d9ef">null&lt;/span>)
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">$&amp;#34;Không tìm thấy đơn hàng với mã {orderId}.&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">var&lt;/span> result = &lt;span style="color:#e6db74">$&amp;#34;Đơn {orderId}: {order.Status} | ETA: {order.EstimatedDelivery:dd/MM/yyyy}&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">if&lt;/span> (includeHistory &amp;amp;&amp;amp; order.StatusHistory?.Any() == &lt;span style="color:#66d9ef">true&lt;/span>)
{
&lt;span style="color:#66d9ef">var&lt;/span> history = &lt;span style="color:#66d9ef">string&lt;/span>.Join(&lt;span style="color:#e6db74">&amp;#34;\n&amp;#34;&lt;/span>, order.StatusHistory.Select(h =&amp;gt; &lt;span style="color:#e6db74">$&amp;#34; - {h.Date:dd/MM} {h.Status}&amp;#34;&lt;/span>));
result += &lt;span style="color:#e6db74">$&amp;#34;\nLịch sử:\n{history}&amp;#34;&lt;/span>;
}
&lt;span style="color:#66d9ef">return&lt;/span> result;
}
&lt;span style="color:#a6e22e">
&lt;/span>&lt;span style="color:#a6e22e"> [KernelFunction(&amp;#34;create_support_ticket&amp;#34;)]&lt;/span>
&lt;span style="color:#a6e22e"> [Description(&amp;#34;Tạo ticket hỗ trợ khi vấn đề cần nhân viên xử lý hoặc không thể tự động giải quyết. KHÔNG tạo ticket nếu đã giải quyết xong.&amp;#34;)]&lt;/span>
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">async&lt;/span> Task&amp;lt;&lt;span style="color:#66d9ef">string&lt;/span>&amp;gt; CreateSupportTicketAsync(
&lt;span style="color:#a6e22e"> [Description(&amp;#34;ID khách hàng trong CRM&amp;#34;)]&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> customerId,
&lt;span style="color:#a6e22e"> [Description(&amp;#34;Tiêu đề ngắn gọn, tối đa 100 ký tự&amp;#34;)]&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> subject,
&lt;span style="color:#a6e22e"> [Description(&amp;#34;Mô tả chi tiết vấn đề&amp;#34;)]&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> description,
&lt;span style="color:#a6e22e"> [Description(&amp;#34;Mức ưu tiên: low, medium, high, urgent&amp;#34;)]&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> priority = &lt;span style="color:#e6db74">&amp;#34;medium&amp;#34;&lt;/span>)
{
&lt;span style="color:#75715e">// Human-in-the-loop: log để audit trước khi write
&lt;/span>&lt;span style="color:#75715e">&lt;/span> Console.WriteLine(&lt;span style="color:#e6db74">$&amp;#34;[AUDIT] Creating ticket for customer {customerId}: {subject}&amp;#34;&lt;/span>);
&lt;span style="color:#66d9ef">var&lt;/span> ticket = &lt;span style="color:#66d9ef">await&lt;/span> _ticketService.CreateAsync(&lt;span style="color:#66d9ef">new&lt;/span> CreateTicketRequest
{
CustomerId = customerId,
Subject = subject,
Description = description,
Priority = Enum.Parse&amp;lt;TicketPriority&amp;gt;(priority, ignoreCase: &lt;span style="color:#66d9ef">true&lt;/span>)
});
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">$&amp;#34;Đã tạo ticket #{ticket.Id}. Nhân viên sẽ liên hệ trong {ticket.SlaHours} giờ.&amp;#34;&lt;/span>;
}
}
&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// Bước 2: Đăng ký Plugin và chạy Agent loop
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">AgentService&lt;/span>
{
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">async&lt;/span> Task&amp;lt;&lt;span style="color:#66d9ef">string&lt;/span>&amp;gt; HandleUserMessageAsync(&lt;span style="color:#66d9ef">string&lt;/span> userMessage, &lt;span style="color:#66d9ef">string&lt;/span> customerId)
{
&lt;span style="color:#75715e">// Khởi tạo Kernel
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">var&lt;/span> builder = Kernel.CreateBuilder();
builder.AddOpenAIChatCompletion(
modelId: &lt;span style="color:#e6db74">&amp;#34;gpt-4o-mini&amp;#34;&lt;/span>,
apiKey: Environment.GetEnvironmentVariable(&lt;span style="color:#e6db74">&amp;#34;OPENAI_API_KEY&amp;#34;&lt;/span>)!);
&lt;span style="color:#66d9ef">var&lt;/span> kernel = builder.Build();
&lt;span style="color:#75715e">// Đăng ký plugin
&lt;/span>&lt;span style="color:#75715e">&lt;/span> kernel.Plugins.AddFromObject(
&lt;span style="color:#66d9ef">new&lt;/span> OrderPlugin(orderService, ticketService),
pluginName: &lt;span style="color:#e6db74">&amp;#34;OrderPlugin&amp;#34;&lt;/span>);
&lt;span style="color:#75715e">// Cấu hình auto tool invocation
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">var&lt;/span> executionSettings = &lt;span style="color:#66d9ef">new&lt;/span> OpenAIPromptExecutionSettings
{
ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions
};
&lt;span style="color:#75715e">// System prompt
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">var&lt;/span> systemPrompt = &lt;span style="color:#e6db74">$&amp;#34;&amp;#34;&amp;#34;
&lt;/span>&lt;span style="color:#e6db74"> Bạn là trợ lý hỗ trợ khách hàng của Công ty ABC.
&lt;/span>&lt;span style="color:#e6db74"> ID khách hàng hiện tại: {customerId}
&lt;/span>&lt;span style="color:#e6db74"> Nguyên tắc:
&lt;/span>&lt;span style="color:#e6db74"> - Chỉ sử dụng tool khi thực sự cần thiết
&lt;/span>&lt;span style="color:#e6db74"> - Không tạo ticket nếu đã giải quyết được bằng thông tin có sẵn
&lt;/span>&lt;span style="color:#e6db74"> - Luôn xác nhận lại với khách trước khi tạo ticket hoặc gửi thông báo
&lt;/span>&lt;span style="color:#e6db74"> &amp;#34;&amp;#34;&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">var&lt;/span> history = &lt;span style="color:#66d9ef">new&lt;/span> ChatHistory(systemPrompt);
history.AddUserMessage(userMessage);
&lt;span style="color:#75715e">// Gọi LLM với auto tool invocation
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">var&lt;/span> chatService = kernel.GetRequiredService&amp;lt;IChatCompletionService&amp;gt;();
&lt;span style="color:#66d9ef">var&lt;/span> result = &lt;span style="color:#66d9ef">await&lt;/span> chatService.GetChatMessageContentAsync(
history,
executionSettings,
kernel);
&lt;span style="color:#66d9ef">return&lt;/span> result.Content ?? &lt;span style="color:#e6db74">&amp;#34;Xin lỗi, tôi không thể xử lý yêu cầu này lúc này.&amp;#34;&lt;/span>;
}
}
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="82-python--langchain-tools">8.2. Python — LangChain Tools&lt;/h3>
&lt;p>LangChain dùng decorator &lt;code>@tool&lt;/code> để đăng ký function thành tool, rất Pythonic:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#f92672">from&lt;/span> langchain_core.tools &lt;span style="color:#f92672">import&lt;/span> tool
&lt;span style="color:#f92672">from&lt;/span> langchain_openai &lt;span style="color:#f92672">import&lt;/span> ChatOpenAI
&lt;span style="color:#f92672">from&lt;/span> langchain.agents &lt;span style="color:#f92672">import&lt;/span> AgentExecutor, create_openai_tools_agent
&lt;span style="color:#f92672">from&lt;/span> langchain_core.prompts &lt;span style="color:#f92672">import&lt;/span> ChatPromptTemplate, MessagesPlaceholder
&lt;span style="color:#f92672">from&lt;/span> typing &lt;span style="color:#f92672">import&lt;/span> Optional
&lt;span style="color:#f92672">import&lt;/span> logging
logger &lt;span style="color:#f92672">=&lt;/span> logging&lt;span style="color:#f92672">.&lt;/span>getLogger(__name__)
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#75715e"># Bước 1: Định nghĩa Tools bằng decorator&lt;/span>
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#a6e22e">@tool&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">get_order_status&lt;/span>(order_id: str, include_history: bool &lt;span style="color:#f92672">=&lt;/span> False) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> str:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Tra cứu trạng thái đơn hàng theo mã đơn.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Dùng khi khách hỏi về tình trạng, tiến độ hoặc ETA giao hàng.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> KHÔNG dùng để sửa hay hủy đơn.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Args:&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> order_id: Mã đơn hàng, định dạng ORD-XXXXXX&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> include_history: Có trả về lịch sử trạng thái không, mặc định False&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#75715e"># Gọi service thực tế (minh hoạ)&lt;/span>
order &lt;span style="color:#f92672">=&lt;/span> order_service&lt;span style="color:#f92672">.&lt;/span>get(order_id)
&lt;span style="color:#66d9ef">if&lt;/span> &lt;span style="color:#f92672">not&lt;/span> order:
&lt;span style="color:#66d9ef">return&lt;/span> f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Không tìm thấy đơn hàng với mã {order_id}.&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
result &lt;span style="color:#f92672">=&lt;/span> f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Đơn {order_id}: {order[&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">status&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">]} | ETA: {order[&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">eta&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">]}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> include_history &lt;span style="color:#f92672">and&lt;/span> order&lt;span style="color:#f92672">.&lt;/span>get(&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">history&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>):
history_str &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#f92672">.&lt;/span>join(f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74"> - {h[&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">date&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">]} {h[&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">status&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">]}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span> &lt;span style="color:#66d9ef">for&lt;/span> h &lt;span style="color:#f92672">in&lt;/span> order[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">history&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>])
result &lt;span style="color:#f92672">+&lt;/span>&lt;span style="color:#f92672">=&lt;/span> f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">Lịch sử:&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">{history_str}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
&lt;span style="color:#66d9ef">return&lt;/span> result
&lt;span style="color:#a6e22e">@tool&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">create_support_ticket&lt;/span>(
customer_id: str,
subject: str,
description: str,
priority: str &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">medium&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>,
category: str &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">other&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> str:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Tạo ticket hỗ trợ khi vấn đề cần nhân viên xử lý.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> KHÔNG tạo ticket nếu đã giải quyết được bằng thông tin có sẵn.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Args:&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> customer_id: ID khách hàng trong CRM&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> subject: Tiêu đề ngắn gọn (tối đa 100 ký tự)&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> description: Mô tả chi tiết vấn đề&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> priority: Mức ưu tiên (low/medium/high/urgent)&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> category: Danh mục (billing/shipping/technical/product/other)&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#75715e"># Audit log trước khi write&lt;/span>
logger&lt;span style="color:#f92672">.&lt;/span>info(f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">[AUDIT] Creating ticket | customer={customer_id} | priority={priority}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>)
&lt;span style="color:#75715e"># Validate priority&lt;/span>
valid_priorities &lt;span style="color:#f92672">=&lt;/span> {&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">low&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">medium&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">high&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">urgent&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>}
&lt;span style="color:#66d9ef">if&lt;/span> priority &lt;span style="color:#f92672">not&lt;/span> &lt;span style="color:#f92672">in&lt;/span> valid_priorities:
&lt;span style="color:#66d9ef">return&lt;/span> f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Priority không hợp lệ. Chọn một trong: {&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">, &lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">.join(valid_priorities)}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
ticket &lt;span style="color:#f92672">=&lt;/span> ticket_service&lt;span style="color:#f92672">.&lt;/span>create({
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">customer_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: customer_id,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">subject&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: subject[:&lt;span style="color:#ae81ff">100&lt;/span>], &lt;span style="color:#75715e"># enforce max length&lt;/span>
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">description&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: description,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">priority&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: priority,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">category&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: category
})
sla_map &lt;span style="color:#f92672">=&lt;/span> {&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">urgent&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">1&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">high&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">4&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">medium&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">8&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">low&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">24&lt;/span>}
&lt;span style="color:#66d9ef">return&lt;/span> f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Đã tạo ticket #{ticket[&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">id&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">]}. Nhân viên sẽ phản hồi trong {sla_map[priority]} giờ.&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
&lt;span style="color:#a6e22e">@tool&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">search_faq&lt;/span>(query: str) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> str:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Tìm kiếm trong cơ sở tri thức FAQ.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Dùng trước khi tạo ticket để xem có thể tự giải quyết không.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Args:&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> query: Câu hỏi hoặc từ khoá cần tìm kiếm&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
results &lt;span style="color:#f92672">=&lt;/span> rag_service&lt;span style="color:#f92672">.&lt;/span>search(query, top_k&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">3&lt;/span>)
&lt;span style="color:#66d9ef">if&lt;/span> &lt;span style="color:#f92672">not&lt;/span> results:
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Không tìm thấy thông tin liên quan trong FAQ.&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#f92672">.&lt;/span>join(
f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">[{i+1}] {r[&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">title&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">]}&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">{r[&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">content&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">][:300]}...&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
&lt;span style="color:#66d9ef">for&lt;/span> i, r &lt;span style="color:#f92672">in&lt;/span> enumerate(results)
)
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#75715e"># Bước 2: Khởi tạo Agent với tool list&lt;/span>
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">create_customer_support_agent&lt;/span>(customer_id: str) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> AgentExecutor:
llm &lt;span style="color:#f92672">=&lt;/span> ChatOpenAI(model&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">gpt-4o-mini&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, temperature&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>)
tools &lt;span style="color:#f92672">=&lt;/span> [get_order_status, create_support_ticket, search_faq]
prompt &lt;span style="color:#f92672">=&lt;/span> ChatPromptTemplate&lt;span style="color:#f92672">.&lt;/span>from_messages([
(&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">system&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Bạn là trợ lý hỗ trợ khách hàng của Công ty ABC.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">ID khách hàng: {customer_id}&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">Nguyên tắc:&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">1. Luôn tìm kiếm FAQ trước khi tạo ticket&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">2. Không tạo ticket nếu đã có câu trả lời&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">3. Xác nhận với khách trước khi thực hiện write action&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">4. Ưu tiên giải quyết nhanh, chỉ escalate khi thực sự cần&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>),
MessagesPlaceholder(variable_name&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">chat_history&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, optional&lt;span style="color:#f92672">=&lt;/span>True),
(&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">human&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">{input}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>),
MessagesPlaceholder(variable_name&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">agent_scratchpad&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>),
])
agent &lt;span style="color:#f92672">=&lt;/span> create_openai_tools_agent(llm, tools, prompt)
&lt;span style="color:#66d9ef">return&lt;/span> AgentExecutor(
agent&lt;span style="color:#f92672">=&lt;/span>agent,
tools&lt;span style="color:#f92672">=&lt;/span>tools,
verbose&lt;span style="color:#f92672">=&lt;/span>True, &lt;span style="color:#75715e"># Log tool calls để debug&lt;/span>
max_iterations&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">5&lt;/span>, &lt;span style="color:#75715e"># Giới hạn vòng lặp, tránh loop vô hạn&lt;/span>
handle_parsing_errors&lt;span style="color:#f92672">=&lt;/span>True
)
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#75715e"># Bước 3: Sử dụng&lt;/span>
&lt;span style="color:#75715e"># ============================================================&lt;/span>
async &lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">handle_message&lt;/span>(customer_id: str, message: str) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> str:
agent &lt;span style="color:#f92672">=&lt;/span> create_customer_support_agent(customer_id)
result &lt;span style="color:#f92672">=&lt;/span> await agent&lt;span style="color:#f92672">.&lt;/span>ainvoke({
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">input&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: message,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">chat_history&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: []
})
&lt;span style="color:#66d9ef">return&lt;/span> result[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">output&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>]
&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="9-bo-mt--kim-sot">9. Bảo mật &amp;amp; Kiểm soát&lt;/h2>
&lt;p>Tool Use mở ra khả năng hành động thực tế — đây cũng là nơi rủi ro tập trung nhất. Không kiểm soát tốt, agent có thể gây ra hậu quả không mong muốn trong hệ thống nghiệp vụ.&lt;/p>
&lt;h3 id="91-allow-list-tool-theo-context">9.1. Allow-list Tool theo context&lt;/h3>
&lt;p>Không cấp toàn bộ tool catalog cho mọi người dùng và mọi tình huống:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">get_tools_for_context&lt;/span>(user_role: str, conversation_stage: str) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> list:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Chỉ cấp tool phù hợp với role và giai đoạn hội thoại.&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#75715e"># Mọi người dùng: chỉ read + search&lt;/span>
base_tools &lt;span style="color:#f92672">=&lt;/span> [get_order_status, search_faq]
&lt;span style="color:#75715e"># Khách hàng đã xác thực: thêm notify&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> user_role &lt;span style="color:#f92672">in&lt;/span> (&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">authenticated_customer&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">agent&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>):
base_tools&lt;span style="color:#f92672">.&lt;/span>append(send_notification)
&lt;span style="color:#75715e"># Nhân viên CS: thêm write tools&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> user_role &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">agent&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>:
base_tools&lt;span style="color:#f92672">.&lt;/span>extend([create_support_ticket, update_ticket_status])
&lt;span style="color:#75715e"># Chỉ mở write tools sau khi đã thu thập đủ thông tin&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> conversation_stage &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">resolution&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span> &lt;span style="color:#f92672">and&lt;/span> user_role &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">agent&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>:
base_tools&lt;span style="color:#f92672">.&lt;/span>append(schedule_callback)
&lt;span style="color:#66d9ef">return&lt;/span> base_tools
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="92-xc-thc-input-trc-khi-execute">9.2. Xác thực input trước khi execute&lt;/h3>
&lt;p>Mỗi tool executor phải validate input — đừng tin tưởng hoàn toàn vào JSON LLM sinh ra:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">execute_tool_safely&lt;/span>(tool_name: str, args: dict, user_context: dict) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> dict:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Wrapper bảo mật cho mọi tool execution.&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#75715e"># 1. Kiểm tra tool có trong allow-list không&lt;/span>
allowed &lt;span style="color:#f92672">=&lt;/span> get_tools_for_context(user_context[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">role&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>], user_context[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">stage&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>])
&lt;span style="color:#66d9ef">if&lt;/span> tool_name &lt;span style="color:#f92672">not&lt;/span> &lt;span style="color:#f92672">in&lt;/span> [t&lt;span style="color:#f92672">.&lt;/span>name &lt;span style="color:#66d9ef">for&lt;/span> t &lt;span style="color:#f92672">in&lt;/span> allowed]:
&lt;span style="color:#66d9ef">return&lt;/span> {&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">error&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Tool &lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">{tool_name}&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74"> không được phép trong context này&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>}
&lt;span style="color:#75715e"># 2. Validate schema bằng JSON Schema&lt;/span>
schema &lt;span style="color:#f92672">=&lt;/span> TOOL_SCHEMAS[tool_name]
errors &lt;span style="color:#f92672">=&lt;/span> jsonschema&lt;span style="color:#f92672">.&lt;/span>validate(args, schema[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">parameters&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>])
&lt;span style="color:#66d9ef">if&lt;/span> errors:
&lt;span style="color:#66d9ef">return&lt;/span> {&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">error&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Invalid arguments: {errors}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>}
&lt;span style="color:#75715e"># 3. Business rule validation&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> tool_name &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">create_support_ticket&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>:
&lt;span style="color:#66d9ef">if&lt;/span> args&lt;span style="color:#f92672">.&lt;/span>get(&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">priority&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>) &lt;span style="color:#f92672">==&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">urgent&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span> &lt;span style="color:#f92672">and&lt;/span> user_context[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">role&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>] &lt;span style="color:#f92672">!=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">agent&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>:
args[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">priority&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>] &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">high&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Downgrade nếu không có quyền&lt;/span>
&lt;span style="color:#75715e"># 4. Audit log trước khi execute write&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> TOOL_WRITE_FLAG&lt;span style="color:#f92672">.&lt;/span>get(tool_name):
audit_log&lt;span style="color:#f92672">.&lt;/span>write({
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">timestamp&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: datetime&lt;span style="color:#f92672">.&lt;/span>utcnow()&lt;span style="color:#f92672">.&lt;/span>isoformat(),
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">user_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: user_context[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">user_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>],
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">tool&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: tool_name,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">args&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: sanitize_pii(args), &lt;span style="color:#75715e"># Mask PII trước khi log&lt;/span>
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">session_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: user_context[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">session_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>]
})
&lt;span style="color:#75715e"># 5. Execute&lt;/span>
&lt;span style="color:#66d9ef">return&lt;/span> TOOL_REGISTRY[tool_name](&lt;span style="color:#f92672">*&lt;/span>&lt;span style="color:#f92672">*&lt;/span>args)
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="93-human-in-the-loop-trc-write-tools">9.3. Human-in-the-Loop trước Write Tools&lt;/h3>
&lt;p>Với các hành động có tác động cao (hủy đơn, hoàn tiền, cập nhật hợp đồng), bắt buộc phải có bước xác nhận từ người:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-yaml" data-lang="yaml">&lt;span style="color:#75715e"># Pseudo workflow: human_approval_gate&lt;/span>
name: write_tool_approval_flow
trigger: tool_call_detected
steps:
- name: classify_tool_impact
check:
- tool_name in HIGH_IMPACT_TOOLS &lt;span style="color:#75715e"># cancel_order, process_refund, update_contract&lt;/span>
on_match: require_approval
on_no_match: auto_execute
- name: request_human_approval
action: send_approval_request
channels:
- slack_manager_channel
- email_supervisor
timeout: 30_minutes
payload:
tool: &lt;span style="color:#e6db74">&amp;#34;{{ tool_name }}&amp;#34;&lt;/span>
args: &lt;span style="color:#e6db74">&amp;#34;{{ tool_args_sanitized }}&amp;#34;&lt;/span>
context: &lt;span style="color:#e6db74">&amp;#34;{{ conversation_summary }}&amp;#34;&lt;/span>
requested_by: &lt;span style="color:#e6db74">&amp;#34;{{ agent_id }}&amp;#34;&lt;/span>
- name: wait_for_decision
action: pause_workflow
resume_on:
- approved: execute_tool_with_audit_log
- rejected: notify_agent_and_user
- timeout: escalate_to_level2
on_error:
- default_action: reject_and_notify
- audit_log: always
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="94-checklist-bo-mt-tool-use">9.4. Checklist bảo mật Tool Use&lt;/h3>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Allow-list tool theo role và context người dùng&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Validate JSON schema của mọi tool call trước khi execute&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Mask PII (email, phone, CCCD) trong audit log&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Rate limiting per user/session trên write tools&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Human approval gate cho high-impact write operations&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Circuit breaker cho external API calls&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Timeout toàn bộ tool execution (gợi ý: ≤ 10 giây)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Dead letter queue cho failed tool calls&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Alert khi có pattern bất thường (nhiều urgent ticket trong 5 phút)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="10-checklist-thit-k-tool-use">10. Checklist thiết kế Tool Use&lt;/h2>
&lt;h3 id="-checklist-nh-ngha-tool">✅ Checklist định nghĩa Tool&lt;/h3>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Tên tool rõ ràng, dùng prefix hành động (get_, create_, send_)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Description mô tả đủ &amp;ldquo;khi nào dùng&amp;rdquo; VÀ &amp;ldquo;khi nào không dùng&amp;rdquo;&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Mọi parameter có description đầy đủ, có enum cho giá trị giới hạn&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Mọi required parameter được đánh dấu rõ&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Tool đã được test với ít nhất 10 câu hỏi thực tế để kiểm tra routing&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Tool không làm nhiều hơn một việc (single responsibility)&lt;/li>
&lt;/ul>
&lt;h3 id="-checklist-trin-khai">✅ Checklist triển khai&lt;/h3>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Allow-list tool đã thiết lập theo role và context&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Validation schema cho tất cả input&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Audit log đã bật cho mọi write tool&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Timeout đã thiết lập (khuyến nghị 10s cho tool, 30s cho toàn bộ agent loop)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Parallel tool call đã được xử lý đúng bằng async&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Max iteration đã đặt giới hạn (5–10 bước là đủ cho hầu hết use case)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Error handling: tool failure không làm crash toàn bộ conversation&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Idempotency key đã implement cho write tools&lt;/li>
&lt;/ul>
&lt;h3 id="-checklist-vn-hnh">✅ Checklist vận hành&lt;/h3>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Dashboard theo dõi: tool call rate, error rate, latency per tool&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Alert khi tool error rate &amp;gt; 5%&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Định kỳ review audit log để phát hiện misuse&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Kiểm tra định kỳ tool definitions vẫn còn phù hợp với API backend&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Quy trình rollback khi tool gây ra kết quả không mong muốn&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="11-kpi-chi-ph-v-roi">11. KPI, Chi phí và ROI&lt;/h2>
&lt;h3 id="111-kpi-cho-tool-use">11.1. KPI cho Tool Use&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>KPI&lt;/th>
&lt;th>Định nghĩa&lt;/th>
&lt;th>Mục tiêu MVP&lt;/th>
&lt;th>Mục tiêu Production&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Tool Call Accuracy&lt;/strong>&lt;/td>
&lt;td>% tool call LLM chọn đúng tool cần thiết&lt;/td>
&lt;td>≥ 85%&lt;/td>
&lt;td>≥ 95%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tool Execution Success Rate&lt;/strong>&lt;/td>
&lt;td>% tool call thực thi thành công (không lỗi)&lt;/td>
&lt;td>≥ 90%&lt;/td>
&lt;td>≥ 99%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Avg Tool Latency (P95)&lt;/strong>&lt;/td>
&lt;td>Thời gian thực thi tool P95&lt;/td>
&lt;td>≤ 3s&lt;/td>
&lt;td>≤ 1s&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Automation Rate&lt;/strong>&lt;/td>
&lt;td>% yêu cầu giải quyết hoàn toàn tự động&lt;/td>
&lt;td>≥ 50%&lt;/td>
&lt;td>≥ 75%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Human Escalation Rate&lt;/strong>&lt;/td>
&lt;td>% request cần human intervention&lt;/td>
&lt;td>≤ 30%&lt;/td>
&lt;td>≤ 15%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Unnecessary Tool Calls&lt;/strong>&lt;/td>
&lt;td>% lần LLM gọi tool không cần thiết&lt;/td>
&lt;td>≤ 15%&lt;/td>
&lt;td>≤ 5%&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="112-c-lng-chi-ph-quy-m-smb-5000-requestsngy">11.2. Ước lượng chi phí (Quy mô SMB, 5.000 requests/ngày)&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Hạng mục&lt;/th>
&lt;th>Chi phí thiết lập&lt;/th>
&lt;th>Chi phí vận hành/tháng&lt;/th>
&lt;th>Ghi chú&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>LLM API (GPT-4o-mini)&lt;/td>
&lt;td>—&lt;/td>
&lt;td>$60–150&lt;/td>
&lt;td>Tool use tăng ~30% token vs text-only&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Backend API / Tool servers&lt;/td>
&lt;td>$0 (code hiện có)&lt;/td>
&lt;td>$20–50&lt;/td>
&lt;td>Thêm endpoint cho tool execution&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Logging &amp;amp; monitoring&lt;/td>
&lt;td>—&lt;/td>
&lt;td>$10–30&lt;/td>
&lt;td>Elasticsearch hoặc Datadog&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Audit log storage&lt;/td>
&lt;td>—&lt;/td>
&lt;td>$5–15&lt;/td>
&lt;td>S3/MinIO lưu audit log&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tổng ước lượng&lt;/strong>&lt;/td>
&lt;td>&lt;strong>$500–2.000&lt;/strong>&lt;/td>
&lt;td>&lt;strong>$95–245&lt;/strong>&lt;/td>
&lt;td>Bao gồm thiết lập &amp;amp; tích hợp ban đầu&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="113-roi-tham-chiu">11.3. ROI tham chiếu&lt;/h3>
&lt;p>&lt;strong>Tình huống&lt;/strong>: Bộ phận CS xử lý 150 yêu cầu/ngày, mỗi request mất trung bình 8 phút nhân sự.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Trước Tool Use&lt;/strong>: 150 × 8 phút = 20 giờ/ngày → ~$600/ngày (tính $30/giờ)&lt;/li>
&lt;li>&lt;strong>Sau Tool Use&lt;/strong> (70% automation): 45 request cần người → 6 giờ/ngày → ~$180/ngày&lt;/li>
&lt;li>&lt;strong>Tiết kiệm&lt;/strong>: ~$420/ngày × 22 ngày = &lt;strong>~$9.240/tháng&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Chi phí hệ thống&lt;/strong>: ~$200/tháng&lt;/li>
&lt;li>&lt;strong>ROI tháng đầu&lt;/strong>: &lt;strong>~4.500%&lt;/strong> | &lt;strong>Hoàn vốn thiết lập&lt;/strong>: &lt;strong>&amp;lt; 1 tuần&lt;/strong>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="12-ri-ro-v-phng-n-gim-thiu">12. Rủi ro và Phương án Giảm Thiểu&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Rủi ro&lt;/th>
&lt;th>Mức độ&lt;/th>
&lt;th>Xác suất&lt;/th>
&lt;th>Phương án giảm thiểu&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>LLM gọi sai tool&lt;/strong> (nhầm create khi chỉ cần get)&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Description rõ ràng + unit test routing + allow-list&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Hallucinate tham số&lt;/strong> (bịa order_id không tồn tại)&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Thấp–Trung bình&lt;/td>
&lt;td>Validate schema + verify tham số với DB trước khi execute&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Prompt injection&lt;/strong> qua user input để gọi tool ngoài scope&lt;/td>
&lt;td>Rất cao&lt;/td>
&lt;td>Thấp&lt;/td>
&lt;td>Sanitize input + allow-list tool + human approval cho write&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tool chạy vòng lặp vô hạn&lt;/strong>&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Thấp&lt;/td>
&lt;td>Max iteration limit + circuit breaker&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Chi phí token tăng đột biến&lt;/strong>&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Token budget per session + cảnh báo anomaly&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>API backend bị quá tải&lt;/strong> do nhiều parallel tool calls&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Rate limit + queue + circuit breaker&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Lộ PII trong tool args log&lt;/strong>&lt;/td>
&lt;td>Rất cao&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>PII masking trước khi log + RBAC trên log access&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tool definition out-of-date&lt;/strong> khi backend API thay đổi&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Versioning tool definitions + integration test tự động&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="13-roadmap-trin-khai-3-giai-on">13. Roadmap Triển Khai 3 Giai Đoạn&lt;/h2>
&lt;h3 id="giai-on-1-tun-13-tool-use-c-bn">Giai đoạn 1 (Tuần 1–3): Tool Use cơ bản&lt;/h3>
&lt;p>&lt;strong>Mục tiêu&lt;/strong>: Agent có thể đọc dữ liệu nghiệp vụ và trả lời chính xác hơn.&lt;/p>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Xác định 3–5 read tools quan trọng nhất (tra cứu đơn hàng, khách hàng, FAQ)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Viết tool definitions theo chuẩn JSON Schema với description đầy đủ&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Tích hợp OpenAI Function Calling hoặc Semantic Kernel cơ bản&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Test routing với 50 câu hỏi mẫu đại diện&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Thiết lập logging tool calls cơ bản&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">&lt;strong>KPI đo được&lt;/strong>: Tool Call Accuracy ≥ 85%, Tool Latency ≤ 3s&lt;/li>
&lt;/ul>
&lt;h3 id="giai-on-2-tun-48-write-tools--bo-mt">Giai đoạn 2 (Tuần 4–8): Write Tools + Bảo mật&lt;/h3>
&lt;p>&lt;strong>Mục tiêu&lt;/strong>: Agent có thể thực hiện hành động nghiệp vụ có kiểm soát.&lt;/p>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Thêm 3–5 write tools (tạo ticket, gửi thông báo, cập nhật trạng thái)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Triển khai allow-list tool theo role và context&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement audit log đầy đủ cho write operations&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Thiết lập human approval gate cho high-impact actions&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Kiểm thử bảo mật: prompt injection, schema validation&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">&lt;strong>KPI đo được&lt;/strong>: Automation Rate ≥ 50%, Escalation Rate ≤ 30%&lt;/li>
&lt;/ul>
&lt;h3 id="giai-on-3-tun-912-ti-u--scale">Giai đoạn 3 (Tuần 9–12): Tối ưu &amp;amp; Scale&lt;/h3>
&lt;p>&lt;strong>Mục tiêu&lt;/strong>: Vận hành ổn định, chi phí tối ưu, có thể mở rộng tool catalog.&lt;/p>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Parallel tool calls cho các read operations độc lập&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Tối ưu tool selection bằng dynamic context filtering&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Xem xét MCP nếu cần chia sẻ tool cho nhiều ứng dụng&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Dashboard KPI đầy đủ và alert tự động&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">A/B test tool descriptions để cải thiện routing accuracy&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">&lt;strong>KPI đo được&lt;/strong>: Tool Call Accuracy ≥ 95%, Automation Rate ≥ 75%&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="14-kt-lun-v-kt-ni-sang-bi-5">14. Kết luận và Kết nối sang Bài 5&lt;/h2>
&lt;p>Tool Use &amp;amp; Function Calling là &lt;strong>bước nhảy vọt&lt;/strong> từ chatbot trả lời sang AI Agent hành động. Khi triển khai đúng:&lt;/p>
&lt;ul>
&lt;li>Agent không chỉ &lt;strong>biết&lt;/strong> — mà còn &lt;strong>làm được&lt;/strong>&lt;/li>
&lt;li>Hệ thống không chỉ &lt;strong>phản hồi&lt;/strong> — mà còn &lt;strong>tự động hóa&lt;/strong> được quy trình nghiệp vụ&lt;/li>
&lt;li>ROI không chỉ là &amp;ldquo;tiện hơn&amp;rdquo; — mà đo được bằng &lt;strong>số giờ nhân lực tiết kiệm mỗi ngày&lt;/strong>&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Ba nguyên tắc cốt lõi&lt;/strong> để Tool Use thành công trong thực tế:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Description trước, code sau&lt;/strong> — 80% vấn đề routing đến từ description mơ hồ, không phải lỗi kỹ thuật&lt;/li>
&lt;li>&lt;strong>Read trước, Write sau&lt;/strong> — triển khai read tools, đo ROI, rồi mới mở rộng sang write&lt;/li>
&lt;li>&lt;strong>Audit everything&lt;/strong> — mọi hành động của agent lên hệ thống nghiệp vụ đều phải có dấu vết&lt;/li>
&lt;/ol>
&lt;hr>
&lt;p>Bài tiếp theo trong series sẽ đi sâu vào &lt;strong>Memory &amp;amp; Context Management&lt;/strong> — cách AI Agent ghi nhớ thông tin qua nhiều phiên hội thoại, quản lý long-context hiệu quả và xây dựng hồ sơ người dùng thông minh để cá nhân hóa trải nghiệm. Đây là yếu tố then chốt để đi từ &amp;ldquo;agent trả lời được một câu&amp;rdquo; sang &amp;ldquo;agent hiểu khách hàng theo thời gian&amp;rdquo;.&lt;/p>
&lt;hr>
&lt;p>&lt;em>Tác giả: AI Agent Series | Cập nhật: 14/05/2026&lt;/em>&lt;/p></description></item><item><title>Memory &amp; Context Management — Giúp AI Agent ghi nhớ và hiểu ngữ cảnh</title><link>https://vunb.github.io/tutorials/ai-agent/memory-va-context-management-giup-ai-agent-ghi-nho-va-hieu-ngu-canh/</link><pubDate>Thu, 14 May 2026 00:00:00 +0700</pubDate><guid>https://vunb.github.io/tutorials/ai-agent/memory-va-context-management-giup-ai-agent-ghi-nho-va-hieu-ngu-canh/</guid><description>&lt;h2 id="1-v-sao-ai-agent-cn-b-nh">1. Vì sao AI Agent cần bộ nhớ?&lt;/h2>
&lt;p>Ở bài trước, chúng ta đã trang bị cho AI Agent khả năng &lt;strong>hành động&lt;/strong> thông qua Tool Use &amp;amp; Function Calling. Tuy nhiên, ngay cả khi agent đã biết gọi đúng tool, vẫn tồn tại một vấn đề căn bản khiến trải nghiệm người dùng còn rời rạc:&lt;/p>
&lt;blockquote>
&lt;p>&amp;ldquo;Tôi đã báo với chatbot tuần trước rằng tôi dị ứng latex — sao hôm nay nó lại gợi ý sản phẩm có latex cho tôi?&amp;rdquo;&lt;/p>
&lt;/blockquote>
&lt;blockquote>
&lt;p>&amp;ldquo;Mỗi lần mở chat mới tôi phải giải thích lại toàn bộ context từ đầu. Mệt mỏi lắm.&amp;rdquo;&lt;/p>
&lt;/blockquote>
&lt;p>Đây là &lt;strong>giới hạn cốt lõi của LLM thuần&lt;/strong>: mô hình ngôn ngữ là &lt;strong>stateless&lt;/strong> — nó không tự động nhớ gì giữa các lần gọi API. Mỗi request là một trang giấy trắng.&lt;/p>
&lt;h3 id="11-gii-hn-context-window">1.1. Giới hạn Context Window&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Mô hình&lt;/th>
&lt;th>Context Window&lt;/th>
&lt;th>Tương đương&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>GPT-4o-mini&lt;/td>
&lt;td>128.000 tokens&lt;/td>
&lt;td>~96.000 từ tiếng Anh (~100 trang A4)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>GPT-4o&lt;/td>
&lt;td>128.000 tokens&lt;/td>
&lt;td>~96.000 từ&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Claude 3.5 Sonnet&lt;/td>
&lt;td>200.000 tokens&lt;/td>
&lt;td>~150.000 từ&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Gemini 1.5 Pro&lt;/td>
&lt;td>1.000.000 tokens&lt;/td>
&lt;td>~750.000 từ&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>Llama 3.1 70B&lt;/td>
&lt;td>128.000 tokens&lt;/td>
&lt;td>~96.000 từ&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;p>Context window lớn không giải quyết được vấn đề:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Chi phí&lt;/strong>: gửi 100.000 token mỗi request = chi phí API tăng tuyến tính&lt;/li>
&lt;li>&lt;strong>Latency&lt;/strong>: context dài → TTFT (time-to-first-token) tăng đáng kể&lt;/li>
&lt;li>&lt;strong>Lost-in-the-middle&lt;/strong>: nghiên cứu cho thấy LLM xử lý thông tin ở đầu và cuối context tốt hơn phần giữa&lt;/li>
&lt;li>&lt;strong>Vẫn stateless&lt;/strong>: đóng browser tab là mất hết, không có khái niệm &amp;ldquo;lần sau nhớ lại&amp;rdquo;&lt;/li>
&lt;/ul>
&lt;h3 id="12-stateless-vs-stateful-agent">1.2. Stateless vs Stateful Agent&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Đặc điểm&lt;/th>
&lt;th>Stateless Agent&lt;/th>
&lt;th>Stateful Agent&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Nhớ hội thoại&lt;/strong>&lt;/td>
&lt;td>Chỉ trong session&lt;/td>
&lt;td>Qua nhiều session&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Nhớ sở thích người dùng&lt;/strong>&lt;/td>
&lt;td>❌&lt;/td>
&lt;td>✅&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Cá nhân hóa&lt;/strong>&lt;/td>
&lt;td>❌&lt;/td>
&lt;td>✅&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Chi phí token&lt;/strong>&lt;/td>
&lt;td>Cao (phải gửi lại history)&lt;/td>
&lt;td>Tối ưu hơn (chỉ gửi phần relevant)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Độ phức tạp triển khai&lt;/strong>&lt;/td>
&lt;td>Thấp&lt;/td>
&lt;td>Trung bình–Cao&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Ứng dụng phù hợp&lt;/strong>&lt;/td>
&lt;td>FAQ đơn giản&lt;/td>
&lt;td>CRM AI, Healthcare AI, Trợ lý cá nhân&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="13-pain-point-thc-t">1.3. Pain Point thực tế&lt;/h3>
&lt;p>&lt;strong>E-commerce&lt;/strong>: Chatbot gợi ý lại sản phẩm khách đã từ chối 3 lần trước.&lt;/p>
&lt;p>&lt;strong>Healthcare&lt;/strong>: Bệnh nhân phải khai lại tiền sử bệnh mỗi lần tương tác với AI assistant của phòng khám.&lt;/p>
&lt;p>&lt;strong>HR Automation&lt;/strong>: Nhân viên phải giải thích lại quy trình đã được AI hướng dẫn cách đây 2 tuần.&lt;/p>
&lt;p>&lt;strong>Kết luận&lt;/strong>: Bộ nhớ không phải tính năng &amp;ldquo;nice-to-have&amp;rdquo; — đây là &lt;strong>điều kiện cần&lt;/strong> để AI Agent tạo ra giá trị bền vững cho doanh nghiệp.&lt;/p>
&lt;hr>
&lt;h2 id="2-taxonomy-b-nh-ai-agent-4-loi">2. Taxonomy bộ nhớ AI Agent: 4 loại&lt;/h2>
&lt;p>Không có một loại bộ nhớ nào phù hợp cho tất cả. Hệ thống memory hiệu quả kết hợp &lt;strong>4 loại&lt;/strong> theo tầng:&lt;/p>
&lt;pre>&lt;code>┌─────────────────────────────────────────────────────────────────┐
│ AI AGENT MEMORY TAXONOMY │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LOẠI 1: IN-CONTEXT MEMORY (Working Memory) │ │
│ │ • Nằm trong context window của LLM │ │
│ │ • Hội thoại hiện tại, system prompt, tool results │ │
│ │ • Tốc độ: Rất nhanh (đã trong RAM của LLM) │ │
│ │ • Giới hạn: Bị xóa khi hết session / hết context │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LOẠI 2: SESSION MEMORY (External Short-term) │ │
│ │ • Lưu ngoài LLM, trong Redis/Valkey │ │
│ │ • Toàn bộ lịch sử hội thoại trong một phiên làm việc │ │
│ │ • TTL: vài giờ đến vài ngày │ │
│ │ • Tốc độ: Nhanh (~1–5ms) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LOẠI 3: PERSISTENT MEMORY (External Long-term) │ │
│ │ • Lưu trong PostgreSQL / SQL Server │ │
│ │ • Hồ sơ người dùng, sở thích, tóm tắt lịch sử dài hạn │ │
│ │ • TTL: Không giới hạn (hoặc theo policy) │ │
│ │ • Tốc độ: Trung bình (~5–50ms) │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LOẠI 4: SEMANTIC MEMORY (Vector Store) │ │
│ │ • Lưu embeddings của ký ức quan trọng │ │
│ │ • Truy vấn bằng semantic similarity (không cần key) │ │
│ │ • Kết hợp với RAG pipeline │ │
│ │ • Qdrant / Weaviate / pgvector / Chroma │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Tốc độ truy cập: Loại 1 &amp;gt; 2 &amp;gt; 4 &amp;gt; 3
Dung lượng lưu trữ: Loại 3 &amp;gt; 4 &amp;gt; 2 &amp;gt; 1
Chi phí lưu trữ: Loại 1 &amp;lt; 2 &amp;lt; 3 ≈ 4
&lt;/code>&lt;/pre>&lt;h3 id="21-khi-no-dng-loi-no">2.1. Khi nào dùng loại nào?&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Loại&lt;/th>
&lt;th>Use Case điển hình&lt;/th>
&lt;th>Ví dụ&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>In-Context&lt;/strong>&lt;/td>
&lt;td>Hội thoại đang diễn ra, tool results tức thì&lt;/td>
&lt;td>&amp;ldquo;Đơn hàng vừa tra là ORD-001, đang giao&amp;rdquo;&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Session&lt;/strong>&lt;/td>
&lt;td>Chuyển tab, F5 trang, reconnect WebSocket&lt;/td>
&lt;td>Tiếp tục hội thoại sau khi mạng bị ngắt&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Persistent&lt;/strong>&lt;/td>
&lt;td>Sở thích cá nhân, lịch sử mua hàng, thông tin hợp đồng&lt;/td>
&lt;td>&amp;ldquo;Khách này thích giao hàng sáng sớm&amp;rdquo;&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Semantic&lt;/strong>&lt;/td>
&lt;td>&amp;ldquo;Nhớ lại&amp;rdquo; ngữ nghĩa không theo thứ tự thời gian&lt;/td>
&lt;td>&amp;ldquo;Lần nào đó khách đề cập vấn đề với sản phẩm X&amp;rdquo;&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="3-in-context-memory--k-thut-qun-l-conversation-history">3. In-Context Memory — Kỹ thuật quản lý Conversation History&lt;/h2>
&lt;p>In-Context Memory là lớp bộ nhớ &lt;strong>đơn giản nhất&lt;/strong> nhưng cần quản lý thận trọng nhất vì ảnh hưởng trực tiếp đến chi phí API và chất lượng câu trả lời.&lt;/p>
&lt;h3 id="31-k-thut-1-sliding-window">3.1. Kỹ thuật 1: Sliding Window&lt;/h3>
&lt;p>Giữ lại &lt;strong>N tin nhắn gần nhất&lt;/strong>, bỏ đi tin nhắn cũ:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#f92672">from&lt;/span> collections &lt;span style="color:#f92672">import&lt;/span> deque
&lt;span style="color:#f92672">from&lt;/span> dataclasses &lt;span style="color:#f92672">import&lt;/span> dataclass, field
&lt;span style="color:#f92672">from&lt;/span> typing &lt;span style="color:#f92672">import&lt;/span> Literal
&lt;span style="color:#a6e22e">@dataclass&lt;/span>
&lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">Message&lt;/span>:
role: Literal[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">user&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">assistant&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">system&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">tool&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>]
content: str
token_count: int &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>
&lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">SlidingWindowMemory&lt;/span>:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Sliding window giữ lại N tin nhắn gần nhất.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> System prompt luôn được giữ nguyên (không tính vào window).&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> __init__(self, max_messages: int &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">20&lt;/span>, system_prompt: str &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>):
self&lt;span style="color:#f92672">.&lt;/span>max_messages &lt;span style="color:#f92672">=&lt;/span> max_messages
self&lt;span style="color:#f92672">.&lt;/span>system_prompt &lt;span style="color:#f92672">=&lt;/span> system_prompt
self&lt;span style="color:#f92672">.&lt;/span>_history: deque[Message] &lt;span style="color:#f92672">=&lt;/span> deque(maxlen&lt;span style="color:#f92672">=&lt;/span>max_messages)
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">add&lt;/span>(self, role: str, content: str) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> None:
self&lt;span style="color:#f92672">.&lt;/span>_history&lt;span style="color:#f92672">.&lt;/span>append(Message(role&lt;span style="color:#f92672">=&lt;/span>role, content&lt;span style="color:#f92672">=&lt;/span>content))
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">get_context&lt;/span>(self) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> list[dict]:
messages &lt;span style="color:#f92672">=&lt;/span> [{&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">role&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">system&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">content&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: self&lt;span style="color:#f92672">.&lt;/span>system_prompt}]
messages&lt;span style="color:#f92672">.&lt;/span>extend(
{&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">role&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: m&lt;span style="color:#f92672">.&lt;/span>role, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">content&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: m&lt;span style="color:#f92672">.&lt;/span>content}
&lt;span style="color:#66d9ef">for&lt;/span> m &lt;span style="color:#f92672">in&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_history
)
&lt;span style="color:#66d9ef">return&lt;/span> messages
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">clear&lt;/span>(self) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> None:
self&lt;span style="color:#f92672">.&lt;/span>_history&lt;span style="color:#f92672">.&lt;/span>clear()
&lt;/code>&lt;/pre>&lt;/div>&lt;p>&lt;strong>Ưu điểm&lt;/strong>: Đơn giản, dễ triển khai.&lt;br>
&lt;strong>Nhược điểm&lt;/strong>: Mất thông tin quan trọng nếu xảy ra ở đầu cuộc hội thoại.&lt;/p>
&lt;h3 id="32-k-thut-2-token-budget-management">3.2. Kỹ thuật 2: Token Budget Management&lt;/h3>
&lt;p>Kiểm soát chính xác theo &lt;strong>số token&lt;/strong> thay vì số tin nhắn:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#f92672">import&lt;/span> tiktoken
&lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">TokenBudgetMemory&lt;/span>:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Quản lý history theo token budget.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Khi vượt ngưỡng, tự động drop tin nhắn cũ nhất (trừ system prompt).&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> __init__(
self,
max_tokens: int &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">4&lt;/span>_000, &lt;span style="color:#75715e"># Token dành cho history&lt;/span>
model: str &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">gpt-4o-mini&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>,
system_prompt: str &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
):
self&lt;span style="color:#f92672">.&lt;/span>max_tokens &lt;span style="color:#f92672">=&lt;/span> max_tokens
self&lt;span style="color:#f92672">.&lt;/span>system_prompt &lt;span style="color:#f92672">=&lt;/span> system_prompt
self&lt;span style="color:#f92672">.&lt;/span>_history: list[Message] &lt;span style="color:#f92672">=&lt;/span> []
self&lt;span style="color:#f92672">.&lt;/span>_encoder &lt;span style="color:#f92672">=&lt;/span> tiktoken&lt;span style="color:#f92672">.&lt;/span>encoding_for_model(model)
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">_count_tokens&lt;/span>(self, text: str) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> int:
&lt;span style="color:#66d9ef">return&lt;/span> len(self&lt;span style="color:#f92672">.&lt;/span>_encoder&lt;span style="color:#f92672">.&lt;/span>encode(text))
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">_total_history_tokens&lt;/span>(self) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> int:
&lt;span style="color:#66d9ef">return&lt;/span> sum(self&lt;span style="color:#f92672">.&lt;/span>_count_tokens(m&lt;span style="color:#f92672">.&lt;/span>content) &lt;span style="color:#66d9ef">for&lt;/span> m &lt;span style="color:#f92672">in&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_history)
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">add&lt;/span>(self, role: str, content: str) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> None:
new_tokens &lt;span style="color:#f92672">=&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_count_tokens(content)
&lt;span style="color:#75715e"># Trim cũ nếu cần&lt;/span>
&lt;span style="color:#66d9ef">while&lt;/span> (
self&lt;span style="color:#f92672">.&lt;/span>_history
&lt;span style="color:#f92672">and&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_total_history_tokens() &lt;span style="color:#f92672">+&lt;/span> new_tokens &lt;span style="color:#f92672">&amp;gt;&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>max_tokens
):
self&lt;span style="color:#f92672">.&lt;/span>_history&lt;span style="color:#f92672">.&lt;/span>pop(&lt;span style="color:#ae81ff">0&lt;/span>) &lt;span style="color:#75715e"># Bỏ tin nhắn cũ nhất&lt;/span>
self&lt;span style="color:#f92672">.&lt;/span>_history&lt;span style="color:#f92672">.&lt;/span>append(Message(role&lt;span style="color:#f92672">=&lt;/span>role, content&lt;span style="color:#f92672">=&lt;/span>content,
token_count&lt;span style="color:#f92672">=&lt;/span>new_tokens))
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">get_context&lt;/span>(self) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> list[dict]:
messages &lt;span style="color:#f92672">=&lt;/span> [{&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">role&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">system&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">content&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: self&lt;span style="color:#f92672">.&lt;/span>system_prompt}]
messages&lt;span style="color:#f92672">.&lt;/span>extend({&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">role&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: m&lt;span style="color:#f92672">.&lt;/span>role, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">content&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: m&lt;span style="color:#f92672">.&lt;/span>content}
&lt;span style="color:#66d9ef">for&lt;/span> m &lt;span style="color:#f92672">in&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_history)
&lt;span style="color:#66d9ef">return&lt;/span> messages
&lt;span style="color:#a6e22e">@property&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">used_tokens&lt;/span>(self) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> int:
&lt;span style="color:#66d9ef">return&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_total_history_tokens()
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="33-k-thut-3-message-summarization-khi-gn-t-limit">3.3. Kỹ thuật 3: Message Summarization khi gần đạt limit&lt;/h3>
&lt;p>Khi history đầy, &lt;strong>tóm tắt các tin cũ&lt;/strong> thay vì xóa hẳn — giữ lại thông tin quan trọng với ít token hơn:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">SummarizingMemory&lt;/span>:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Khi token vượt ngưỡng, gọi LLM để tóm tắt nửa đầu lịch sử.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Kết quả tóm tắt được lưu lại như một tin nhắn &lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">system&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74"> đặc biệt.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
SUMMARY_THRESHOLD &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0.80&lt;/span> &lt;span style="color:#75715e"># Tóm tắt khi đạt 80% token budget&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> __init__(self, max_tokens: int &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">6&lt;/span>_000, llm_client&lt;span style="color:#f92672">=&lt;/span>None):
self&lt;span style="color:#f92672">.&lt;/span>max_tokens &lt;span style="color:#f92672">=&lt;/span> max_tokens
self&lt;span style="color:#f92672">.&lt;/span>_history: list[Message] &lt;span style="color:#f92672">=&lt;/span> []
self&lt;span style="color:#f92672">.&lt;/span>_summary: str &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
self&lt;span style="color:#f92672">.&lt;/span>_llm &lt;span style="color:#f92672">=&lt;/span> llm_client
async &lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">_summarize_older_half&lt;/span>(self) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> None:
midpoint &lt;span style="color:#f92672">=&lt;/span> len(self&lt;span style="color:#f92672">.&lt;/span>_history) &lt;span style="color:#f92672">/&lt;/span>&lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#ae81ff">2&lt;/span>
to_summarize &lt;span style="color:#f92672">=&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_history[:midpoint]
self&lt;span style="color:#f92672">.&lt;/span>_history &lt;span style="color:#f92672">=&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_history[midpoint:]
conversation_text &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#f92672">.&lt;/span>join(
f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">{m.role.upper()}: {m.content}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span> &lt;span style="color:#66d9ef">for&lt;/span> m &lt;span style="color:#f92672">in&lt;/span> to_summarize
)
prompt &lt;span style="color:#f92672">=&lt;/span> (
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Tóm tắt ngắn gọn cuộc hội thoại sau, giữ lại &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">các thông tin quan trọng như: thông tin đơn hàng, &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">vấn đề người dùng đã báo, quyết định đã đưa ra:&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">{conversation_text}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
)
response &lt;span style="color:#f92672">=&lt;/span> await self&lt;span style="color:#f92672">.&lt;/span>_llm&lt;span style="color:#f92672">.&lt;/span>complete(prompt)
self&lt;span style="color:#f92672">.&lt;/span>_summary &lt;span style="color:#f92672">=&lt;/span> (
f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">[TÓM TẮT HỘI THOẠI TRƯỚC]: {response}&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
&lt;span style="color:#f92672">+&lt;/span> (f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">[TÓM TẮT TRƯỚC ĐÓ]: {self._summary}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span> &lt;span style="color:#66d9ef">if&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_summary &lt;span style="color:#66d9ef">else&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>)
)
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">get_context&lt;/span>(self) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> list[dict]:
messages &lt;span style="color:#f92672">=&lt;/span> []
&lt;span style="color:#66d9ef">if&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_summary:
messages&lt;span style="color:#f92672">.&lt;/span>append({&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">role&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">system&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">content&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: self&lt;span style="color:#f92672">.&lt;/span>_summary})
messages&lt;span style="color:#f92672">.&lt;/span>extend({&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">role&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: m&lt;span style="color:#f92672">.&lt;/span>role, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">content&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: m&lt;span style="color:#f92672">.&lt;/span>content}
&lt;span style="color:#66d9ef">for&lt;/span> m &lt;span style="color:#f92672">in&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_history)
&lt;span style="color:#66d9ef">return&lt;/span> messages
&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="4-session-memory--lu-tr-ngn-hn-vi-redisvalkey">4. Session Memory — Lưu trữ ngắn hạn với Redis/Valkey&lt;/h2>
&lt;p>Session Memory giải quyết vấn đề &lt;strong>mất hội thoại khi reconnect&lt;/strong> mà không cần lưu trữ mãi mãi.&lt;/p>
&lt;h3 id="41-session-schema-json">4.1. Session Schema (JSON)&lt;/h3>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-json" data-lang="json">{
&lt;span style="color:#f92672">&amp;#34;session_id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;sess_abc123xyz&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;user_id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;usr_456&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;tenant_id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;tenant_healthcare_01&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;created_at&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;2026-05-14T08:30:00+07:00&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;last_active&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;2026-05-14T09:15:42+07:00&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;ttl_seconds&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">86400&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;metadata&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;channel&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;web&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;agent_id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;support-agent-v2&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;language&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;vi&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;context&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;current_topic&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;đơn hàng ORD-78901&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;entities_mentioned&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;ORD-78901&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;sản phẩm laptop X1&amp;#34;&lt;/span>],
&lt;span style="color:#f92672">&amp;#34;user_intent&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;track_order&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;messages&amp;#34;&lt;/span>: [
{
&lt;span style="color:#f92672">&amp;#34;id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;msg_001&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;role&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;user&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;content&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Đơn hàng ORD-78901 của tôi đến chưa?&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;timestamp&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;2026-05-14T08:30:05+07:00&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;token_count&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">18&lt;/span>
},
{
&lt;span style="color:#f92672">&amp;#34;id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;msg_002&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;role&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;assistant&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;content&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Đơn hàng ORD-78901 hiện đang trong quá trình giao, dự kiến đến ngày 15/05.&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;timestamp&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;2026-05-14T08:30:08+07:00&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;token_count&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">32&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;tool_calls_used&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;get_order_status&amp;#34;&lt;/span>]
}
],
&lt;span style="color:#f92672">&amp;#34;summary&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;total_tokens_used&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">50&lt;/span>
}
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="42-c--semantic-kernel-vi-redis-session-memory">4.2. C# — Semantic Kernel với Redis Session Memory&lt;/h3>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-csharp" data-lang="csharp">&lt;span style="color:#66d9ef">using&lt;/span> Microsoft.SemanticKernel;
&lt;span style="color:#66d9ef">using&lt;/span> Microsoft.SemanticKernel.ChatCompletion;
&lt;span style="color:#66d9ef">using&lt;/span> StackExchange.Redis;
&lt;span style="color:#66d9ef">using&lt;/span> System.Text.Json;
&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// Bước 1: RedisSessionStore — CRUD session lên Redis
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">RedisSessionStore&lt;/span>
{
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">readonly&lt;/span> IDatabase _redis;
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">readonly&lt;/span> TimeSpan _defaultTtl = TimeSpan.FromHours(&lt;span style="color:#ae81ff">2&lt;/span>&lt;span style="color:#ae81ff">4&lt;/span>);
&lt;span style="color:#66d9ef">public&lt;/span> RedisSessionStore(IConnectionMultiplexer redis)
{
_redis = redis.GetDatabase();
}
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">static&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> Key(&lt;span style="color:#66d9ef">string&lt;/span> sessionId) =&amp;gt; &lt;span style="color:#e6db74">$&amp;#34;session:{sessionId}&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">async&lt;/span> Task&amp;lt;SessionData?&amp;gt; GetAsync(&lt;span style="color:#66d9ef">string&lt;/span> sessionId)
{
&lt;span style="color:#66d9ef">var&lt;/span> raw = &lt;span style="color:#66d9ef">await&lt;/span> _redis.StringGetAsync(Key(sessionId));
&lt;span style="color:#66d9ef">if&lt;/span> (raw.IsNullOrEmpty) &lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#66d9ef">null&lt;/span>;
&lt;span style="color:#75715e">// Làm mới TTL mỗi khi truy cập (sliding expiry)
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">await&lt;/span> _redis.KeyExpireAsync(Key(sessionId), _defaultTtl);
&lt;span style="color:#66d9ef">return&lt;/span> JsonSerializer.Deserialize&amp;lt;SessionData&amp;gt;(raw!);
}
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">async&lt;/span> Task SaveAsync(SessionData session)
{
&lt;span style="color:#66d9ef">var&lt;/span> json = JsonSerializer.Serialize(session);
&lt;span style="color:#66d9ef">await&lt;/span> _redis.StringSetAsync(
Key(session.SessionId),
json,
_defaultTtl);
}
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">async&lt;/span> Task DeleteAsync(&lt;span style="color:#66d9ef">string&lt;/span> sessionId)
=&amp;gt; &lt;span style="color:#66d9ef">await&lt;/span> _redis.KeyDeleteAsync(Key(sessionId));
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">async&lt;/span> Task AppendMessageAsync(
&lt;span style="color:#66d9ef">string&lt;/span> sessionId,
&lt;span style="color:#66d9ef">string&lt;/span> role,
&lt;span style="color:#66d9ef">string&lt;/span> content)
{
&lt;span style="color:#66d9ef">var&lt;/span> session = &lt;span style="color:#66d9ef">await&lt;/span> GetAsync(sessionId)
?? &lt;span style="color:#66d9ef">new&lt;/span> SessionData { SessionId = sessionId };
session.Messages.Add(&lt;span style="color:#66d9ef">new&lt;/span> SessionMessage
{
Id = &lt;span style="color:#e6db74">$&amp;#34;msg_{Guid.NewGuid():N}&amp;#34;&lt;/span>,
Role = role,
Content = content,
Timestamp = DateTimeOffset.UtcNow
});
session.LastActive = DateTimeOffset.UtcNow;
session.TotalTokensUsed += EstimateTokens(content);
&lt;span style="color:#66d9ef">await&lt;/span> SaveAsync(session);
}
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">static&lt;/span> &lt;span style="color:#66d9ef">int&lt;/span> EstimateTokens(&lt;span style="color:#66d9ef">string&lt;/span> text)
=&amp;gt; (&lt;span style="color:#66d9ef">int&lt;/span>)Math.Ceiling(text.Length / &lt;span style="color:#ae81ff">4.0&lt;/span>); &lt;span style="color:#75715e">// Ước lượng đơn giản
&lt;/span>&lt;span style="color:#75715e">&lt;/span>}
&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// Bước 2: AgentWithSessionMemory — tích hợp Semantic Kernel
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">AgentWithSessionMemory&lt;/span>
{
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">readonly&lt;/span> Kernel _kernel;
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">readonly&lt;/span> RedisSessionStore _sessionStore;
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">readonly&lt;/span> IChatCompletionService _chat;
&lt;span style="color:#66d9ef">public&lt;/span> AgentWithSessionMemory(
Kernel kernel,
RedisSessionStore sessionStore)
{
_kernel = kernel;
_sessionStore = sessionStore;
_chat = kernel.GetRequiredService&amp;lt;IChatCompletionService&amp;gt;();
}
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">async&lt;/span> Task&amp;lt;&lt;span style="color:#66d9ef">string&lt;/span>&amp;gt; ChatAsync(
&lt;span style="color:#66d9ef">string&lt;/span> sessionId,
&lt;span style="color:#66d9ef">string&lt;/span> userId,
&lt;span style="color:#66d9ef">string&lt;/span> userMessage)
{
&lt;span style="color:#75715e">// 1. Load session từ Redis
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">var&lt;/span> session = &lt;span style="color:#66d9ef">await&lt;/span> _sessionStore.GetAsync(sessionId)
?? &lt;span style="color:#66d9ef">new&lt;/span> SessionData
{
SessionId = sessionId,
UserId = userId,
CreatedAt = DateTimeOffset.UtcNow,
LastActive = DateTimeOffset.UtcNow
};
&lt;span style="color:#75715e">// 2. Rebuild ChatHistory từ session
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">var&lt;/span> history = &lt;span style="color:#66d9ef">new&lt;/span> ChatHistory(BuildSystemPrompt(session));
&lt;span style="color:#66d9ef">foreach&lt;/span> (&lt;span style="color:#66d9ef">var&lt;/span> msg &lt;span style="color:#66d9ef">in&lt;/span> TrimToTokenBudget(session.Messages, maxTokens: &lt;span style="color:#ae81ff">3&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>))
{
&lt;span style="color:#66d9ef">if&lt;/span> (msg.Role == &lt;span style="color:#e6db74">&amp;#34;user&amp;#34;&lt;/span>)
history.AddUserMessage(msg.Content);
&lt;span style="color:#66d9ef">else&lt;/span> &lt;span style="color:#66d9ef">if&lt;/span> (msg.Role == &lt;span style="color:#e6db74">&amp;#34;assistant&amp;#34;&lt;/span>)
history.AddAssistantMessage(msg.Content);
}
history.AddUserMessage(userMessage);
&lt;span style="color:#75715e">// 3. Gọi LLM
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">var&lt;/span> settings = &lt;span style="color:#66d9ef">new&lt;/span> OpenAIPromptExecutionSettings
{
ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
MaxTokens = &lt;span style="color:#ae81ff">1&lt;/span>&lt;span style="color:#ae81ff">0&lt;/span>&lt;span style="color:#ae81ff">2&lt;/span>&lt;span style="color:#ae81ff">4&lt;/span>
};
&lt;span style="color:#66d9ef">var&lt;/span> response = &lt;span style="color:#66d9ef">await&lt;/span> _chat.GetChatMessageContentAsync(
history, settings, _kernel);
&lt;span style="color:#66d9ef">var&lt;/span> assistantReply = response.Content
?? &lt;span style="color:#e6db74">&amp;#34;Xin lỗi, tôi chưa xử lý được yêu cầu này.&amp;#34;&lt;/span>;
&lt;span style="color:#75715e">// 4. Lưu cả 2 lượt vào session
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">await&lt;/span> _sessionStore.AppendMessageAsync(sessionId, &lt;span style="color:#e6db74">&amp;#34;user&amp;#34;&lt;/span>, userMessage);
&lt;span style="color:#66d9ef">await&lt;/span> _sessionStore.AppendMessageAsync(sessionId, &lt;span style="color:#e6db74">&amp;#34;assistant&amp;#34;&lt;/span>, assistantReply);
&lt;span style="color:#66d9ef">return&lt;/span> assistantReply;
}
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">static&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> BuildSystemPrompt(SessionData session)
=&amp;gt; &lt;span style="color:#e6db74">$&amp;#34;&amp;#34;&amp;#34;
&lt;/span>&lt;span style="color:#e6db74"> Bạn là trợ lý AI hỗ trợ khách hàng.
&lt;/span>&lt;span style="color:#e6db74"> ID phiên: {session.SessionId}
&lt;/span>&lt;span style="color:#e6db74"> ID người dùng: {session.UserId}
&lt;/span>&lt;span style="color:#e6db74"> Ngày tạo phiên: {session.CreatedAt:dd/MM/yyyy HH:mm}
&lt;/span>&lt;span style="color:#e6db74"> Hãy trả lời ngắn gọn, chuyên nghiệp bằng tiếng Việt.
&lt;/span>&lt;span style="color:#e6db74"> &amp;#34;&amp;#34;&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">private&lt;/span> &lt;span style="color:#66d9ef">static&lt;/span> IEnumerable&amp;lt;SessionMessage&amp;gt; TrimToTokenBudget(
List&amp;lt;SessionMessage&amp;gt; messages,
&lt;span style="color:#66d9ef">int&lt;/span> maxTokens)
{
&lt;span style="color:#75715e">// Lấy tin từ cuối về đầu cho đến khi đủ budget
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#66d9ef">var&lt;/span> result = &lt;span style="color:#66d9ef">new&lt;/span> List&amp;lt;SessionMessage&amp;gt;();
&lt;span style="color:#66d9ef">int&lt;/span> used = &lt;span style="color:#ae81ff">0&lt;/span>;
&lt;span style="color:#66d9ef">foreach&lt;/span> (&lt;span style="color:#66d9ef">var&lt;/span> msg &lt;span style="color:#66d9ef">in&lt;/span> messages.AsEnumerable().Reverse())
{
&lt;span style="color:#66d9ef">int&lt;/span> t = (&lt;span style="color:#66d9ef">int&lt;/span>)Math.Ceiling(msg.Content.Length / &lt;span style="color:#ae81ff">4.0&lt;/span>);
&lt;span style="color:#66d9ef">if&lt;/span> (used + t &amp;gt; maxTokens) &lt;span style="color:#66d9ef">break&lt;/span>;
result.Insert(&lt;span style="color:#ae81ff">0&lt;/span>, msg);
used += t;
}
&lt;span style="color:#66d9ef">return&lt;/span> result;
}
}
&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// Bước 3: Data models
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">// ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">public&lt;/span> record SessionData
{
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> SessionId { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; } = &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> UserId { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; } = &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> TenantId { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; } = &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">public&lt;/span> DateTimeOffset CreatedAt { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; }
&lt;span style="color:#66d9ef">public&lt;/span> DateTimeOffset LastActive { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; }
&lt;span style="color:#66d9ef">public&lt;/span> List&amp;lt;SessionMessage&amp;gt; Messages { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; } = &lt;span style="color:#66d9ef">new&lt;/span>();
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> Summary { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; } = &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">int&lt;/span> TotalTokensUsed { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; }
}
&lt;span style="color:#66d9ef">public&lt;/span> record SessionMessage
{
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> Id { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; } = &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> Role { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; } = &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">public&lt;/span> &lt;span style="color:#66d9ef">string&lt;/span> Content { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; } = &lt;span style="color:#e6db74">&amp;#34;&amp;#34;&lt;/span>;
&lt;span style="color:#66d9ef">public&lt;/span> DateTimeOffset Timestamp { &lt;span style="color:#66d9ef">get&lt;/span>; &lt;span style="color:#66d9ef">set&lt;/span>; }
}
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="43-cu-hnh-redis-cho-session-memory">4.3. Cấu hình Redis cho Session Memory&lt;/h3>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-yaml" data-lang="yaml">&lt;span style="color:#75715e"># redis-session.yml — cấu hình khuyến nghị cho production&lt;/span>
redis:
connection: &lt;span style="color:#e6db74">&amp;#34;redis://redis-host:6379&amp;#34;&lt;/span>
database: &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#75715e"># Dùng DB riêng cho sessions&lt;/span>
key_prefix: &lt;span style="color:#e6db74">&amp;#34;session:&amp;#34;&lt;/span>
default_ttl: &lt;span style="color:#ae81ff">86400&lt;/span> &lt;span style="color:#75715e"># 24 giờ (sliding)&lt;/span>
max_memory: &lt;span style="color:#e6db74">&amp;#34;2gb&amp;#34;&lt;/span>
max_memory_policy: &lt;span style="color:#e6db74">&amp;#34;allkeys-lru&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Tự xóa key cũ khi hết RAM&lt;/span>
&lt;span style="color:#75715e"># Cluster mode cho production scale&lt;/span>
cluster:
enabled: &lt;span style="color:#66d9ef">true&lt;/span>
nodes:
- &lt;span style="color:#e6db74">&amp;#34;redis-1:6379&amp;#34;&lt;/span>
- &lt;span style="color:#e6db74">&amp;#34;redis-2:6379&amp;#34;&lt;/span>
- &lt;span style="color:#e6db74">&amp;#34;redis-3:6379&amp;#34;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="5-persistent-long-term-memory--postgresql-schema">5. Persistent Long-term Memory — PostgreSQL Schema&lt;/h2>
&lt;p>Long-term memory lưu trữ thông tin &lt;strong>không bị xóa&lt;/strong> — hồ sơ người dùng, sở thích, lịch sử tương tác tích lũy qua nhiều session và nhiều tháng.&lt;/p>
&lt;h3 id="51-schema-postgresql">5.1. Schema PostgreSQL&lt;/h3>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#75715e">-- ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">-- Schema: ai_memory
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">-- Mô tả: Long-term memory cho AI Agent
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">-- ============================================================
&lt;/span>&lt;span style="color:#75715e">&lt;/span>
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">SCHEMA&lt;/span> &lt;span style="color:#66d9ef">IF&lt;/span> &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">EXISTS&lt;/span> ai_memory;
&lt;span style="color:#75715e">-- Hồ sơ người dùng tích lũy
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">TABLE&lt;/span> ai_memory.user_profiles (
id UUID &lt;span style="color:#66d9ef">PRIMARY&lt;/span> &lt;span style="color:#66d9ef">KEY&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> gen_random_uuid(),
user_id VARCHAR(&lt;span style="color:#ae81ff">128&lt;/span>) &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">UNIQUE&lt;/span>,
tenant_id VARCHAR(&lt;span style="color:#ae81ff">128&lt;/span>) &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span>,
display_name VARCHAR(&lt;span style="color:#ae81ff">256&lt;/span>),
&lt;span style="color:#66d9ef">language&lt;/span> VARCHAR(&lt;span style="color:#ae81ff">10&lt;/span>) &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">vi&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>,
timezone VARCHAR(&lt;span style="color:#ae81ff">64&lt;/span>) &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">Asia/Ho_Chi_Minh&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>,
&lt;span style="color:#75715e">-- Sở thích và hành vi tích lũy (JSONB cho linh hoạt)
&lt;/span>&lt;span style="color:#75715e">&lt;/span> preferences JSONB &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">{}&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>::jsonb,
&lt;span style="color:#75715e">/*&lt;/span>&lt;span style="color:#75715e">
&lt;/span>&lt;span style="color:#75715e"> Ví dụ preferences:
&lt;/span>&lt;span style="color:#75715e"> {
&lt;/span>&lt;span style="color:#75715e"> &amp;#34;communication_style&amp;#34;: &amp;#34;formal&amp;#34;,
&lt;/span>&lt;span style="color:#75715e"> &amp;#34;preferred_channel&amp;#34;: &amp;#34;email&amp;#34;,
&lt;/span>&lt;span style="color:#75715e"> &amp;#34;product_interests&amp;#34;: [&amp;#34;laptop&amp;#34;, &amp;#34;phụ kiện&amp;#34;],
&lt;/span>&lt;span style="color:#75715e"> &amp;#34;delivery_preference&amp;#34;: &amp;#34;morning&amp;#34;,
&lt;/span>&lt;span style="color:#75715e"> &amp;#34;language_level&amp;#34;: &amp;#34;technical&amp;#34;
&lt;/span>&lt;span style="color:#75715e"> }
&lt;/span>&lt;span style="color:#75715e"> &lt;/span>&lt;span style="color:#75715e">*/&lt;/span>
&lt;span style="color:#75715e">-- Tóm tắt ngữ cảnh từ các session trước
&lt;/span>&lt;span style="color:#75715e">&lt;/span> context_summary TEXT,
&lt;span style="color:#75715e">-- Metadata
&lt;/span>&lt;span style="color:#75715e">&lt;/span> first_seen_at TIMESTAMPTZ &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> NOW(),
last_seen_at TIMESTAMPTZ &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> NOW(),
total_sessions INT &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>,
total_messages INT &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>,
created_at TIMESTAMPTZ &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> NOW(),
updated_at TIMESTAMPTZ &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> NOW()
);
&lt;span style="color:#75715e">-- Log tương tác dài hạn (chỉ lưu sự kiện quan trọng, không lưu mọi tin nhắn)
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">TABLE&lt;/span> ai_memory.interaction_logs (
id UUID &lt;span style="color:#66d9ef">PRIMARY&lt;/span> &lt;span style="color:#66d9ef">KEY&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> gen_random_uuid(),
user_id VARCHAR(&lt;span style="color:#ae81ff">128&lt;/span>) &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">REFERENCES&lt;/span> ai_memory.user_profiles(user_id),
tenant_id VARCHAR(&lt;span style="color:#ae81ff">128&lt;/span>) &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span>,
session_id VARCHAR(&lt;span style="color:#ae81ff">256&lt;/span>),
event_type VARCHAR(&lt;span style="color:#ae81ff">64&lt;/span>) &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span>,
&lt;span style="color:#75715e">-- Các event_type mẫu:
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">-- &amp;#39;preference_update&amp;#39;, &amp;#39;issue_reported&amp;#39;, &amp;#39;purchase_intent&amp;#39;,
&lt;/span>&lt;span style="color:#75715e">&lt;/span> &lt;span style="color:#75715e">-- &amp;#39;complaint&amp;#39;, &amp;#39;compliment&amp;#39;, &amp;#39;topic_discussed&amp;#39;, &amp;#39;goal_achieved&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>
summary TEXT &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span>, &lt;span style="color:#75715e">-- Tóm tắt ngắn sự kiện
&lt;/span>&lt;span style="color:#75715e">&lt;/span> detail JSONB, &lt;span style="color:#75715e">-- Chi tiết đầy đủ nếu cần
&lt;/span>&lt;span style="color:#75715e">&lt;/span> importance SMALLINT &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#66d9ef">CHECK&lt;/span> (importance &lt;span style="color:#66d9ef">BETWEEN&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#66d9ef">AND&lt;/span> &lt;span style="color:#ae81ff">5&lt;/span>),
&lt;span style="color:#75715e">-- 1=trivial, 2=low, 3=medium, 4=high, 5=critical
&lt;/span>&lt;span style="color:#75715e">&lt;/span>
tags TEXT[] &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">{}&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>,
&lt;span style="color:#75715e">-- Memory decay: tự xóa sau thời gian nếu importance thấp
&lt;/span>&lt;span style="color:#75715e">&lt;/span> expires_at TIMESTAMPTZ,
created_at TIMESTAMPTZ &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> NOW()
);
&lt;span style="color:#75715e">-- Key-value store cho memory ngắn hơn long-term nhưng cần persist (không muốn dùng Redis)
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">TABLE&lt;/span> ai_memory.memory_items (
id UUID &lt;span style="color:#66d9ef">PRIMARY&lt;/span> &lt;span style="color:#66d9ef">KEY&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> gen_random_uuid(),
user_id VARCHAR(&lt;span style="color:#ae81ff">128&lt;/span>) &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span>,
tenant_id VARCHAR(&lt;span style="color:#ae81ff">128&lt;/span>) &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span>,
memory_key VARCHAR(&lt;span style="color:#ae81ff">256&lt;/span>) &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span>,
memory_value TEXT &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span>,
memory_type VARCHAR(&lt;span style="color:#ae81ff">64&lt;/span>) &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">fact&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>,
&lt;span style="color:#75715e">-- &amp;#39;fact&amp;#39;, &amp;#39;preference&amp;#39;, &amp;#39;goal&amp;#39;, &amp;#39;constraint&amp;#39;, &amp;#39;skill&amp;#39;, &amp;#39;relationship&amp;#39;
&lt;/span>&lt;span style="color:#75715e">&lt;/span>
&lt;span style="color:#66d9ef">source&lt;/span> VARCHAR(&lt;span style="color:#ae81ff">128&lt;/span>), &lt;span style="color:#75715e">-- Session ID nguồn gốc
&lt;/span>&lt;span style="color:#75715e">&lt;/span> confidence DECIMAL(&lt;span style="color:#ae81ff">3&lt;/span>,&lt;span style="color:#ae81ff">2&lt;/span>) &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>.&lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#66d9ef">CHECK&lt;/span> (confidence &lt;span style="color:#66d9ef">BETWEEN&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span> &lt;span style="color:#66d9ef">AND&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span>),
importance SMALLINT &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#66d9ef">CHECK&lt;/span> (importance &lt;span style="color:#66d9ef">BETWEEN&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#66d9ef">AND&lt;/span> &lt;span style="color:#ae81ff">5&lt;/span>),
&lt;span style="color:#75715e">-- Deduplication
&lt;/span>&lt;span style="color:#75715e">&lt;/span> content_hash VARCHAR(&lt;span style="color:#ae81ff">64&lt;/span>) &lt;span style="color:#66d9ef">GENERATED&lt;/span> ALWAYS &lt;span style="color:#66d9ef">AS&lt;/span> (
encode(sha256(user_id::bytea &lt;span style="color:#f92672">|&lt;/span>&lt;span style="color:#f92672">|&lt;/span> memory_key::bytea &lt;span style="color:#f92672">|&lt;/span>&lt;span style="color:#f92672">|&lt;/span> memory_value::bytea), &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">hex&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>)
) STORED,
access_count INT &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#ae81ff">0&lt;/span>,
last_accessed TIMESTAMPTZ,
expires_at TIMESTAMPTZ,
created_at TIMESTAMPTZ &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> NOW(),
updated_at TIMESTAMPTZ &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span> &lt;span style="color:#66d9ef">DEFAULT&lt;/span> NOW(),
&lt;span style="color:#66d9ef">UNIQUE&lt;/span>(user_id, tenant_id, memory_key)
);
&lt;span style="color:#75715e">-- Indexes cho hiệu suất
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">INDEX&lt;/span> idx_user_profiles_tenant &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.user_profiles(tenant_id);
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">INDEX&lt;/span> idx_user_profiles_last &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.user_profiles(last_seen_at &lt;span style="color:#66d9ef">DESC&lt;/span>);
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">INDEX&lt;/span> idx_interaction_logs_user &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.interaction_logs(user_id, created_at &lt;span style="color:#66d9ef">DESC&lt;/span>);
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">INDEX&lt;/span> idx_interaction_logs_type &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.interaction_logs(event_type, tenant_id);
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">INDEX&lt;/span> idx_interaction_logs_tags &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.interaction_logs &lt;span style="color:#66d9ef">USING&lt;/span> GIN(tags);
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">INDEX&lt;/span> idx_memory_items_user &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.memory_items(user_id, tenant_id);
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">INDEX&lt;/span> idx_memory_items_type &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.memory_items(memory_type, user_id);
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">INDEX&lt;/span> idx_memory_items_expires &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.memory_items(expires_at)
&lt;span style="color:#66d9ef">WHERE&lt;/span> expires_at &lt;span style="color:#66d9ef">IS&lt;/span> &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">NULL&lt;/span>;
&lt;span style="color:#75715e">-- Trigger cập nhật updated_at tự động
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">OR&lt;/span> &lt;span style="color:#66d9ef">REPLACE&lt;/span> &lt;span style="color:#66d9ef">FUNCTION&lt;/span> ai_memory.set_updated_at()
&lt;span style="color:#66d9ef">RETURNS&lt;/span> &lt;span style="color:#66d9ef">TRIGGER&lt;/span> &lt;span style="color:#66d9ef">AS&lt;/span> &lt;span style="color:#960050;background-color:#1e0010">$&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">$&lt;/span>
&lt;span style="color:#66d9ef">BEGIN&lt;/span> &lt;span style="color:#66d9ef">NEW&lt;/span>.updated_at &lt;span style="color:#f92672">=&lt;/span> NOW(); &lt;span style="color:#66d9ef">RETURN&lt;/span> &lt;span style="color:#66d9ef">NEW&lt;/span>; &lt;span style="color:#66d9ef">END&lt;/span>;
&lt;span style="color:#960050;background-color:#1e0010">$&lt;/span>&lt;span style="color:#960050;background-color:#1e0010">$&lt;/span> &lt;span style="color:#66d9ef">LANGUAGE&lt;/span> plpgsql;
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">TRIGGER&lt;/span> trg_user_profiles_updated
&lt;span style="color:#66d9ef">BEFORE&lt;/span> &lt;span style="color:#66d9ef">UPDATE&lt;/span> &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.user_profiles
&lt;span style="color:#66d9ef">FOR&lt;/span> &lt;span style="color:#66d9ef">EACH&lt;/span> &lt;span style="color:#66d9ef">ROW&lt;/span> &lt;span style="color:#66d9ef">EXECUTE&lt;/span> &lt;span style="color:#66d9ef">FUNCTION&lt;/span> ai_memory.set_updated_at();
&lt;span style="color:#66d9ef">CREATE&lt;/span> &lt;span style="color:#66d9ef">TRIGGER&lt;/span> trg_memory_items_updated
&lt;span style="color:#66d9ef">BEFORE&lt;/span> &lt;span style="color:#66d9ef">UPDATE&lt;/span> &lt;span style="color:#66d9ef">ON&lt;/span> ai_memory.memory_items
&lt;span style="color:#66d9ef">FOR&lt;/span> &lt;span style="color:#66d9ef">EACH&lt;/span> &lt;span style="color:#66d9ef">ROW&lt;/span> &lt;span style="color:#66d9ef">EXECUTE&lt;/span> &lt;span style="color:#66d9ef">FUNCTION&lt;/span> ai_memory.set_updated_at();
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="52-migration-strategy">5.2. Migration Strategy&lt;/h3>
&lt;p>Khi cần thay đổi schema trong production:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-sql" data-lang="sql">&lt;span style="color:#75715e">-- Migration V2: Thêm cột emotion_profile vào user_profiles
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">-- File: migrations/V2__add_emotion_profile.sql
&lt;/span>&lt;span style="color:#75715e">&lt;/span>
&lt;span style="color:#66d9ef">ALTER&lt;/span> &lt;span style="color:#66d9ef">TABLE&lt;/span> ai_memory.user_profiles
&lt;span style="color:#66d9ef">ADD&lt;/span> &lt;span style="color:#66d9ef">COLUMN&lt;/span> &lt;span style="color:#66d9ef">IF&lt;/span> &lt;span style="color:#66d9ef">NOT&lt;/span> &lt;span style="color:#66d9ef">EXISTS&lt;/span> emotion_profile JSONB &lt;span style="color:#66d9ef">DEFAULT&lt;/span> &lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">{}&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>::jsonb;
&lt;span style="color:#66d9ef">COMMENT&lt;/span> &lt;span style="color:#66d9ef">ON&lt;/span> &lt;span style="color:#66d9ef">COLUMN&lt;/span> ai_memory.user_profiles.emotion_profile &lt;span style="color:#66d9ef">IS&lt;/span>
&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">Xu hướng cảm xúc tích lũy: { &amp;#34;avg_sentiment&amp;#34;: 0.7, &amp;#34;frustration_signals&amp;#34;: 2 }&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>;
&lt;span style="color:#75715e">-- Backfill: giá trị mặc định cho các bản ghi cũ đã được handle bởi DEFAULT
&lt;/span>&lt;span style="color:#75715e">&lt;/span>&lt;span style="color:#75715e">-- Không cần UPDATE toàn bộ bảng nếu đã có DEFAULT.
&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="6-semantic-memory--vector-store-cho-k-c-ng-ngha">6. Semantic Memory — Vector Store cho ký ức ngữ nghĩa&lt;/h2>
&lt;p>Semantic Memory cho phép agent &lt;strong>tìm lại ký ức liên quan&lt;/strong> mà không cần nhớ key hay thứ tự thời gian — chỉ cần mô tả ngữ nghĩa gần với nội dung cần tìm.&lt;/p>
&lt;h3 id="61-kin-trc-semantic-memory--rag">6.1. Kiến trúc Semantic Memory + RAG&lt;/h3>
&lt;pre>&lt;code>Người dùng: &amp;quot;Tôi đã từng phàn nàn về vấn đề gì với sản phẩm này chưa?&amp;quot;
│
▼
┌───────────────────────┐
│ Semantic Memory │
│ Retrieval Pipeline │
└────────┬──────────────┘
│
┌────────▼──────────────────────────────────────────┐
│ 1. Embed câu hỏi → query vector [0.12, -0.34...] │
└────────┬──────────────────────────────────────────┘
│
┌────────▼──────────────────────────────────────────┐
│ 2. Similarity search trong Qdrant/pgvector │
│ Filter: user_id = &amp;quot;usr_456&amp;quot; │
│ Top-K: 5 ký ức liên quan nhất │
└────────┬──────────────────────────────────────────┘
│
┌────────▼──────────────────────────────────────────┐
│ 3. Re-rank theo: │
│ - Similarity score │
│ - Importance (1-5) │
│ - Recency (gần đây hơn = ưu tiên hơn) │
└────────┬──────────────────────────────────────────┘
│
┌────────▼──────────────────────────────────────────┐
│ 4. Inject vào context window: │
│ [RELEVANT MEMORIES]: │
│ - 12/03: Phàn nàn pin laptop hao nhanh │
│ - 05/04: Báo lỗi bàn phím phím Space kẹt │
└────────┬──────────────────────────────────────────┘
│
▼
LLM sinh câu trả lời có ngữ cảnh đầy đủ
&lt;/code>&lt;/pre>&lt;h3 id="62-python--langchain--qdrant-semantic-memory">6.2. Python — LangChain + Qdrant Semantic Memory&lt;/h3>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#f92672">from&lt;/span> langchain_openai &lt;span style="color:#f92672">import&lt;/span> OpenAIEmbeddings, ChatOpenAI
&lt;span style="color:#f92672">from&lt;/span> langchain_qdrant &lt;span style="color:#f92672">import&lt;/span> QdrantVectorStore
&lt;span style="color:#f92672">from&lt;/span> langchain.memory &lt;span style="color:#f92672">import&lt;/span> VectorStoreRetrieverMemory
&lt;span style="color:#f92672">from&lt;/span> langchain.chains &lt;span style="color:#f92672">import&lt;/span> ConversationChain
&lt;span style="color:#f92672">from&lt;/span> langchain.prompts &lt;span style="color:#f92672">import&lt;/span> PromptTemplate
&lt;span style="color:#f92672">from&lt;/span> qdrant_client &lt;span style="color:#f92672">import&lt;/span> QdrantClient
&lt;span style="color:#f92672">from&lt;/span> qdrant_client.models &lt;span style="color:#f92672">import&lt;/span> Distance, VectorParams
&lt;span style="color:#f92672">from&lt;/span> datetime &lt;span style="color:#f92672">import&lt;/span> datetime
&lt;span style="color:#f92672">import&lt;/span> uuid
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#75715e"># Bước 1: Khởi tạo Qdrant collection cho semantic memory&lt;/span>
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">init_semantic_memory_store&lt;/span>(
qdrant_url: str,
collection_name: str &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">agent_memories&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>,
vector_size: int &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1536&lt;/span> &lt;span style="color:#75715e"># OpenAI text-embedding-3-small&lt;/span>
) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> QdrantVectorStore:
client &lt;span style="color:#f92672">=&lt;/span> QdrantClient(url&lt;span style="color:#f92672">=&lt;/span>qdrant_url)
&lt;span style="color:#75715e"># Tạo collection nếu chưa tồn tại&lt;/span>
existing &lt;span style="color:#f92672">=&lt;/span> [c&lt;span style="color:#f92672">.&lt;/span>name &lt;span style="color:#66d9ef">for&lt;/span> c &lt;span style="color:#f92672">in&lt;/span> client&lt;span style="color:#f92672">.&lt;/span>get_collections()&lt;span style="color:#f92672">.&lt;/span>collections]
&lt;span style="color:#66d9ef">if&lt;/span> collection_name &lt;span style="color:#f92672">not&lt;/span> &lt;span style="color:#f92672">in&lt;/span> existing:
client&lt;span style="color:#f92672">.&lt;/span>create_collection(
collection_name&lt;span style="color:#f92672">=&lt;/span>collection_name,
vectors_config&lt;span style="color:#f92672">=&lt;/span>VectorParams(
size&lt;span style="color:#f92672">=&lt;/span>vector_size,
distance&lt;span style="color:#f92672">=&lt;/span>Distance&lt;span style="color:#f92672">.&lt;/span>COSINE
)
)
embeddings &lt;span style="color:#f92672">=&lt;/span> OpenAIEmbeddings(model&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">text-embedding-3-small&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>)
&lt;span style="color:#66d9ef">return&lt;/span> QdrantVectorStore(
client&lt;span style="color:#f92672">=&lt;/span>client,
collection_name&lt;span style="color:#f92672">=&lt;/span>collection_name,
embedding&lt;span style="color:#f92672">=&lt;/span>embeddings
)
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#75715e"># Bước 2: SemanticMemoryManager — lưu và truy vấn ký ức&lt;/span>
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">SemanticMemoryManager&lt;/span>:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Quản lý semantic memory cho một user cụ thể.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Mỗi ký ức là một đoạn text có metadata: user_id, importance, timestamp.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
IMPORTANCE_THRESHOLD &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#75715e"># Chỉ lưu ký ức có importance &amp;gt;= 3&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> __init__(self, vector_store: QdrantVectorStore, user_id: str):
self&lt;span style="color:#f92672">.&lt;/span>_store &lt;span style="color:#f92672">=&lt;/span> vector_store
self&lt;span style="color:#f92672">.&lt;/span>_user_id &lt;span style="color:#f92672">=&lt;/span> user_id
self&lt;span style="color:#f92672">.&lt;/span>_embeddings &lt;span style="color:#f92672">=&lt;/span> OpenAIEmbeddings(model&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">text-embedding-3-small&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>)
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">save_memory&lt;/span>(
self,
content: str,
memory_type: str &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">fact&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>,
importance: int &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span>,
tags: list[str] &lt;span style="color:#f92672">|&lt;/span> None &lt;span style="color:#f92672">=&lt;/span> None
) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> str &lt;span style="color:#f92672">|&lt;/span> None:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Lưu một ký ức vào vector store.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Chỉ lưu nếu importance &amp;gt;= ngưỡng.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Trả về memory_id nếu lưu thành công, None nếu bỏ qua.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> importance &lt;span style="color:#f92672">&amp;lt;&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>IMPORTANCE_THRESHOLD:
&lt;span style="color:#66d9ef">return&lt;/span> None &lt;span style="color:#75715e"># Không đủ quan trọng để ghi nhớ lâu dài&lt;/span>
memory_id &lt;span style="color:#f92672">=&lt;/span> str(uuid&lt;span style="color:#f92672">.&lt;/span>uuid4())
metadata &lt;span style="color:#f92672">=&lt;/span> {
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">user_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: self&lt;span style="color:#f92672">.&lt;/span>_user_id,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">memory_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: memory_id,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">memory_type&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: memory_type,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">importance&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: importance,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">tags&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: tags &lt;span style="color:#f92672">or&lt;/span> [],
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">created_at&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: datetime&lt;span style="color:#f92672">.&lt;/span>utcnow()&lt;span style="color:#f92672">.&lt;/span>isoformat(),
}
self&lt;span style="color:#f92672">.&lt;/span>_store&lt;span style="color:#f92672">.&lt;/span>add_texts(
texts&lt;span style="color:#f92672">=&lt;/span>[content],
metadatas&lt;span style="color:#f92672">=&lt;/span>[metadata],
ids&lt;span style="color:#f92672">=&lt;/span>[memory_id]
)
&lt;span style="color:#66d9ef">return&lt;/span> memory_id
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">recall&lt;/span>(
self,
query: str,
top_k: int &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">5&lt;/span>,
memory_type: str &lt;span style="color:#f92672">|&lt;/span> None &lt;span style="color:#f92672">=&lt;/span> None
) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> list[dict]:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Tìm kiếm ký ức liên quan theo ngữ nghĩa.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Có thể filter theo memory_type.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
filter_condition &lt;span style="color:#f92672">=&lt;/span> {&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">user_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: self&lt;span style="color:#f92672">.&lt;/span>_user_id}
&lt;span style="color:#66d9ef">if&lt;/span> memory_type:
filter_condition[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">memory_type&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>] &lt;span style="color:#f92672">=&lt;/span> memory_type
results &lt;span style="color:#f92672">=&lt;/span> self&lt;span style="color:#f92672">.&lt;/span>_store&lt;span style="color:#f92672">.&lt;/span>similarity_search_with_score(
query&lt;span style="color:#f92672">=&lt;/span>query,
k&lt;span style="color:#f92672">=&lt;/span>top_k,
filter&lt;span style="color:#f92672">=&lt;/span>filter_condition
)
&lt;span style="color:#75715e"># Re-rank: kết hợp similarity score + importance&lt;/span>
memories &lt;span style="color:#f92672">=&lt;/span> []
&lt;span style="color:#66d9ef">for&lt;/span> doc, score &lt;span style="color:#f92672">in&lt;/span> results:
importance &lt;span style="color:#f92672">=&lt;/span> doc&lt;span style="color:#f92672">.&lt;/span>metadata&lt;span style="color:#f92672">.&lt;/span>get(&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">importance&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#ae81ff">3&lt;/span>)
&lt;span style="color:#75715e"># Công thức re-rank đơn giản: 0.7 * similarity + 0.3 * (importance/5)&lt;/span>
combined_score &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">0.7&lt;/span> &lt;span style="color:#f92672">*&lt;/span> score &lt;span style="color:#f92672">+&lt;/span> &lt;span style="color:#ae81ff">0.3&lt;/span> &lt;span style="color:#f92672">*&lt;/span> (importance &lt;span style="color:#f92672">/&lt;/span> &lt;span style="color:#ae81ff">5&lt;/span>)
memories&lt;span style="color:#f92672">.&lt;/span>append({
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">content&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: doc&lt;span style="color:#f92672">.&lt;/span>page_content,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">metadata&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: doc&lt;span style="color:#f92672">.&lt;/span>metadata,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">similarity&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: round(score, &lt;span style="color:#ae81ff">4&lt;/span>),
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">combined_score&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: round(combined_score, &lt;span style="color:#ae81ff">4&lt;/span>)
})
&lt;span style="color:#75715e"># Sắp xếp theo combined_score giảm dần&lt;/span>
memories&lt;span style="color:#f92672">.&lt;/span>sort(key&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#66d9ef">lambda&lt;/span> x: x[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">combined_score&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>], reverse&lt;span style="color:#f92672">=&lt;/span>True)
&lt;span style="color:#66d9ef">return&lt;/span> memories
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">format_for_context&lt;/span>(self, memories: list[dict]) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> str:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Định dạng ký ức để inject vào context window.&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> &lt;span style="color:#f92672">not&lt;/span> memories:
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
lines &lt;span style="color:#f92672">=&lt;/span> [&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">[KÝ ỨC LIÊN QUAN CỦA NGƯỜI DÙNG]:&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>]
&lt;span style="color:#66d9ef">for&lt;/span> m &lt;span style="color:#f92672">in&lt;/span> memories:
date &lt;span style="color:#f92672">=&lt;/span> m[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">metadata&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>]&lt;span style="color:#f92672">.&lt;/span>get(&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">created_at&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>)[:&lt;span style="color:#ae81ff">10&lt;/span>]
mtype &lt;span style="color:#f92672">=&lt;/span> m[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">metadata&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>]&lt;span style="color:#f92672">.&lt;/span>get(&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">memory_type&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">fact&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>)
lines&lt;span style="color:#f92672">.&lt;/span>append(f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">- [{date}][{mtype}] {m[&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">content&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">]}&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>)
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#f92672">.&lt;/span>join(lines)
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#75715e"># Bước 3: Tích hợp với LangChain ConversationChain&lt;/span>
&lt;span style="color:#75715e"># ============================================================&lt;/span>
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">build_agent_with_semantic_memory&lt;/span>(
qdrant_url: str,
user_id: str
) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> tuple[ConversationChain, SemanticMemoryManager]:
vector_store &lt;span style="color:#f92672">=&lt;/span> init_semantic_memory_store(qdrant_url)
memory_mgr &lt;span style="color:#f92672">=&lt;/span> SemanticMemoryManager(vector_store, user_id)
retriever &lt;span style="color:#f92672">=&lt;/span> vector_store&lt;span style="color:#f92672">.&lt;/span>as_retriever(
search_kwargs&lt;span style="color:#f92672">=&lt;/span>{
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">k&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">4&lt;/span>,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">filter&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: {&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">user_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: user_id}
}
)
lc_memory &lt;span style="color:#f92672">=&lt;/span> VectorStoreRetrieverMemory(retriever&lt;span style="color:#f92672">=&lt;/span>retriever)
prompt &lt;span style="color:#f92672">=&lt;/span> PromptTemplate(
input_variables&lt;span style="color:#f92672">=&lt;/span>[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">history&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">input&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>],
template&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Bạn là trợ lý AI hỗ trợ khách hàng thông minh.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">Thông tin từ các tương tác trước đây:&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">{history}&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">Hội thoại hiện tại:&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">Người dùng: {input}&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">Trợ lý:&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
)
llm &lt;span style="color:#f92672">=&lt;/span> ChatOpenAI(model&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">gpt-4o-mini&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, temperature&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">0.1&lt;/span>)
chain &lt;span style="color:#f92672">=&lt;/span> ConversationChain(
llm&lt;span style="color:#f92672">=&lt;/span>llm,
prompt&lt;span style="color:#f92672">=&lt;/span>prompt,
memory&lt;span style="color:#f92672">=&lt;/span>lc_memory,
verbose&lt;span style="color:#f92672">=&lt;/span>False
)
&lt;span style="color:#66d9ef">return&lt;/span> chain, memory_mgr
&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="7-memory-retrieval-strategy-khi-no-dng-loi-no">7. Memory Retrieval Strategy: Khi nào dùng loại nào&lt;/h2>
&lt;h3 id="71-decision-tree--chn-loi-memory-ph-hp">7.1. Decision Tree — Chọn loại Memory phù hợp&lt;/h3>
&lt;pre>&lt;code>Bắt đầu: Agent nhận một yêu cầu mới từ người dùng
│
▼
┌───────────────────────────────┐
│ Thông tin có trong context │
│ window hiện tại không? │
└──────────┬────────────────────┘
│
┌─────────┴─────────┐
YES NO
│ │
▼ ▼
Dùng IN-CONTEXT Cần tìm ở đâu?
MEMORY trực tiếp │
┌────────┴─────────────────────┐
│ │
┌──────────▼──────────┐ ┌────────────▼──────────┐
│ Thông tin từ cùng │ │ Thông tin từ nhiều │
│ phiên làm việc │ │ phiên trước? │
│ hôm nay? │ └────────────┬──────────┘
└──────────┬──────────┘ │
│ ┌──────────┴──────────┐
YES │ │
│ Tìm theo KEY Tìm theo NGỮ NGHĨA
▼ (user_id, type) (không biết key cụ thể)
SESSION MEMORY │ │
(Redis, ~1ms) ▼ ▼
PERSISTENT MEMORY SEMANTIC MEMORY
(PostgreSQL, ~20ms) (Qdrant, ~30-50ms)
&lt;/code>&lt;/pre>&lt;h3 id="72-hybrid-retrieval--kt-hp-session--semantic">7.2. Hybrid Retrieval — Kết hợp Session + Semantic&lt;/h3>
&lt;p>Chiến lược tối ưu nhất cho production: &lt;strong>luôn truy vấn cả 2 nguồn song song&lt;/strong>, merge kết quả:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#f92672">import&lt;/span> asyncio
&lt;span style="color:#f92672">from&lt;/span> dataclasses &lt;span style="color:#f92672">import&lt;/span> dataclass
&lt;span style="color:#a6e22e">@dataclass&lt;/span>
&lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">MemoryContext&lt;/span>:
session_messages: list[dict]
semantic_memories: list[dict]
user_profile: dict &lt;span style="color:#f92672">|&lt;/span> None
async &lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">hybrid_memory_retrieval&lt;/span>(
session_id: str,
user_id: str,
current_query: str,
session_store: RedisSessionStore, &lt;span style="color:#75715e"># type: ignore&lt;/span>
semantic_mgr: SemanticMemoryManager,
profile_repo: object &lt;span style="color:#75715e"># type: ignore&lt;/span>
) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> MemoryContext:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Truy vấn song song cả session memory và semantic memory.&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
session_task &lt;span style="color:#f92672">=&lt;/span> session_store&lt;span style="color:#f92672">.&lt;/span>get_async(session_id)
semantic_task &lt;span style="color:#f92672">=&lt;/span> asyncio&lt;span style="color:#f92672">.&lt;/span>to_thread(
semantic_mgr&lt;span style="color:#f92672">.&lt;/span>recall, current_query, top_k&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">4&lt;/span>
)
profile_task &lt;span style="color:#f92672">=&lt;/span> asyncio&lt;span style="color:#f92672">.&lt;/span>to_thread(
profile_repo&lt;span style="color:#f92672">.&lt;/span>get_by_user_id, user_id &lt;span style="color:#75715e"># type: ignore&lt;/span>
)
session_data, semantic_results, profile &lt;span style="color:#f92672">=&lt;/span> await asyncio&lt;span style="color:#f92672">.&lt;/span>gather(
session_task, semantic_task, profile_task
)
&lt;span style="color:#66d9ef">return&lt;/span> MemoryContext(
session_messages&lt;span style="color:#f92672">=&lt;/span>session_data&lt;span style="color:#f92672">.&lt;/span>messages &lt;span style="color:#66d9ef">if&lt;/span> session_data &lt;span style="color:#66d9ef">else&lt;/span> [],
semantic_memories&lt;span style="color:#f92672">=&lt;/span>semantic_results,
user_profile&lt;span style="color:#f92672">=&lt;/span>profile
)
&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="8-context-window-management-nng-cao">8. Context Window Management nâng cao&lt;/h2>
&lt;h3 id="81-bn-chin-lc-chnh">8.1. Bốn chiến lược chính&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Chiến lược&lt;/th>
&lt;th>Mô tả&lt;/th>
&lt;th>Ưu điểm&lt;/th>
&lt;th>Nhược điểm&lt;/th>
&lt;th>Phù hợp với&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Sliding Window&lt;/strong>&lt;/td>
&lt;td>Giữ N tin nhắn gần nhất&lt;/td>
&lt;td>Đơn giản, dễ implement&lt;/td>
&lt;td>Mất thông tin quan trọng đầu session&lt;/td>
&lt;td>FAQ bot, session ngắn&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Summary Buffer&lt;/strong>&lt;/td>
&lt;td>Tóm tắt phần cũ khi đầy&lt;/td>
&lt;td>Giữ thông tin key, token hiệu quả&lt;/td>
&lt;td>Cần gọi LLM thêm để tóm tắt&lt;/td>
&lt;td>CS bot, session trung bình&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Entity Memory&lt;/strong>&lt;/td>
&lt;td>Track entities (tên, mã đơn, sản phẩm) được đề cập&lt;/td>
&lt;td>Giữ facts quan trọng, ít token&lt;/td>
&lt;td>Cần NER pipeline&lt;/td>
&lt;td>Sales bot, healthcare bot&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>ConversationKG&lt;/strong>&lt;/td>
&lt;td>Knowledge Graph từ hội thoại&lt;/td>
&lt;td>Biểu diễn quan hệ phức tạp&lt;/td>
&lt;td>Phức tạp triển khai&lt;/td>
&lt;td>Research agent, phân tích hợp đồng&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="82-bng-so-snh-chi-tit">8.2. Bảng so sánh chi tiết&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Tiêu chí&lt;/th>
&lt;th>Sliding Window&lt;/th>
&lt;th>Summary Buffer&lt;/th>
&lt;th>Entity Memory&lt;/th>
&lt;th>ConversationKG&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Độ phức tạp implement&lt;/strong>&lt;/td>
&lt;td>★☆☆☆☆&lt;/td>
&lt;td>★★★☆☆&lt;/td>
&lt;td>★★★☆☆&lt;/td>
&lt;td>★★★★★&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Token efficiency&lt;/strong>&lt;/td>
&lt;td>★★☆☆☆&lt;/td>
&lt;td>★★★★☆&lt;/td>
&lt;td>★★★★★&lt;/td>
&lt;td>★★★☆☆&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Giữ thông tin long-term&lt;/strong>&lt;/td>
&lt;td>★☆☆☆☆&lt;/td>
&lt;td>★★★☆☆&lt;/td>
&lt;td>★★★★☆&lt;/td>
&lt;td>★★★★★&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tốc độ&lt;/strong>&lt;/td>
&lt;td>★★★★★&lt;/td>
&lt;td>★★★☆☆&lt;/td>
&lt;td>★★★★☆&lt;/td>
&lt;td>★★☆☆☆&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Chi phí API&lt;/strong>&lt;/td>
&lt;td>Thấp&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Thấp&lt;/td>
&lt;td>Cao&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Hỗ trợ LangChain&lt;/strong>&lt;/td>
&lt;td>✅&lt;/td>
&lt;td>✅&lt;/td>
&lt;td>✅&lt;/td>
&lt;td>✅ (beta)&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Hỗ trợ Semantic Kernel&lt;/strong>&lt;/td>
&lt;td>✅&lt;/td>
&lt;td>Tự implement&lt;/td>
&lt;td>Tự implement&lt;/td>
&lt;td>❌&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="83-khuyn-ngh-la-chn-theo-use-case">8.3. Khuyến nghị lựa chọn theo use case&lt;/h3>
&lt;pre>&lt;code>Use case │ Chiến lược khuyến nghị
───────────────────────┼──────────────────────────────────────────
FAQ chatbot đơn giản │ Sliding Window (20 tin nhắn)
Customer Support AI │ Summary Buffer + Entity Memory
Healthcare AI │ Entity Memory + Persistent Memory
Sales/CRM AI │ Entity Memory + Semantic Memory
Contract analysis │ ConversationKG + Semantic Memory
Personal Assistant │ Summary Buffer + Semantic Memory + Profile
&lt;/code>&lt;/pre>&lt;hr>
&lt;h2 id="9-user-profiling--personalization">9. User Profiling &amp;amp; Personalization&lt;/h2>
&lt;h3 id="91-xy-dng-h-s-ngi-dng-tch-ly">9.1. Xây dựng hồ sơ người dùng tích lũy&lt;/h3>
&lt;p>Hồ sơ người dùng không được tạo ra một lần — nó &lt;strong>tích lũy&lt;/strong> và &lt;strong>tự cập nhật&lt;/strong> qua từng tương tác:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-json" data-lang="json">{
&lt;span style="color:#f92672">&amp;#34;user_id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;usr_456&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;tenant_id&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;tenant_ecommerce_01&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;display_name&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Nguyễn Văn An&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;language&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;vi&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;timezone&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Asia/Ho_Chi_Minh&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;preferences&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;communication_style&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;casual&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;response_length&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;concise&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;preferred_channel&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;zalo&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;delivery_time&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;morning&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;payment_method&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;momo&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;product_categories&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;laptop&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;phụ kiện gaming&amp;#34;&lt;/span>],
&lt;span style="color:#f92672">&amp;#34;price_sensitivity&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;medium&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;brand_preferences&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;Dell&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;ASUS&amp;#34;&lt;/span>]
},
&lt;span style="color:#f92672">&amp;#34;behavioral_patterns&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;avg_session_duration_minutes&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">12.5&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;peak_active_hours&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;08:00-10:00&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;20:00-22:00&amp;#34;&lt;/span>],
&lt;span style="color:#f92672">&amp;#34;typical_query_types&amp;#34;&lt;/span>: [&lt;span style="color:#e6db74">&amp;#34;order_tracking&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;product_comparison&amp;#34;&lt;/span>],
&lt;span style="color:#f92672">&amp;#34;escalation_rate&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">0.05&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;satisfaction_trend&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;improving&amp;#34;&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;known_issues&amp;#34;&lt;/span>: [
{
&lt;span style="color:#f92672">&amp;#34;type&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;allergy&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;detail&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;dị ứng latex&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;recorded_at&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;2026-03-12&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;source_session&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;sess_xyz789&amp;#34;&lt;/span>
}
],
&lt;span style="color:#f92672">&amp;#34;interaction_summary&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;Khách hàng thân thiết, thường mua laptop gaming. Đã từng phàn nàn về thời gian giao hàng chậm vào tháng 3. Ưa phong cách giao tiếp thân mật, không thích câu trả lời dài dòng.&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;metrics&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;total_sessions&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">28&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;total_messages&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">312&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;purchases_assisted&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">4&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;tickets_raised&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">2&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;last_purchase_date&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;2026-04-20&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;lifetime_value_vnd&amp;#34;&lt;/span>: &lt;span style="color:#ae81ff">18500000&lt;/span>
},
&lt;span style="color:#f92672">&amp;#34;privacy&amp;#34;&lt;/span>: {
&lt;span style="color:#f92672">&amp;#34;consent_given&amp;#34;&lt;/span>: &lt;span style="color:#66d9ef">true&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;consent_date&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;2026-01-15&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;data_retention_until&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">&amp;#34;2029-01-15&amp;#34;&lt;/span>,
&lt;span style="color:#f92672">&amp;#34;pii_masked&amp;#34;&lt;/span>: &lt;span style="color:#66d9ef">false&lt;/span>
}
}
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="92-privacy-considerations">9.2. Privacy Considerations&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Tách biệt PII&lt;/strong>: Email, số điện thoại, CCCD không lưu trong profile summary&lt;/li>
&lt;li>&lt;strong>Consent tracking&lt;/strong>: Ghi nhận rõ thời điểm người dùng đồng ý lưu dữ liệu&lt;/li>
&lt;li>&lt;strong>Data minimization&lt;/strong>: Chỉ lưu những gì thực sự cần để cá nhân hóa&lt;/li>
&lt;li>&lt;strong>Right to forget&lt;/strong>: Xem mục 12 — cơ chế xóa toàn bộ memory theo yêu cầu&lt;/li>
&lt;li>&lt;strong>Tenant isolation&lt;/strong>: Mỗi tenant có namespace riêng, không thể cross-query&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="10-memory-write-policy--khi-no-ghi-khi-no-b-qua">10. Memory Write Policy — Khi nào ghi, khi nào bỏ qua&lt;/h2>
&lt;p>Không phải mọi tin nhắn đều đáng ghi vào long-term memory. Ghi không chọn lọc sẽ làm &lt;strong>nhiễu&lt;/strong> bộ nhớ và tăng chi phí.&lt;/p>
&lt;h3 id="101-importance-scoring">10.1. Importance Scoring&lt;/h3>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#f92672">from&lt;/span> enum &lt;span style="color:#f92672">import&lt;/span> IntEnum
&lt;span style="color:#66d9ef">class&lt;/span> &lt;span style="color:#a6e22e">MemoryImportance&lt;/span>(IntEnum):
TRIVIAL &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">1&lt;/span> &lt;span style="color:#75715e"># &amp;#34;Ok&amp;#34;, &amp;#34;Cảm ơn&amp;#34;, lời chào&lt;/span>
LOW &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">2&lt;/span> &lt;span style="color:#75715e"># Câu hỏi chung, không cá nhân&lt;/span>
MEDIUM &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">3&lt;/span> &lt;span style="color:#75715e"># Thông tin hữu ích nhưng không critical&lt;/span>
HIGH &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">4&lt;/span> &lt;span style="color:#75715e"># Sở thích rõ ràng, vấn đề đã xảy ra&lt;/span>
CRITICAL &lt;span style="color:#f92672">=&lt;/span> &lt;span style="color:#ae81ff">5&lt;/span> &lt;span style="color:#75715e"># Dị ứng, yêu cầu đặc biệt, khiếu nại quan trọng&lt;/span>
&lt;span style="color:#75715e"># Bảng quy tắc đơn giản để scoring&lt;/span>
IMPORTANCE_RULES &lt;span style="color:#f92672">=&lt;/span> [
&lt;span style="color:#75715e"># (pattern, importance)&lt;/span>
([&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">dị ứng&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">không dùng được&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">cấm&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">tuyệt đối không&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>], MemoryImportance&lt;span style="color:#f92672">.&lt;/span>CRITICAL),
([&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">thích&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">muốn&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">ưa&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">hay dùng&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">thường xuyên&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>], MemoryImportance&lt;span style="color:#f92672">.&lt;/span>HIGH),
([&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">từng&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">lần trước&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">hôm qua&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">tuần trước&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>], MemoryImportance&lt;span style="color:#f92672">.&lt;/span>HIGH),
([&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">phàn nàn&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">tức&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">bực&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">thất vọng&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">tệ&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>], MemoryImportance&lt;span style="color:#f92672">.&lt;/span>HIGH),
([&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">hỏi về&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">muốn biết&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">giá bao nhiêu&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>], MemoryImportance&lt;span style="color:#f92672">.&lt;/span>LOW),
([&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">ok&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">được&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">cảm ơn&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">bye&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">tạm biệt&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>], MemoryImportance&lt;span style="color:#f92672">.&lt;/span>TRIVIAL),
]
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">score_importance&lt;/span>(message: str) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> MemoryImportance:
message_lower &lt;span style="color:#f92672">=&lt;/span> message&lt;span style="color:#f92672">.&lt;/span>lower()
best_score &lt;span style="color:#f92672">=&lt;/span> MemoryImportance&lt;span style="color:#f92672">.&lt;/span>LOW
&lt;span style="color:#66d9ef">for&lt;/span> keywords, importance &lt;span style="color:#f92672">in&lt;/span> IMPORTANCE_RULES:
&lt;span style="color:#66d9ef">if&lt;/span> any(kw &lt;span style="color:#f92672">in&lt;/span> message_lower &lt;span style="color:#66d9ef">for&lt;/span> kw &lt;span style="color:#f92672">in&lt;/span> keywords):
&lt;span style="color:#66d9ef">if&lt;/span> importance &lt;span style="color:#f92672">&amp;gt;&lt;/span> best_score:
best_score &lt;span style="color:#f92672">=&lt;/span> importance
&lt;span style="color:#66d9ef">return&lt;/span> best_score
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="102-memory-write-decision-flow">10.2. Memory Write Decision Flow&lt;/h3>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">async &lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">decide_and_write_memory&lt;/span>(
user_id: str,
message: str,
session_context: dict,
memory_mgr: SemanticMemoryManager,
pg_repo: object &lt;span style="color:#75715e"># type: ignore&lt;/span>
) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> None:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Quyết định có lưu vào long-term memory không, và lưu ở đâu.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
importance &lt;span style="color:#f92672">=&lt;/span> score_importance(message)
&lt;span style="color:#75715e"># Quy tắc 1: Bỏ qua nếu quá tầm thường&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> importance &lt;span style="color:#f92672">&amp;lt;&lt;/span>&lt;span style="color:#f92672">=&lt;/span> MemoryImportance&lt;span style="color:#f92672">.&lt;/span>TRIVIAL:
&lt;span style="color:#66d9ef">return&lt;/span>
&lt;span style="color:#75715e"># Quy tắc 2: Kiểm tra deduplication (đã có memory tương tự chưa)&lt;/span>
similar &lt;span style="color:#f92672">=&lt;/span> memory_mgr&lt;span style="color:#f92672">.&lt;/span>recall(message, top_k&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">1&lt;/span>)
&lt;span style="color:#66d9ef">if&lt;/span> similar &lt;span style="color:#f92672">and&lt;/span> similar[&lt;span style="color:#ae81ff">0&lt;/span>][&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">similarity&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>] &lt;span style="color:#f92672">&amp;gt;&lt;/span> &lt;span style="color:#ae81ff">0.95&lt;/span>:
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#75715e"># Đã có ký ức gần như giống hệt, bỏ qua&lt;/span>
&lt;span style="color:#75715e"># Quy tắc 3: Ghi vào Semantic Memory nếu importance &amp;gt;= 3&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> importance &lt;span style="color:#f92672">&amp;gt;&lt;/span>&lt;span style="color:#f92672">=&lt;/span> MemoryImportance&lt;span style="color:#f92672">.&lt;/span>MEDIUM:
memory_mgr&lt;span style="color:#f92672">.&lt;/span>save_memory(
content&lt;span style="color:#f92672">=&lt;/span>message,
memory_type&lt;span style="color:#f92672">=&lt;/span>classify_memory_type(message),
importance&lt;span style="color:#f92672">=&lt;/span>int(importance)
)
&lt;span style="color:#75715e"># Quy tắc 4: Ghi vào PostgreSQL interaction_log nếu importance &amp;gt;= 4&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> importance &lt;span style="color:#f92672">&amp;gt;&lt;/span>&lt;span style="color:#f92672">=&lt;/span> MemoryImportance&lt;span style="color:#f92672">.&lt;/span>HIGH:
await pg_repo&lt;span style="color:#f92672">.&lt;/span>log_interaction( &lt;span style="color:#75715e"># type: ignore&lt;/span>
user_id&lt;span style="color:#f92672">=&lt;/span>user_id,
event_type&lt;span style="color:#f92672">=&lt;/span>classify_event_type(message),
summary&lt;span style="color:#f92672">=&lt;/span>message[:&lt;span style="color:#ae81ff">500&lt;/span>],
importance&lt;span style="color:#f92672">=&lt;/span>int(importance),
&lt;span style="color:#75715e"># Memory decay: ký ức LOW tự xóa sau 90 ngày&lt;/span>
expires_at&lt;span style="color:#f92672">=&lt;/span>(
None &lt;span style="color:#66d9ef">if&lt;/span> importance &lt;span style="color:#f92672">&amp;gt;&lt;/span>&lt;span style="color:#f92672">=&lt;/span> MemoryImportance&lt;span style="color:#f92672">.&lt;/span>HIGH
&lt;span style="color:#66d9ef">else&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">NOW() + INTERVAL &lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">90 days&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
)
)
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="103-memory-decay--ttl-cho-long-term-memory">10.3. Memory Decay — TTL cho Long-term Memory&lt;/h3>
&lt;p>Không phải mọi ký ức đều cần giữ mãi mãi. Thiết lập TTL theo importance:&lt;/p>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Importance Level&lt;/th>
&lt;th>TTL khuyến nghị&lt;/th>
&lt;th>Ví dụ&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>CRITICAL (5)&lt;/strong>&lt;/td>
&lt;td>Không hết hạn&lt;/td>
&lt;td>Dị ứng, yêu cầu đặc biệt về sức khỏe&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>HIGH (4)&lt;/strong>&lt;/td>
&lt;td>2 năm&lt;/td>
&lt;td>Sở thích mua hàng, khiếu nại đã giải quyết&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>MEDIUM (3)&lt;/strong>&lt;/td>
&lt;td>6 tháng&lt;/td>
&lt;td>Câu hỏi đã được trả lời, sản phẩm đã xem&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>LOW (2)&lt;/strong>&lt;/td>
&lt;td>90 ngày&lt;/td>
&lt;td>Thông tin ngữ cảnh session&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>TRIVIAL (1)&lt;/strong>&lt;/td>
&lt;td>Không lưu&lt;/td>
&lt;td>Lời chào, phản hồi ngắn&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="11-multi-session-continuity">11. Multi-session Continuity&lt;/h2>
&lt;h3 id="111-cho-n-ngi-dng-quay-li">11.1. Chào đón người dùng quay lại&lt;/h3>
&lt;p>Khi người dùng bắt đầu session mới, agent cần &lt;strong>pre-load context&lt;/strong> và &lt;strong>chào hỏi cá nhân hóa&lt;/strong>:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">async &lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">build_welcome_context&lt;/span>(
user_id: str,
current_query: str,
memory_mgr: SemanticMemoryManager,
pg_repo: object &lt;span style="color:#75715e"># type: ignore&lt;/span>
) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> str:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Xây dựng context phong phú khi người dùng quay lại.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Chạy song song để tối thiểu latency.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#f92672">import&lt;/span> asyncio
profile_task &lt;span style="color:#f92672">=&lt;/span> asyncio&lt;span style="color:#f92672">.&lt;/span>to_thread(pg_repo&lt;span style="color:#f92672">.&lt;/span>get_profile, user_id) &lt;span style="color:#75715e"># type: ignore&lt;/span>
memories_task &lt;span style="color:#f92672">=&lt;/span> asyncio&lt;span style="color:#f92672">.&lt;/span>to_thread(
memory_mgr&lt;span style="color:#f92672">.&lt;/span>recall, current_query, top_k&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">3&lt;/span>
)
profile, relevant_memories &lt;span style="color:#f92672">=&lt;/span> await asyncio&lt;span style="color:#f92672">.&lt;/span>gather(
profile_task, memories_task
)
context_parts &lt;span style="color:#f92672">=&lt;/span> []
&lt;span style="color:#75715e"># 1. Thông tin hồ sơ cơ bản&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> profile:
context_parts&lt;span style="color:#f92672">.&lt;/span>append(f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">[HỒ SƠ NGƯỜI DÙNG]:&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">- Tên: {profile.get(&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">display_name&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">, &lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">Khách hàng&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">)}&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">- Số phiên: {profile.get(&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">total_sessions&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">, 0)}&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">- Tóm tắt: {profile.get(&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">interaction_summary&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">, &lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">)}&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">- Sở thích nổi bật: {&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">, &lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">.join(profile.get(&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">preferences&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">, {}).get(&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">product_categories&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">, []))}&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#f92672">.&lt;/span>strip())
&lt;span style="color:#75715e"># 2. Ký ức liên quan đến câu hỏi hiện tại&lt;/span>
&lt;span style="color:#66d9ef">if&lt;/span> relevant_memories:
context_parts&lt;span style="color:#f92672">.&lt;/span>append(
memory_mgr&lt;span style="color:#f92672">.&lt;/span>format_for_context(relevant_memories)
)
&lt;span style="color:#66d9ef">return&lt;/span> &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#ae81ff">\n&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#f92672">.&lt;/span>join(context_parts)
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="112-prompt-augmentation-template">11.2. Prompt Augmentation Template&lt;/h3>
&lt;p>Template để inject memory context vào system prompt:&lt;/p>
&lt;pre>&lt;code>SYSTEM PROMPT TEMPLATE (với Memory Augmentation):
─────────────────────────────────────────────────────────
Bạn là trợ lý AI của {company_name}.
{user_context}
━━ Chú ý khi trả lời ━━
- Nếu người dùng quay lại sau nhiều ngày, hãy chào hỏi ấm áp và đề cập đến
tương tác gần nhất nếu phù hợp với câu hỏi hiện tại.
- Ưu tiên thông tin trong [KÝ ỨC LIÊN QUAN] khi có liên quan đến câu hỏi.
- KHÔNG đề cập đến ký ức không liên quan — tránh cảm giác &amp;quot;đang bị theo dõi&amp;quot;.
- Phong cách giao tiếp: {communication_style}
─────────────────────────────────────────────────────────
Ví dụ kết quả sau khi augment:
─────────────────────────────────────────────────────────
[HỒ SƠ NGƯỜI DÙNG]:
- Tên: Nguyễn Văn An (28 phiên, khách thân thiết)
- Tóm tắt: Thường mua laptop gaming, thích giao hàng buổi sáng
[KÝ ỨC LIÊN QUAN]:
- [2026-03-12][constraint] dị ứng latex — KHÔNG gợi ý sản phẩm chứa latex
- [2026-04-05][complaint] Phàn nàn giao hàng chậm 3 ngày so với cam kết
Chào mừng anh An quay lại! Hôm nay anh cần hỗ trợ gì ạ?
─────────────────────────────────────────────────────────
&lt;/code>&lt;/pre>&lt;hr>
&lt;h2 id="12-bo-mt--privacy-cho-memory">12. Bảo mật &amp;amp; Privacy cho Memory&lt;/h2>
&lt;h3 id="121-cc-nguyn-tc-ct-li">12.1. Các nguyên tắc cốt lõi&lt;/h3>
&lt;p>&lt;strong>Data Isolation (Multi-tenant)&lt;/strong>: Mỗi tenant/organization có namespace riêng trong Redis, schema riêng trong PostgreSQL, collection riêng trong vector store. Tuyệt đối không cross-query giữa các tenant.&lt;/p>
&lt;p>&lt;strong>PII Masking trước khi lưu&lt;/strong>: Luôn mask PII trước khi lưu vào semantic memory hoặc interaction log:&lt;/p>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">&lt;span style="color:#f92672">import&lt;/span> re
PII_PATTERNS &lt;span style="color:#f92672">=&lt;/span> {
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">email&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">r&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">b[A-Za-z0-9._&lt;/span>&lt;span style="color:#e6db74">%&lt;/span>&lt;span style="color:#e6db74">+-]+@[A-Za-z0-9.-]+&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">.[A-Z|a-z]{2,}&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">b&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">phone_vn&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">r&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">b(0[35789]&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">d{8}|[+]84[35789]&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">d{8})&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">b&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>,
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">cccd&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">r&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">b&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">d{9}(&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">d{3})?&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">b&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>, &lt;span style="color:#75715e"># 9 hoặc 12 số&lt;/span>
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">credit_card&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: &lt;span style="color:#e6db74">r&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">b&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">d{4}[&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">s-]?&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">d{4}[&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">s-]?&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">d{4}[&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">s-]?&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">d{4}&lt;/span>&lt;span style="color:#e6db74">\&lt;/span>&lt;span style="color:#e6db74">b&lt;/span>&lt;span style="color:#e6db74">&amp;#39;&lt;/span>,
}
&lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">mask_pii&lt;/span>(text: str) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> str:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">Thay thế PII bằng placeholder trước khi lưu vào memory.&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
masked &lt;span style="color:#f92672">=&lt;/span> text
&lt;span style="color:#66d9ef">for&lt;/span> pii_type, pattern &lt;span style="color:#f92672">in&lt;/span> PII_PATTERNS&lt;span style="color:#f92672">.&lt;/span>items():
placeholder &lt;span style="color:#f92672">=&lt;/span> f&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">[{pii_type.upper()}_MASKED]&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>
masked &lt;span style="color:#f92672">=&lt;/span> re&lt;span style="color:#f92672">.&lt;/span>sub(pattern, placeholder, masked, flags&lt;span style="color:#f92672">=&lt;/span>re&lt;span style="color:#f92672">.&lt;/span>IGNORECASE)
&lt;span style="color:#66d9ef">return&lt;/span> masked
&lt;span style="color:#75715e"># Sử dụng:&lt;/span>
&lt;span style="color:#75715e"># &amp;#34;Email tôi là abc@gmail.com và SĐT 0912345678&amp;#34;&lt;/span>
&lt;span style="color:#75715e"># → &amp;#34;Email tôi là [EMAIL_MASKED] và SĐT [PHONE_VN_MASKED]&amp;#34;&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="122-gdpr--right-to-forget">12.2. GDPR / Right-to-Forget&lt;/h3>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-python" data-lang="python">async &lt;span style="color:#66d9ef">def&lt;/span> &lt;span style="color:#a6e22e">delete_all_user_memory&lt;/span>(
user_id: str,
tenant_id: str,
session_store: object, &lt;span style="color:#75715e"># type: ignore&lt;/span>
vector_store: object, &lt;span style="color:#75715e"># type: ignore&lt;/span>
pg_repo: object &lt;span style="color:#75715e"># type: ignore&lt;/span>
) &lt;span style="color:#f92672">-&lt;/span>&lt;span style="color:#f92672">&amp;gt;&lt;/span> dict:
&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Xóa toàn bộ memory của người dùng theo yêu cầu GDPR.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> Trả về báo cáo xóa để audit.&lt;/span>&lt;span style="color:#e6db74">
&lt;/span>&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74"> &lt;/span>&lt;span style="color:#e6db74">&amp;#34;&amp;#34;&amp;#34;&lt;/span>
&lt;span style="color:#f92672">import&lt;/span> asyncio
results &lt;span style="color:#f92672">=&lt;/span> {}
&lt;span style="color:#75715e"># 1. Xóa tất cả sessions trong Redis&lt;/span>
session_keys &lt;span style="color:#f92672">=&lt;/span> await session_store&lt;span style="color:#f92672">.&lt;/span>find_by_user(user_id, tenant_id) &lt;span style="color:#75715e"># type: ignore&lt;/span>
&lt;span style="color:#66d9ef">for&lt;/span> key &lt;span style="color:#f92672">in&lt;/span> session_keys:
await session_store&lt;span style="color:#f92672">.&lt;/span>delete(key) &lt;span style="color:#75715e"># type: ignore&lt;/span>
results[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">sessions_deleted&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>] &lt;span style="color:#f92672">=&lt;/span> len(session_keys)
&lt;span style="color:#75715e"># 2. Xóa semantic memories trong vector store&lt;/span>
deleted_vectors &lt;span style="color:#f92672">=&lt;/span> await asyncio&lt;span style="color:#f92672">.&lt;/span>to_thread(
vector_store&lt;span style="color:#f92672">.&lt;/span>delete, &lt;span style="color:#75715e"># type: ignore&lt;/span>
filter&lt;span style="color:#f92672">=&lt;/span>{&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">user_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: user_id, &lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">tenant_id&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>: tenant_id}
)
results[&lt;span style="color:#e6db74">&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>&lt;span style="color:#e6db74">vectors_deleted&lt;/span>&lt;span style="color:#e6db74">&amp;#34;&lt;/span>] &lt;span style="color:#f92672">=&lt;/span> deleted_vectors
&lt;span style="color:#75715e"># 3. Xóa PostgreSQL records&lt;/span>
pg_deleted &lt;span style="color:#f92672">=&lt;/span> await pg_repo&lt;span style="color:#f92672">.&lt;/span>delete_user_data(user_id, tenant_id) &lt;span style="color:#75715e"># type: ignore&lt;/span>
results&lt;span style="color:#f92672">.&lt;/span>update(pg_deleted)
&lt;span style="color:#75715e"># 4. Audit log (bắt buộc, không xóa)&lt;/span>
await pg_repo&lt;span style="color:#f92672">.&lt;/span>log_gdpr_deletion( &lt;span style="color:#75715e"># type: ignore&lt;/span>
user_id&lt;span style="color:#f92672">=&lt;/span>user_id,
tenant_id&lt;span style="color:#f92672">=&lt;/span>tenant_id,
deleted_at&lt;span style="color:#f92672">=&lt;/span>datetime&lt;span style="color:#f92672">.&lt;/span>utcnow()&lt;span style="color:#f92672">.&lt;/span>isoformat(),
deletion_report&lt;span style="color:#f92672">=&lt;/span>results
)
&lt;span style="color:#66d9ef">return&lt;/span> results
&lt;/code>&lt;/pre>&lt;/div>&lt;h3 id="123-cu-hnh-bo-mt-memory-yaml">12.3. Cấu hình bảo mật Memory (YAML)&lt;/h3>
&lt;div class="highlight">&lt;pre style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4">&lt;code class="language-yaml" data-lang="yaml">&lt;span style="color:#75715e"># memory-security.yml&lt;/span>
memory_security:
&lt;span style="color:#75715e"># Mã hóa at-rest&lt;/span>
encryption:
redis:
enabled: &lt;span style="color:#66d9ef">true&lt;/span>
algorithm: &lt;span style="color:#e6db74">&amp;#34;AES-256-GCM&amp;#34;&lt;/span>
key_rotation_days: &lt;span style="color:#ae81ff">90&lt;/span>
postgresql:
tde_enabled: &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># Transparent Data Encryption&lt;/span>
column_encryption:
- table: user_profiles
columns: [preferences, context_summary, interaction_summary]
vector_store:
enabled: &lt;span style="color:#66d9ef">true&lt;/span>
provider: &lt;span style="color:#e6db74">&amp;#34;qdrant-cloud&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Qdrant Cloud có built-in encryption&lt;/span>
&lt;span style="color:#75715e"># Kiểm soát truy cập&lt;/span>
access_control:
rbac_enabled: &lt;span style="color:#66d9ef">true&lt;/span>
roles:
agent_read: [&lt;span style="color:#e6db74">&amp;#34;session:read&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;memory:read&amp;#34;&lt;/span>]
agent_write: [&lt;span style="color:#e6db74">&amp;#34;session:write&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;memory:write&amp;#34;&lt;/span>]
admin: [&lt;span style="color:#e6db74">&amp;#34;session:*&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;memory:*&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;gdpr:*&amp;#34;&lt;/span>]
tenant_isolation: strict &lt;span style="color:#75715e"># Không cho phép cross-tenant query&lt;/span>
&lt;span style="color:#75715e"># PII&lt;/span>
pii:
mask_before_store: &lt;span style="color:#66d9ef">true&lt;/span>
patterns: [&lt;span style="color:#e6db74">&amp;#34;email&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;phone_vn&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;cccd&amp;#34;&lt;/span>, &lt;span style="color:#e6db74">&amp;#34;credit_card&amp;#34;&lt;/span>]
log_masking_events: &lt;span style="color:#66d9ef">true&lt;/span>
&lt;span style="color:#75715e"># Retention policy&lt;/span>
retention:
default_ttl_days: &lt;span style="color:#ae81ff">180&lt;/span>
critical_memory: &lt;span style="color:#e6db74">&amp;#34;no_expiry&amp;#34;&lt;/span>
gdpr_deletion: &lt;span style="color:#e6db74">&amp;#34;immediate&amp;#34;&lt;/span>
audit_logs: &lt;span style="color:#e6db74">&amp;#34;7_years&amp;#34;&lt;/span> &lt;span style="color:#75715e"># Yêu cầu pháp lý Việt Nam&lt;/span>
&lt;span style="color:#75715e"># Monitoring&lt;/span>
monitoring:
alert_on_cross_tenant_query: &lt;span style="color:#66d9ef">true&lt;/span>
alert_on_bulk_read: &lt;span style="color:#66d9ef">true&lt;/span> &lt;span style="color:#75715e"># &amp;gt; 1000 records trong 1 phút&lt;/span>
alert_on_pii_in_log: &lt;span style="color:#66d9ef">true&lt;/span>
&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="13-checklist-trin-khai-memory-system">13. Checklist triển khai Memory System&lt;/h2>
&lt;h3 id="-cp-1-in-context-memory-tun-12">✅ Cấp 1: In-Context Memory (Tuần 1–2)&lt;/h3>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Chọn chiến lược context management: Sliding Window / Summary Buffer / Entity Memory&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement token counting chính xác theo model đang dùng (tiktoken hoặc tương đương)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Thiết lập ngưỡng tóm tắt tự động (khuyến nghị: 80% token budget)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Unit test: đảm bảo system prompt luôn được giữ nguyên&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Đo token usage trung bình per request để baseline chi phí&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Verify: context không bao giờ vượt quá max_tokens của model&lt;/li>
&lt;/ul>
&lt;h3 id="-cp-2-session-memory-tun-24">✅ Cấp 2: Session Memory (Tuần 2–4)&lt;/h3>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Cài đặt Redis/Valkey với persistence (AOF + RDB)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Thiết kế session schema JSON đầy đủ (session_id, user_id, tenant_id, messages, metadata)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement sliding TTL (làm mới TTL mỗi khi truy cập)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Thiết lập Redis eviction policy: &lt;code>allkeys-lru&lt;/code>&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Test: reconnect sau khi mạng bị ngắt vẫn load được session&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Test: session không bị lẫn giữa các user (tenant isolation)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Monitoring: Redis memory usage, key count, hit rate&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Backup: cấu hình Redis persistence cho production&lt;/li>
&lt;/ul>
&lt;h3 id="-cp-3-long-term-memory-tun-48">✅ Cấp 3: Long-term Memory (Tuần 4–8)&lt;/h3>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Deploy PostgreSQL schema (&lt;code>ai_memory.user_profiles&lt;/code>, &lt;code>interaction_logs&lt;/code>, &lt;code>memory_items&lt;/code>)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement importance scoring cho mọi tin nhắn trước khi lưu&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement PII masking pipeline (email, phone, CCCD)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Thiết lập memory decay TTL theo importance level&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement deduplication bằng content_hash&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Thiết lập vector store (Qdrant hoặc pgvector) và indexing pipeline&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement hybrid retrieval (session + semantic, chạy song song)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">GDPR: implement &lt;code>delete_all_user_memory&lt;/code> API endpoint&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Encrypt sensitive columns trong PostgreSQL&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Load test: hybrid retrieval &amp;lt; 100ms P95 với 100K users&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Audit log: mọi write operation vào long-term memory&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="14-kpi-chi-ph-v-roi">14. KPI, Chi phí và ROI&lt;/h2>
&lt;h3 id="141-kpi-cho-memory-system">14.1. KPI cho Memory System&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>KPI&lt;/th>
&lt;th>Định nghĩa&lt;/th>
&lt;th>Mục tiêu MVP&lt;/th>
&lt;th>Mục tiêu Production&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Session Continuity Rate&lt;/strong>&lt;/td>
&lt;td>% session được restore thành công sau reconnect&lt;/td>
&lt;td>≥ 95%&lt;/td>
&lt;td>≥ 99.5%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Memory Retrieval Latency (P95)&lt;/strong>&lt;/td>
&lt;td>Thời gian hybrid retrieval P95&lt;/td>
&lt;td>≤ 200ms&lt;/td>
&lt;td>≤ 80ms&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Memory Relevance Score&lt;/strong>&lt;/td>
&lt;td>% ký ức được retrieve có liên quan thực sự&lt;/td>
&lt;td>≥ 70%&lt;/td>
&lt;td>≥ 85%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Context Token Efficiency&lt;/strong>&lt;/td>
&lt;td>Giảm token gửi lên LLM vs không có memory&lt;/td>
&lt;td>≥ 20%&lt;/td>
&lt;td>≥ 40%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Personalization Acceptance Rate&lt;/strong>&lt;/td>
&lt;td>% khi agent dùng memory, user không phàn nàn &amp;ldquo;sai&amp;rdquo;&lt;/td>
&lt;td>≥ 90%&lt;/td>
&lt;td>≥ 97%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Memory Write Noise Rate&lt;/strong>&lt;/td>
&lt;td>% bản ghi lưu vào long-term nhưng không bao giờ được truy vấn lại&lt;/td>
&lt;td>≤ 30%&lt;/td>
&lt;td>≤ 10%&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>GDPR Deletion SLA&lt;/strong>&lt;/td>
&lt;td>Thời gian hoàn thành right-to-forget từ khi nhận yêu cầu&lt;/td>
&lt;td>≤ 72 giờ&lt;/td>
&lt;td>≤ 24 giờ&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="142-c-lng-chi-ph-quy-m-smb-10000-sessionsngy">14.2. Ước lượng chi phí (Quy mô SMB, 10.000 sessions/ngày)&lt;/h3>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Hạng mục&lt;/th>
&lt;th>Chi phí thiết lập&lt;/th>
&lt;th>Chi phí vận hành/tháng&lt;/th>
&lt;th>Ghi chú&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Redis (2 GB, HA)&lt;/strong>&lt;/td>
&lt;td>$0 (self-hosted)&lt;/td>
&lt;td>$30–80&lt;/td>
&lt;td>Hoặc Upstash Redis ~$20/tháng&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>PostgreSQL (memory schema)&lt;/strong>&lt;/td>
&lt;td>$0 (add to existing)&lt;/td>
&lt;td>$10–30&lt;/td>
&lt;td>~50GB storage cho 1M users&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Qdrant Cloud (1M vectors)&lt;/strong>&lt;/td>
&lt;td>$0&lt;/td>
&lt;td>$25–75&lt;/td>
&lt;td>Phụ thuộc vào số ký ức/user&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Embedding API&lt;/strong>&lt;/td>
&lt;td>—&lt;/td>
&lt;td>$20–60&lt;/td>
&lt;td>10K sessions × avg 10 memories × $0.0001/embed&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>LLM cho summarization&lt;/strong>&lt;/td>
&lt;td>—&lt;/td>
&lt;td>$15–40&lt;/td>
&lt;td>Chỉ khi trigger tóm tắt context&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Engineering (thiết kế + triển khai)&lt;/strong>&lt;/td>
&lt;td>$3.000–8.000&lt;/td>
&lt;td>$500–1.500&lt;/td>
&lt;td>Bảo trì, cải tiến&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Tổng ước lượng&lt;/strong>&lt;/td>
&lt;td>&lt;strong>$3.000–8.000&lt;/strong>&lt;/td>
&lt;td>&lt;strong>$100–285&lt;/strong>&lt;/td>
&lt;td>Không tính LLM chính&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;h3 id="143-roi-tham-chiu">14.3. ROI tham chiếu&lt;/h3>
&lt;p>&lt;strong>Tình huống&lt;/strong>: Công ty TMĐT 50.000 khách hàng hoạt động. Trước khi có Memory:&lt;/p>
&lt;ul>
&lt;li>Mỗi session mới: khách mất 2–3 phút re-explain context → 30% khách bỏ cuộc&lt;/li>
&lt;li>CS team nhận 20% ticket &amp;ldquo;lặp lại vấn đề đã giải quyết&amp;rdquo; vì agent không nhớ&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Sau khi triển khai Memory System:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Khách quay lại tiếp tục ngay từ điểm dừng → Giảm abandonment 30% → &lt;strong>+15% conversion&lt;/strong>&lt;/li>
&lt;li>Giảm lặp ticket: agent tự nhớ context → &lt;strong>-20% ticket volume&lt;/strong> → tiết kiệm $2.000–5.000/tháng nhân sự CS&lt;/li>
&lt;li>CSAT tăng từ 3.8 → 4.3/5 (ví dụ tham chiếu từ các dự án CRM AI) → &lt;strong>+18% customer retention&lt;/strong>&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>ROI năm đầu&lt;/strong> (ước tính thận trọng):&lt;/p>
&lt;ul>
&lt;li>Tiết kiệm nhân sự CS: $2.500/tháng × 12 = $30.000/năm&lt;/li>
&lt;li>Tăng conversion: khó đo trực tiếp nhưng ước tính $10.000–30.000/năm&lt;/li>
&lt;li>Chi phí hệ thống: $285/tháng × 12 + $5.000 setup = &lt;strong>$8.420/năm&lt;/strong>&lt;/li>
&lt;li>&lt;strong>ROI ≈ 380–710%&lt;/strong> | &lt;strong>Hoàn vốn: 2–3 tháng&lt;/strong>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="15-bng-ri-ro-v-phng-n-gim-thiu">15. Bảng Rủi ro và Phương án Giảm Thiểu&lt;/h2>
&lt;table>
&lt;thead>
&lt;tr>
&lt;th>Rủi ro&lt;/th>
&lt;th>Mức độ&lt;/th>
&lt;th>Xác suất&lt;/th>
&lt;th>Phương án giảm thiểu&lt;/th>
&lt;/tr>
&lt;/thead>
&lt;tbody>
&lt;tr>
&lt;td>&lt;strong>Memory contamination&lt;/strong>: Agent dùng sai ký ức của user khác&lt;/td>
&lt;td>Rất cao&lt;/td>
&lt;td>Thấp (nếu thiết kế đúng)&lt;/td>
&lt;td>Tenant + user isolation nghiêm ngặt; unit test cross-user query&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Stale memory&lt;/strong>: Sở thích cũ không còn phù hợp&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Memory decay TTL + confidence score giảm dần theo thời gian&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Hallucinated memory&lt;/strong>: Agent &amp;ldquo;nhớ&amp;rdquo; thứ không có trong store&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Chỉ inject ký ức đã verified; prompt rõ &amp;ldquo;chỉ dùng ký ức từ [RELEVANT MEMORIES]&amp;rdquo;&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>PII leak trong log/memory&lt;/strong>&lt;/td>
&lt;td>Rất cao&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>PII masking pipeline bắt buộc trước khi lưu; kiểm tra định kỳ&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Redis out-of-memory&lt;/strong>&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Eviction policy LRU + monitoring alert ở 80% RAM; Redis Cluster&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Latency cao khi cold-start&lt;/strong> (pre-load nhiều memory)&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Async pre-load; cache top-K profiles; limit recall to top-3&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>Ký ức xây dựng sai lệch&lt;/strong> (garbage-in-garbage-out)&lt;/td>
&lt;td>Cao&lt;/td>
&lt;td>Trung bình&lt;/td>
&lt;td>Importance scoring nghiêm ngặt; human review với importance=5&lt;/td>
&lt;/tr>
&lt;tr>
&lt;td>&lt;strong>GDPR non-compliance&lt;/strong>: Không xóa kịp khi user yêu cầu&lt;/td>
&lt;td>Rất cao&lt;/td>
&lt;td>Thấp&lt;/td>
&lt;td>Automated deletion pipeline; SLA 24h; audit log cho mọi deletion&lt;/td>
&lt;/tr>
&lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="16-roadmap-trin-khai-3-giai-on">16. Roadmap Triển Khai 3 Giai Đoạn&lt;/h2>
&lt;h3 id="giai-on-1-tun-13-in-context--session-memory">Giai đoạn 1 (Tuần 1–3): In-Context + Session Memory&lt;/h3>
&lt;p>&lt;strong>Mục tiêu&lt;/strong>: Agent không bao giờ &amp;ldquo;quên&amp;rdquo; trong cùng một phiên làm việc.&lt;/p>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Implement Token Budget Memory với ngưỡng 80% trigger summarization&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Cài đặt Redis/Valkey, thiết kế session schema&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement &lt;code>RedisSessionStore&lt;/code> với sliding TTL 24h&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Tích hợp session memory vào agent loop hiện tại&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Test: reconnect sau 1h, sau 8h vẫn load được session&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Monitoring: Redis memory, session hit rate, token usage per session&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">&lt;strong>KPI đo được&lt;/strong>: Session Continuity Rate ≥ 95%, Memory Retrieval Latency ≤ 200ms&lt;/li>
&lt;/ul>
&lt;h3 id="giai-on-2-tun-48-long-term-memory--user-profiling">Giai đoạn 2 (Tuần 4–8): Long-term Memory + User Profiling&lt;/h3>
&lt;p>&lt;strong>Mục tiêu&lt;/strong>: Agent biết khách hàng là ai và nhớ lịch sử quan trọng.&lt;/p>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Deploy PostgreSQL memory schema (3 bảng chính)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement importance scoring và memory write policy&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Build user profile accumulation pipeline (cập nhật sau mỗi session)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement PII masking trước khi lưu vào mọi storage&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Triển khai Qdrant hoặc pgvector cho semantic memory&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Implement hybrid retrieval (session + semantic song song)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Build GDPR deletion endpoint&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Test: right-to-forget hoàn thành &amp;lt; 24h&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">&lt;strong>KPI đo được&lt;/strong>: Memory Relevance Score ≥ 70%, Context Token Efficiency +20%&lt;/li>
&lt;/ul>
&lt;h3 id="giai-on-3-tun-912-ti-u--c-nhn-ha-nng-cao">Giai đoạn 3 (Tuần 9–12): Tối ưu &amp;amp; Cá nhân hóa nâng cao&lt;/h3>
&lt;p>&lt;strong>Mục tiêu&lt;/strong>: Trải nghiệm cá nhân hóa thực sự, vận hành ổn định ở scale.&lt;/p>
&lt;ul>
&lt;li>&lt;input disabled="" type="checkbox">Implement Memory Decay (TTL theo importance)&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Build personalization engine: tự động điều chỉnh communication style&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">A/B test: so sánh agent có/không có long-term memory về CSAT&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Tối ưu hybrid retrieval: caching top profiles, async pre-load&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Dashboard KPI: memory hit rate, relevance score, noise rate&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Thiết lập alert: cross-tenant query, PII in log, bulk read anomaly&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">Load test: 100K concurrent users, latency P95 &amp;lt; 80ms&lt;/li>
&lt;li>&lt;input disabled="" type="checkbox">&lt;strong>KPI đo được&lt;/strong>: CSAT +0.3+ điểm, Memory Write Noise Rate ≤ 10%&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="17-kt-lun-v-kt-ni-sang-bi-6">17. Kết luận và Kết nối sang Bài 6&lt;/h2>
&lt;p>Memory &amp;amp; Context Management là &lt;strong>nền tảng của trải nghiệm người dùng&lt;/strong> — không phải feature phụ mà là điều kiện cần để AI Agent tạo ra giá trị lâu dài:&lt;/p>
&lt;ul>
&lt;li>Không có Session Memory → Agent quên mọi thứ khi user F5 trang&lt;/li>
&lt;li>Không có Long-term Memory → Agent xử lý khách hàng VIP như người lạ&lt;/li>
&lt;li>Không có Semantic Memory → Agent không thể &amp;ldquo;nhớ lại&amp;rdquo; những gì quan trọng khi cần&lt;/li>
&lt;li>Không có Memory Policy → Garbage in, garbage out; rủi ro PII, chi phí không kiểm soát&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Ba nguyên tắc cốt lõi để Memory System thành công:&lt;/strong>&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Layer by layer&lt;/strong> — Bắt đầu từ Session Memory (đơn giản, ROI rõ ràng), rồi mới đến Long-term và Semantic&lt;/li>
&lt;li>&lt;strong>Write less, write right&lt;/strong> — Importance scoring nghiêm ngặt: thà bỏ sót 30% ký ức còn hơn lưu 80% rác&lt;/li>
&lt;li>&lt;strong>Privacy first&lt;/strong> — PII masking và tenant isolation phải là yêu cầu từ ngày đầu, không phải afterthought&lt;/li>
&lt;/ol>
&lt;hr>
&lt;p>Bài tiếp theo trong series sẽ đi sâu vào &lt;strong>Planning &amp;amp; ReAct Loop&lt;/strong> — cách AI Agent không chỉ phản hồi ngay lập tức mà còn biết &lt;strong>lập kế hoạch&lt;/strong> và &lt;strong>lý luận nhiều bước&lt;/strong> trước khi hành động. Đây là nền tảng để xây dựng các agent phức tạp như: tự động xử lý claim bảo hiểm, phân tích hồ sơ tín dụng hay điều phối quy trình onboarding nhân viên — những bài toán đòi hỏi agent phải &amp;ldquo;suy nghĩ&amp;rdquo; trước khi &amp;ldquo;làm&amp;rdquo;.&lt;/p>
&lt;hr>
&lt;p>&lt;em>Tác giả: AI Agent Series | Cập nhật: 14/05/2026&lt;/em>&lt;/p></description></item></channel></rss>