diff --git a/docs/llmservice/models/glm-5-2.md b/docs/llmservice/models/glm-5-2.md new file mode 100644 index 0000000..4fdc18f --- /dev/null +++ b/docs/llmservice/models/glm-5-2.md @@ -0,0 +1,47 @@ +# GLM-5.2 + +## Overview + +GLM-5.2 is a GLM-family text foundation model developed by Z.AI and released on June 16, 2026. It is positioned for long-horizon coding and engineering tasks, with a 1M-token context window, 128K maximum output, and a `reasoning_effort` control for adjusting reasoning depth. + +## Key Features + +* **1M Context Window**: Supports up to 1M tokens of context for project-scale codebases, long documents, and multi-step engineering workflows. +* **Long-Horizon Coding**: Z.AI reports 81.0 on Terminal-Bench 2.1 and 62.1 on SWE-bench Pro, with emphasis on project-level code understanding and sustained task execution. +* **Configurable Reasoning**: Supports deep-thinking mode and the GLM-5.2-specific `reasoning_effort` parameter, with `high` and `max` reasoning levels for complex tasks. +* **Agent and Tool Integration**: Supports function calling, streaming tool calls, structured output, context caching, and MCP-based tool/data-source integration. + +## Best Use Cases + +* **Project-Scale Codebase Work**: Reviewing, refactoring, migrating, or extending repositories where the model needs to retain architecture, module boundaries, API contracts, and engineering conventions. +* **Long-Horizon Engineering Tasks**: Multi-file implementation, dependency-aware refactoring, SDK adaptation, debugging loops, and test-fix-verify cycles that require sustained progress. +* **Tool-Using Agent Workflows**: Coding agents, internal automation, MCP-connected workflows, and structured-output systems that need reliable tool invocation and streamed tool-call arguments. + +## Capabilities and Limitations + +| Capability | Description | +| :------------------- | :-------------------------------------------------------------------------------------------------------------------------------------------- | +| **Reasoning** | Supports deep-thinking mode and `reasoning_effort`; Z.AI positions it for complex engineering, debugging, and long-chain reasoning workflows. | +| **Creative Writing** | Supports general text generation through the chat completion API, but official GLM-5.2 materials emphasize coding and engineering use cases. | +| **Coding** | Z.AI reports Terminal-Bench 2.1 score of 81.0 and SWE-bench Pro score of 62.1, with focus on long-horizon coding-agent scenarios. | +| **Multimodal** | Text input and text output. Vision and multimodal workflows are handled by separate Z.AI models such as GLM-5V-Turbo. | +| **Response Speed** | Official docs do not publish latency or tokens-per-second figures; streaming responses and streaming tool calls are supported. | +| **Context Window** | 1M tokens. | +| **Max Output** | 128K tokens. | +| **Tool Use** | Function calling, streaming tool calls, structured output, context caching, and MCP integration. | +| **Multilingual** | Suitable for Chinese and English developer workflows; official docs do not publish a GLM-5.2 language coverage benchmark. | + +### Known Limitations + +* Text-only model; image, video, and GUI-understanding tasks require a separate vision-language model such as GLM-5V-Turbo. +* Very long contexts and 128K outputs can increase latency and cost; cap `max_tokens` and use context caching where applicable. + +## Credits Usage + +| Model | Input (Credits/Token) | Cache Write (Credits/Token) | Cache Read (Credits/Token) | Output (Credits/Token) | Web Search (Credits/Use) | Billing Notes | +| :--- | --------------------: | --------------------------: | -------------------------: | ---------------------: | -----------------------: | :--- | +| **GLM-5.2** | `1.40` | `1.40` | `0.28` | `4.40` | `-` | - | + +:::info Pricing note +Prices shown in the documentation are B.AI standard reference prices for base billing purposes. B.AI may provide lower actual usage costs through top-up bonuses and account benefits. Specific prices, bonus Credits, and account benefits are subject to the platform display and final billing records. +::: diff --git a/docs/llmservice/pricing-and-usage.md b/docs/llmservice/pricing-and-usage.md index 0197ebc..29775e9 100644 --- a/docs/llmservice/pricing-and-usage.md +++ b/docs/llmservice/pricing-and-usage.md @@ -18,6 +18,7 @@ The platform uses a unified Credits system to measure and settle usage across al | MiniMax M2.7 | 0.30 | 0.375 | 0.06 | 1.20 | - | | Kimi K2.5 | 0.59 | 0.59 | 0.177 | 3.00 | - | | Qwen3.6-27B | 0.19 | 0.19 | 0.19 | 2.99 | - | +| GLM-5.2 | 1.40 | 1.40 | 0.28 | 4.40 | - | | GLM-5.1 | 1.40 | 1.40 | 0.26 | 4.40 | - | | GLM-5 | 1.00 | 1.00 | 0.20 | 3.20 | - | | DeepSeek V3.2 | 0.29 | 0.29 | 0.145 | 0.44 | - | diff --git a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/models/glm-5-2.md b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/models/glm-5-2.md new file mode 100644 index 0000000..1683b6e --- /dev/null +++ b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/models/glm-5-2.md @@ -0,0 +1,47 @@ +# GLM-5.2 + +## 概述 + +GLM-5.2 是由 Z.AI 开发的 GLM 系列文本基础模型,于 2026 年 6 月 16 日发布。该模型面向长周期代码和工程任务,支持 1M tokens 上下文窗口、128K 最大输出,并提供 `reasoning_effort` 参数用于调整推理深度。 + +## 核心特性 + +* **1M 上下文窗口**:支持最高 1M tokens 上下文,适合项目级代码库、长文档和多步骤工程工作流。 +* **长周期代码能力**:Z.AI 报告其 Terminal-Bench 2.1 得分为 81.0,SWE-bench Pro 得分为 62.1,重点面向项目级代码理解和持续任务执行。 +* **可配置推理深度**:支持 deep-thinking 模式,以及 GLM-5.2 专用的 `reasoning_effort` 参数;复杂任务可使用 `high` 和 `max` 推理等级。 +* **Agent 与工具集成**:支持函数调用、流式工具调用、结构化输出、上下文缓存,以及基于 MCP 的工具和数据源集成。 + +## 适用场景 + +* **项目级代码库工作**:适合代码审查、重构、迁移或扩展仓库,需要模型保持架构、模块边界、API 合约和工程约定的场景。 +* **长周期工程任务**:适合多文件实现、依赖感知重构、SDK 适配、调试循环,以及测试、修复、验证一体化流程。 +* **工具调用型 Agent 工作流**:适合 Coding Agent、内部自动化、MCP 连接工作流,以及需要可靠工具调用和流式工具参数的结构化输出系统。 + +## 能力与限制 + +| 能力维度 | 说明 | +| :--- | :--- | +| **推理能力** | 支持 deep-thinking 模式和 `reasoning_effort`;Z.AI 将其定位于复杂工程、调试和长链路推理工作流 | +| **创意写作** | 支持通过 chat completion API 进行通用文本生成,但官方 GLM-5.2 材料更强调代码和工程场景 | +| **编程能力** | Z.AI 报告 Terminal-Bench 2.1 得分为 81.0,SWE-bench Pro 得分为 62.1,重点面向长周期 Coding Agent 场景 | +| **多模态能力** | 文本输入和文本输出;视觉和多模态工作流由 GLM-5V-Turbo 等独立 Z.AI 模型处理 | +| **响应速度** | 官方文档未公布延迟或 tokens-per-second 数据;支持流式响应和流式工具调用 | +| **上下文窗口** | 1M tokens | +| **最大输出** | 128K tokens | +| **工具调用** | 函数调用、流式工具调用、结构化输出、上下文缓存和 MCP 集成 | +| **多语言能力** | 适合中文和英文开发者工作流;官方文档未公布 GLM-5.2 语言覆盖基准 | + +### 已知限制 + +* 该模型为文本模型;图像、视频和 GUI 理解任务需要使用 GLM-5V-Turbo 等独立视觉语言模型。 +* 超长上下文和 128K 输出可能增加延迟和成本;建议按需限制 `max_tokens`,并在适用场景使用上下文缓存。 + +## 积分消耗 + +| 模型名称 | 输入 (Credits/Token) | Cache Write (Credits/Token) | Cache Read (Credits/Token) | 输出 (Credits/Token) | 网页搜索(Credits/次) | 计费说明 | +| :--- | --------------------: | --------------------------: | -------------------------: | -------------------: | ---------------------: | :--- | +| **GLM-5.2** | `1.40` | `1.40` | `0.28` | `4.40` | `-` | - | + +:::info 价格说明 +文档价格为 B.AI 平台模型标准参考价,仅供基础计费说明使用。B.AI 可能会通过充值赠送及账户权益等方式,为用户提供更低的实际使用成本。具体价格、赠送积分及账户权益请以平台页面展示及最终账单为准。 +::: diff --git a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/pricing-and-usage.md b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/pricing-and-usage.md index 94e8775..ce40979 100644 --- a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/pricing-and-usage.md +++ b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/llmservice/pricing-and-usage.md @@ -18,6 +18,7 @@ | MiniMax M2.7 | 0.30 | 0.375 | 0.06 | 1.20 | - | | Kimi K2.5 | 0.59 | 0.59 | 0.177 | 3.00 | - | | Qwen3.6-27B | 0.19 | 0.19 | 0.19 | 2.99 | - | +| GLM-5.2 | 1.40 | 1.40 | 0.28 | 4.40 | - | | GLM-5.1 | 1.40 | 1.40 | 0.26 | 4.40 | - | | GLM-5 | 1.00 | 1.00 | 0.20 | 3.20 | - | | DeepSeek V3.2 | 0.29 | 0.29 | 0.145 | 0.44 | - | diff --git a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/sidebars.js b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/sidebars.js index e0985e3..60779a7 100644 --- a/i18n/zh-Hans/docusaurus-plugin-content-docs/current/sidebars.js +++ b/i18n/zh-Hans/docusaurus-plugin-content-docs/current/sidebars.js @@ -237,6 +237,7 @@ const sidebars = { 'llmservice/models/gemini-3-1-pro', 'llmservice/models/gemini-3-5-flash', 'llmservice/models/gemini-3-flash', + 'llmservice/models/glm-5-2', 'llmservice/models/glm-5-1', 'llmservice/models/glm-5', 'llmservice/models/kimi-k2.5', diff --git a/package.json b/package.json index c2a7142..9564fbf 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "@x402-tron/docs", - "version": "1.2.27", + "version": "1.2.28", "description": "x402-tron documentation", "license": "MIT", "scripts": { diff --git a/sidebars.js b/sidebars.js index dc1aade..511be04 100644 --- a/sidebars.js +++ b/sidebars.js @@ -234,6 +234,7 @@ const sidebars = { 'llmservice/models/gemini-3-1-pro', 'llmservice/models/gemini-3-5-flash', 'llmservice/models/gemini-3-flash', + 'llmservice/models/glm-5-2', 'llmservice/models/glm-5-1', 'llmservice/models/glm-5', 'llmservice/models/kimi-k2.5',