classifier

分类器使用 AI 辅助权限决策，在 Auto 模式下自动审批写操作。

设计理念

问题

硬编码规则无法覆盖所有场景：

新命令不断出现
上下文影响安全性
用户意图难以预测

解决方案

使用 AI 分类器判断操作是否安全：

命令 + 上下文 → 分类器 → 批准/拒绝

核心实现

分类器调用

async function classifierApprove(
  command: string,
  context: ToolContext
): Promise<PermissionResult> {
  // 1. 构建分类器 prompt
  const prompt = buildClassifierPrompt(command, context);
  
  // 2. 调用 Claude API
  const response = await anthropic.messages.create({
    model: 'claude-3-5-sonnet-20241022',
    max_tokens: 100,
    messages: [{ role: 'user', content: prompt }],
  });
  
  // 3. 解析响应
  const decision = parseClassifierResponse(response.content);
  
  return decision;
}

Classifier Prompt

function buildClassifierPrompt(
  command: string,
  context: ToolContext
): string {
  return `
You are a security classifier for shell commands.

Command: ${command}
Working directory: ${context.cwd}
User intent: ${context.userIntent}

Analyze if this command is safe to execute automatically.

Consider:
1. Does it modify important files?
2. Does it have irreversible effects?
3. Does it match the user's intent?
4. Are there any security risks?

Respond with JSON:
{
  "safe": true/false,
  "reason": "explanation",
  "confidence": 0.0-1.0
}
`;
}

响应解析

function parseClassifierResponse(content: string): PermissionResult {
  try {
    const json = JSON.parse(content);
    
    return {
      allowed: json.safe && json.confidence > 0.8,
      reason: json.reason,
      confidence: json.confidence,
    };
  } catch {
    // 解析失败，默认拒绝
    return {
      allowed: false,
      reason: 'Failed to parse classifier response',
    };
  }
}

分类示例

安全操作

// 命令: npm install lodash
// 上下文: 用户说 "安装 lodash"

{
  "safe": true,
  "reason": "Installing a popular package matches user intent",
  "confidence": 0.95
}

危险操作

// 命令: rm -rf node_modules
// 上下文: 用户说 "清理项目"

{
  "safe": false,
  "reason": "Recursive deletion should be confirmed by user",
  "confidence": 0.9
}

不确定操作

// 命令: ./deploy.sh
// 上下文: 用户说 "部署应用"

{
  "safe": false,
  "reason": "Unknown script, cannot verify safety",
  "confidence": 0.6
}

性能优化

缓存决策

const classifierCache = new Map<string, PermissionResult>();

async function classifierApproveWithCache(
  command: string,
  context: ToolContext
): Promise<PermissionResult> {
  const key = `${command}:${context.cwd}`;
  
  if (classifierCache.has(key)) {
    return classifierCache.get(key);
  }
  
  const result = await classifierApprove(command, context);
  classifierCache.set(key, result);
  
  return result;
}

批量分类

async function classifyBatch(
  commands: string[],
  context: ToolContext
): Promise<PermissionResult[]> {
  // 一次 API 调用分类多个命令
  const prompt = buildBatchClassifierPrompt(commands, context);
  const response = await anthropic.messages.create({
    model: 'claude-3-5-sonnet-20241022',
    messages: [{ role: 'user', content: prompt }],
  });
  
  return parseBatchResponse(response.content);
}

置信度阈值

配置

{
  "classifier": {
    "enabled": true,
    "confidenceThreshold": 0.8,
    "fallbackToManual": true
  }
}

决策逻辑

function makeDecision(
  classification: ClassifierResult,
  threshold: number
): PermissionResult {
  if (classification.confidence < threshold) {
    // 置信度不足，回退到手动
    return { allowed: false, reason: 'Low confidence, manual approval required' };
  }
  
  return {
    allowed: classification.safe,
    reason: classification.reason,
  };
}

学习机制

用户反馈

function recordUserFeedback(
  command: string,
  classifierDecision: boolean,
  userDecision: boolean
): void {
  if (classifierDecision !== userDecision) {
    // 分类器错误，记录反馈
    feedbackLog.append({
      command: command,
      predicted: classifierDecision,
      actual: userDecision,
      timestamp: Date.now(),
    });
  }
}

模式学习

// 从反馈中学习新模式
function learnPatterns(feedbackLog: Feedback[]): Pattern[] {
  const patterns = [];
  
  // 找出一致的错误分类
  const errors = feedbackLog.filter(f => f.predicted !== f.actual);
  
  // 提取共同特征
  const commonFeatures = extractCommonFeatures(errors);
  
  // 生成新规则
  patterns.push(...generateRules(commonFeatures));
  
  return patterns;
}

下一步

查看权限模式的配置
了解 Bash 权限检查的详细规则
探索沙箱执行的隔离机制

Previousbash permissions Nextpermission modes

hashtag设计理念

hashtag问题

hashtag解决方案

hashtag核心实现

hashtag分类器调用

hashtagClassifier Prompt

hashtag响应解析

hashtag分类示例

hashtag安全操作

hashtag危险操作

hashtag不确定操作

hashtag性能优化

hashtag缓存决策

hashtag批量分类

hashtag置信度阈值

hashtag配置

hashtag决策逻辑

hashtag学习机制

hashtag用户反馈

hashtag模式学习

hashtag下一步

设计理念

问题

解决方案

核心实现

分类器调用

Classifier Prompt

响应解析

分类示例

安全操作

危险操作

不确定操作

性能优化

缓存决策

批量分类

置信度阈值

配置

决策逻辑

学习机制

用户反馈

模式学习

下一步