canonry

Optimize AI Provider Calls

Context

Each visibility run queries every keyword against every provider (e.g., 50 keywords × 3 providers = 150 API calls). There’s no caching, so scheduled runs re-query everything even if results haven’t changed. Default models are expensive (claude-sonnet-4-6, gpt-4o). Combined, this makes monitoring costly for users with many keywords.

Plan

Step 1: Cheaper Default Models (trivial, ~60-70% cost reduction per query)

Step 2: Claude Parameter Tuning (trivial, ~15-20% additional savings on Claude)

Step 3: Snapshot Caching with TTL (moderate effort, 75-90% reduction on scheduled runs)

Skip API calls when a recent snapshot exists for the same keyword+provider.

3a. Add cacheTtlHours to config

3b. Add forceRefresh to API

3c. Add --force flag to CLI

3d. Cache logic in JobRunner (packages/canonry/src/job-runner.ts)

3e. Add composite DB index

3f. UI force-refresh option

Step 4: Concurrent Keyword Processing (moderate effort, ~60-70% wall-time reduction)

Keywords are currently processed sequentially. Process N keywords concurrently.

Step 5: OpenAI Prompt Trim (trivial, marginal savings)

Version Bump

Verification

  1. pnpm run typecheck — ensure no type errors
  2. pnpm run test — existing tests pass
  3. pnpm run lint — no lint issues
  4. Manual: trigger a run, verify snapshots created. Trigger another run within TTL, verify cached results used (check logs for “cached” messages). Trigger with --force, verify fresh API calls.
  5. Manual: verify usage counters only count actual API calls, not cached results

Key Files