Skip to content

Troubleshooting

Common failure symptoms, probable causes, and fixes.


Daemon startup failures

SymptomProbable causeFix
vlx daemon exits immediatelyMissing env var (ADO_PAT, telegram tokens) or vlx.yaml not found / invalidRun bash setup/check-prereqs.sh; re-read the error message and re-source env vars
Error: unknown adapter type "..."Typo in adapters.* config keyCheck adapters.tracker, adapters.human_channel, etc. Valid values are listed in Configure
Config validation failed at startupUnknown key in vlx.yaml or wrong typeThe error message names the offending key; fix it and restart
Daemon immediately exits with integrity_check failedstate.db is corruptSee Operations — Database restore

ADO polling issues

SymptomProbable causeFix
ADO poll returns 401PAT expired or wrong scopeRegenerate with Work Items R/W + Code R/W; update env var; restart daemon
Daemon polls but never picks up a storyStory missing the vlx tag, or assigned to the wrong userCheck the tag; check ado.bot_assignee in vlx.yaml vs the PAT owner; default is @Me (PAT owner)
Daemon picks up story but immediately failsStory work item has no description or titleAdd a title and description in ADO before assigning

Telegram issues

SymptomProbable causeFix
Telegram messages never arriveWrong group_chat_id sign — groups are negative; supergroups have -100 prefixRe-derive the chat ID via getUpdates; add the bot to the group
Bot does not respond to commandsUser not in authorized_user_idsAdd your Telegram user ID to telegram.authorized_user_ids in vlx.yaml
Approval buttons do nothingBot not in forum-mode supergroup, or Topics not enabledMigrate to a forum-mode supergroup; enable Topics in group settings
Clarification question posted but no reply delivered to agentstate_file path not writable (offset not persisted across restart)Check telegram.state_file path; ensure daemon process has write access

Build / pipeline failures

SymptomProbable causeFix
Daemon stuck on a stageCrash mid-stage; checkpoint not advanced/history <storyId> in Telegram for the event trail; check stage checkpoints; use /requeue to retry
Build fails with plan_inconsistentArchitect's plan has AC rows that don't match what Builder can implementApprove the re-plan the orchestrator sends to Telegram
Review finds critical findings every attemptBuilder is making the same mistake repeatedlyCheck agent-brain/CLAUDE.md for a correction; use /restart <storyId> think to re-spec from scratch
Inspector gate fails (bun test exits non-zero)Tests are failing in the worktreeCheck qa-evidence/ in the story PR branch; the test output is committed there
Story auto-escaped 2 times, now needs_review2 consecutive pre-Ship failuresUse /requeue to give it another attempt, or /cancel to abandon

Database issues

SymptomProbable causeFix
state.db integrity failure on startupCorrupt SQLiteInspect backups/; run vlx db restore <backup> --yes — there is no auto-rollback
vlx status shows a run stuck in active for hoursDaemon crashed mid-stage; worktree exists but no ACP subprocessRestart the daemon — recovery runs automatically; if still stuck after restart, use /requeue
Event log seq gap detectedCrash between event write and commitRun is marked needs_review; use /requeue to attempt recovery from the last checkpoint

Git / SCM issues

SymptomProbable causeFix
Branch push fails with 403ADO PAT missing Code R/W scopeRegenerate PAT with Code R/W; update env var; restart daemon
PR creation fails with "branch already exists as PR"A previous run opened a PR for this story; worktree was destroyedUse /restart <storyId> ship to re-open or update the PR
git worktree add fails.vlx/worktrees/ directory not writableCheck permissions on the worktree root; check runtime.worktree_root in config

Auto-merge issues

SymptomProbable causeFix
PR approved but never auto-mergedStory missing auto-merge-ok tag, or trust score below thresholdCheck ADO work item tags; run vlx status to see trust score
Auto-merge triggered too earlyCooldown expired before you noticedUse /vetomerge <storyId> before the cooldown expires; increase auto_merge.cooldown_minutes

Diagnostic commands quick reference

bash
vlx status                   # active runs + queue depth
vlx db check                 # SQLite integrity check
vlx memory report            # memory hygiene scan

# In Telegram:
/history <storyId>           # last 30 events
/requeue <storyId>           # revive failed run
/restart <storyId> <stage>   # re-run from stage
/cancel <storyId>            # kill run
/vetomerge <storyId>         # block pending auto-merge

Getting more log detail

Set LOG_LEVEL=debug (or trace for maximum verbosity) and restart the daemon:

bash
LOG_LEVEL=debug vlx daemon

Under systemd, add the override:

bash
sudo systemctl edit vlx
# Add:
# [Service]
# Environment="LOG_LEVEL=debug"
sudo systemctl restart vlx
sudo journalctl -u vlx -f

Internal Veloxcore tool — not a public product.