3KProject Agent Workflow Notes

How I Turned Multi-Agent Collaboration into a Real Workflow Inside 3KProject

This article is about how I actually run agent collaboration inside 3KProject. What finally stabilized the flow was not adding a smarter model, but breaking the collaboration into four concrete lanes: task locking, gates, handoff, and context budget, then wiring each lane to specific documents and tools.

Lock Lock the task before doing any real work so multiple agents do not collide on the same card or the same high-risk file.
Gate Keep pre-flight, in-flight, and post-flight checks fixed so the rules do not live only inside documents.
Handoff Handoff is not a long paragraph. It compresses changed files, decisions, blockers, and next actions into a summary the next agent can actually take over.

I Eventually Realized the Problem Wasn't Model Strength

The hardest part of multi-agent collaboration is not generation. It is takeover quality. If scope, state, and validation are unclear when the next agent steps in, even a strong model can still turn the repo into a mess.

At first I oversimplified the problem too. I thought that if the instructions were detailed enough, agents would naturally collaborate. In practice, the real failure cases were much more engineering-shaped: two agents touching the same card, the next person not being able to tell what the previous one changed, handoffs carrying so much background that the context exploded, and rules written in docs but never executed by tools.

So I stopped chasing a more clever prompt and instead collapsed things into a repo operating system. Behind that operating system is a set of documents I open over and over again: collaboration instructions, the task-card handbook, keep workflow shards, the context budget guideline, and the multi-agent collaboration protocol.

If a single agent is like a very capable component inside a Unity project, then multi-agent collaboration is closer to scene execution order, event contracts, and runtime guards. A strong component alone still cannot replace lifecycle management for the whole scene.

The Collaboration Spine I Finally Settled On

After reading through the repo, I reduced the collaboration loop into a short spine: read the consensus summary, run the context budget check, lock the task, edit only within a local slice, run focused validation as the first check, then compress the result into a handoff for the next agent. Once the chain gets short, hidden risks become obvious very quickly.

New request or task card arrives Pre-flight Read keep.summary -> run check-context-budget -> task-lock.js check / lock -> confirm this round's scope In-flight Edit only the local files and local behavior, then run the first focused validation immediately without expanding scope Did validation pass? cheap checks first Fail: repair in place Return to the same slice, patch the same gap, and rerun the same check Pass: handoff + unlock Write changedFiles / decisions / blockers / nextAction, then hand it to the next agent FAIL PASS
Figure 1: I eventually made every agent follow the same short spine. The point was not more process theater, but making it explicit when scope may expand and when it absolutely may not.

Why lock first

I no longer treat task lock as bureaucratic overhead. I treat it as the cheapest collision avoider. If the task is not locked at the beginning, everything after that becomes much harder to trust.

Why cheap checks first

The first validation must be cheap and local. That is the only way it will actually run after each small edit, instead of becoming a ritual people remember only at the end.

The Collaboration Documents I Actually Keep Opening

If I unpack this method back into documents, I keep opening five groups of files. They are not parallel duplicates. Each one plays a different role: entry layer, consensus layer, collaboration layer, task layer, and throttling layer. That separation is exactly why I do not need to cram every rule into one giant handbook.

Entry layer: AGENTS.md / .github/copilot-instructions.md Put language rules, hard rules, pre-flight, and task lock into the first screen so every agent starts from the same opening move. Consensus layer: docs/keep.summary.md + docs/keep-shards/keep-workflow.md - Keep project consensus in summary form instead of full-text rereads - Land task-card, handoff, collision, and Git rules here Collaboration layer: .github/instructions/ agent-collaboration.instructions.md - Three defenses: pre-flight / in-flight / post-flight - Turn task lock, compute gate, and handoff into an actual operating flow Task layer: docs/agent-briefs/Readme.md - Task-card lifecycle, notes format, commit batches, and handoff rules - Give "who is doing what" and "how far along" a formal place to live Throttle layer: docs/agent-context-budget.md + docs/agent-collaboration-protocol.md - Handoff summary cards, image throttling, and token hard-stops - Define what the next agent must verify before taking over Support scripts: task-lock.js / compute-gate.js / check-context-budget.js / report-turn-usage.js Docs explain the rules. Scripts actually run them. Keeping those two separate is what stops collaboration from degrading into a verbal agreement.
Figure 2: I did not solve collaboration with one mega-document. I split it into layers so entry, consensus, execution, throttling, and handoff each have their own job.
  • AGENTS.md / CLAUDE.md / .github/copilot-instructions.md: these entry files make Codex, Claude Code, and Copilot see the hard rules first instead of discovering the process only after they already started working.
  • .github/instructions/agent-collaboration.instructions.md: the shared entry layer that lets Antigravity and any other shared-flow agent see the same lock rule and the same three defenses first.
  • docs/agent-briefs/Readme.md: the formal home of task-card lifecycle and notes format, so collaboration state becomes a traceable record.
  • docs/keep.summary.md / docs/keep-shards/keep-workflow.md: the consensus layer that lets agents know the current project rules without rereading the whole keep.
  • docs/agent-context-budget.md / docs/agent-collaboration-protocol.md: the documents that make context throttling and handoff contracts explicit, so message-layer chaos does not eat the workflow.

One Patch I Now Treat as Mandatory: All Four Entry Points Must See the Lock Rule First

I eventually noticed a dangerous hole: the `task-lock` rule already existed, but it mostly lived deep inside files like `docs/agent-briefs/Readme.md`. That means if an agent enters directly through Copilot, Codex, Claude Code, or Antigravity, it can easily skip the lock flow without even realizing it.

The real issue was not that the rule did not exist. The issue was that it did not live in a must-pass entry point. If the entry does not show it, the downstream handbook is only remedial reading, not a real pre-flight guard.

If the four agent entry points do not show the lock rule first, the whole workflow can drift from step zero Copilot Entry: .github/copilot- instructions.md Codex Entry: AGENTS.md Claude Code Entry: CLAUDE.md Antigravity Shared entry: agent-collaboration instructions + keep.summary Hard Rule #0: lock first node tools_node/task-lock.js check <task-id> node tools_node/task-lock.js lock <task-id> <agent-name> Then immediately update status / started_at / started_by_agent Only after that may the agent enter Pre-flight / edits / validation / handoff If the entry point hides this rule, the agent can bypass the whole collaboration defense line in the very first step
Figure 3: I moved the lock rule into every entry point because if the entry does not show it, an agent can start working while being completely unaware of the collaboration guard.
The patch I landed was not "write an even longer handbook." It was to move Hard Rule #0 directly into the first layer of every agent entry. That sounds small, but it changes collaboration reliability a lot.
  • Create `CLAUDE.md`: put Hard Rule #0 at the top of the Claude Code main entry so it appears before any other pre-flight steps.
  • Update `AGENTS.md`: make the Codex entry show the lock rule before general work guidance.
  • Update `.github/copilot-instructions.md`: insert the hard rule ahead of Copilot's pre-flight flow so it is not treated like a secondary detail.
  • Update `.github/instructions/agent-collaboration.instructions.md`: lift the lock rule from "one step in pre-flight" into a highly visible shared hard rule.
  • Update `docs/keep.summary.md` section 5: turn the one-line reminder into an explicit command block so the shared consensus layer also surfaces it immediately.

Wrong move

Hide the lock rule in a deep document and assume every agent will keep reading until it eventually finds that section. That may survive in solo work, but it almost always fails in relay-style collaboration.

Right move

Make every entry show `check`, `lock`, and the required frontmatter update on the first screen. The agent should learn "you cannot edit yet, lock first" before it touches anything.

What These Documents Actually Prevent

For me, the value of these documents is not that they are comprehensive. The value is that each one blocks a different kind of failure. Once failure types are separated, the collaboration rules stop sounding like vague slogans and become operational.

The common failures are not abstract "the agent got worse". They are these four concrete incidents. Task / file collision Main defense: task-lock.js, docs/agent-briefs/Readme.md, agent-collaboration.instructions.md Lock first, fill started_by_agent, and restrict concurrent edits on high-risk files Split concurrent work before it collides Handoff amnesia Main defense: handoff contract, notes format, changedFiles checks The next agent should not guess what happened. It should compare the summary against the actual diff. Make takeover verifiable instead of memory-based Context overload Main defense: docs/agent-context-budget.md and handoff summary-card format Limit the payload to 1 to 3 necessary files, and keep full keep files or huge images out of the handoff Keep the next round capable of local reasoning Rules living only in docs Main defense: compute-gate, validate, encoding checks, report-turn-usage, and other scripts Turn the rules into pre-flight and post-flight commands instead of review-time reminders Make documents behave like executable process
Figure 4: The collaboration documents work because each one stops a different kind of incident, not because every rule was piled into the same place.

What scares me most is not the bug

What really scares me is this: a change already happened, but nobody can say who made it, under which rule, or whether the move can still be replayed. Once multi-agent collaboration loses those three things, it quickly becomes an unmaintainable black box.

That is why I keep splitting rules into smaller layers

Once rules are split, each layer solves one problem only. That layered design is more practical than a giant governance document because it actually supports a concrete move in the current moment.

How I Actually Use This Stack Day to Day

Once these rules settled down, my daily workflow became very fixed. Before taking over, I check the consensus summary, run the health scan and context budget, then lock the task. After the edit, I run the cheapest validation first, then add the fuller gate if needed, then compress changed files and decisions into a handoff.

Pre-flight: what I always run before starting

node tools_node/compute-gate.js --profile quick --agent-feedback --no-stop
node tools_node/check-context-budget.js --changed
node tools_node/task-lock.js check <task-id>
node tools_node/task-lock.js lock <task-id> <agent-name>

Post-flight: what I always run before leaving

node tools_node/compute-gate.js --profile standard --agent-feedback
node tools_node/check-encoding-touched.js <changed-files...>
node tools_node/task-lock.js unlock <task-id> <agent-name>
node tools_node/report-turn-usage.js --changed --emit-final-line
What I care about most here is not the commands themselves, but the order. Read the summary before locking. Do focused validation before expanding scope. Write the handoff before leaving. Once the order breaks, even existing rules start acting like cleanup instead of defense.

This workflow is especially friendly to smaller models and relay-style collaboration. If each round stays inside a small slice, the model does not need to understand the whole system to finish one reliable segment. That is why I now believe multi-agent collaboration is not about many people working at once. It is about making sure any next person can still understand what they are inheriting.

What I Require Every Round to Leave Behind as Handoff

In this repo, handoff is no longer a freestyle summary. It is a contract with explicit fields. For me, the most useful format is not a long essay. It is something that lets the next agent see immediately which files matter, what has already been decided, and what does not need to be reread.

Task: HARN-ARCH-0002
Goal: Remove the illegal battle -> ui reference
Read:
- docs/keep.summary.md
- docs/agent-briefs/tasks/HARN-ARCH-0002.md
- shared/interfaces/IBattleUIBridge.ts
Known:
- The task card is already locked
- The battle side may only touch the interface, not UI components directly
- The current blocker is that the bridge injection point is still missing
Need:
- Add the bridge injection point
- Use focused validation to confirm the illegal reference is gone
Avoid:
- Do not reread keep.md in full
- Do not expand into unrelated UI contract files

I like this kind of summary card because it makes the function of handoff very explicit. It is not retelling all background. It is narrowing the next agent's reading surface. Once the surface gets smaller, takeover cost drops immediately, and token usage does not blow up.

What a good handoff looks like

Few files, short conclusions, a clear next move, and an explicit note about which large documents or artifacts do not need to be reread.

What a bad handoff looks like

Pasting the keep, the manifest, QA notes, and a huge diff into the same message. It looks complete, but in reality it sends the next round straight into hard-stop.

The Role Humans Should Actually Play in AI Collaboration

When I read about Harness Engineering, one of the most important reminders for me was this: a good harness is not meant to remove humans from the loop. It is meant to direct human attention toward the highest-value places. Human developers bring more than coding skill into a codebase. They bring taste, organizational memory, debt tradeoffs, product goals, and a strong instinct for what the team would or would not accept.

Agents do not naturally have those things. They do not know which convention is load-bearing and which one is just historical habit. They also do not know whether a technically correct solution actually matches the direction I want 3KProject to move toward. That is why I gradually shifted the human role from "line-by-line agent inspector" to "harness steward."

Human Steering Goals, taste, tradeoffs, exception approvals Guides Turn experience into entry rules, task cards, how-tos, and no-go zones Coding Agent Execute local tasks, repair errors, produce drafts and patches Sensors Return signals through type, lint, test, schema, and runtime smoke checks What humans decide when tools cannot product direction, acceptable debt, architecture exceptions, semantic ambiguity, and final acceptance criteria
Figure 5: Humans are not excluded by the harness. They set goals, tune guides and sensors, and handle the tradeoffs that neither tools nor agents can judge reliably.

Humans should not keep doing low-level checks

Formatting, schema, module boundaries, encoding, and type errors should be pushed into computational sensors. If humans keep manually checking those, collaboration cost gets stuck inside review.

Humans should keep tuning the system

When the same class of mistake repeats, my job is not to scold the agent. My job is to add a guide, add a sensor, add a fixture, or move a rule farther forward into the entry point.

So inside 3KProject, the role I want for myself is steward, not human linter. The agent handles local execution. The tools handle deterministic checks. Humans handle direction, risk, and exceptions. That is how multi-agent collaboration gets steadier over time instead of dumping all uncertainty back into human heads.

The Final Decision Rule I Keep for Myself

If I had to reduce my current understanding of multi-agent collaboration to one sentence, it would be this: do not bet collaboration on everyone being individually brilliant; build it on a system that lets any next person align immediately at takeover time.

Inside 3KProject, I now see this as an extension of Harness Engineering. A single-agent harness solves "how do we make the model fail less often." A multi-agent harness solves "how do we keep relay-style work from amplifying errors during handoff."

Better handed to the agent

Requirement decomposition, local edits, document cleanup, semantic ambiguity reading, and higher-level option comparison.

Better handed to the system

Task lock, scope management, validation order, handoff structure, context budget, and file-boundary enforcement.

I no longer think of multi-agent collaboration as "opening several chat windows." I think of it as writing the repo's lifecycle, execution order, and guards clearly enough that agents can act like a team instead of a crowd of temporary workers who cannot see one another.