3KProject Agent Workflow Notes
How I Turned Multi-Agent Collaboration into a Real Workflow Inside 3KProject
This article is about how I actually run agent collaboration inside 3KProject. What finally stabilized the flow was not adding a smarter model, but breaking the collaboration into four concrete lanes: task locking, gates, handoff, and context budget, then wiring each lane to specific documents and tools.
I Eventually Realized the Problem Wasn't Model Strength
The hardest part of multi-agent collaboration is not generation. It is takeover quality. If scope, state, and validation are unclear when the next agent steps in, even a strong model can still turn the repo into a mess.
At first I oversimplified the problem too. I thought that if the instructions were detailed enough, agents would naturally collaborate. In practice, the real failure cases were much more engineering-shaped: two agents touching the same card, the next person not being able to tell what the previous one changed, handoffs carrying so much background that the context exploded, and rules written in docs but never executed by tools.
So I stopped chasing a more clever prompt and instead collapsed things into a repo operating system. Behind that operating system is a set of documents I open over and over again: collaboration instructions, the task-card handbook, keep workflow shards, the context budget guideline, and the multi-agent collaboration protocol.
The Collaboration Spine I Finally Settled On
After reading through the repo, I reduced the collaboration loop into a short spine: read the consensus summary, run the context budget check, lock the task, edit only within a local slice, run focused validation as the first check, then compress the result into a handoff for the next agent. Once the chain gets short, hidden risks become obvious very quickly.
Why lock first
I no longer treat task lock as bureaucratic overhead. I treat it as the cheapest collision avoider. If the task is not locked at the beginning, everything after that becomes much harder to trust.
Why cheap checks first
The first validation must be cheap and local. That is the only way it will actually run after each small edit, instead of becoming a ritual people remember only at the end.
The Collaboration Documents I Actually Keep Opening
If I unpack this method back into documents, I keep opening five groups of files. They are not parallel duplicates. Each one plays a different role: entry layer, consensus layer, collaboration layer, task layer, and throttling layer. That separation is exactly why I do not need to cram every rule into one giant handbook.
- AGENTS.md / CLAUDE.md / .github/copilot-instructions.md: these entry files make Codex, Claude Code, and Copilot see the hard rules first instead of discovering the process only after they already started working.
- .github/instructions/agent-collaboration.instructions.md: the shared entry layer that lets Antigravity and any other shared-flow agent see the same lock rule and the same three defenses first.
- docs/agent-briefs/Readme.md: the formal home of task-card lifecycle and notes format, so collaboration state becomes a traceable record.
- docs/keep.summary.md / docs/keep-shards/keep-workflow.md: the consensus layer that lets agents know the current project rules without rereading the whole keep.
- docs/agent-context-budget.md / docs/agent-collaboration-protocol.md: the documents that make context throttling and handoff contracts explicit, so message-layer chaos does not eat the workflow.
One Patch I Now Treat as Mandatory: All Four Entry Points Must See the Lock Rule First
I eventually noticed a dangerous hole: the `task-lock` rule already existed, but it mostly lived deep inside files like `docs/agent-briefs/Readme.md`. That means if an agent enters directly through Copilot, Codex, Claude Code, or Antigravity, it can easily skip the lock flow without even realizing it.
The real issue was not that the rule did not exist. The issue was that it did not live in a must-pass entry point. If the entry does not show it, the downstream handbook is only remedial reading, not a real pre-flight guard.
- Create `CLAUDE.md`: put Hard Rule #0 at the top of the Claude Code main entry so it appears before any other pre-flight steps.
- Update `AGENTS.md`: make the Codex entry show the lock rule before general work guidance.
- Update `.github/copilot-instructions.md`: insert the hard rule ahead of Copilot's pre-flight flow so it is not treated like a secondary detail.
- Update `.github/instructions/agent-collaboration.instructions.md`: lift the lock rule from "one step in pre-flight" into a highly visible shared hard rule.
- Update `docs/keep.summary.md` section 5: turn the one-line reminder into an explicit command block so the shared consensus layer also surfaces it immediately.
Wrong move
Hide the lock rule in a deep document and assume every agent will keep reading until it eventually finds that section. That may survive in solo work, but it almost always fails in relay-style collaboration.
Right move
Make every entry show `check`, `lock`, and the required frontmatter update on the first screen. The agent should learn "you cannot edit yet, lock first" before it touches anything.
What These Documents Actually Prevent
For me, the value of these documents is not that they are comprehensive. The value is that each one blocks a different kind of failure. Once failure types are separated, the collaboration rules stop sounding like vague slogans and become operational.
What scares me most is not the bug
What really scares me is this: a change already happened, but nobody can say who made it, under which rule, or whether the move can still be replayed. Once multi-agent collaboration loses those three things, it quickly becomes an unmaintainable black box.
That is why I keep splitting rules into smaller layers
Once rules are split, each layer solves one problem only. That layered design is more practical than a giant governance document because it actually supports a concrete move in the current moment.
How I Actually Use This Stack Day to Day
Once these rules settled down, my daily workflow became very fixed. Before taking over, I check the consensus summary, run the health scan and context budget, then lock the task. After the edit, I run the cheapest validation first, then add the fuller gate if needed, then compress changed files and decisions into a handoff.
Pre-flight: what I always run before starting
node tools_node/compute-gate.js --profile quick --agent-feedback --no-stop
node tools_node/check-context-budget.js --changed
node tools_node/task-lock.js check <task-id>
node tools_node/task-lock.js lock <task-id> <agent-name>
Post-flight: what I always run before leaving
node tools_node/compute-gate.js --profile standard --agent-feedback
node tools_node/check-encoding-touched.js <changed-files...>
node tools_node/task-lock.js unlock <task-id> <agent-name>
node tools_node/report-turn-usage.js --changed --emit-final-line
This workflow is especially friendly to smaller models and relay-style collaboration. If each round stays inside a small slice, the model does not need to understand the whole system to finish one reliable segment. That is why I now believe multi-agent collaboration is not about many people working at once. It is about making sure any next person can still understand what they are inheriting.
What I Require Every Round to Leave Behind as Handoff
In this repo, handoff is no longer a freestyle summary. It is a contract with explicit fields. For me, the most useful format is not a long essay. It is something that lets the next agent see immediately which files matter, what has already been decided, and what does not need to be reread.
Task: HARN-ARCH-0002
Goal: Remove the illegal battle -> ui reference
Read:
- docs/keep.summary.md
- docs/agent-briefs/tasks/HARN-ARCH-0002.md
- shared/interfaces/IBattleUIBridge.ts
Known:
- The task card is already locked
- The battle side may only touch the interface, not UI components directly
- The current blocker is that the bridge injection point is still missing
Need:
- Add the bridge injection point
- Use focused validation to confirm the illegal reference is gone
Avoid:
- Do not reread keep.md in full
- Do not expand into unrelated UI contract files
I like this kind of summary card because it makes the function of handoff very explicit. It is not retelling all background. It is narrowing the next agent's reading surface. Once the surface gets smaller, takeover cost drops immediately, and token usage does not blow up.
What a good handoff looks like
Few files, short conclusions, a clear next move, and an explicit note about which large documents or artifacts do not need to be reread.
What a bad handoff looks like
Pasting the keep, the manifest, QA notes, and a huge diff into the same message. It looks complete, but in reality it sends the next round straight into hard-stop.
The Role Humans Should Actually Play in AI Collaboration
When I read about Harness Engineering, one of the most important reminders for me was this: a good harness is not meant to remove humans from the loop. It is meant to direct human attention toward the highest-value places. Human developers bring more than coding skill into a codebase. They bring taste, organizational memory, debt tradeoffs, product goals, and a strong instinct for what the team would or would not accept.
Agents do not naturally have those things. They do not know which convention is load-bearing and which one is just historical habit. They also do not know whether a technically correct solution actually matches the direction I want 3KProject to move toward. That is why I gradually shifted the human role from "line-by-line agent inspector" to "harness steward."
Humans should not keep doing low-level checks
Formatting, schema, module boundaries, encoding, and type errors should be pushed into computational sensors. If humans keep manually checking those, collaboration cost gets stuck inside review.
Humans should keep tuning the system
When the same class of mistake repeats, my job is not to scold the agent. My job is to add a guide, add a sensor, add a fixture, or move a rule farther forward into the entry point.
So inside 3KProject, the role I want for myself is steward, not human linter. The agent handles local execution. The tools handle deterministic checks. Humans handle direction, risk, and exceptions. That is how multi-agent collaboration gets steadier over time instead of dumping all uncertainty back into human heads.
The Final Decision Rule I Keep for Myself
If I had to reduce my current understanding of multi-agent collaboration to one sentence, it would be this: do not bet collaboration on everyone being individually brilliant; build it on a system that lets any next person align immediately at takeover time.
Inside 3KProject, I now see this as an extension of Harness Engineering. A single-agent harness solves "how do we make the model fail less often." A multi-agent harness solves "how do we keep relay-style work from amplifying errors during handoff."
Better handed to the agent
Requirement decomposition, local edits, document cleanup, semantic ambiguity reading, and higher-level option comparison.
Better handed to the system
Task lock, scope management, validation order, handoff structure, context budget, and file-boundary enforcement.
I no longer think of multi-agent collaboration as "opening several chat windows." I think of it as writing the repo's lifecycle, execution order, and guards clearly enough that agents can act like a team instead of a crowd of temporary workers who cannot see one another.