Less token churn
Cache repo embeddings, rank compact actions, and reserve expensive reasoning for places where language is actually needed.
Joint Embedding Predictive Architecture for code
The next coding agent should understand a repository as a world, predict what edits will do, and verify each action before it writes.
The thesis
Today’s agents mostly read text, sample patches, run tools, and try again. j3 explores a different primitive: learn the dynamics of a codebase, then choose edits by predicted effect.
Architecture
A Joint Embedding Predictive Architecture learns compact representations of repo state, edit actions, and target outcomes. The agent asks which action moves this code world toward the desired future.
repo state -> encoder -> z_current
structured edit -> encoder -> z_action
test signal -> encoder -> z_target
predict(z_current, z_action) ~= z_future
rank edits by distance(z_future, z_target)
materialize best action as a deterministic diff
validate in an isolated worktree
the patch engine is planning, not autocomplete
Why it matters
LLMs are powerful language interfaces. But the patch engine should be trained on how repositories actually change, how tests respond, and which actions repair behavior.
Cache repo embeddings, rank compact actions, and reserve expensive reasoning for places where language is actually needed.
A constrained patch space can be inspected, replayed, diagnosed, and improved without pretending raw text sampling is planning.
Public repos, synthetic break/fix transitions, and local test outcomes become training signal for the next repair attempt.
Agent primitive
The long-term architecture keeps LLMs where they are strongest: translating messy human requests into objectives, constraints, and tests. The repair engine remains predictive and verifiable.
| Layer | LLM-first | JEPA-first | Goal |
|---|---|---|---|
| Planning | Sample text | Predict effects | Fewer blind trials |
| Memory | Prompt context | Repo embeddings | Persistent code state |
| Safety | Review generated diff | Validate action outcomes | Trust through execution |
The claim is architectural: LLMs can remain useful adapters, but the core patch planner should be a code-world predictor.
Where j3 is going
The destination is not another chat wrapper. It is open-source infrastructure for coding agents that learn action dynamics, preserve human control, and run on developer hardware.
Imports, attributes, call shapes, guards, conditions, signatures, and narrow exception repairs become first-class actions instead of accidental text edits.
Positive and negative validation traces teach the system which repo-state, failure-signal, and action combinations are likely to pass.
A compact local model predicts future repository embeddings, searches structured edit trajectories, and validates only the most promising futures.
Open source invitation
j3 is a bet that the future of coding agents is local, world-model-driven, and validated by execution.