Session Grounding

Orchestrating AI

The most effective AI users aren't writing more code — they're writing less. They've shifted from typing to orchestration: defining problems, setting constraints, and reviewing outputs.

1 hr live session
4 sections
continuous demo
01

How do I give an agent enough context to succeed?

"I can't pay attention to everything at once. And there's also the risk of two different agents modifying things that are slightly related."
— User research participant
The quality of your outcome is decided before the first line of code. A clear plan is the starting line — sloppy input yields sloppy output, regardless of model.
Show the difference between "build me X" and a plan-first approach. The agent orchestrates parallel subagents under the hood — research, codebase scan, doc lookup — you just give it a clear target.
MCPs connect planning to where real work lives — pull from GitHub issues, Work IQ meeting summaries, Jira. Don't copy-paste context; let the agent pull it.
Breaking down plans practically: UI/Scaffold/Tests-first as a forcing function. Tests-first gives the agent a verifiable contract. Scaffold-first prevents architectural drift.
Features to demo
Plan mode
Start with a clear plan before implementation
Clarifying questions
Agent asks instead of assumes
MCPs for context
GitHub issues and Work IQ inputs
Parallel subagents during plan mode
Mermaid diagrams in chat
MCP Apps / Excalidraw (spatial planning)
02

How do I stop repeating myself across sessions?

"There's a lot to sort of process in my brain."
— User research participant
Context engineering is the skill. Not prompting — engineering the persistent world the agent operates in. The best orchestrators build better environments, not better prompts.
Start with the easiest unlock: "just ask the agent." It can scaffold its own instruction files, generate skill files, write its own .agent.md.
Instructions
Repo-level conventions. Always loaded. Set-and-forget.
Skills
Domain expertise. Loaded when relevant. Shareable like linter configs.
Custom Agents
Full personas with tools, skills, grounding docs. Trust you can commit.
Memory
Session memory preserves decisions. Repo memory compounds learning.
Hooks
Deterministic guardrails. Not prompts — actual code that executes.
Features to demo
"Ask the agent"
Generate instruction and skill files in-workspace
Layered customization
Instructions → Skills → Custom agents
Memory
Session + repo memory — persist and reuse context
Hooks
Linter-on-edit, deny patterns, org policy
AGENTS.md / .prompt.md layers
Skills from extensions (chatSkills)
03

When should I run more than one agent?

"They're helping me get a week's worth of work done in a day. But it also feels like I'm having to process a week's worth of information in a day as well."
— User research participant
The answer isn't "always" — it's about matching agent type to task shape. Three types, three decision criteria.
The operating model: stay hands-on locally for visual work, delegate scoped tasks to background, send explorative work to cloud. This is the judgment call — a practice, not a feature.
The daily rhythm: your actual morning/evening pattern. When do you use plan mode? When do you skip it? When do you go straight to background? This is orchestration in practice.
Local
Visual, interactive, needs your judgment mid-flight. Stay hands-on.
your workspace
Background
Well-scoped, deterministic. Runs in an isolated worktree with terminal sandbox.
isolated worktree
Cloud
Explorative, can take time. Changes land as PRs with CI gates.
lands as PR
Features to demo
Agent Sessions view
Mission control — status and check-ins
Background + worktrees
Run scoped tasks in isolation
Cloud agents
Explorative work → PR landing
Copilot in PRs
Assign issue → wake up to drafts
Terminal sandbox (fs/network controls)
Multi-model (Claude/Codex/Copilot)
Subagents (internal parallelism)
Terminal-first CLI flows
04

How do I know when to trust the output?

"The anxiety was not worth the few seconds I saved... I would still stick to one task at a time."
— User research participant
Trust isn't binary — it's a spectrum you calibrate over time. The anxiety is real and valid. What resolves it isn't blind confidence — it's layered controls.
Start tight, loosen as you learn. Side project? Approve-all. Production code? Full sandbox + hooks + code review. The controls exist so you go faster with confidence.
During execution
Prevention
Hooks fire on every tool call. Terminal sandbox restricts filesystem and network. Deterministic code, not prompts.
Hooks Sandbox Auto-approve
At review time
Review
Copilot Code Review on PRs with risk-prioritized highlighting. Grounded in your custom instructions.
CCR Risk flags Summaries
After completion
Verification
Custom agents that encode what "good" looks like. Dogfooding agent with persona, browser MCP, product brief.
.agent.md Playwright Self-review
Features to demo
Hooks + sandbox
Prevention layer during execution
Copilot Code Review
Risk-prioritized review layer
Dogfooding agent
.agent.md + Playwright MCP + skills
Auto-approval rules
Agent self-review patterns
Vision support (screenshot review)

Plan, direct, multiply, ship.

Start with two agents, not five. You'll know when you're ready for more.

1
Use plan mode for one task
2
Send one subtask to background
3
Review, commit, ship