The Agentic Intention Framework In Practice

Carlisia Campos

MCP Technical Strategist

Publish Date August 15, 2025

Lessons

Intermediate
11 min read
Basics

Overview: Who, what and why

WHO will use the framework:

Production teams (Full framework): Software/platform engineers building production-ready agentic AI systems who need organizational alignment and KPIs
Individual builders (Core framework): Developers creating useful agents for personal or small-scale use who need clarity but not organizational buy-in

WHAT are the components of the framework: A structured framework that transforms vague AI agent ideas into precise, measurable workflow definitions by eliciting explicit answers to critical questions before development begins:

Full framework (production): WHO, WHAT, CONSTRAINTS, WHY, ALIGNMENT
Core framework (individual): WHO, WHAT, CONSTRAINTS, WHY

Framework constraints

Production mode requires answering all five questions including ALIGNMENT
Individual mode allows skipping ALIGNMENT for faster iteration
Requires ongoing but minimal time investment (< 5% of project time)
Must be revisitable and adjustable as learning occurs

WHY use the framework: Achieve higher adoption rates for agentic tools and reduce chances of major pivots after development starts by evaluating and catching misalignment early. Define clear constraints to: maximize discoverability, usability (focused tools are intuitive), and efficiency (precise boundaries means fewer tokens wasted on clarification or error recovery).

Investing in ALIGNMENT: 2-4 hours of upfront intention crafting plus 15-minute weekly reviews prevents weeks of wasted development, failed launches, and abandoned projects, delivering 100x ROI on time invested for individuals and teams that would otherwise face even one major pivot.

Why intention matters in agentic systems

Architecting robust, performant, and useful agents demands clear and precise intention for our workflows: with LLMs in the loop, system effectiveness multiplies when boundaries are explicitly defined upfront. These boundaries enhance predictability, discoverability, and usability of our agentic tools.

Because for an LLM to make sound decisions, it requires precise guidance. They cannot infer unstated goals, deduce implicit requirements, or navigate ambiguous objectives. They operate solely on the explicit instructions we provide. The intention statement we establish becomes their complete operating context.

A clear intention statement has the power to transform vague possibilities into specific, measurable, achievable outcomes.

Agents

In this context, please think of an agent as a set of 1.n tools/other agents that collective compose to serve a single intention. In the Model Context Protocol world, it would be the MCP server, which contains 1.n tools, all of which would do work to support the server’s sole intention.

Prompt system

If you want to jump right into it, use the Agentic intention framework prompt system. Follow the instructions.

The five questions

These are 5 straighforward questions to apply to both a set of agentic tools and each tool in that set (if more than 1) individually:

This agent helps [WHO: specific users with context] to [WHAT: exact workflow/task]with [CONSTRAINTS: explicit boundaries and limitations] so that [WHY: measurable outcome with timeline]delivering [ALIGNMENT: X hours saved × Y users × Z frequency = impact that justifies investment].

Tip

The intention must remain the same all across all tools, what changes is scope.

WHO: define exact users

Transform broad categories into specific personas with clear contexts.

❌ Weak: “Developers who need help with code”

✅ Strong: “Backend engineers debugging production Python services between 10K-100K requests/minute”

❌ Weak: “Data scientists”

✅ Strong: “ML engineers validating model outputs before A/B tests in recommendation systems”

The specificity test: Can you list 10 real people or companies who fit this description exactly?

WHAT: specify the opportunity

Define the exact workflow or pain point the agent addresses.

❌ Weak: “Assist with documentation”

✅ Strong: “Generate OpenAPI specs from existing REST endpoint code and validate against current implementation”

❌ Weak: “Help with testing”

✅ Strong: “Create edge case scenarios for payment processing flows based on production error patterns”

The specificity test: Could another engineer implement this without asking clarifying questions?

CONSTRAINTS: set explicit boundaries

Define precise boundaries for what the agent will and won’t do.

Essential constraint categories:

Operations: Read-only analysis vs. write operations vs. execution permissions
Performance: Max file sizes, timeout limits, rate limits
Security: Sandboxing requirements, data access restrictions, authentication needs
Scope: Supported languages, frameworks, file formats

❌ Weak: “Works with code files”

✅ Strong: “Read-only analysis of Python/TypeScript files under 500KB, no execution, no external API calls”

WHY: measurable outcomes

Connect the agent to specific, measurable improvements.

❌ Weak: “Improve developer productivity”

✅ Strong: “Reduce p0 debugging time from 45 to 15 minutes for production incidents”

❌ Weak: “Better code quality”

✅ Strong: “Catch 90% of breaking API changes before merge, preventing ~20 rollbacks per quarter”

The specificity test: Can you measure this outcome within 30 days of deployment?

ALIGNMENT: proving ROI (production teams only)

This question helps to determine whether the agent idea justifies the investment. For production systems, alignment is not only the difference between a promoted project and a cancelled one, but it goes to the very reputation and success of the enterprise.

Strong alignment means demonstrating that limited resources (time, talent, budget) will deliver concentrated impact that far exceeds the investment. Think in terms of transforming specific workflows for specific users so dramatically that the ROI becomes undeniable.

The key questions to answer:

Can we quantify the exact impact delivered to our primary users?
Does the effort-to-impact ratio justify choosing this over other projects?
Will the concentrated impact be sufficient to sustain long-term investment?

Without clear answers to these questions, even technically brilliant agents become stale. The following section provides formulas, templates, and examples to make this math crystal clear.

Intention checkpoints

Intention validation checklist

Can you name 10 specific users who urgently need this?
Would they notice if your agent disappeared after one week?
Can you measure success within 30 days of deployment?
Is the Impact Score > 10?
Could you explain the ROI to a skeptical CFO in 2 minutes?
Will the first version solve one complete workflow end-to-end?
Can you ship meaningful impact with < 500 development hours?

If any answer is “no,” reapply the framework with more precise answers.

Red flags in intention statements

Hedge words: “various”, “multiple”, “flexible”, “any”
Vague outcomes: “better”, “easier”, “improved”
Missing constraints: No boundaries = infinite scope
Everyone problems: “all developers” = no one’s specific need
Shallow payoff: “slight improvement” = resource sink
Mismatched ambition: Grand vision with shoestring capacity

Alignment deep dive: making the math work

The effort-to-impact formula

For any agent project, calculate:

Impact Score = (Users Affected × Hours Saved per User × Frequency) / Implementation Hours

Viable threshold: Impact Score > 10
Excellent target: Impact Score > 100

Example calculations

High-alignment agent: API documentation validator

50 engineers × 2 hours saved/week × 50 weeks = 5,000 hours saved/year
Implementation: 200 hours
Impact Score: 25 ✅

Poor-alignment agent: Automated technical debt tracker

200 engineers × 3 hours saved/quarter × 4 quarters = 2,400 hours saved/year
Implementation: 800 hours + 200 hours/year maintenance = 1,000 hours year one
Impact Score: 2.4 ❌

Why this seems smart but isn’t: While technical debt tracking is important and affects many engineers, the actual time saved is sparse (quarterly planning only) and the implementation requires complex codebase analysis, metric tracking, and visualization dashboards to maintain. Most of the “impact” is indirect and hard to measure precisely. Engineers might still debate priorities regardless of the tracking, and management already has rough visibility through sprint velocity and bug rates.

The concentration principle

Strong alignment concentrates resources where they matter most:

Concentration Score = (Value to Primary Users / Total Users) × (Core Features / Total Features)

Target: > 0.6 (60% of value from focused effort)

Applied examples

Concentrated approach: Python debugging agent for microservices

80% value to 20 backend engineers (primary users)
3 core features: trace analysis, memory profiling, bottleneck detection
5 total features (including nice-to-haves)
Concentration Score: 0.8 × 0.6 = 0.48 (acceptable, could be better)

Diluted approach: “Universal code helper”

20% value spread across 100 developers
2 core features out of 15 attempted features
Concentration Score: 0.2 × 0.13 = 0.026 (failure)

Common alignment pitfalls

1. The enterprise mirage Building for Fortune 500 companies without confirmed demand or access.

Reality check: Do you have 3+ champions inside target companies?
Better approach: Start with mid-size companies where you have relationships

2. The perfect assistant fallacy Trying to handle every edge case from day one.

Reality check: Can you ship value with 20% of planned features?
Better approach: Launch with one workflow perfected, expand based on usage

3. The metrics phantom Claiming measurable outcomes you can’t actually track.

Reality check: Do you have telemetry in place to measure this?
Better approach: Choose metrics your existing tools already capture

4. The infinite loop Adding features to serve more users, diluting impact for core users.

Reality check: Will your first 10 users still love this after all additions?
Better approach: Say no to feature requests not aligned with the intention

Alignment templates by project type

Infrastructure agent (e.g., Kubernetes troubleshooting)

Users: [X] platform engineers managing [Y] clusters
Time saved: [Z] hours per incident × [N] incidents/month
Investment: [Dev hours] + [Maintenance hours/month × 12]
Break-even: Investment < (Time saved × $150/hour)

Development workflow agent (e.g., code review assistant)

Users: [X] developers doing [Y] reviews/week
Quality impact: [Z]% fewer bugs reaching production
Current bug cost: [$ per bug × bugs/month]
Investment: [Dev hours] × $150/hour
ROI period: When does prevention savings exceed investment?

Migration/modernization agent (e.g., framework upgrader)

Scope: [X] services/components to migrate
Manual effort: [Y] hours per service
Agent effort: [Z] hours per service
Total savings: (Y – Z) × X × $150/hour
Success rate needed: Break-even at what accuracy percentage?

Complete examples

Strong: production observability agent

WHO: SRE teams managing Kubernetes-based microservices (20-100 services) experiencing 5+ incidents weekly

WHAT: Analyze pod logs, metrics, and traces to identify root causes of latency spikes and error rate increases, generating runbooks for common patterns

CONSTRAINTS:

Read-only access to Prometheus, Jaeger, and CloudWatch
Analyzes last 24 hours of data only
No automated remediation actions
Must complete analysis within 2 minutes

WHY: Reduce mean time to resolution (MTTR) from 47 minutes to under 20 minutes for 80% of production incidents

ALIGNMENT:

15 SREs × 3 hours saved/week × 50 weeks = 2,250 hours/year
Implementation: 320 hours
Impact Score: 7.0 (acceptable)
Concentration: 90% value to SREs, minimal value to others
ROI: 337,500 saved/years 48,000 investment

Weak: enterprise code quality analyzer

WHO: Development teams working on production services

WHAT: Analyze code patterns and suggest architectural improvements based on best practices

CONSTRAINTS:

Supports major languages and frameworks
Integrates with existing CI/CD”

WHY: Reduce technical debt and improve system maintainability

ALIGNMENT:

Cannot calculate deliverable: “technical debt reduction” has no concrete timeline
“improved maintainability” could mean anything from 5% to 50% fewer bugs
“development teams” could be 10 or 1000 people
“architectural improvements” range from trivial to year-long refactors.

This project will fail because it sounds sophisticated and important but lacks any measurable target. It’s the kind of initiative that gets approved because nobody wants to argue against “code quality” but dies in implementation when nobody can agree what success looks like.

Building with intention

The intention statement isn’t meant to be a static, use-once artifact, it’s an active tool that guides ongoing decisions and evolution.

Regular intention reviews

Weekly: 15-minute review

Are we accepting only features that serve our WHO?
Are we staying within same set of CONSTRAINTS per agent?
Are we tracking toward our WHY?

Monthly: Alignment check

Is our Impact Score trending as predicted?
Should we adjust scope to maintain concentration?
What would our first users say about recent additions?

Quarterly: Full revision

Has our understanding of WHO evolved?
Should we graduate from individual to production mode?
Is it time to spawn a second, separate agent to serve a different intention?

Three patterns for intentional evolution

Here are three patterns can help us evolve agentic systems without losing focus.

1. Deepen (most common)

Keep the same intention but serve it more powerfully. Instead of adding adjacent features, enhance the core workflow with richer capabilities, better accuracy, or faster performance.

For example, an agent that “explains Python functions” might evolve to provide interactive examples, visualize call graphs, or trace execution paths—all deepening the original explanation goal without changing the fundamental intention.

Signs we should deepen:

Users love the core feature but want it to do more
The original challenge still isn’t fully solved
We can 10x the impact without changing the WHO or WHAT

2. Fork (when intentions diverge)

When users request features that would violate the original constraints or serve different workflows, create a separate agent with its own intention statement rather than diluting the original.

For example, if our read-only documentation analyzer users keep asking for auto-fixing capabilities, don’t break the read-only constraint. Instead, spawn a sibling “documentation fixer” agent with write permissions and its own clear intention.

Signs we should fork:

Feature requests conflict with core constraints
Different user segments want opposing behaviors
The new capability would require fundamentally different architecture

3. Pivot (rare but necessary)

Sometimes we discover the original intention missed the real opportunity entirely. Pivoting means admitting this, crafting a new intention statement, and potentially deprecating the original approach.

For example, we might build an agent to “help write better commit messages” only to discover developers actually by far struggle most with understanding what changed in their code. The pivot refocuses entirely on workflows to support change analysis rather than message crafting.

Signs we should pivot:

User adoption is minimal despite technical success
Users consistently use the tool for unintended purposes
The real challenge is adjacent to, but different from, the original assumption

The power of precision

This framework helps us maintain that precision through the entire lifecycle of agents, from initial idea through production evolution.

If you believe that a sharp intention that serves 20 users excellently beats a vague intention that serves 2,000 users poorly, this is for you. use it, modify it, maybe build agentic workflows to use it for validation. Pass it around. Because with agentic AI, precision is power (and profit).

Overview: Who, what and why Why intention matters in agentic systems Prompt system The five questions WHO: define exact users WHAT: specify the opportunity CONSTRAINTS: set explicit boundaries WHY: measurable outcomes ALIGNMENT: proving ROI (production teams only) Intention checkpoints Alignment deep dive: making the math work Complete examples Building with intention Three patterns for intentional evolution

MCP Academy is the go-to resource for educational content on Model Context Protocol…created by developers, for developers.

Related Lessons

Article

Basics

The Agentic Intention Framework

Introduction: The paradox of building with AI in the loop As modern software engineers, we’re trained in the agile way: ship fast, iterate based on feedback, let design emerge. We start minimal and evolve based on what we learn from users. But architecting robust, performant, and useful MCP servers demands the opposite: with LLMs in …

Article

Basics

Defining Tighter Boundaries Using Intention

The agentic intention framework can be used as a daily decision-making filter. If every choice we face flows through this framework, we will succeed in keeping our MCP servers focused and effective. Scope boundaries Before writing any code, our intention helps us define what belongs in our MCP server and what doesn’t. The Goldilocks test …

Article

Basics

Composition Patterns That Work

One of MCP’s most powerful features is enabling LLMs to combine tools into solutions. In this section I want to give you ideas for how to craft effective composition patterns. Three principles for composable tools 1. Predictable contracts Each tool has clear inputs and outputs. When AnalyzeDocumentationQuality returns a QualityReport, the LLM knows exactly what …