Speaker Notes

Dawn of the Agents

Amit Joshi · Solutions Architect · Leapfrog Technology

Navigation

Opening Story Slide 1: Title Slide 2: The Analogy Slide 3: Big Theme Slide 4: Foundation Models Slide 5: Custom Models Slide 6: Competition Slide 7: Agents Slide 8: Agent Stack Slide 9: Specialized Agents Slide 10: Vibe Coding Slide 11: Specs vs Code Slide 12: Vibe DevOps Slide 13: Governance Slide 14: Takeaways Slide 15: Closing Closing Story

Opening Story

Opening Line

"I want to start with something uncomfortable."

(Pause)

The Story

"Think about the last important email you sent at work. A proposal. A roadmap. A customer escalation."

2015

"In 2015, you researched. You read blogs. You wrote drafts. Maybe you asked a colleague to review it. You owned every word."

2023

"In 2023, you pasted links into ChatGPT. You asked for a summary. You tweaked the output. You still pressed 'Send'."

(Pause)

2025

"Now imagine this: You tell an AI 'Handle this.' It researches. It drafts. It adapts tone based on the recipient. It sends the email. And it updates the CRM."

(Pause again)

"At that point… you're not writing emails anymore. You're supervising outcomes."

The Turn

"AWS re:Invent 2025 wasn't about better emails. It was about that exact transition — everywhere."

"From code… to infrastructure… to security… AWS is betting that doing becomes automated, and judgement becomes the scarce skill."

SLIDE 1

Title Slide

On Screen

AWS re:Invent 2025: GenAI Matures → The Age of Agents From copilots to autonomous systems

This opening slide sets expectations for the entire presentation. This talk is not a feature dump of AWS services. Instead, it's about the directional shifts AWS is making in the GenAI space.

The key theme is the transition from GenAI that "helps" (like copilots and assistants) to GenAI that "does" (autonomous agents that take action). This represents a fundamental shift in how we think about AI in production systems.

Mention AWS re:Invent 2025 once here to establish context, then move on. The focus should be on the concepts, not on rehashing conference announcements.

SLIDE 2

The Analogy (Anchor the Talk)

On Screen

2015 → 2023 → 2025+ From manual → assisted → autonomous

This slide introduces the central analogy that will frame the entire presentation. Walk through the email analogy slowly, as established in the opening story:

2015: You manually research, write, and send emails
2023: AI assists you - you use ChatGPT to draft, but you still review and send
2025: AI acts autonomously - you delegate the task, and AI handles it end-to-end

Emphasize the critical difference between AI as a tool (2023) and AI as an actor (2025). This isn't just about better autocomplete - it's about AI systems that can hold intent over time, make decisions, and take actions.

This framing will apply throughout the talk to:

Foundation models
Coding workflows
DevOps practices
Security policies

"re:Invent 2025 was basically AWS saying: we're entering the 2025 box."

SLIDE 3

Big re:Invent Theme

On Screen

Foundation Models = Infrastructure ───────────── Agents are the Interface

This is the core AWS thesis and the most important conceptual slide in the deck. Everything else ladders up to this framework.

Foundation Models are Infrastructure: Models are becoming commoditized utility services, similar to EC2, S3, or managed databases. They're powerful but increasingly undifferentiated. You select them based on cost, performance, and availability - not as your core competitive advantage.

Agents are the Interface: The new application layer is agents - systems that use foundation models as compute primitives but add memory, tools, policies, and orchestration on top. This is where differentiation happens.

This reframing is critical for architects: don't build your strategy around owning or customizing models. Build it around what your agents do with those models.

SLIDE 4

Foundation Models = Infrastructure

On Screen

• Nova 2 (Lite, Sonic, Omni) • Incremental gains, not magic • Bedrock = model marketplace

Nova 2 models represent AWS's approach to foundation models: boring on purpose. They're not breakthrough innovations - they're incremental improvements focused on:

Faster inference times
Lower cost per token
Better context windows
More reliable outputs

That's the point. Infrastructure should be boring, predictable, and reliable. You don't want your database doing surprising things, and you shouldn't want your foundation models doing surprising things either.

Bedrock as the model marketplace: AWS Bedrock now hosts models from multiple sources - AWS's own models, open-source models (Llama, Mistral), and closed models from partners. AWS is positioning itself as the "model switchboard" - you can switch between models without rewriting your application code.

This is a strategic play: AWS wins if they own the infrastructure layer where models run, regardless of which specific model you choose.

SLIDE 5

Custom Models: Power vs Reality

On Screen

Nova Forge Custom models are possible But rarely necessary

Nova Forge is AWS's service for training or heavily customizing foundation models. It's powerful, but be very clear with the audience about the reality:

The challenges of custom models:

Validation is hard: How do you know your custom model is better? Testing is expensive and time-consuming.
Drift is real: Models degrade over time as data patterns change. You need continuous monitoring and retraining.
Cost explodes silently: Training costs, infrastructure costs, and maintenance costs compound quickly.

Critical take (say this explicitly):

"For most teams, custom models are a vendor-lock-in shaped hammer looking for a nail."

Best default approach:

Use the best available foundation model from the marketplace
Customize via prompts, RAG (Retrieval Augmented Generation), and light fine-tuning
Only invest in full custom models when you have a compelling, measured business case

Most companies overestimate their need for custom models and underestimate the operational burden.

SLIDE 6

Competitive Reality (Brief, Honest)

On Screen

AWS ≠ only GenAI cloud And that's OK

Acknowledge the elephant in the room: AWS is not the only player in GenAI infrastructure. Your audience knows this, so address it directly:

Google Cloud: Vertex AI + Gemini models
Microsoft Azure: Azure OpenAI Service (exclusive partnership with OpenAI)
Others: Anthropic (Claude), open-source ecosystem

AWS's response is not model supremacy. They're not trying to win by having the "best" model. Instead, their strategy is:

Choice: Run any model on their infrastructure
Governance: Enterprise-grade controls, policies, and audit trails
Integration: Deep integration with existing AWS services

Implication for architects: Design your systems for model plurality. Don't hard-code dependencies on a single model or provider. Build abstraction layers that let you swap models based on cost, performance, or capability.

The best model today might not be the best model next year. Your architecture should be resilient to that reality.

SLIDE 7

The Real Shift: Agents

On Screen

Agents ≠ Chatbots Agents = long-lived AI workers

This is where the conversation shifts from models (infrastructure) to agents (the application layer). Define agents clearly for the audience:

What makes something an agent?

Memory: Agents remember context across interactions and over time
Tools: Agents can call APIs, query databases, execute code, and interact with external systems
Policies: Agents operate within defined guardrails and permissions
Autonomy: Agents can make decisions and take actions without constant human input

Why this is different from chatbots: Chatbots respond to prompts. Agents hold intent over time, can recover from failures, and can act without asking for permission at every step.

Tie back to the 2025 analogy: This is the shift from "assisted" (you ask questions, AI responds) to "autonomous" (you delegate a goal, AI figures out how to achieve it).

This is the first time AI can truly do things rather than just suggest things.

SLIDE 8

AWS Agent Stack

On Screen

• Bedrock AgentCore • Step-based execution • Guardrails & memory

Bedrock AgentCore is AWS's agent runtime - the infrastructure layer for running agents at scale.

What it actually is (be honest):

Prompt orchestration: Managing multi-turn conversations and context
Tool calling: Safely connecting agents to APIs and services
Policy enforcement: Ensuring agents operate within defined boundaries
Memory management: Storing and retrieving agent state

Critical take:

"It feels more like a controlled workflow engine than true autonomous intelligence."

And that's not bad. Enterprises need predictability. They need agents that can be audited, monitored, and constrained. Pure autonomy is scary in production environments with real customers and real money at stake.

AgentCore is AWS's answer to the question: "How do we make agents enterprise-ready?" The answer is structure, governance, and observability.

SLIDE 9

Specialized Agents Are Coming Fast

On Screen

DevOps Agent Security Agent Kiro (IDE)

AWS is shipping job-shaped agents - AI systems designed for specific professional roles rather than general-purpose tasks.

Examples:

DevOps Agent: Acts as an AI on-call engineer that can diagnose issues, execute runbooks, and even remediate incidents
Security Agent: Provides continuous application security scanning, threat detection, and automated response
Kiro (IDE): An agentic integrated development environment that doesn't just autocomplete code, but understands project architecture and can refactor across multiple files

The pattern to watch: AWS is building agents that map to specific job functions. Expect many more to come:

FinOps Agent: Cost optimization and budget management
Compliance Agent: Continuous regulatory compliance checking
Data Ops Agent: Data pipeline management and quality assurance

This is the future of enterprise AI: not one general assistant, but a team of specialized agents, each focused on a specific domain with deep context and tools.

SLIDE 10

Vibe Coding Is Now Normal

On Screen

Natural language → Code Humans supervise

"Vibe coding" is no longer a joke. It's become the default workflow for many developers: describe what you want in natural language, and AI generates the code.

The reality: AI now writes more code than humans in many organizations. Tools like GitHub Copilot, AWS CodeWhisperer, and agentic IDEs are producing the majority of boilerplate, integration code, and even complex logic.

The risk is not wrong code. Modern AI is surprisingly good at generating syntactically correct, functionally accurate code for well-defined problems.

The real risks are:

No one understands it: Code generated in seconds isn't necessarily code that's understood
No one tests it properly: The temptation is to trust AI output without rigorous testing
Technical debt accumulates faster: It's easy to generate code; it's hard to maintain it

Key line (pause after saying this):

"Productivity is up. Visibility is down."

We're shipping faster, but we're accumulating complexity we don't fully understand. This is a management and process challenge, not just a technical one.

SLIDE 11

The Spec Is the Source Code

On Screen

Specs > Code Python ≈ Assembly

This is a provocative statement - let it land before explaining.

The core idea: In an AI-driven development world, specifications drive code generation. The code itself becomes an implementation detail, similar to how assembly language is an implementation detail of higher-level languages.

What this means:

Specs drive generation: Well-written specifications, contracts, and type definitions become the source of truth
Code is ephemeral: Code can be regenerated from specs as models improve or requirements change
Testing becomes critical: You can't review every line of generated code, so tests become your primary quality gate

Implication for software practices:

Tests are more important than ever
Contracts and interfaces become first-class artifacts
Assertions and invariants matter more than syntax

The Python line: When you say "Python is the new assembly," smile. It's half joke, half prophecy. The point is that we're moving up the abstraction ladder again - from assembly to C, from C to Python, and now from Python to natural language specifications.

SLIDE 12

Vibe DevOps: Infra as Prompt

On Screen

"I want X" → Terraform / Pulumi → Running infra

Infrastructure generation via prompts is maturing fast. What was a demo 12 months ago is becoming production-ready.

How it works:

You describe infrastructure intent in natural language: "I need a serverless API with authentication"
AI generates infrastructure-as-code (Terraform, Pulumi, CloudFormation)
The code is reviewed (by human or automated tools)
Infrastructure is provisioned

AWS demos and Pulumi AI prove feasibility. These aren't research projects - they're shipping products that developers are using today.

Lambda durable workflows are key to this shift. They enable:

Long-running operations: Infrastructure changes take time; workflows can wait
Agent-driven changes: Agents can make multi-step infrastructure modifications
State management: Track complex provisioning processes with retries and rollbacks

Critical warning:

"Infrastructure mistakes scale faster than code mistakes."

A bad line of code might affect one user. A bad infrastructure change can take down your entire service. The guardrails and review processes for AI-generated infrastructure need to be even stronger than for AI-generated code.

SLIDE 13

Governance Moves Upstream

On Screen

Security starts at prompt time Not post-deployment

This is one of the most important slides in the deck for enterprise architects.

IAM Autopilot is a huge signal. AWS is showing that policies can be generated from intent rather than copied from Stack Overflow or constructed through trial-and-error.

What's changing:

Policies are generated from prompts: "This agent needs read access to S3 and write access to DynamoDB" → least-privilege IAM policy
AgentCore policies intercept actions: Before an agent can execute a tool call, policies check authorization
Audit trails are mandatory: Every agent action is logged and traceable

The shift: Security used to be post-deployment - penetration testing, vulnerability scans, incident response. In the agentic world, security starts at prompt time. The initial prompt determines what the agent can and cannot do.

Critical implication:

Auditability becomes mandatory: You must be able to explain why an agent did what it did
Prompt logs are compliance artifacts: The prompts you give to agents are part of your audit trail

Say this clearly:

"If you can't explain why an agent did something, you can't run it in production."

This is not optional. It's a requirement for responsible AI deployment.

SLIDE 14

What This Means for You

On Screen

Use best models Adopt agents carefully Over-invest in validation

This slide translates everything we've discussed into practical guidance for the audience.

1. Use best models (don't build them):

Start with the best available foundation models from the marketplace
Customize through prompts, RAG, and fine-tuning before considering full custom models
Design for model plurality - don't lock yourself into a single provider

2. Adopt agents carefully (treat them like junior engineers):

Start with narrow, well-defined tasks
Implement guardrails and policies from day one
Provide clear tools and context
Supervise outcomes, not every action

3. Over-invest in validation (it's your safety net):

Tests are more important than ever in AI-generated code
Build comprehensive observability into agent workflows
Implement circuit breakers and rollback mechanisms
Audit agent actions continuously

This is a new operating model, not just new tech. You can't just "add AI" to your existing processes. You need to rethink how work gets done, how quality is assured, and how responsibility is assigned.

SLIDE 15

Close the Loop (Back to the Analogy)

On Screen

2025+: You don't write the email You approve the outcome

Circle back to the opening analogy from Slide 2. This brings narrative closure to the presentation.

The core message: The transition from 2015 → 2023 → 2025 isn't just about better tools. It's about a fundamental shift in our role.

The mindset shift:

From execution → supervision
From doing → judging
From typing → delegating

AWS re:Invent 2025 wasn't about features. It wasn't about Nova 2 benchmarks or AgentCore API specifications. It was about AWS recognizing and enabling this fundamental change in how technical work gets done.

End with this line:

"The most valuable engineers won't be the fastest typers — they'll be the best delegators."

(Pause)

Closing Story

The Setup

"Let's go back to that AI that sent the email for you."

"What happens if it sends the wrong one?"

(Pause)

"What happens if it provisions the wrong infrastructure? Deletes the wrong data? Escalates the wrong incident?"

The Insight

"This is why re:Invent spent so much time on:

Guardrails
Policies
Audit trails
IAM generated from intent

"Not because AWS doesn't trust AI — but because AI changes who's accountable."

The Close (Land This Line)

"In the agentic world, failure doesn't look like a bug. It looks like a decision."

(Pause)

"And decisions require ownership."

Final Line (Strong, Memorable)

"In 2025, the most valuable engineers won't be the ones who type the fastest… They'll be the ones who know what not to delegate."

(Pause)

Thank you.

Presentation by Amit Joshi · Solutions Architect · Leapfrog Technology

Dawn of the Agents

↑