Speaker Notes

Dawn of the Agents

Amit Joshi · Solutions Architect · Leapfrog Technology

Navigation

Opening Story

Opening Line

"I want to start with something uncomfortable."

(Pause)

The Story

"Think about the last important email you sent at work. A proposal. A roadmap. A customer escalation."

2015

"In 2015, you researched. You read blogs. You wrote drafts. Maybe you asked a colleague to review it. You owned every word."

2023

"In 2023, you pasted links into ChatGPT. You asked for a summary. You tweaked the output. You still pressed 'Send'."

(Pause)

2025

"Now imagine this: You tell an AI 'Handle this.' It researches. It drafts. It adapts tone based on the recipient. It sends the email. And it updates the CRM."

(Pause again)

"At that point… you're not writing emails anymore. You're supervising outcomes."

The Turn

"AWS re:Invent 2025 wasn't about better emails. It was about that exact transition — everywhere."
"From code… to infrastructure… to security… AWS is betting that doing becomes automated, and judgement becomes the scarce skill."
SLIDE 1

Title Slide

AWS re:Invent 2025: GenAI Matures → The Age of Agents From copilots to autonomous systems

This opening slide sets expectations for the entire presentation. This talk is not a feature dump of AWS services. Instead, it's about the directional shifts AWS is making in the GenAI space.

The key theme is the transition from GenAI that "helps" (like copilots and assistants) to GenAI that "does" (autonomous agents that take action). This represents a fundamental shift in how we think about AI in production systems.

Mention AWS re:Invent 2025 once here to establish context, then move on. The focus should be on the concepts, not on rehashing conference announcements.

SLIDE 2

The Analogy (Anchor the Talk)

2015 → 2023 → 2025+ From manual → assisted → autonomous

This slide introduces the central analogy that will frame the entire presentation. Walk through the email analogy slowly, as established in the opening story:

  • 2015: You manually research, write, and send emails
  • 2023: AI assists you - you use ChatGPT to draft, but you still review and send
  • 2025: AI acts autonomously - you delegate the task, and AI handles it end-to-end

Emphasize the critical difference between AI as a tool (2023) and AI as an actor (2025). This isn't just about better autocomplete - it's about AI systems that can hold intent over time, make decisions, and take actions.

This framing will apply throughout the talk to:

  • Foundation models
  • Coding workflows
  • DevOps practices
  • Security policies
"re:Invent 2025 was basically AWS saying: we're entering the 2025 box."
SLIDE 3

Big re:Invent Theme

Foundation Models = Infrastructure ───────────── Agents are the Interface

This is the core AWS thesis and the most important conceptual slide in the deck. Everything else ladders up to this framework.

Foundation Models are Infrastructure: Models are becoming commoditized utility services, similar to EC2, S3, or managed databases. They're powerful but increasingly undifferentiated. You select them based on cost, performance, and availability - not as your core competitive advantage.

Agents are the Interface: The new application layer is agents - systems that use foundation models as compute primitives but add memory, tools, policies, and orchestration on top. This is where differentiation happens.

This reframing is critical for architects: don't build your strategy around owning or customizing models. Build it around what your agents do with those models.

SLIDE 4

Foundation Models = Infrastructure

• Nova 2 (Lite, Sonic, Omni) • Incremental gains, not magic • Bedrock = model marketplace

Nova 2 models represent AWS's approach to foundation models: boring on purpose. They're not breakthrough innovations - they're incremental improvements focused on:

  • Faster inference times
  • Lower cost per token
  • Better context windows
  • More reliable outputs

That's the point. Infrastructure should be boring, predictable, and reliable. You don't want your database doing surprising things, and you shouldn't want your foundation models doing surprising things either.

Bedrock as the model marketplace: AWS Bedrock now hosts models from multiple sources - AWS's own models, open-source models (Llama, Mistral), and closed models from partners. AWS is positioning itself as the "model switchboard" - you can switch between models without rewriting your application code.

This is a strategic play: AWS wins if they own the infrastructure layer where models run, regardless of which specific model you choose.

SLIDE 5

Custom Models: Power vs Reality

Nova Forge Custom models are possible But rarely necessary

Nova Forge is AWS's service for training or heavily customizing foundation models. It's powerful, but be very clear with the audience about the reality:

The challenges of custom models:

  • Validation is hard: How do you know your custom model is better? Testing is expensive and time-consuming.
  • Drift is real: Models degrade over time as data patterns change. You need continuous monitoring and retraining.
  • Cost explodes silently: Training costs, infrastructure costs, and maintenance costs compound quickly.

Critical take (say this explicitly):

"For most teams, custom models are a vendor-lock-in shaped hammer looking for a nail."

Best default approach:

  • Use the best available foundation model from the marketplace
  • Customize via prompts, RAG (Retrieval Augmented Generation), and light fine-tuning
  • Only invest in full custom models when you have a compelling, measured business case

Most companies overestimate their need for custom models and underestimate the operational burden.

SLIDE 6

Competitive Reality (Brief, Honest)

AWS ≠ only GenAI cloud And that's OK

Acknowledge the elephant in the room: AWS is not the only player in GenAI infrastructure. Your audience knows this, so address it directly:

  • Google Cloud: Vertex AI + Gemini models
  • Microsoft Azure: Azure OpenAI Service (exclusive partnership with OpenAI)
  • Others: Anthropic (Claude), open-source ecosystem

AWS's response is not model supremacy. They're not trying to win by having the "best" model. Instead, their strategy is:

  • Choice: Run any model on their infrastructure
  • Governance: Enterprise-grade controls, policies, and audit trails
  • Integration: Deep integration with existing AWS services

Implication for architects: Design your systems for model plurality. Don't hard-code dependencies on a single model or provider. Build abstraction layers that let you swap models based on cost, performance, or capability.

The best model today might not be the best model next year. Your architecture should be resilient to that reality.

SLIDE 7

The Real Shift: Agents

Agents ≠ Chatbots Agents = long-lived AI workers

This is where the conversation shifts from models (infrastructure) to agents (the application layer). Define agents clearly for the audience:

What makes something an agent?

  • Memory: Agents remember context across interactions and over time
  • Tools: Agents can call APIs, query databases, execute code, and interact with external systems
  • Policies: Agents operate within defined guardrails and permissions
  • Autonomy: Agents can make decisions and take actions without constant human input

Why this is different from chatbots: Chatbots respond to prompts. Agents hold intent over time, can recover from failures, and can act without asking for permission at every step.

Tie back to the 2025 analogy: This is the shift from "assisted" (you ask questions, AI responds) to "autonomous" (you delegate a goal, AI figures out how to achieve it).

This is the first time AI can truly do things rather than just suggest things.

SLIDE 8

AWS Agent Stack

• Bedrock AgentCore • Step-based execution • Guardrails & memory

Bedrock AgentCore is AWS's agent runtime - the infrastructure layer for running agents at scale.

What it actually is (be honest):

  • Prompt orchestration: Managing multi-turn conversations and context
  • Tool calling: Safely connecting agents to APIs and services
  • Policy enforcement: Ensuring agents operate within defined boundaries
  • Memory management: Storing and retrieving agent state

Critical take:

"It feels more like a controlled workflow engine than true autonomous intelligence."

And that's not bad. Enterprises need predictability. They need agents that can be audited, monitored, and constrained. Pure autonomy is scary in production environments with real customers and real money at stake.

AgentCore is AWS's answer to the question: "How do we make agents enterprise-ready?" The answer is structure, governance, and observability.

SLIDE 9

Specialized Agents Are Coming Fast

DevOps Agent Security Agent Kiro (IDE)

AWS is shipping job-shaped agents - AI systems designed for specific professional roles rather than general-purpose tasks.

Examples:

  • DevOps Agent: Acts as an AI on-call engineer that can diagnose issues, execute runbooks, and even remediate incidents
  • Security Agent: Provides continuous application security scanning, threat detection, and automated response
  • Kiro (IDE): An agentic integrated development environment that doesn't just autocomplete code, but understands project architecture and can refactor across multiple files

The pattern to watch: AWS is building agents that map to specific job functions. Expect many more to come:

  • FinOps Agent: Cost optimization and budget management
  • Compliance Agent: Continuous regulatory compliance checking
  • Data Ops Agent: Data pipeline management and quality assurance

This is the future of enterprise AI: not one general assistant, but a team of specialized agents, each focused on a specific domain with deep context and tools.

SLIDE 10

Vibe Coding Is Now Normal

Natural language → Code Humans supervise

"Vibe coding" is no longer a joke. It's become the default workflow for many developers: describe what you want in natural language, and AI generates the code.

The reality: AI now writes more code than humans in many organizations. Tools like GitHub Copilot, AWS CodeWhisperer, and agentic IDEs are producing the majority of boilerplate, integration code, and even complex logic.

The risk is not wrong code. Modern AI is surprisingly good at generating syntactically correct, functionally accurate code for well-defined problems.

The real risks are:

  • No one understands it: Code generated in seconds isn't necessarily code that's understood
  • No one tests it properly: The temptation is to trust AI output without rigorous testing
  • Technical debt accumulates faster: It's easy to generate code; it's hard to maintain it

Key line (pause after saying this):

"Productivity is up. Visibility is down."

We're shipping faster, but we're accumulating complexity we don't fully understand. This is a management and process challenge, not just a technical one.

SLIDE 11

The Spec Is the Source Code

Specs > Code Python ≈ Assembly

This is a provocative statement - let it land before explaining.

The core idea: In an AI-driven development world, specifications drive code generation. The code itself becomes an implementation detail, similar to how assembly language is an implementation detail of higher-level languages.

What this means:

  • Specs drive generation: Well-written specifications, contracts, and type definitions become the source of truth
  • Code is ephemeral: Code can be regenerated from specs as models improve or requirements change
  • Testing becomes critical: You can't review every line of generated code, so tests become your primary quality gate

Implication for software practices:

  • Tests are more important than ever
  • Contracts and interfaces become first-class artifacts
  • Assertions and invariants matter more than syntax

The Python line: When you say "Python is the new assembly," smile. It's half joke, half prophecy. The point is that we're moving up the abstraction ladder again - from assembly to C, from C to Python, and now from Python to natural language specifications.

SLIDE 12

Vibe DevOps: Infra as Prompt

"I want X" → Terraform / Pulumi → Running infra

Infrastructure generation via prompts is maturing fast. What was a demo 12 months ago is becoming production-ready.

How it works:

  1. You describe infrastructure intent in natural language: "I need a serverless API with authentication"
  2. AI generates infrastructure-as-code (Terraform, Pulumi, CloudFormation)
  3. The code is reviewed (by human or automated tools)
  4. Infrastructure is provisioned

AWS demos and Pulumi AI prove feasibility. These aren't research projects - they're shipping products that developers are using today.

Lambda durable workflows are key to this shift. They enable:

  • Long-running operations: Infrastructure changes take time; workflows can wait
  • Agent-driven changes: Agents can make multi-step infrastructure modifications
  • State management: Track complex provisioning processes with retries and rollbacks

Critical warning:

"Infrastructure mistakes scale faster than code mistakes."

A bad line of code might affect one user. A bad infrastructure change can take down your entire service. The guardrails and review processes for AI-generated infrastructure need to be even stronger than for AI-generated code.

SLIDE 13

Governance Moves Upstream

Security starts at prompt time Not post-deployment

This is one of the most important slides in the deck for enterprise architects.

IAM Autopilot is a huge signal. AWS is showing that policies can be generated from intent rather than copied from Stack Overflow or constructed through trial-and-error.

What's changing:

  • Policies are generated from prompts: "This agent needs read access to S3 and write access to DynamoDB" → least-privilege IAM policy
  • AgentCore policies intercept actions: Before an agent can execute a tool call, policies check authorization
  • Audit trails are mandatory: Every agent action is logged and traceable

The shift: Security used to be post-deployment - penetration testing, vulnerability scans, incident response. In the agentic world, security starts at prompt time. The initial prompt determines what the agent can and cannot do.

Critical implication:

  • Auditability becomes mandatory: You must be able to explain why an agent did what it did
  • Prompt logs are compliance artifacts: The prompts you give to agents are part of your audit trail

Say this clearly:

"If you can't explain why an agent did something, you can't run it in production."

This is not optional. It's a requirement for responsible AI deployment.

SLIDE 14

What This Means for You

Use best models Adopt agents carefully Over-invest in validation

This slide translates everything we've discussed into practical guidance for the audience.

1. Use best models (don't build them):

  • Start with the best available foundation models from the marketplace
  • Customize through prompts, RAG, and fine-tuning before considering full custom models
  • Design for model plurality - don't lock yourself into a single provider

2. Adopt agents carefully (treat them like junior engineers):

  • Start with narrow, well-defined tasks
  • Implement guardrails and policies from day one
  • Provide clear tools and context
  • Supervise outcomes, not every action

3. Over-invest in validation (it's your safety net):

  • Tests are more important than ever in AI-generated code
  • Build comprehensive observability into agent workflows
  • Implement circuit breakers and rollback mechanisms
  • Audit agent actions continuously

This is a new operating model, not just new tech. You can't just "add AI" to your existing processes. You need to rethink how work gets done, how quality is assured, and how responsibility is assigned.

SLIDE 15

Close the Loop (Back to the Analogy)

2025+: You don't write the email You approve the outcome

Circle back to the opening analogy from Slide 2. This brings narrative closure to the presentation.

The core message: The transition from 2015 → 2023 → 2025 isn't just about better tools. It's about a fundamental shift in our role.

The mindset shift:

  • From executionsupervision
  • From doingjudging
  • From typingdelegating

AWS re:Invent 2025 wasn't about features. It wasn't about Nova 2 benchmarks or AgentCore API specifications. It was about AWS recognizing and enabling this fundamental change in how technical work gets done.

End with this line:

"The most valuable engineers won't be the fastest typers — they'll be the best delegators."

(Pause)

Closing Story

The Setup

"Let's go back to that AI that sent the email for you."
"What happens if it sends the wrong one?"

(Pause)

"What happens if it provisions the wrong infrastructure? Deletes the wrong data? Escalates the wrong incident?"

The Insight

"This is why re:Invent spent so much time on:
  • Guardrails
  • Policies
  • Audit trails
  • IAM generated from intent
"Not because AWS doesn't trust AI — but because AI changes who's accountable."

The Close (Land This Line)

"In the agentic world, failure doesn't look like a bug. It looks like a decision."

(Pause)

"And decisions require ownership."

Final Line (Strong, Memorable)

"In 2025, the most valuable engineers won't be the ones who type the fastest… They'll be the ones who know what not to delegate."

(Pause)

Thank you.

Presentation by Amit Joshi · Solutions Architect · Leapfrog Technology

Dawn of the Agents