The Challenge Understanding Building Mastering
A Journey into Responsible AI Design

Designing Intelligence You Can Trust

Responsible Multi-Agent AI Systems for Real-World Impact

Nicole (Nicki) Florio | Zarin Lokhandwala
Trust
Transparency
Collaboration

The Challenge We All Face

Whether you're an AI expert or just getting started, one question unites us all:

"How do we unlock AI's incredible potential while ensuring accuracy, reliability, and trustworthy outcomes?"

The Reality

It doesn't matter how powerful or impressive an AI model is—if people can't trust the output, they simply won't use it.

The Opportunity

Designing trustworthy AI isn't just an ethical obligation—it's a strategic requirement for adoption, efficiency, and long-term value.

Why Intentional Design Matters

Understanding the risks helps us build better safeguards

Errors at Scale

AI doesn't just make mistakes—it can amplify them. What would be a small human error can become thousands of incorrect outputs within seconds.

Data Reflects Us

AI learns from human-generated data. If that data is biased, incomplete, or skewed, the model will replicate—and often magnify—those patterns.

Not "Set & Forget"

AI is only as good as the instructions we provide. Clear prompting, continuous training, and ongoing review are essential for consistent performance.

Every risk we've discussed can be mitigated with thoughtful, intentional design.

Understanding AI Agents

An AI agent is a system that can perceive, decide, and act toward a goal—with minimal human input

Non-Agentic AI

Standard Chatbot

  • Reactive—responds only when prompted
  • Stateless—no persistent memory across sessions
  • Single-turn—handles one request at a time
VS

Agentic AI

Autonomous Workflow Partner

  • Proactive—plans and pursues goals across multiple steps
  • Stateful—maintains context, memory & progress
  • Autonomous—executes end-to-end workflows
  • Tool-enabled—uses external tools, APIs & systems

The Difference in Practice

Scenario: A customer's card is charged $2,400 at an electronics store in a city they don't live in

Non-Agentic AI

Answers questions
1 Fraud system flags transaction & freezes card
2 Customer opens chatbot: "Why is my card frozen?"
3 AI responds: "Please call 1-800-555-0123 to verify your identity."
4 Customer calls → 20 min hold → re-explains → agent manually reviews → unfreezes card
5 Customer asks later about the alert — AI has no memory of it
~30–45 min · multiple handoffs · frustrated customer

Agentic AI

Resolves problems
1 Fraud system flags transaction
2 AI pulls customer profile, spending patterns & detects phone GPS is in that city
3 AI sends push notification: "$2,400 at Best Buy in Denver. Was this you?"
4 Customer taps "Yes, that's me" — AI verifies, unfreezes card, updates travel profile
5 Days later, customer asks about it — AI recalls the full interaction
~2 min · one touchpoint · delighted customer
Non-Agentic AI answers.   Agentic AI resolves.

Before You Build: Essential Questions

Stop. Think. Design. Then open your systems.

Let's walk through this together — imagine we're designing a multi-agent fraud detection system for a bank.
1

What is the core purpose of this agent?

Our Fraud Detection Agent has one job: monitor every incoming transaction in real time and flag anything suspicious before it impacts the customer.

2

What should it own and be responsible for?

It owns risk scoring — pulling transaction history, comparing spending patterns, and assigning a confidence-rated risk level. It does NOT own the decision to freeze an account.

3

What decisions can and should it make autonomously?

Auto-approve low-risk transactions that match known patterns. Auto-block purchases from confirmed fraudulent merchants. But gray-zone cases? Those need a second opinion.

4

Where should it stop and escalate to a human?

Transactions over $5,000, first-time international purchases, or sudden behavioral shifts — the agent packages full context and routes to a human fraud analyst.

Agent Handles

Scoring every transaction against historical patterns
Auto-approving low-risk transactions in real time
Sending push notifications for customer verification
Logging every decision with a full audit trail

Human Reviews

High-value flagged transactions ($5,000+)
Customer disputes & edge-case appeals
Updating fraud detection rules & thresholds
Final account freeze / unfreeze decisions

Multi-Agent Workflow in Action

See how agents collaborate in our fraud detection system — from transaction to resolution

Real-Time Data Sources
Transaction Stream $2,400 · Best Buy · Denver
Spending History Avg electronics: $180/mo
AI Agents
Ingestion Agent Captures Transaction
Enrichment Agent Adds Context
Enriched Output
$2,400 Best Buy
GPS: Denver
13x avg spend
3 Transactions Incoming
$45 grocery
$800 online
$2,400 Best Buy
Fraud Detection Agent
Risk Scoring Pattern & History Analysis
Decisions
$45 Approved
$800 Blocked
$2,400 Flagged — Score: 0.62 Escalate to Human Review
Flagged Transaction
$2,400 · Best Buy · Denver Score: 0.62 · 13x avg spend
Human Fraud Analyst
Reviews Context GPS matches · Customer traveling
Resolution
Card Unfrozen
"Was this you?"
Audit Logged

Transaction Ingestion

Capture & Enrich in Real Time

Every incoming transaction is captured in real time. The Ingestion Agent pairs it with the customer's spending history and behavioral patterns, then the Enrichment Agent adds context — location, device, merchant category — so the next agent can score it accurately.

Sample Enriched Transaction
$2,400 at Best Buy — Denver, CO
Customer GPS: Denver (traveling)
Avg electronics spend: $180/mo

Validating Your Design

How do you know you built it right? Ask these critical questions.

Auditability

Can you trace how it reached its conclusions?

Reproducibility

Can you repeat the same outcome given the same inputs?

Accuracy

Is it actually producing correct results?

Consistency

Does it perform reliably over time?

The Audit Test

If you can't audit it, you can't trust it.

Implement logging, traceability, and systematic checks. Document how decisions are made so anyone can understand the path from input to output.

The Fairness Check

Optimization without fairness creates risk.

Speed and efficiency mean nothing if outcomes are biased or inequitable. Build fairness checks into your validation process from day one.

Designing with Agentic AI

Four principles to carry into every project

Design First, Build Second

Define purpose, ownership, and boundaries before you write a single line. The tools are ready — your thinking has to be too.

Earn Trust by Default

Auditability, fairness, and clear goals aren't add-ons — they're the foundation. If people can't trust the output, they won't use it.

Humans Stay in the Loop

AI lacks judgment, context, and accountability. Human oversight isn't a limitation — it's what makes the system reliable.

Complexity doesn't equal capability.
Build systems that are understandable, auditable, and worthy of trust.

Now Go Build.

You've got the framework. You've seen the workflow. The only thing left is to start.

Design with intent
Build in trust
Keep humans close

Nicole (Nicki) Florio  ·  Zarin Lokhandwala