How JP Morgan built 30,000 AI Agents

by Tim Kilpatrick on March 12, 2026

in AI

If you are an AI fast-follower, rather than an AI early-adopter, it may be time to learn from organizations that moved beyond the AI pilot.

It may be time to look past the conflicting stories on AI, AI Agents, and GenAI to focus on what is working. If your organization is in a regulated environment where compliance, auditability, and governance reigns, it may be time to learn how innovative organizations navigated past these AI adoption barriers. While it’s unlikely you have an almost $20B 2026 technology budget like JP Morgan, you can learn from their experimentation with AI beyond software coding.

The conflicting AI stories last month included the warning Something Big is Happening from Matt Shumer that received 84 million views on X. He advising you to build up your savings and be cautious about adding debt based on your income. A substack post on Citrini Research shook the US stock market by predicting AI will cause massive white collar job layoffs, 10% unemployment and 36% drop in the S&P in 2028.

This narrative is contrast to 2026 researchers finding AI agents fail 70% of the time and 95% of enterprise GenAI projects get zero return. and Gartner predicting over 40% of agentic AI projects will be canceled by end of 2027.

JP Morgan ignored the conflicting AI stories and began their GenAI journey before the release of ChatGPT in November 2022. Derek Waldron, their chief analytics officer, explained how the financial giant achieved large-scale, voluntary employee adoption of AI, GenAI and AI agents. Their experience helped them realize three core strategy principles:

Employees should decide AI use cases. AI technology capabilities must be distributed to the employee user base. Employees are best to determine the effectiveneess of AI use cases.
Employees should build with AI. There are no two job functions exactly the same. Distribute powerful reusable building blocks and capabilities that enable individuals to build solutions themselves such as AI tools for research, analysis, and document prep.
Connections. The long-term bottlenecks for driving maximum value was not the AI models, it was the connections to the existing technology applications, knowledge, analytics, and processes of the enterprise.

They realized that if AI superintelligence or Artificial General Intelligence (AGI) was ever achieved, it would provide little value without the reusable building blocks and connections.

They also determined that we are in a Foundational AI arms race (i.e., OpenAI, Anthropic, Google) and new models are released almost daily. They determined the top foundation models and even smaller models work well enough, the solution almost entirely is about the internal platform, tools, and connections and how it works within the ecosystem. Thus, they designed their enterprise AI capabilities so that they could swap foundational AI models anytime without impact to the enterprise AI ecosystem. Essentially treating Foundational AI models as commodities.

To create value, JP Morgan had to address:

LLM augmentation
Connections
Reusable building blocks
Guardrails
Evals
User Awareness and Adoption
AI Agent mindset

LLM Augmentation – For GenAI to be effective, it must leverage current, trusted, and proprietary knowledge while avoiding hallucinations. To address this, it begins with developing a Retrieval-Augmented Generation (RAG) strategy. RAG improves Gen AI by retrieving external and trusted knowledge sources —like company documents or the internet—adds them to the Large Language Model (LLM) prompt (a.k.a., context window) before generating a response. The inference output is up-to-date trusted information, which reduces hallucinations, and doesn’t require retraining when new Foundational AI models are released. JP Morgan is on their fourth generation of RAG:

First – basic RAG – access internal knowledge in keyword and vector searches. Key word search is brittle, such as searching for “vacation” may not include knowledge on approximate nearest neighbors (ANNs) such as PTO, holiday, or personal leave. To enable vector search, organization create vector stores by transforming data into numerical embeddings stored in a database for semantic search. The numerical embedding captures the relationships between vacation, PTO, holiday, and personal leave. This enables users to ask questions about vacation time or recently quarterly results without navigating HR and financial systems, finding documents, clicking on links, scrolling, and searching for answers. You may have noticed that Google Search started to improve around 2017 when they incorporated vector search of numerial embeddings innovations.
Second – democratized RAG – they federated it so the entire firm could create their own knowledge stores and set various access provisions around it.
Third – hierarchy RAG – they realized not all knowledge is equal, thus they created hierarchies of information
- top – precise scripted answer to various questions when you don’t want degrees of freedom
- second – they call it evergreen. It is surfacing authoritative real time sources.
- third – knowledge and information contained in documents that stand the test of time
- bottom – information that may be more temporal that becomes less relevant as time goes on
Fourth – multimodality RAG – ingest reports with graphs, images, and company pitch illustrations, like what’s used in marketing

Connections – for AI agents to work, they need ubiquitous connections to what people interact with such as structured data systems, documents, knowledge stores, applications, and systems like HR and CRM. JP Morgan calls this their connected ecosystem. They built connections to their trading, financing, and risk systems. They added connections each month. They use a five-team rule for deciding on connections. If five teams request a connection, they prioritize implementing those connections via application programming interfaces (API) or Model Context Protocols (MCPs) to enable AI models to seamlessly connect to external data sources, tools, and databases. Once the connections are made, they become available to anyone in the enterprise with the access privileges.

Reusable building blocks – The personal AI agents call the many reusable building blocks available across the enterprise or what’s shared by team members. The building blocks are often multiple-step tasks, such as research (multiple calls to various systems, knowledge, and data) and analysis by comparing companies within an investment strategy to determine the best value. The building blocks create presentations in PowerPoint, populate and analyze data in Excel, pitchbook creation, and prepare material into a JP Morgan’s standard formats for documents, brochures, or presentations. The reusable building blocks could be other AI agents and skills that can be added to LLM prompts. The skills could by step-by-step instructions of how to do a task that are added to the LLM prompt (context window) when planning a task.

Guardrails – AI agents is a major mindset shift from telling the system what to do (traditional software) to telling the system what not to do (AI Agents). “The only way to truly know if AI agents work is to release them into production environments,” according to LangChain CEO Harrison Chase. Guardrails must address the many alignment challenges AI Agents create. JP Morgan onboards every MCP server into platform, which includes security testing and making sure legal agreements are in place. Once the MCP server is onboarded, it is available for the whole firm provided the individual has access privileges.

Evals – In addition guardrails, AI agents require robust and comprehensive set of evals to determine their effectiveness. The evals must determine:

overall performance of the AI agents
alignment with goals, risk, compliance, and governance
feedback to power the flywheel of learning and adapting.

Evals are a fast-emerging new industry that helps train AI models and determine their performance, alignment, and learning. The AI Startup Mercor hired tens of thousands of human experts—including lawyers, doctors, engineers, and researchers—to train and evaluate advanced AI models. Their evals are critical to understanding the quality and accuracy of AI inferences in specific domains and subjects. Mercor’s APEX-Agents evaluate agents on real day-to-day work of professionals such as investment banking analysts, management consultants, and corporate lawyers. OpenAI released GDPval, a framework designed to measure AI model performance with 1,320 real-world, economically valuable tasks across 44 occupations. Guardrails, alignment, and governance of AI Agents is not possible without robust enterprise evals that monitor, audit, and measure performance. JP Morgan vaulted to the top of the AI Evident Index, which is a global eval of AI talent, innovation, leadership, and transparency in banking.

User Awareness and Adoption – JP Morgan began with the distribution of LLM technology onto everyone’s desktop within a hosted platform. This ensures the inputs to the LLMs stay inside the enterprise rather than captured by OpenAI, Anthropic, or Google. The internal LLM helps to prevent shadow AI, the unauthorized use of AI tools, applications, or large language models (LLMs) by employees to perform work tasks without the knowledge, approval, or security vetting of their organization.

JP Morgan displayed posters, launched “AI Made Easy” employee training, and hosted ideation workshops with thousands of people across businesses, operations, and technology departments. At the hosted events, they demoed the tech, brainstormed how it could be used, and identified a ton of ideas. They identified half a dozen of the most prominent patterns of asks including being able to query and analyze their digital data within LLMs. They enabled conversational questioning to structured databases, thus automating the process of traditional data base querying. They built document ingestion and comparison analysis AI tools.

Without much effort, they got to 30% enterprise adoption with their early adopters. That group got busy right away building personal AI agents that helped them get to 60% once the fast follower population saw what they were doing.

AI Agent mindset – For AI agents to evolve, it requires what Waldron describes as an AI Agent mindset. He often asks himself what is missing from his personal AI assistant ecosystem that he can’t ask an AI agent to do. First, they helped people use AI for questions and answers, then research and summary activities. Then people didn’t want AI agents to solve part of the process, they wanted it to solve the whole process. For example, an investment banker must go to news, check earnings releases, do web research about a particular client and then create a briefing note. In 2025, Waldron believes they landed on an innovation flywheel. When they try to build AI agents, they identify and fill the gaps (maybe connections). They have a team that surveils the gaps and has a process to solve them centrally. This addresses individuals’ problems, which expands capabilities, which creates more ideas, more uses, thus the flywheel effect.

JP Morgan uses a top-down and bottom-up approach. Bottom-up has resulted in building incredibly powerful platform capabilities that have connections, knowledge, and reusable building blocks that grow and scale over time, enabling more use cases and adoption. Yet the bottom-up approach won’t fully transform a company on its own. Businesses run on long processes that cross multiple teams. JP Morgan recognized that if they want to move the needle on end-to-end processes, top-down strategies are required, such as reducing end-to-end time to disperse credit and end-to-end time to onboard employees. Waldron acknowledged they must strategically rethink what the process will look like in a world of AI and AI agents.

While the 30,000 personal AI agents don’t address the more complex enterprise workflow processes that cross multiple teams, the AI platform layer created for personal AI agents is a foundation for the end-to-end enterprise processes.

For more on enterprise adoption of AI, VentureBeat, and their Beyond the Pilot podcast highlights what JP Morgan and other enterprises are doing with AI. We can learn from organizations that moved beyond AI demos and pilots. The Beyond the Pilot podcast has helped me learn.

{ 0 comments… add one }

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Previous post: MIT Researchers Identified AI Challenges in Healthcare

How JP Morgan built 30,000 AI Agents

Related

Categories

How JP Morgan built 30,000 AI Agents

Share this:

Related

Categories