Table of Contents
Highlight
- OpenAI AgentKit transforms agent development by offering a unified visual workflow system with integrated Guardrails, ensuring safety, governance, and ease of use for developers and enterprises.
- The Agent Builder enables the intuitive creation of multi-agent workflows through drag-and-drop functionality, facilitating faster collaboration, reduced iteration cycles, and the deployment of intelligent agents to production with minimal code.
- With Evals and ChatKit, developers can improve agent reliability and performance while embedding custom AI chat experiences directly into apps, websites, or enterprise tools.
In October 2025, OpenAI introduced AgentKit, a comprehensive kit that aims to make it easier for developers and enterprises to build, deploy, and optimize intelligent agents. As interest in and demand for agentic-oriented systems—that is, systems that do not simply respond to prompts but rather orchestrate an array of tools, workflows, and choices—escalates, the use of AgentKit represents an important next step in simplifying what has historically been a fragmented and complex solution to build. The following section tells you what AgentKit offers, why it’s important, and how it fits in the future of building AI/agent systems.

The Challenge: Fragmented Agent Development
Thus far in building agents typically means engineers managing multiple disparate tools and/or portions of the infrastructure: Custom connectors to specific data source(s)Scripting orchestration/logic that makes the models, tools, and branches work together Prompt tuning and evaluation pipelinesA front-end chat or UI layers And/or Version, safety, and iteration among workflowsThe fragmentation adds complexity and slows innovation, increases engineering effort for onboarding and conceptual on getting teams across product, legal, and engineering team together and coordinating their outputs. AgentKit seeks to address these pain points in a more integrated stack solution.
What AgentKit Offers
AgentKit comprises a number of new or redefined components that relate to the major phases of agent development:
- Agent Builder (Visual Workflow Creation)
One of the foundational concepts of AgentKit is Agent Builder – a visual interface for developers to drag & drop nodes, wire up logic and versions of workflows, preview runs, develop inline evaluation configuration, and offer built-in version control. For example, you could create a multi-agent workflow that begins with a classification step, routes to different sub-agents, and introduces guardrails.
All of this is visually observable and can be iterated upon, without the need to deeply manage the underlying code glue.OpenAI references customer stories: customers like Ramp report that they reduced iteration cycles by 70%, and got an agent into the market in two sprints rather than in two quarters; LY Corporation built their first multi-agent workflow with an agent in less than two hours.
- Connector Registry
Agents typically need to interface with data sources or external systems (e.g., Google Drive, internal databases, APIs). The Connector Registry can allow teams to manage connectors in a source of truth with reuse across workspaces and collaborations. It includes prebuilt connectors for Dropbox, Google Drive, SharePoint, Microsoft Teams, and other third-party collaborative applications. For organizations, this brings governance and consistency to allow agents from different organizations and work to reuse approved connectors, utilizing governance and security or compliance rules.

- Guardrails (Safety Layer)
To mitigate undesirable or unsafe behavior, AgentKit includes Guardrails, which is a neutral part of a separate organization project as an integrated open-source safety layer. Guardrails can mask or flag personally identifiable information (PII), detect attempts to jailbreak the model, or simply enforce other behavioral constraints. Guardrails can be run standalone or through a library in Python or JavaScript. This adds alignment between agent behavior and their expected safety behavior and reduces risk when your agent is making observations over real data or interfacing with external systems.
- ChatKit (Agentic Chat Embedding)
Once you have built an agent for some use, you will then be challenged to embed it into an existing UI, app, website, or product. You will be faced with challenges of stream responses, threaded responses, representing a “thinking” state, interaction, etc. AgentKit’s ChatKit simplifies the process of embedding your agent’s functionality. You get a chat-based UI that you can customize to your style or brand, drop it in, and start using it right away.OpenAI has shared that Canva integrated a support agent with ChatKit in less than an hour, saving engineers two weeks of work.ChatKit is now generally available to developers.
- Enhanced Evaluation with Evals
Now that you have a working agent, you need to build reliability into it, which usually requires some feedback loop and a way to evaluate the performance. OpenAI’s existing framework for Evals has been extended in AgentKit with specific functionality and new genus types.
Datasets, to create evaluation sets, and to expand them over time, especially with grading or annotation.
Trace Grading allows an agent to be graded across an entire workflow, not just across exposure to isolated prompts.
Automated Prompt Optimization: improve prompts based on human annotations or grader Outputs
Third-party Model Support: test models from other vendors in the same Evals framework
This entire set of features enables developers to learn about the behaviors of their agents, find weaknesses, and iteratively improve performance. OpenAI reports that one of their customers has qualified that their due diligence multi-agent saw a 50% developers’ time-in-development and 30% accuracy.
- Agent Reinforcement Fine-Tune (RFT)
AgentKit brings OpenAI’s RFT capabilities for how to further fine-tune agents. These new features include:Custom tool calls: train a model on when to call which tool,Custom graders: let developers define what “success” means in their domain to help shape learning goals. These features can help push agents to be more context-aware, better tool utilization, and in general more together for their use case.–

Availability & Pricing: ChatKit and new Evals features are broadly available to all developers at launch. Agent Builder is in beta. The Connector Registry will be rolled out in beta to API users, ChatGPT Enterprise, and Education customers through the Global Admin Console. All tools that comprise AgentKit are included under the normal API model pricing, i.e., no additional separate charge.OpenAI is planning to release a Workflows API and agent deployment capabilities specific to ChatGPT in the near future.
What AgentKit Brings to the Table
AgentKit is a sign of the maturing of agent development and deployment. It has many implications to move from one-off orchestration scripts and broken systems to a cohesive, versioned, visual, governed system.
- Accessibility
Less engineering overhead means smaller teams, or non-AI experts can tinker with agents.
- Faster Iteration and Deployment
Visual workflow + reusable connectors + guardrails = faster to production.
- Stakeholder Collaboration
Product designers, legal, compliance, and engineers can see and reason about agent logic together.
Reliability & Safety Built-in evaluation, guardrails, and versioning can help mitigate unexpected behaviors or regressions.
- Ecosystem Play
Because it builds on top of OpenAI’s Responses API and Evals, AgentKit deepens the connection between OpenAI’s core models and downstream applications. For organizations developing customer support assistants, internal knowledge agents, research assistants, sales assistants, or domain use agents, AgentKit can be a central part of their infrastructure.
For organizations that create a customer support assistant, internal knowledge agents, research assistants, sales bots, or domain-specific agents, AgentKit could serve as a key part of their architecture.

Possible Challenges and Considerations
Even with its promise, developers should keep in mind a few caveats:Beta features: Some aspects (Agent Builder, Connector Registry) are still in beta, and, as a result, might have constraints or evolving APIs.
Lock-in: As one comes deeper into AgentKit’s framework, there is an increased coupling to OpenAI’s ecosystem; one needs to consider the potential migration path if one wants to swap model providers in the future.
Custom needs: Extremely specialized or brand-new workflows might still need custom code beyond the visual nodes
Safety and domain risks: Although guardrails help, there are still possible domain-specific failure modes, so human oversight and monitoring, and proper fail-safety are a reasonable approach.
Scalability and costs: For extremely high throughput or huge numbers of agents, the architecture and cost model may need to be factored into deeper consideration.
Future Considerations
AgentKit represents a deliberate effort by OpenAI to make agent building accessible, organized, and powerful. As agents become woven into enterprise applications, tools like AgentKit may set the standards not just for what models agent builders use, but how agents are built at all. In future releases, we will look for:

How expressive the visual workflows will become (e.g., loops, conditionals, dynamic branching).
How third-party models will work (e.g., plug in non-openaimodels via evals).
How agents can be deployed (serverless hosting of agents, autoscaling).
More advanced safety/verification tools, a community, or marketplaces of reusable agent modules, connectors, or templates.