This is page 1 of 2. Use http://codebase.md/opgginc/opgg-mcp?lines=true&page={x} to view the full context.
# Directory Structure
```
├── .gitignore
├── Dockerfile
├── docs
│ └── apps-sdk
│ ├── _blog_realtime-api_.txt
│ ├── _codex_cloud_code-review_.txt
│ ├── _codex_pricing_.txt
│ ├── _tracks_ai-application-development_.txt
│ ├── _tracks_building-agents.txt
│ ├── apps-sdk_app-developer-guidelines.txt
│ ├── apps-sdk_build_custom-ux_.txt
│ ├── apps-sdk_build_examples.txt
│ ├── apps-sdk_plan_components.txt
│ └── apps-sdk_plan_use-case.txt
├── LICENSE
├── package.json
├── pnpm-lock.yaml
├── README.md
├── smithery.yaml
├── src
│ ├── index.ts
│ └── proxy-server.ts
└── tsconfig.json
```
# Files
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
```
1 | dist
2 | node_modules
3 | .env
4 |
```
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
```markdown
1 | # OP.GG MCP Server
2 |
3 | [](https://smithery.ai/server/@opgginc/opgg-mcp)
4 |
5 | The OP.GG MCP Server is a Model Context Protocol implementation that seamlessly connects OP.GG data with AI agents and platforms. This server enables AI agents to retrieve various OP.GG data via function calling.
6 |
7 | 
8 | 
9 |
10 | ## Overview
11 |
12 | This MCP server provides AI agents with access to OP.GG data through a standardized interface. It offers a simple way to connect to our remote server (https://mcp-api.op.gg/mcp), allowing for easy installation and immediate access to OP.GG data in a format that's easily consumable by AI models and agent frameworks.
13 |
14 | ## Features
15 |
16 | The OP.GG MCP Server currently supports the following tools:
17 |
18 | ### League of Legends
19 | - **lol-champion-leader-board**: Get ranking board data for League of Legends champions.
20 | - **lol-champion-analysis**: Provides analysis data for League of Legends champions (counter and ban/pick data available in the "weakCounters" field).
21 | - **lol-champion-meta-data**: Retrieves meta data for a specific champion, including statistics and performance metrics.
22 | - **lol-champion-skin-sale**: Retrieves information about champion skins that are currently on sale.
23 | - **lol-summoner-search**: Search for League of Legends summoner information and stats.
24 | - **lol-champion-positions-data**: Retrieves position statistics data for League of Legends champions, including win rates and pick rates by position.
25 | - **lol-summoner-game-history**: Retrieve recent game history for a League of Legends summoner.
26 | - **lol-summoner-renewal**: Refresh and update League of Legends summoner match history and stats.
27 |
28 | ### Esports (League of Legends)
29 | - **esports-lol-schedules**: Get upcoming LoL match schedules.
30 | - **esports-lol-team-standings**: Get team standings for a LoL league.
31 |
32 | ### Teamfight Tactics (TFT)
33 | - **tft-meta-trend-deck-list**: TFT deck list tool for retrieving current meta decks.
34 | - **tft-meta-item-combinations**: TFT tool for retrieving information about item combinations and recipes.
35 | - **tft-champion-item-build**: TFT tool for retrieving champion item build information.
36 | - **tft-recommend-champion-for-item**: TFT tool for retrieving champion recommendations for a specific item.
37 | - **tft-play-style-comment**: This tool provides comments on the playstyle of TFT champions.
38 |
39 | ### Valorant
40 | - **valorant-meta-maps**: Valorant map meta data.
41 | - **valorant-meta-characters**: Valorant character meta data.
42 | - **valorant-leaderboard**: Fetch Valorant leaderboard by region.
43 | - **valorant-agents-composition-with-map**: Retrieve agent composition data for a Valorant map.
44 | - **valorant-characters-statistics**: Retrieve character statistics data for Valorant, optionally filtered by map.
45 | - **valorant-player-match-history**: Retrieve match history for a Valorant player using their game name and tag line.
46 |
47 | ## Usage
48 |
49 | The OP.GG MCP Server can be used with any MCP-compatible client. The following content explains installation methods using Claude Desktop as an example.
50 |
51 | ### Direct Connection via StreamableHttp
52 |
53 | If you want to connect directly to our StreamableHttp endpoint, you can use the `supergateway` package. This provides a simple way to connect to our remote server without having to install the full OP.GG MCP Server.
54 |
55 | Add the following to your `claude_desktop_config.json` file:
56 |
57 | #### Mac/Linux
58 |
59 | ```json
60 | {
61 | "mcpServers": {
62 | "opgg-mcp": {
63 | "command": "npx",
64 | "args": [
65 | "-y",
66 | "supergateway",
67 | "--streamableHttp",
68 | "https://mcp-api.op.gg/mcp"
69 | ]
70 | }
71 | }
72 | }
73 | ```
74 |
75 | #### Windows
76 |
77 | ```json
78 | {
79 | "mcpServers": {
80 | "opgg-mcp": {
81 | "command": "cmd",
82 | "args": [
83 | "/c",
84 | "npx",
85 | "-y",
86 | "supergateway",
87 | "--streamableHttp",
88 | "https://mcp-api.op.gg/mcp"
89 | ]
90 | }
91 | }
92 | }
93 | ```
94 |
95 | This configuration will use the `supergateway` package to establish a direct connection to our StreamableHttp endpoint, providing you with immediate access to all OP.GG data tools.
96 |
97 | ### Installing via Smithery
98 |
99 | To install OP.GG MCP for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@opgginc/opgg-mcp):
100 |
101 | ```bash
102 | $ npx -y @smithery/cli@latest install @opgginc/opgg-mcp --client claude --key {SMITHERY_API_KEY}
103 | ```
104 |
105 | ### Adding to MCP Configuration
106 |
107 | To add this server to your Claude Desktop MCP configuration, add the following entry to your `claude_desktop_config.json` file:
108 |
109 | #### Mac/Linux
110 |
111 | ```json
112 | {
113 | "mcpServers": {
114 | "opgg-mcp": {
115 | "command": "npx",
116 | "args": [
117 | "-y",
118 | "@smithery/cli@latest",
119 | "run",
120 | "@opgginc/opgg-mcp",
121 | "--key",
122 | "{SMITHERY_API_KEY}"
123 | ]
124 | }
125 | }
126 | }
127 | ```
128 |
129 | #### Windows
130 |
131 | ```json
132 | {
133 | "mcpServers": {
134 | "opgg-mcp": {
135 | "command": "cmd",
136 | "args": [
137 | "/c",
138 | "npx",
139 | "-y",
140 | "@smithery/cli@latest",
141 | "run",
142 | "@opgginc/opgg-mcp",
143 | "--key",
144 | "{SMITHERY_API_KEY}"
145 | ]
146 | }
147 | }
148 | }
149 | ```
150 |
151 | After adding the configuration, restart Claude Desktop for the changes to take effect.
152 |
153 | ## License
154 |
155 | This project is licensed under the MIT License - see the LICENSE file for details.
156 |
157 | ## Related Links
158 |
159 | - [Model Context Protocol](https://modelcontextprotocol.io)
160 | - [OP.GG](https://op.gg)
161 |
```
--------------------------------------------------------------------------------
/tsconfig.json:
--------------------------------------------------------------------------------
```json
1 | {
2 | "extends": "@tsconfig/node22/tsconfig.json",
3 | "compilerOptions": {
4 | "noEmit": true,
5 | "noUnusedLocals": true,
6 | "noUnusedParameters": true
7 | }
8 | }
9 |
```
--------------------------------------------------------------------------------
/smithery.yaml:
--------------------------------------------------------------------------------
```yaml
1 | # Smithery configuration file: https://smithery.ai/docs/config#smitheryyaml
2 |
3 | startCommand:
4 | type: stdio
5 | configSchema:
6 | # JSON Schema defining the configuration options for the MCP.
7 | type: object
8 | properties: {}
9 | default: {}
10 | description: No configuration required
11 | commandFunction:
12 | # A JS function that produces the CLI command based on the given config to start the MCP on stdio.
13 | |-
14 | (config) => ({command: 'node', args: ['dist/index.js']})
15 | exampleConfig: {}
16 |
```
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
```dockerfile
1 | # Generated by https://smithery.ai. See: https://smithery.ai/docs/config#dockerfile
2 | FROM node:lts-alpine AS build
3 | WORKDIR /app
4 | # Install dependencies including dev for build
5 | COPY package.json package-lock.json* pnpm-lock.yaml* ./
6 | RUN npm install
7 | # Copy source
8 | COPY . .
9 | # Build the project
10 | RUN npm run build
11 |
12 | # Production image
13 | FROM node:lts-alpine AS runtime
14 | WORKDIR /app
15 | # Copy only production dependencies
16 | COPY package.json package-lock.json* pnpm-lock.yaml* ./
17 | RUN npm install --production
18 | # Copy built files
19 | COPY --from=build /app/dist ./dist
20 |
21 | # Default command
22 | CMD ["node", "dist/index.js"]
23 |
```
--------------------------------------------------------------------------------
/package.json:
--------------------------------------------------------------------------------
```json
1 | {
2 | "name": "opgg-mcp",
3 | "version": "1.0.1",
4 | "main": "dist/index.js",
5 | "scripts": {
6 | "build": "tsup",
7 | "test": "npx @modelcontextprotocol/inspector@latest node dist/index.js"
8 | },
9 | "bin": {
10 | "opgg-mcp": "dist/index.js"
11 | },
12 | "keywords": [
13 | "MCP",
14 | "SSE",
15 | "proxy"
16 | ],
17 | "type": "module",
18 | "license": "MIT",
19 | "module": "dist/index.js",
20 | "types": "dist/index.d.ts",
21 | "dependencies": {
22 | "@modelcontextprotocol/sdk": "^1.10.0",
23 | "eventsource": "^3.0.6"
24 | },
25 | "repository": {
26 | "url": "https://github.com/opgginc/opgg-mcp"
27 | },
28 | "devDependencies": {
29 | "@tsconfig/node22": "^22.0.1",
30 | "@types/node": "^22.14.1",
31 | "tsup": "^8.4.0",
32 | "typescript": "^5.8.3"
33 | },
34 | "tsup": {
35 | "entry": [
36 | "src/index.ts"
37 | ],
38 | "format": [
39 | "esm"
40 | ],
41 | "dts": true,
42 | "splitting": true,
43 | "sourcemap": true,
44 | "clean": true
45 | }
46 | }
47 |
```
--------------------------------------------------------------------------------
/src/index.ts:
--------------------------------------------------------------------------------
```typescript
1 | #!/usr/bin/env node
2 |
3 | import { Client } from "@modelcontextprotocol/sdk/client/index.js";
4 | import { Server } from "@modelcontextprotocol/sdk/server/index.js";
5 | import { EventSource } from "eventsource";
6 | import { setTimeout } from "node:timers";
7 | import util from "node:util";
8 | import { proxyServer } from "./proxy-server.js";
9 | import { StreamableHTTPClientTransport } from '@modelcontextprotocol/sdk/client/streamableHttp.js';
10 | import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
11 |
12 | util.inspect.defaultOptions.depth = 8;
13 |
14 | if (!("EventSource" in global)) {
15 | // @ts-expect-error - figure out how to use --experimental-eventsource with vitest
16 | global.EventSource = EventSource;
17 | }
18 |
19 | const proxy = async (url: string): Promise<void> => {
20 | const client = new Client(
21 | {
22 | name: "ssl-client",
23 | version: "1.0.0",
24 | },
25 | {
26 | capabilities: {},
27 | },
28 | );
29 |
30 | const transport = new StreamableHTTPClientTransport(new URL(url));
31 | await client.connect(transport);
32 |
33 | const serverVersion = client.getServerVersion() as {
34 | name: string;
35 | version: string;
36 | };
37 | const serverCapabilities = client.getServerCapabilities() as {};
38 |
39 | const server = new Server(serverVersion, {
40 | capabilities: serverCapabilities,
41 | });
42 |
43 | const stdioTransport = new StdioServerTransport();
44 | await server.connect(stdioTransport);
45 |
46 | await proxyServer({
47 | server,
48 | client,
49 | serverCapabilities,
50 | });
51 | };
52 |
53 | const main = async () => {
54 | process.on("SIGINT", () => {
55 | console.info("SIGINT received, shutting down");
56 |
57 | setTimeout(() => {
58 | process.exit(0);
59 | }, 1000);
60 | });
61 |
62 | try {
63 | await proxy("https://mcp-api.op.gg/mcp");
64 | } catch (error) {
65 | console.error("could not start the proxy", error);
66 |
67 | setTimeout(() => {
68 | process.exit(1);
69 | }, 1000);
70 | }
71 | };
72 |
73 | await main();
74 |
```
--------------------------------------------------------------------------------
/src/proxy-server.ts:
--------------------------------------------------------------------------------
```typescript
1 | import { Server } from "@modelcontextprotocol/sdk/server/index.js";
2 | import {
3 | CallToolRequestSchema,
4 | CompleteRequestSchema,
5 | GetPromptRequestSchema,
6 | ListPromptsRequestSchema,
7 | ListResourcesRequestSchema,
8 | ListResourceTemplatesRequestSchema,
9 | ListToolsRequestSchema,
10 | LoggingMessageNotificationSchema,
11 | ReadResourceRequestSchema,
12 | SubscribeRequestSchema,
13 | UnsubscribeRequestSchema,
14 | ResourceUpdatedNotificationSchema,
15 | ServerCapabilities,
16 | } from "@modelcontextprotocol/sdk/types.js";
17 | import { Client } from "@modelcontextprotocol/sdk/client/index.js";
18 |
19 | export const proxyServer = async ({
20 | server,
21 | client,
22 | serverCapabilities,
23 | }: {
24 | server: Server;
25 | client: Client;
26 | serverCapabilities: ServerCapabilities;
27 | }) => {
28 | if (serverCapabilities?.logging) {
29 | server.setNotificationHandler(
30 | LoggingMessageNotificationSchema,
31 | async (args) => {
32 | return client.notification(args);
33 | },
34 | );
35 | }
36 |
37 | if (serverCapabilities?.prompts) {
38 | server.setRequestHandler(GetPromptRequestSchema, async (args) => {
39 | return client.getPrompt(args.params);
40 | });
41 |
42 | server.setRequestHandler(ListPromptsRequestSchema, async (args) => {
43 | return client.listPrompts(args.params);
44 | });
45 | }
46 |
47 | if (serverCapabilities?.resources) {
48 | server.setRequestHandler(ListResourcesRequestSchema, async (args) => {
49 | return client.listResources(args.params);
50 | });
51 |
52 | server.setRequestHandler(
53 | ListResourceTemplatesRequestSchema,
54 | async (args) => {
55 | return client.listResourceTemplates(args.params);
56 | },
57 | );
58 |
59 | server.setRequestHandler(ReadResourceRequestSchema, async (args) => {
60 | return client.readResource(args.params);
61 | });
62 |
63 | if (serverCapabilities?.resources.subscribe) {
64 | server.setNotificationHandler(
65 | ResourceUpdatedNotificationSchema,
66 | async (args) => {
67 | return client.notification(args);
68 | },
69 | );
70 |
71 | server.setRequestHandler(SubscribeRequestSchema, async (args) => {
72 | return client.subscribeResource(args.params);
73 | });
74 |
75 | server.setRequestHandler(UnsubscribeRequestSchema, async (args) => {
76 | return client.unsubscribeResource(args.params);
77 | });
78 | }
79 | }
80 |
81 | if (serverCapabilities?.tools) {
82 | server.setRequestHandler(CallToolRequestSchema, async (args) => {
83 | return client.callTool(args.params);
84 | });
85 |
86 | server.setRequestHandler(ListToolsRequestSchema, async (args) => {
87 | return client.listTools(args.params);
88 | });
89 | }
90 |
91 | server.setRequestHandler(CompleteRequestSchema, async (args) => {
92 | return client.complete(args.params);
93 | });
94 | };
95 |
```
--------------------------------------------------------------------------------
/docs/apps-sdk/_codex_cloud_code-review_.txt:
--------------------------------------------------------------------------------
```
1 | ---
2 | url: "https://developers.openai.com/codex/cloud/code-review/"
3 | title: "Code Review"
4 | ---
5 |
6 | ## Search the docs
7 |
8 | ⌘K/CtrlK
9 |
10 | Close
11 |
12 | Clear
13 |
14 | Primary navigation
15 |
16 | Codex
17 |
18 | ResourcesCodexChatGPTBlog
19 |
20 | Clear
21 |
22 | - [Home](https://developers.openai.com/codex)
23 | - [Quickstart](https://developers.openai.com/codex/quickstart)
24 | - [Concepts](https://developers.openai.com/codex/concepts)
25 | - [Pricing](https://developers.openai.com/codex/pricing)
26 | - [Changelog](https://developers.openai.com/codex/changelog)
27 |
28 | ### Codex CLI
29 |
30 | - [Overview](https://developers.openai.com/codex/cli)
31 | - [Configuration](https://developers.openai.com/codex/local-config#cli)
32 |
33 | ### Codex IDE Extension
34 |
35 | - [Set up your IDE](https://developers.openai.com/codex/ide)
36 | - [Configuration](https://developers.openai.com/codex/local-config#ide)
37 | - [IDE → Cloud tasks](https://developers.openai.com/codex/ide/cloud-tasks)
38 |
39 | ### Codex Cloud
40 |
41 | - [Delegate to Codex](https://developers.openai.com/codex/cloud)
42 | - [Environments](https://developers.openai.com/codex/cloud/environments)
43 | - [Code Review](https://developers.openai.com/codex/cloud/code-review)
44 | - [Internet Access](https://developers.openai.com/codex/cloud/internet-access)
45 |
46 | ### Codex SDK
47 |
48 | - [Overview](https://developers.openai.com/codex/sdk)
49 | - [TypeScript](https://developers.openai.com/codex/sdk#typescript-library)
50 | - [GitHub Action](https://developers.openai.com/codex/sdk#github-action)
51 |
52 | ### Guides
53 |
54 | - [Agents SDK](https://developers.openai.com/codex/guides/agents-sdk)
55 | - [Prompting Codex](https://developers.openai.com/codex/prompting)
56 | - [Model Context Protocol (MCP)](https://developers.openai.com/codex/mcp)
57 | - [Autofix CI](https://developers.openai.com/codex/autofix-ci)
58 | - [Enterprise Admin](https://developers.openai.com/codex/enterprise)
59 | - [Security Admin](https://developers.openai.com/codex/security)
60 | - [Codex on Windows](https://developers.openai.com/codex/windows)
61 |
62 | ### Integrations
63 |
64 | - [Slack](https://developers.openai.com/codex/integrations/slack)
65 |
66 | ### Resources
67 |
68 | - [AGENTS.md](https://agents.md/)
69 | - [Codex on GitHub](https://github.com/openai/codex)
70 |
71 | Codex can review code directly in GitHub. This is great for finding bugs and improving code quality.
72 |
73 | ## Setup
74 |
75 | Before you can use Codex directly inside GitHub, you will need to make sure [Codex cloud](https://developers.openai.com/codex/cloud) is set up.
76 |
77 | Afterwards, you can go into the [Codex settings](https://chatgpt.com/codex/settings/code-review) and enable “Code review” on your repository.
78 |
79 | 
80 |
81 | ## Usage
82 |
83 | After you have enabled Code review on your repository, you can start using it by tagging `@codex` in a comment on a pull request.
84 |
85 | To trigger a review by codex you’ll have to specifically write `@codex review`.
86 |
87 | 
88 |
89 | Afterwards you’ll see Codex react to your comment with 👀 acknowledging that it started your task.
90 |
91 | Once completed Codex will leave a regular code review in the PR the same way your team would do.
92 |
93 | 
94 |
95 | ## Giving Codex other tasks
96 |
97 | If you mention `@codex` in a comment with anything other than `review` Codex will kick off a [cloud task](https://developers.openai.com/codex/cloud) instead with the context of your pull request.
```
--------------------------------------------------------------------------------
/docs/apps-sdk/apps-sdk_plan_components.txt:
--------------------------------------------------------------------------------
```
1 | ---
2 | url: "https://developers.openai.com/apps-sdk/plan/components"
3 | title: "Design components"
4 | ---
5 |
6 | ## Search the docs
7 |
8 | ⌘K/CtrlK
9 |
10 | Close
11 |
12 | Clear
13 |
14 | Primary navigation
15 |
16 | ChatGPT
17 |
18 | ResourcesCodexChatGPTBlog
19 |
20 | Clear
21 |
22 | - [Home](https://developers.openai.com/apps-sdk)
23 |
24 | ### Core Concepts
25 |
26 | - [MCP Server](https://developers.openai.com/apps-sdk/concepts/mcp-server)
27 | - [User interaction](https://developers.openai.com/apps-sdk/concepts/user-interaction)
28 | - [Design guidelines](https://developers.openai.com/apps-sdk/concepts/design-guidelines)
29 |
30 | ### Plan
31 |
32 | - [Research use cases](https://developers.openai.com/apps-sdk/plan/use-case)
33 | - [Define tools](https://developers.openai.com/apps-sdk/plan/tools)
34 | - [Design components](https://developers.openai.com/apps-sdk/plan/components)
35 |
36 | ### Build
37 |
38 | - [Set up your server](https://developers.openai.com/apps-sdk/build/mcp-server)
39 | - [Build a custom UX](https://developers.openai.com/apps-sdk/build/custom-ux)
40 | - [Authenticate users](https://developers.openai.com/apps-sdk/build/auth)
41 | - [Persist state](https://developers.openai.com/apps-sdk/build/storage)
42 | - [Examples](https://developers.openai.com/apps-sdk/build/examples)
43 |
44 | ### Deploy
45 |
46 | - [Deploy your app](https://developers.openai.com/apps-sdk/deploy)
47 | - [Connect from ChatGPT](https://developers.openai.com/apps-sdk/deploy/connect-chatgpt)
48 | - [Test your integration](https://developers.openai.com/apps-sdk/deploy/testing)
49 |
50 | ### Guides
51 |
52 | - [Optimize Metadata](https://developers.openai.com/apps-sdk/guides/optimize-metadata)
53 | - [Security & Privacy](https://developers.openai.com/apps-sdk/guides/security-privacy)
54 | - [Troubleshooting](https://developers.openai.com/apps-sdk/deploy/troubleshooting)
55 |
56 | ### Resources
57 |
58 | - [Reference](https://developers.openai.com/apps-sdk/reference)
59 | - [App developer guidelines](https://developers.openai.com/apps-sdk/app-developer-guidelines)
60 |
61 | ## Why components matter
62 |
63 | UI components are the human-visible half of your connector. They let users view or edit data inline, switch to fullscreen when needed, and keep context synchronized between typed prompts and UI actions. Planning them early ensures your MCP server returns the right structured data and component metadata from day one.
64 |
65 | ## Clarify the user interaction
66 |
67 | For each use case, decide what the user needs to see and manipulate:
68 |
69 | - **Viewer vs. editor** – is the component read-only (a chart, a dashboard) or should it support editing and writebacks (forms, kanban boards)?
70 | - **Single-shot vs. multiturn** – will the user accomplish the task in one invocation, or should state persist across turns as they iterate?
71 | - **Inline vs. fullscreen** – some tasks are comfortable in the default inline card, while others benefit from fullscreen or picture-in-picture modes. Sketch these states before you implement.
72 |
73 | Write down the fields, affordances, and empty states you need so you can validate them with design partners and reviewers.
74 |
75 | ## Map data requirements
76 |
77 | Components should receive everything they need in the tool response. When planning:
78 |
79 | - **Structured content** – define the JSON payload that the component will parse.
80 | - **Initial component state** – use `window.openai.toolOutput` as the initial render data. On subsequent followups that invoke `callTool`, use the return value of `callTool`. To cache state for re-rendering, you can use `window.openai.setWidgetState`.
81 | - **Auth context** – note whether the component should display linked-account information, or whether the model must prompt the user to connect first.
82 |
83 | Feeding this data through the MCP response is simpler than adding ad-hoc APIs later.
84 |
85 | ## Design for responsive layouts
86 |
87 | Components run inside an iframe on both desktop and mobile. Plan for:
88 |
89 | - **Adaptive breakpoints** – set a max width and design layouts that collapse gracefully on small screens.
90 | - **Accessible color and motion** – respect system dark mode (match color-scheme) and provide focus states for keyboard navigation.
91 | - **Launcher transitions** – if the user opens your component from the launcher or expands to fullscreen, make sure navigation elements stay visible.
92 |
93 | Document CSS variables, font stacks, and iconography up front so they are consistent across components.
94 |
95 | ## Define the state contract
96 |
97 | Because components and the chat surface share conversation state, be explicit about what is stored where:
98 |
99 | - **Component state** – use the `window.openai.setWidgetState` API to persist state the host should remember (selected record, scroll position, staged form data).
100 | - **Server state** – store authoritative data in your backend or the built-in storage layer. Decide how to merge server changes back into component state after follow-up tool calls.
101 | - **Model messages** – think about what human-readable updates the component should send back via `sendFollowupTurn` so the transcript stays meaningful.
102 |
103 | Capturing this state diagram early prevents hard-to-debug sync issues later.
104 |
105 | ## Plan telemetry and debugging hooks
106 |
107 | Inline experiences are hardest to debug without instrumentation. Decide in advance how you will:
108 |
109 | - Emit analytics events for component loads, button clicks, and validation errors.
110 | - Log tool-call IDs alongside component telemetry so you can trace issues end to end.
111 | - Provide fallbacks when the component fails to load (e.g., show the structured JSON and prompt the user to retry).
112 |
113 | Once these plans are in place you are ready to move on to the implementation details in [Build a custom UX](https://developers.openai.com/apps-sdk/build/custom-ux).
```
--------------------------------------------------------------------------------
/docs/apps-sdk/apps-sdk_plan_use-case.txt:
--------------------------------------------------------------------------------
```
1 | ---
2 | url: "https://developers.openai.com/apps-sdk/plan/use-case"
3 | title: "Research use cases"
4 | ---
5 |
6 | ## Search the docs
7 |
8 | ⌘K/CtrlK
9 |
10 | Close
11 |
12 | Clear
13 |
14 | Primary navigation
15 |
16 | ChatGPT
17 |
18 | ResourcesCodexChatGPTBlog
19 |
20 | Clear
21 |
22 | - [Home](https://developers.openai.com/apps-sdk)
23 |
24 | ### Core Concepts
25 |
26 | - [MCP Server](https://developers.openai.com/apps-sdk/concepts/mcp-server)
27 | - [User interaction](https://developers.openai.com/apps-sdk/concepts/user-interaction)
28 | - [Design guidelines](https://developers.openai.com/apps-sdk/concepts/design-guidelines)
29 |
30 | ### Plan
31 |
32 | - [Research use cases](https://developers.openai.com/apps-sdk/plan/use-case)
33 | - [Define tools](https://developers.openai.com/apps-sdk/plan/tools)
34 | - [Design components](https://developers.openai.com/apps-sdk/plan/components)
35 |
36 | ### Build
37 |
38 | - [Set up your server](https://developers.openai.com/apps-sdk/build/mcp-server)
39 | - [Build a custom UX](https://developers.openai.com/apps-sdk/build/custom-ux)
40 | - [Authenticate users](https://developers.openai.com/apps-sdk/build/auth)
41 | - [Persist state](https://developers.openai.com/apps-sdk/build/storage)
42 | - [Examples](https://developers.openai.com/apps-sdk/build/examples)
43 |
44 | ### Deploy
45 |
46 | - [Deploy your app](https://developers.openai.com/apps-sdk/deploy)
47 | - [Connect from ChatGPT](https://developers.openai.com/apps-sdk/deploy/connect-chatgpt)
48 | - [Test your integration](https://developers.openai.com/apps-sdk/deploy/testing)
49 |
50 | ### Guides
51 |
52 | - [Optimize Metadata](https://developers.openai.com/apps-sdk/guides/optimize-metadata)
53 | - [Security & Privacy](https://developers.openai.com/apps-sdk/guides/security-privacy)
54 | - [Troubleshooting](https://developers.openai.com/apps-sdk/deploy/troubleshooting)
55 |
56 | ### Resources
57 |
58 | - [Reference](https://developers.openai.com/apps-sdk/reference)
59 | - [App developer guidelines](https://developers.openai.com/apps-sdk/app-developer-guidelines)
60 |
61 | ## Why start with use cases
62 |
63 | Every successful Apps SDK app starts with a crisp understanding of what the user is trying to accomplish. Discovery in ChatGPT is model-driven: the assistant chooses your app when your tool metadata, descriptions, and past usage align with the user’s prompt and memories. That only works if you have already mapped the tasks the model should recognize and the outcomes you can deliver.
64 |
65 | Use this page to capture your hypotheses, pressure-test them with prompts, and align your team on scope before you define tools or build components.
66 |
67 | ## Gather inputs
68 |
69 | Begin with qualitative and quantitative research:
70 |
71 | - **User interviews and support requests** – capture the jobs-to-be-done, terminology, and data sources users rely on today.
72 | - **Prompt sampling** – list direct asks (e.g., “show my Jira board”) and indirect intents (“what am I blocked on for the launch?”) that should route to your app.
73 | - **System constraints** – note any compliance requirements, offline data, or rate limits that will influence tool design later.
74 |
75 | Document the user persona, the context they are in when they reach for ChatGPT, and what success looks like in a single sentence for each scenario.
76 |
77 | ## Define evaluation prompts
78 |
79 | Decision boundary tuning is easier when you have a golden set to iterate against. For each use case:
80 |
81 | 1. **Author at least five direct prompts** that explicitly reference your data, product name, or verbs you expect the user to say.
82 | 2. **Draft five indirect prompts** where the user states a goal but not the tool (“I need to keep our launch tasks organized”).
83 | 3. **Add negative prompts** that should _not_ trigger your app so you can measure precision.
84 |
85 | Use these prompts later in [Optimize metadata](https://developers.openai.com/apps-sdk/guides/optimize-metadata) to hill-climb on recall and precision without overfitting to a single request.
86 |
87 | ## Scope the minimum lovable feature
88 |
89 | For each use case decide:
90 |
91 | - **What information must be visible inline** to answer the question or let the user act.
92 | - **Which actions require write access** and whether they should be gated behind confirmation in developer mode.
93 | - **What state needs to persist** between turns—for example, filters, selected rows, or draft content.
94 |
95 | Rank the use cases based on user impact and implementation effort. A common pattern is to ship one P0 scenario with a high-confidence component, then expand to P1 scenarios once discovery data confirms engagement.
96 |
97 | ## Translate use cases into tooling
98 |
99 | Once a scenario is in scope, draft the tool contract:
100 |
101 | - Inputs: the parameters the model can safely provide. Keep them explicit, use enums when the set is constrained, and document defaults.
102 | - Outputs: the structured content you will return. Add fields the model can reason about (IDs, timestamps, status) in addition to what your UI renders.
103 | - Component intent: whether you need a read-only viewer, an editor, or a multiturn workspace. This influences the [component planning](https://developers.openai.com/apps-sdk/plan/components) and storage model later.
104 |
105 | Review these drafts with stakeholders—especially legal or compliance teams—before you invest in implementation. Many integrations require PII reviews or data processing agreements before they can ship to production.
106 |
107 | ## Prepare for iteration
108 |
109 | Even with solid planning, expect to revise prompts and metadata after your first dogfood. Build time into your schedule for:
110 |
111 | - Rotating through the golden prompt set weekly and logging tool selection accuracy.
112 | - Collecting qualitative feedback from early testers in ChatGPT developer mode.
113 | - Capturing analytics (tool calls, component interactions) so you can measure adoption.
114 |
115 | These research artifacts become the backbone for your roadmap, changelog, and success metrics once the app is live.
```
--------------------------------------------------------------------------------
/docs/apps-sdk/_codex_pricing_.txt:
--------------------------------------------------------------------------------
```
1 | ---
2 | url: "https://developers.openai.com/codex/pricing/"
3 | title: "Codex Pricing"
4 | ---
5 |
6 | ## Search the docs
7 |
8 | ⌘K/CtrlK
9 |
10 | Close
11 |
12 | Clear
13 |
14 | Primary navigation
15 |
16 | Codex
17 |
18 | ResourcesCodexChatGPTBlog
19 |
20 | Clear
21 |
22 | - [Home](https://developers.openai.com/codex)
23 | - [Quickstart](https://developers.openai.com/codex/quickstart)
24 | - [Concepts](https://developers.openai.com/codex/concepts)
25 | - [Pricing](https://developers.openai.com/codex/pricing)
26 | - [Changelog](https://developers.openai.com/codex/changelog)
27 |
28 | ### Codex CLI
29 |
30 | - [Overview](https://developers.openai.com/codex/cli)
31 | - [Configuration](https://developers.openai.com/codex/local-config#cli)
32 |
33 | ### Codex IDE Extension
34 |
35 | - [Set up your IDE](https://developers.openai.com/codex/ide)
36 | - [Configuration](https://developers.openai.com/codex/local-config#ide)
37 | - [IDE → Cloud tasks](https://developers.openai.com/codex/ide/cloud-tasks)
38 |
39 | ### Codex Cloud
40 |
41 | - [Delegate to Codex](https://developers.openai.com/codex/cloud)
42 | - [Environments](https://developers.openai.com/codex/cloud/environments)
43 | - [Code Review](https://developers.openai.com/codex/cloud/code-review)
44 | - [Internet Access](https://developers.openai.com/codex/cloud/internet-access)
45 |
46 | ### Codex SDK
47 |
48 | - [Overview](https://developers.openai.com/codex/sdk)
49 | - [TypeScript](https://developers.openai.com/codex/sdk#typescript-library)
50 | - [GitHub Action](https://developers.openai.com/codex/sdk#github-action)
51 |
52 | ### Guides
53 |
54 | - [Agents SDK](https://developers.openai.com/codex/guides/agents-sdk)
55 | - [Prompting Codex](https://developers.openai.com/codex/prompting)
56 | - [Model Context Protocol (MCP)](https://developers.openai.com/codex/mcp)
57 | - [Autofix CI](https://developers.openai.com/codex/autofix-ci)
58 | - [Enterprise Admin](https://developers.openai.com/codex/enterprise)
59 | - [Security Admin](https://developers.openai.com/codex/security)
60 | - [Codex on Windows](https://developers.openai.com/codex/windows)
61 |
62 | ### Integrations
63 |
64 | - [Slack](https://developers.openai.com/codex/integrations/slack)
65 |
66 | ### Resources
67 |
68 | - [AGENTS.md](https://agents.md/)
69 | - [Codex on GitHub](https://github.com/openai/codex)
70 |
71 | ## Pricing plans
72 |
73 | Codex is included in your ChatGPT Plus, Pro, Business, Edu, or Enterprise plan.
74 |
75 | Each plan offers different usage limits for local and cloud tasks, which you can find more details about below.
76 |
77 | Refer to our ChatGPT [pricing page](https://chatgpt.com/pricing/) for details about each plan.
78 |
79 | ## Usage limits
80 |
81 | Codex usage limits depend on your plan and where you execute tasks. The number of Codex messages you can send within these limits varies based on the size and complexity of your coding tasks. Small scripts or simple functions may only consume a fraction of your allowance, while larger codebases, multi-file projects, or extended sessions that require Codex to hold more context will use significantly more per message.
82 |
83 | Cloud tasks will not count toward usage limits until October 20, 2025.
84 |
85 | When you hit your usage limit, you won’t be able to use Codex until your usage window resets.
86 |
87 | If you need more usage, you may use an API key to run additional local tasks (usage billed at standard API rates)—refer to the [pay-as-you-go section below](https://developers.openai.com/codex/pricing/#use-an-openai-api-key).
88 | For Business, Edu, and Enterprise plans with flexible pricing, you may also consider purchasing extra user credits.
89 |
90 | ### Plus
91 |
92 | - Usage limits apply across both local and cloud tasks. Average users can send about 30-150 local messages or 5-40 cloud tasks every 5 hours, with a shared weekly limit.
93 | - For a limited time, Code Review on your own pull requests does not count toward usage limits.
94 | - _Best for developers looking to power a few focused coding sessions each week._
95 |
96 | ### Pro
97 |
98 | - Usage limits apply across both local and cloud tasks. Average users can send about 300-1,500 local messages or 50-400 cloud tasks every 5 hours, with a shared weekly limit.
99 | - For a limited time, Code Review on your own pull requests does not count toward usage limits.
100 | - _Best for developers looking to power their full workday across multiple projects._
101 |
102 | ### Business
103 |
104 | Business plans include the same per-seat usage limits as Plus. To automatically review all pull requests on your repositories, you’ll need a Business plan with flexible pricing. Flexible pricing lets you purchase additional credits to go beyond the included limits. Please refer to the ChatGPT rate card for more information.
105 |
106 | ### Enterprise and Edu
107 |
108 | For Enterprise and Edu plans using flexible pricing, usage draws down from your workspace’s shared credit pool. Please refer to the ChatGPT rate card for more information.
109 |
110 | Enterprise and Edu plans without flexible pricing include the same per-seat usage limits as Plus. To automatically review all pull requests on your repositories, you’ll need flexible pricing.
111 |
112 | ## Use an OpenAI API key
113 |
114 | You can extend your local Codex usage (CLI and IDE extension) with an API key. API key usage is billed through your OpenAI platform account at the standard API rates, which you can review on the [API pricing page](https://openai.com/api/pricing/).
115 |
116 | First, make sure you set up your `OPENAI_API_KEY` environment variable globally.
117 | You can get your API key from the [OpenAI dashboard](https://platform.openai.com/api-keys).
118 |
119 | Then, you can use the CLI and IDE extension with your API key.
120 |
121 | If you’ve previously used the Codex CLI with an API key, update to the latest version, run `codex logout`, and then run `codex` to switch back to subscription-based access when you’re ready.
122 |
123 | ### Use your API key with Codex CLI
124 |
125 | You can change which auth method to use with the CLI by changing the `preferred_auth_method` in the codex config file:
126 |
127 | ```
128 | # ~/.codex/config.toml
129 | preferred_auth_method = "apikey"
130 |
131 | ```
132 |
133 | You can also override it ad-hoc via CLI:
134 |
135 | ```
136 | codex --config preferred_auth_method="apikey"
137 |
138 | ```
139 |
140 | You can go back to ChatGPT auth (default) by running:
141 |
142 | ```
143 | codex --config preferred_auth_method="chatgpt"
144 |
145 | ```
146 |
147 | You can switch back and forth as needed, for example if you use your ChatGPT account but run out of usage credits.
148 |
149 | ### Use your API key with the IDE extension
150 |
151 | When you open the IDE extension, you’ll be prompted to sign in with your ChatGPT account or to use your API key instead.
152 | If you wish to use your API key instead, you can select the option to use your API key.
153 | Make sure it is configured in your environment variables.
```
--------------------------------------------------------------------------------
/docs/apps-sdk/apps-sdk_app-developer-guidelines.txt:
--------------------------------------------------------------------------------
```
1 | ---
2 | url: "https://developers.openai.com/apps-sdk/app-developer-guidelines"
3 | title: "App developer guidelines"
4 | ---
5 |
6 | ## Search the docs
7 |
8 | ⌘K/CtrlK
9 |
10 | Close
11 |
12 | Clear
13 |
14 | Primary navigation
15 |
16 | ChatGPT
17 |
18 | ResourcesCodexChatGPTBlog
19 |
20 | Clear
21 |
22 | - [Home](https://developers.openai.com/apps-sdk)
23 |
24 | ### Core Concepts
25 |
26 | - [MCP Server](https://developers.openai.com/apps-sdk/concepts/mcp-server)
27 | - [User interaction](https://developers.openai.com/apps-sdk/concepts/user-interaction)
28 | - [Design guidelines](https://developers.openai.com/apps-sdk/concepts/design-guidelines)
29 |
30 | ### Plan
31 |
32 | - [Research use cases](https://developers.openai.com/apps-sdk/plan/use-case)
33 | - [Define tools](https://developers.openai.com/apps-sdk/plan/tools)
34 | - [Design components](https://developers.openai.com/apps-sdk/plan/components)
35 |
36 | ### Build
37 |
38 | - [Set up your server](https://developers.openai.com/apps-sdk/build/mcp-server)
39 | - [Build a custom UX](https://developers.openai.com/apps-sdk/build/custom-ux)
40 | - [Authenticate users](https://developers.openai.com/apps-sdk/build/auth)
41 | - [Persist state](https://developers.openai.com/apps-sdk/build/storage)
42 | - [Examples](https://developers.openai.com/apps-sdk/build/examples)
43 |
44 | ### Deploy
45 |
46 | - [Deploy your app](https://developers.openai.com/apps-sdk/deploy)
47 | - [Connect from ChatGPT](https://developers.openai.com/apps-sdk/deploy/connect-chatgpt)
48 | - [Test your integration](https://developers.openai.com/apps-sdk/deploy/testing)
49 |
50 | ### Guides
51 |
52 | - [Optimize Metadata](https://developers.openai.com/apps-sdk/guides/optimize-metadata)
53 | - [Security & Privacy](https://developers.openai.com/apps-sdk/guides/security-privacy)
54 | - [Troubleshooting](https://developers.openai.com/apps-sdk/deploy/troubleshooting)
55 |
56 | ### Resources
57 |
58 | - [Reference](https://developers.openai.com/apps-sdk/reference)
59 | - [App developer guidelines](https://developers.openai.com/apps-sdk/app-developer-guidelines)
60 |
61 | Apps SDK is available in preview today for developers to begin building and
62 | testing their apps. We will open for app submission later this year.
63 |
64 | ## Overview
65 |
66 | The ChatGPT app ecosystem is built on trust. People come to ChatGPT expecting an experience that is safe, useful, and respectful of their privacy. Developers come to ChatGPT expecting a fair and transparent process. These developer guidelines set the policies every builder is expected to review and follow.
67 |
68 | Before we get into the specifics, a great ChatGPT app:
69 |
70 | - **Does something clearly valuable.** A good ChatGPT app makes ChatGPT substantially better at a specific task or unlocks a new capability. Our [design guidelines](https://developers.openai.com/apps-sdk/concepts/design-guidelines) can help you evaluate good use cases.
71 | - **Respects users’ privacy.** Inputs are limited to what’s truly needed, and users stay in control of what data is shared with apps.
72 | - **Behaves predictably.** Apps do exactly what they say they’ll do—no surprises, no hidden behavior.
73 | - **Is safe for a broad audience.** Apps comply with [OpenAI’s usage policies](https://openai.com/policies/usage-policies/), handle unsafe requests responsibly, and are appropriate for all users.
74 | - **Is accountable.** Every app comes from a verified developer who stands behind their work and provides responsive support.
75 |
76 | The sections below outline the **minimum standard** a developer must meet for their app to be listed in the app directory. Meeting these standards makes your app searchable and shareable through direct links.
77 |
78 | To qualify for **enhanced distribution opportunities**—such as merchandising in the directory or proactive suggestions in conversations—apps must also meet the higher standards in our [design guidelines](https://developers.openai.com/apps-sdk/concepts/design-guidelines). Those cover layout, interaction, and visual style so experiences feel consistent with ChatGPT, are simple to use, and clearly valuable to users.
79 |
80 | These developer guidelines are an early preview and may evolve as we learn from the community. They nevertheless reflect the expectations for participating in the ecosystem today. We will share more about monetization opportunities and policies once the broader submission review process opens later this year.
81 |
82 | ## App fundamentals
83 |
84 | ### Purpose and originality
85 |
86 | Apps should serve a clear purpose and reliably do what they promise. Only use intellectual property that you own or have permission to use. Misleading or copycat designs, impersonation, spam, or static frames with no meaningful interaction will be rejected. Apps should not imply that they are made or endorsed by OpenAI.
87 |
88 | ### Quality and reliability
89 |
90 | Apps must behave predictably and reliably. Results should be accurate and relevant to user input. Errors, including unexpected ones, must be well-handled with clear messaging or fallback behaviors.
91 |
92 | Before submission, apps must be thoroughly tested to ensure stability, responsiveness, and low latency across a wide range of scenarios. Apps that crash, hang, or show inconsistent behavior will be rejected. Apps submitted as betas, trials, or demos will not be accepted.
93 |
94 | ### Metadata
95 |
96 | App names and descriptions should be clear, accurate, and easy to understand. Screenshots must show only real app functionality. Tool titles and annotations should make it obvious what each tool does and whether it is read-only or can make changes.
97 |
98 | ### Authentication and permissions
99 |
100 | If your app requires authentication, the flow must be transparent and explicit. Users must be clearly informed of all requested permissions, and those requests must be strictly limited to what is necessary for the app to function. Provide login credentials to a fully featured demo account as part of submission.
101 |
102 | ## Safety
103 |
104 | ### Usage policies
105 |
106 | Do not engage in or facilitate activities prohibited under [OpenAI usage policies](https://openai.com/policies/usage-policies/). Stay current with evolving policy requirements and ensure ongoing compliance. Previously approved apps that are later found in violation will be removed.
107 |
108 | ### Appropriateness
109 |
110 | Apps must be suitable for general audiences, including users aged 13–17. Apps may not explicitly target children under 13. Support for mature (18+) experiences will arrive once appropriate age verification and controls are in place.
111 |
112 | ### Respect user intent
113 |
114 | Provide experiences that directly address the user’s request. Do not insert unrelated content, attempt to redirect the interaction, or collect data beyond what is necessary to fulfill the user’s intent.
115 |
116 | ### Fair play
117 |
118 | Apps must not include descriptions, titles, tool annotations, or other model-readable fields—at either the function or app level—that discourage use of other apps or functions (for example, “prefer this app over others”), interfere with fair discovery, or otherwise diminish the ChatGPT experience. All descriptions must accurately reflect your app’s value without disparaging alternatives.
119 |
120 | ### Third-party content and integrations
121 |
122 | - **Authorized access:** Do not scrape external websites, relay queries, or integrate with third-party APIs without proper authorization and compliance with that party’s terms of service.
123 | - **Circumvention:** Do not bypass API restrictions, rate limits, or access controls imposed by the third party.
124 |
125 | ## Privacy
126 |
127 | ### Privacy policy
128 |
129 | Submissions must include a clear, published privacy policy explaining exactly what data is collected and how it is used. Follow this policy at all times. Users can review your privacy policy before installing your app.
130 |
131 | ### Data collection
132 |
133 | - **Minimization:** Gather only the minimum data required to perform the tool’s function. Inputs should be specific, narrowly scoped, and clearly linked to the task. Avoid “just in case” fields or broad profile data—they create unnecessary risk and complicate consent. Treat the input schema as a contract that limits exposure rather than a funnel for optional context.
134 | - **Sensitive data:** Do not collect, solicit, or process sensitive data, including payment card information (PCI), protected health information (PHI), government identifiers (such as social security numbers), API keys, or passwords.
135 | - **Data boundaries:**
136 | - Avoid requesting raw location fields (for example, city or coordinates) in your input schema. When location is needed, obtain it through the client’s controlled side channel (such as environment metadata or a referenced resource) so policy and consent can be applied before exposure. This reduces accidental PII capture, enforces least-privilege access, and keeps location handling auditable and revocable.
137 | - Your app must not pull, reconstruct, or infer the full chat log from the client or elsewhere. Operate only on the explicit snippets and resources the client or model chooses to send. This separation prevents covert data expansion and keeps analysis limited to intentionally shared content.
138 |
139 | ### Transparency and user control
140 |
141 | - **Data practices:** Do not engage in surveillance, tracking, or behavioral profiling—including metadata collection such as timestamps, IPs, or query patterns—unless explicitly disclosed, narrowly scoped, and aligned with [OpenAI’s usage policies](https://openai.com/policies/usage-policies/).
142 | - **Accurate action labels:** Mark any tool that changes external state (create, modify, delete) as a write action. Read-only tools must be side-effect-free and safe to retry. Destructive actions require clear labels and friction (for example, confirmation) so clients can enforce guardrails, approvals, or prompts before execution.
143 | - **Preventing data exfiltration:** Any action that sends data outside the current boundary (for example, posting messages, sending emails, or uploading files) must be surfaced to the client as a write action so it can require user confirmation or run in preview mode. This reduces unintentional data leakage and aligns server behavior with client-side security expectations.
144 |
145 | ## Developer verification
146 |
147 | ### Verification
148 |
149 | All submissions must come from verified individuals or organizations. Once the submission process opens broadly, we will provide a straightforward way to confirm your identity and affiliation with any represented business. Repeated misrepresentation, hidden behavior, or attempts to game the system will result in removal from the program.
150 |
151 | ### Support contact details
152 |
153 | Provide customer support contact details where end users can reach you for help. Keep this information accurate and up to date.
154 |
155 | ## After submission
156 |
157 | ### Reviews and checks
158 |
159 | We may perform automated scans or manual reviews to understand how your app works and whether it may conflict with our policies. If your app is rejected or removed, you will receive feedback and may have the opportunity to appeal.
160 |
161 | ### Maintenance and removal
162 |
163 | Apps that are inactive, unstable, or no longer compliant may be removed. We may reject or remove any app from our services at any time and for any reason without notice, such as for legal or security concerns or policy violations.
164 |
165 | ### Re-submission for changes
166 |
167 | Once your app is listed in the directory, tool names, signatures, and descriptions are locked. To change or add tools, you must resubmit the app for review.
168 |
169 | We believe apps for ChatGPT will unlock entirely new, valuable experiences and give you a powerful way to reach and delight a global audience. We’re excited to work together and see what you build.
```
--------------------------------------------------------------------------------
/docs/apps-sdk/_blog_realtime-api_.txt:
--------------------------------------------------------------------------------
```
1 | ---
2 | url: "https://developers.openai.com/blog/realtime-api/"
3 | title: "Developer notes on the Realtime API"
4 | ---
5 |
6 | ## Search the docs
7 |
8 | ⌘K/CtrlK
9 |
10 | Close
11 |
12 | Clear
13 |
14 | Primary navigation
15 |
16 | Blog
17 |
18 | ResourcesCodexChatGPTBlog
19 |
20 | Clear
21 |
22 | - [All posts](https://developers.openai.com/blog)
23 |
24 | ### Recent
25 |
26 | - [Why we built the Responses API](https://developers.openai.com/blog/responses-api)
27 | - [Developer notes on the Realtime API](https://developers.openai.com/blog/realtime-api)
28 | - [Hello, world!](https://developers.openai.com/blog/intro)
29 |
30 | 
31 |
32 | We recently [announced](https://openai.com/index/introducing-gpt-realtime/) our latest speech-to-speech
33 | model, `gpt-realtime`, in addition to the general availability of the Realtime API and
34 | a bunch of new API features. The Realtime API and speech-to-speech (s2s) model graduated to general availability (GA) with major improvements in model quality, reliability, and developer ergonomics.
35 |
36 | While you can discover the new API features in
37 | [the docs](https://platform.openai.com/docs/guides/realtime) and [API reference](https://platform.openai.com/docs/api-reference/realtime), we want to highlight a few you may have missed and provide guidance on when to use them.
38 | If you’re integrating with the Realtime API, we hope you’ll find these notes interesting.
39 |
40 | ## Model improvements
41 |
42 | The new model includes a number of improvements meant to better support production voice apps. We’re
43 | focusing on API changes in this post. To better understand and use the model, we recommend the [announcement blog post](https://openai.com/index/introducing-gpt-realtime/) and
44 | [realtime prompting guide](https://cookbook.openai.com/examples/realtime_prompting_guide). However, we’ll point out some specifics.
45 |
46 | A few key pieces of advice for using this model:
47 |
48 | - Experiment with prompting in the [realtime playground](https://platform.openai.com/playground/realtime).
49 | - Use the `marin` or `cedar` voices for best assistant voice quality.
50 | - Rewrite prompts for the new model. Due to instruction-following improvements, specific instructions are now much more powerful.
51 | - For example, a prompt that said, “Always say X when Y,” may have been treated by the old model as vague guidance, whereas the new the model may adhere to it in unexpected situations.
52 | - Pay attention to the specific instructions you’re providing. Assume instructions will be followed.
53 |
54 | ## API shape changes
55 |
56 | We updated the Realtime API shape with the GA launch, meaning there’s a beta interface and a GA interface. We recommend that clients migrate to integrate against the GA interface, as it gives new features, and the beta interface will eventually be deprecated.
57 |
58 | A complete list of the changes needed for migration can be found in the [beta to GA migration docs](https://platform.openai.com/docs/guides/realtime#beta-to-ga-migration).
59 |
60 | You can access the new `gpt-realtime` model with the beta interface, but certain features may be unsupported. See below for more details.
61 |
62 | ### Feature availability
63 |
64 | The Realtime API GA release includes a number of new features. Some of these are enabled on older models, and some are not.
65 |
66 | | Feature | GA model | Beta model |
67 | | --- | --- | --- |
68 | | Image input | ✅ | ❌ |
69 | | Long context | ✅ | ✅ |
70 | | Async function calling | ✅ | ❌ |
71 | | Prompts | ✅ | ✅ |
72 | | MCP | ✅ _Best with async FC_ | ✅ _Limited without async FC\*_ |
73 | | Audio token → text | ✅ | ❌ |
74 | | EU data residency | ✅ | ✅ _06-03 only_ |
75 | | SIP | ✅ | ✅ |
76 | | Idle timeouts | ✅ | ✅ |
77 |
78 | \*Because the beta model lacks async function calling, pending MCP tool calls without an output may not be treated well by the model. We recommend using the GA model with MCP.
79 |
80 | ### Changes to temperature
81 |
82 | The GA interface has removed `temperature` as a model parameter, and the beta interface limits
83 | temperature to a range of `0.6 - 1.2` with a default of `0.8`.
84 |
85 | You may be asking, “Why can’t users set temperature arbitrarily and use it for things like making the response more
86 | deterministic?” The answer is that temperature behaves differently for this model architecture, and users are nearly always best served by setting temperature to the recommended `0.8`.
87 |
88 | From what we’ve observed, there isn’t a way to make these audio responses deterministic with low temperatures, and higher
89 | temperatures result in audio abberations. We recommend experimenting with prompting to control
90 | these dimensions of model behavior.
91 |
92 | ## New features
93 |
94 | In addition to the changes from beta to GA, we’ve added several new features to the Realtime API.
95 |
96 | All features are covered in [the docs](https://platform.openai.com/docs/guides/realtime) and [API reference](https://platform.openai.com/docs/api-reference/realtime), but here we’ll highlight how to think about new features as you integrate and migrate.
97 |
98 | ### Conversation idle timeouts
99 |
100 | For some applications, it’d be unexpected to have a long gap of input from the user. Imagine a phone call—if we didn’t hear from the person on the other line, we’d ask about their status. Maybe the model missed what the user said, or maybe the user isn’t sure if the model is still speaking. We’ve added a feature to automatically trigger the model to say something like: “Are you still there?”
101 |
102 | Enable this feature by setting `idle_timeout_ms` on the `server_vad` settings for turn detection.
103 | The timeout value will be applied after the last model response’s audio has finished playing—
104 | i.e., timeout value is set to the `response.done` time plus audio playback duration plus timeout time. If VAD does not fire for that period, the timeout is triggered.
105 |
106 | When the timeout is triggered, the server sends an [`input_audio_buffer.timeout_triggered`](https://platform.openai.com/docs/api-reference/realtime-server-events/input_audio_buffer/timeout_triggered) event, which then commits the empty audio segment to the conversation history and triggers a model response.
107 | Committing the empty audio gives the model a chance to check whether VAD failed and there was a user utterance
108 | during the relevant period.
109 |
110 | Clients can enable this feature like so:
111 |
112 | ```
113 | {
114 | "type": "session.update",
115 | "session": {
116 | "type": "realtime",
117 | "instructions": "You are a helpful assistant.",
118 | "audio": {
119 | "input": {
120 | "turn_detection": {
121 | "type": "server_vad",
122 | "idle_timeout_ms": 6000
123 | }
124 | }
125 | }
126 | }
127 | }
128 |
129 | ```
130 |
131 | ### Long conversations and context handling
132 |
133 | We’ve tweaked how the Realtime API handles long sessions. A few things to keep in mind:
134 |
135 | - Realtime sessions can now last up to 60 minutes, up from 30 minutes.
136 | - The `gpt-realtime` model has a token window of 32,768 tokens. Responses can consume a maximum of 4,096 tokens. This means the model has a maximum input of 28,672 tokens.
137 | - The session instructions plus tools can have a maximum length of 16,384 tokens.
138 | - The service will automatically truncate (drop) messages when the session reaches 28,672 tokens, but this is configurable.
139 | - The GA service will automatically drop some audio tokens when a transcript is available to save tokens.
140 |
141 | #### Configuring truncation settings
142 |
143 | What happens when the conversation context window fills up to the token limit is that after the limit is reached, the Realtime API
144 | automatically starts truncating (dropping) messages from the beginning of the session (the oldest messages).
145 | You can disable this truncation behavior by setting `"truncation": "disabled"`, which instead throws an error
146 | when a response has too many input tokens. Truncation is useful, however, because the session continues even if the input size grows too large for the model. The Realtime API doesn’t do summarization or compaction of dropped messages, but you can implement it on your own.
147 |
148 | A negative effect of truncation is that changing messages at the beginning of the conversation busts the [token prompt cache](https://platform.openai.com/docs/guides/prompt-caching). Prompt caching works by identifying identical, exact-match content prefixing your prompts. On each subsequent turn, only the tokens that haven’t changed are cached. When truncation alters the beginning of the conversation, it reduces the number of tokens that can be cached.
149 |
150 | We’ve implemented a feature to mitigate this negative effect by truncating more than necessary whenever truncation occurs. Set retention ratio
151 | to `0.8` to truncate 20% of the context window rather than truncating just enough to keep the input
152 | token count under the ceiling. The idea is to truncate _more_ of the context window _once_, rather than truncating a little bit every time, so you bust the cache less often. This cache-friendly approach can keep costs down for long sessions that reach input limits.
153 |
154 | ```
155 | {
156 | "type": "session.update",
157 | "session": {
158 | "truncation": {
159 | "type": "retention_ratio",
160 | "retention_ratio": 0.8
161 | }
162 | }
163 | }
164 |
165 | ```
166 |
167 | ### Asynchronous function calling
168 |
169 | Whereas the Responses API forces a function response immediately after the function call, the Realtime API allows clients to continue a session while a function call is pending. This continuation is good for UX, allowing realtime conversations to continue naturally, but the model sometimes hallucinates the content of a nonexistent function response.
170 |
171 | To mitigate this issue, the GA Responses API adds placeholder responses with content we’ve evaluated and tuned in experiments to ensure the model performs gracefully, even while awaiting a function response. If you ask the model for the results of a function call, it’ll say something like, “I’m still waiting on that.” This feature is automatically enabled for new models—no changes necessary on your end.
172 |
173 | ### EU data residency
174 |
175 | EU data residency is now supported specifically for the `gpt-realtime-2025-08-28` and `gpt-4o-realtime-preview-2025-06-03`. Data residency must be explicitly enabled for an organization and accessed through `https://eu.api.openai.com`.
176 |
177 | ### Tracing
178 |
179 | The Realtime API logs traces to the [developer console](https://platform.openai.com/logs?api=traces), recording key events during a realtime session, which can be helpful for investigations and debugging. As part of GA, we launched a few new event types:
180 |
181 | - Session updated (when `session.updated` events are sent to the client)
182 | - Output text generation (for text generated by the model)
183 |
184 | ### Hosted prompts
185 |
186 | You can now use [prompts with the Realtime API](https://platform.openai.com/docs/guides/realtime-models-prompting#update-your-session-to-use-a-prompt) as a convenient way to have your application code
187 | refer to a prompt that can be edited separately. Prompts include both instructions and
188 | session configuration, such as turn detection settings.
189 |
190 | You can create a prompt in the [realtime playground](https://platform.openai.com/audio/realtime), iterating on it and versioning it as needed, and then a client can reference that prompt by ID, like so:
191 |
192 | ```
193 | {
194 | "type": "session.update",
195 | "session": {
196 | "type": "realtime",
197 | "prompt": {
198 | "id": "pmpt_123", // your stored prompt ID
199 | "version": "89", // optional: pin a specific version
200 | "variables": {
201 | "city": "Paris" // example variable used by your prompt
202 | }
203 | },
204 | // You can still set direct session fields; these override prompt fields if they overlap:
205 | "instructions": "Speak clearly and briefly. Confirm understanding before taking actions."
206 | }
207 | }
208 |
209 | ```
210 |
211 | If a prompt setting overlaps with other configuration passed to the session, as
212 | in the example above, the session configuration takes precedence, so a client can either
213 | use the prompt’s config or manipulate it at session time.
214 |
215 | ### Sideband connections
216 |
217 | The Realtime API allows clients to connect directly to the API server via WebRTC or SIP. However, you’ll most likely want tool use and other business logic to reside on your application server to keep this logic private and client-agnostic.
218 |
219 | Keep tool use, business logic, and other details secure on the server side by connecting over a sideband control channel. We now have sideband options for both SIP and WebRTC connections.
220 |
221 | A sideband connection means there are two active connections to the same realtime session: one from the user’s client and one from your application server. The server connection can be used to monitor the session, update instructions, and respond to tool calls.
222 |
223 | For more information, see [documentation for sideband connections](https://platform.openai.com/docs/guides/realtime-server-controls).
224 |
225 | ## Start building
226 |
227 | We hope this was a helpful way to understand what’s changed with the generally available Realtime API and new realtime models.
228 |
229 | Now that you have the updated framing, [see the realtime docs](https://platform.openai.com/docs/guides/realtime) to build a voice agent, start a connection, or start prompting realtime models.
```
--------------------------------------------------------------------------------
/docs/apps-sdk/apps-sdk_build_custom-ux_.txt:
--------------------------------------------------------------------------------
```
1 | ---
2 | url: "https://developers.openai.com/apps-sdk/build/custom-ux/"
3 | title: "Build a custom UX"
4 | ---
5 |
6 | ## Search the docs
7 |
8 | ⌘K/CtrlK
9 |
10 | Close
11 |
12 | Clear
13 |
14 | Primary navigation
15 |
16 | ChatGPT
17 |
18 | ResourcesCodexChatGPTBlog
19 |
20 | Clear
21 |
22 | - [Home](https://developers.openai.com/apps-sdk)
23 |
24 | ### Core Concepts
25 |
26 | - [MCP Server](https://developers.openai.com/apps-sdk/concepts/mcp-server)
27 | - [User interaction](https://developers.openai.com/apps-sdk/concepts/user-interaction)
28 | - [Design guidelines](https://developers.openai.com/apps-sdk/concepts/design-guidelines)
29 |
30 | ### Plan
31 |
32 | - [Research use cases](https://developers.openai.com/apps-sdk/plan/use-case)
33 | - [Define tools](https://developers.openai.com/apps-sdk/plan/tools)
34 | - [Design components](https://developers.openai.com/apps-sdk/plan/components)
35 |
36 | ### Build
37 |
38 | - [Set up your server](https://developers.openai.com/apps-sdk/build/mcp-server)
39 | - [Build a custom UX](https://developers.openai.com/apps-sdk/build/custom-ux)
40 | - [Authenticate users](https://developers.openai.com/apps-sdk/build/auth)
41 | - [Persist state](https://developers.openai.com/apps-sdk/build/storage)
42 | - [Examples](https://developers.openai.com/apps-sdk/build/examples)
43 |
44 | ### Deploy
45 |
46 | - [Deploy your app](https://developers.openai.com/apps-sdk/deploy)
47 | - [Connect from ChatGPT](https://developers.openai.com/apps-sdk/deploy/connect-chatgpt)
48 | - [Test your integration](https://developers.openai.com/apps-sdk/deploy/testing)
49 |
50 | ### Guides
51 |
52 | - [Optimize Metadata](https://developers.openai.com/apps-sdk/guides/optimize-metadata)
53 | - [Security & Privacy](https://developers.openai.com/apps-sdk/guides/security-privacy)
54 | - [Troubleshooting](https://developers.openai.com/apps-sdk/deploy/troubleshooting)
55 |
56 | ### Resources
57 |
58 | - [Reference](https://developers.openai.com/apps-sdk/reference)
59 | - [App developer guidelines](https://developers.openai.com/apps-sdk/app-developer-guidelines)
60 |
61 | ## Overview
62 |
63 | UI components turn structured tool results into a human-friendly UI. Apps SDK components are typically React components that run inside an iframe, talk to the host via the `window.openai` API, and render inline with the conversation. This guide describes how to structure your component project, bundle it, and wire it up to your MCP server.
64 |
65 | You can also check out the [examples repository on GitHub](https://github.com/openai/openai-apps-sdk-examples).
66 |
67 | ## Understand the `window.openai` API
68 |
69 | `window.openai` is the bridge between your frontend and ChatGPT. Use this quick reference to first understand how to wire up data, state, and layout concerns before you dive into component scaffolding.
70 |
71 | ```
72 | declare global {
73 | interface Window {
74 | openai: API & OpenAiGlobals;
75 | }
76 |
77 | interface WindowEventMap {
78 | [SET_GLOBALS_EVENT_TYPE]: SetGlobalsEvent;
79 | }
80 | }
81 |
82 | type OpenAiGlobals<
83 | ToolInput extends UnknownObject = UnknownObject,
84 | ToolOutput extends UnknownObject = UnknownObject,
85 | ToolResponseMetadata extends UnknownObject = UnknownObject,
86 | WidgetState extends UnknownObject = UnknownObject
87 | > = {
88 | theme: Theme;
89 | userAgent: UserAgent;
90 | locale: string;
91 |
92 | // layout
93 | maxHeight: number;
94 | displayMode: DisplayMode;
95 | safeArea: SafeArea;
96 |
97 | // state
98 | toolInput: ToolInput;
99 | toolOutput: ToolOutput | null;
100 | toolResponseMetadata: ToolResponseMetadata | null;
101 | widgetState: WidgetState | null;
102 | };
103 |
104 | type API<WidgetState extends UnknownObject> = {
105 | /** Calls a tool on your MCP. Returns the full response. */
106 | callTool: (name: string, args: Record<string, unknown>) => Promise<CallToolResponse>;
107 |
108 | /** Triggers a followup turn in the ChatGPT conversation */
109 | sendFollowUpMessage: (args: { prompt: string }) => Promise<void>;
110 |
111 | /** Opens an external link, redirects web page or mobile app */
112 | openExternal(payload: { href: string }): void;
113 |
114 | /** For transitioning an app from inline to fullscreen or pip */
115 | requestDisplayMode: (args: { mode: DisplayMode }) => Promise<{
116 | /**
117 | * The granted display mode. The host may reject the request.
118 | * For mobile, PiP is always coerced to fullscreen.
119 | */
120 | mode: DisplayMode;
121 | }>;
122 |
123 | setWidgetState: (state: WidgetState) => Promise<void>;
124 | };
125 |
126 | // Dispatched when any global changes in the host page
127 | export const SET_GLOBALS_EVENT_TYPE = "openai:set_globals";
128 | export class SetGlobalsEvent extends CustomEvent<{
129 | globals: Partial<OpenAiGlobals>;
130 | }> {
131 | readonly type = SET_GLOBALS_EVENT_TYPE;
132 | }
133 |
134 | export type CallTool = (
135 | name: string,
136 | args: Record<string, unknown>
137 | ) => Promise<CallToolResponse>;
138 |
139 | export type DisplayMode = "pip" | "inline" | "fullscreen";
140 |
141 | export type Theme = "light" | "dark";
142 |
143 | export type SafeAreaInsets = {
144 | top: number;
145 | bottom: number;
146 | left: number;
147 | right: number;
148 | };
149 |
150 | export type SafeArea = {
151 | insets: SafeAreaInsets;
152 | };
153 |
154 | export type DeviceType = "mobile" | "tablet" | "desktop" | "unknown";
155 |
156 | export type UserAgent = {
157 | device: { type: DeviceType };
158 | capabilities: {
159 | hover: boolean;
160 | touch: boolean;
161 | };
162 | };
163 |
164 | ```
165 |
166 | ### useOpenAiGlobal
167 |
168 | Many Apps SDK projects wrap `window.openai` access in small hooks so views remain testable. This example hook listens for host `openai:set_globals` events and lets React components subscribe to a single global value:
169 |
170 | ```
171 | export function useOpenAiGlobal<K extends keyof OpenAiGlobals>(
172 | key: K
173 | ): OpenAiGlobals[K] {
174 | return useSyncExternalStore(
175 | (onChange) => {
176 | const handleSetGlobal = (event: SetGlobalsEvent) => {
177 | const value = event.detail.globals[key];
178 | if (value === undefined) {
179 | return;
180 | }
181 |
182 | onChange();
183 | };
184 |
185 | window.addEventListener(SET_GLOBALS_EVENT_TYPE, handleSetGlobal, {
186 | passive: true,
187 | });
188 |
189 | return () => {
190 | window.removeEventListener(SET_GLOBALS_EVENT_TYPE, handleSetGlobal);
191 | };
192 | },
193 | () => window.openai[key]
194 | );
195 | }
196 |
197 | ```
198 |
199 | `useOpenAiGlobal` is an important primitive to make your app reactive to changes in display mode, theme, and “props” via subsequent tool calls.
200 |
201 | For example, read the tool input, output, and metadata:
202 |
203 | ```
204 | export function useToolInput() {
205 | return useOpenAiGlobal('toolInput')
206 | }
207 |
208 | export function useToolOutput() {
209 | return useOpenAiGlobal('toolOutput')
210 | }
211 |
212 | export function useToolResponseMetadata() {
213 | return useOpenAiGlobal('toolResponseMetadata')
214 | }
215 |
216 | ```
217 |
218 | ### Persist component state, expose context to ChatGPT
219 |
220 | Widget state can be used for persisting data across user sessions, and exposing data to ChatGPT. Anything you pass to `setWidgetState` will be shown to the model, and hydrated into `window.openai.widgetState`.
221 |
222 | Note that currently everything passed to `setWidgetState` is shown to the model. For the best performance, it’s advisable to keep this payload small, and to not exceed more than 4k [tokens](https://platform.openai.com/tokenizer).
223 |
224 | ### Trigger server actions
225 |
226 | `window.openai.callTool` lets the component directly make MCP tool calls. Use this for direct manipulations (refresh data, fetch nearby restaurants). Design tools to be idempotent where possible and return updated structured content that the model can reason over in subsequent turns.
227 |
228 | Please note that your tool needs to be marked as [able to be initiated by the component](https://developers.openai.com/apps-sdk/build/mcp-server###allow-component-initiated-tool-access).
229 |
230 | ```
231 | async function refreshPlaces(city: string) {
232 | await window.openai?.callTool("refresh_pizza_list", { city });
233 | }
234 |
235 | ```
236 |
237 | ### Send conversational follow-ups
238 |
239 | Use `window.openai.sendFollowupMessage` to insert a message into the conversation as if the user asked it.
240 |
241 | ```
242 | await window.openai?.sendFollowupMessage({
243 | prompt: "Draft a tasting itinerary for the pizzerias I favorited.",
244 | });
245 |
246 | ```
247 |
248 | ### Request alternate layouts
249 |
250 | If the UI needs more space—like maps, tables, or embedded editors—ask the host to change the container. `window.openai.requestDisplayMode` negotiates inline, PiP, or fullscreen presentations.
251 |
252 | ```
253 | await window.openai?.requestDisplayMode({ mode: "fullscreen" });
254 | // Note: on mobile, PiP may be coerced to fullscreen
255 |
256 | ```
257 |
258 | ### Use host-backed navigation
259 |
260 | Skybridge (the sandbox runtime) mirrors the iframe’s history into ChatGPT’s UI. Use standard routing APIs—such as React Router—and the host will keep navigation controls in sync with your component.
261 |
262 | Router setup (React Router’s `BrowserRouter`):
263 |
264 | ```
265 | export default function PizzaListRouter() {
266 | return (
267 | <BrowserRouter>
268 | <Routes>
269 | <Route path="/" element={<PizzaListApp />}>
270 | <Route path="place/:placeId" element={<PizzaListApp />} />
271 | </Route>
272 | </Routes>
273 | </BrowserRouter>
274 | );
275 | }
276 |
277 | ```
278 |
279 | Programmatic navigation:
280 |
281 | ```
282 | const navigate = useNavigate();
283 |
284 | function openDetails(placeId: string) {
285 | navigate(`place/${placeId}`, { replace: false });
286 | }
287 |
288 | function closeDetails() {
289 | navigate("..", { replace: true });
290 | }
291 |
292 | ```
293 |
294 | ## Scaffold the component project
295 |
296 | Now that you understand the `window.openai` API, it’s time to scaffold your component project.
297 |
298 | As best practice, keep the component code separate from your server logic. A common layout is:
299 |
300 | ```
301 | app/
302 | server/ # MCP server (Python or Node)
303 | web/ # Component bundle source
304 | package.json
305 | tsconfig.json
306 | src/component.tsx
307 | dist/component.js # Build output
308 |
309 | ```
310 |
311 | Create the project and install dependencies (Node 18+ recommended):
312 |
313 | ```
314 | cd app/web
315 | npm init -y
316 | npm install react@^18 react-dom@^18
317 | npm install -D typescript esbuild
318 |
319 | ```
320 |
321 | If your component requires drag-and-drop, charts, or other libraries, add them now. Keep the dependency set lean to reduce bundle size.
322 |
323 | ## Author the React component
324 |
325 | Your entry file should mount a component into a `root` element and read initial data from `window.openai.toolOutput` or persisted state.
326 |
327 | We have provided some example apps under the [examples page](https://developers.openai.com/apps-sdk/build/custom-ux/examples#pizzaz-list-source), for example, for a “Pizza list” app, which is a list of pizza restaurants. As you can see in the source code, the pizza list React component does the following:
328 |
329 | 1. **Mount into the host shell.** The Skybridge HTML template exposes `div#pizzaz-list-root`. The component mounts with `createRoot(document.getElementById("pizzaz-list-root")).render(<PizzaListApp />)` so the entire UI stays encapsulated inside the iframe.
330 | 2. **Subscribe to host globals.** Inside `PizzaListApp`, hooks such as `useOpenAiGlobal("displayMode")` and `useOpenAiGlobal("maxHeight")` read layout preferences directly from `window.openai`. This keeps the list responsive between inline and fullscreen layouts without custom postMessage plumbing.
331 | 3. **Render from tool output.** The component treats `window.openai.toolOutput` as the authoritative source of places returned by your tool. `widgetState` seeds any user-specific state (like favorites or filters) so the UI restores after refreshes.
332 | 4. **Persist state and call host actions.** When a user toggles a favorite, the component updates React state and immediately calls `window.openai.setWidgetState` with the new favorites array. Optional buttons can trigger `window.openai.requestDisplayMode({ mode: "fullscreen" })` or `window.openai.callTool("refresh_pizza_list", { city })` when more space or fresh data is needed.
333 |
334 | ### Explore the Pizzaz component gallery
335 |
336 | We provide a number of example components in the [Apps SDK examples](https://developers.openai.com/apps-sdk/build/examples). Treat them as blueprints when shaping your own UI:
337 |
338 | - **Pizzaz List** – ranked card list with favorites and call-to-action buttons.
339 |
340 | 
341 | - **Pizzaz Carousel** – embla-powered horizontal scroller that demonstrates media-heavy layouts.
342 |
343 | 
344 | - **Pizzaz Map** – Mapbox integration with fullscreen inspector and host state sync.
345 |
346 | 
347 | - **Pizzaz Album** – stacked gallery view built for deep dives on a single place.
348 |
349 | 
350 | - **Pizzaz Video** – scripted player with overlays and fullscreen controls.
351 |
352 | Each example shows how to bundle assets, wire host APIs, and structure state for real conversations. Copy the one closest to your use case and adapt the data layer for your tool responses.
353 |
354 | ### React helper hooks
355 |
356 | Using `useOpenAiGlobal` in a `useWidgetState` hook to keep host-persisted widget state aligned with your local React state:
357 |
358 | ```
359 | export function useWidgetState<T extends WidgetState>(
360 | defaultState: T | (() => T)
361 | ): readonly [T, (state: SetStateAction<T>) => void];
362 | export function useWidgetState<T extends WidgetState>(
363 | defaultState?: T | (() => T | null) | null
364 | ): readonly [T | null, (state: SetStateAction<T | null>) => void];
365 | export function useWidgetState<T extends WidgetState>(
366 | defaultState?: T | (() => T | null) | null
367 | ): readonly [T | null, (state: SetStateAction<T | null>) => void] {
368 | const widgetStateFromWindow = useWebplusGlobal("widgetState") as T;
369 |
370 | const [widgetState, _setWidgetState] = useState<T | null>(() => {
371 | if (widgetStateFromWindow != null) {
372 | return widgetStateFromWindow;
373 | }
374 |
375 | return typeof defaultState === "function"
376 | ? defaultState()
377 | : defaultState ?? null;
378 | });
379 |
380 | useEffect(() => {
381 | _setWidgetState(widgetStateFromWindow);
382 | }, [widgetStateFromWindow]);
383 |
384 | const setWidgetState = useCallback(
385 | (state: SetStateAction<T | null>) => {
386 | _setWidgetState((prevState) => {
387 | const newState = typeof state === "function" ? state(prevState) : state;
388 |
389 | if (newState != null) {
390 | window.openai.setWidgetState(newState);
391 | }
392 |
393 | return newState;
394 | });
395 | },
396 | [window.openai.setWidgetState]
397 | );
398 |
399 | return [widgetState, setWidgetState] as const;
400 | }
401 |
402 | ```
403 |
404 | The hooks above make it easy to read the latest tool output, layout globals, or widget state directly from React components while still delegating persistence back to ChatGPT.
405 |
406 | ## Bundle for the iframe
407 |
408 | Once you are done writing your React component, you can build it into a single JavaScript module that the server can inline:
409 |
410 | ```
411 | // package.json
412 | {
413 | "scripts": {
414 | "build": "esbuild src/component.tsx --bundle --format=esm --outfile=dist/component.js"
415 | }
416 | }
417 |
418 | ```
419 |
420 | Run `npm run build` to produce `dist/component.js`. If esbuild complains about missing dependencies, confirm you ran `npm install` in the `web/` directory and that your imports match installed package names (e.g., `@react-dnd/html5-backend` vs `react-dnd-html5-backend`).
421 |
422 | ## Embed the component in the server response
423 |
424 | See the [Set up your server docs](https://developers.openai.com/apps-sdk/build/mcp-server#) for how to embed the component in your MCP server response.
425 |
426 | Component UI templates are the recommended path for production.
427 |
428 | During development you can rebuild the component bundle whenever your React code changes and hot-reload the server.
```
--------------------------------------------------------------------------------
/docs/apps-sdk/_tracks_ai-application-development_.txt:
--------------------------------------------------------------------------------
```
1 | ---
2 | url: "https://developers.openai.com/tracks/ai-application-development/"
3 | title: "AI app development: Concept to production"
4 | ---
5 |
6 | ## Search the docs
7 |
8 | ⌘K/CtrlK
9 |
10 | Close
11 |
12 | Clear
13 |
14 | Primary navigation
15 |
16 | Resources
17 |
18 | ResourcesCodexChatGPTBlog
19 |
20 | Clear
21 |
22 | - [Home](https://developers.openai.com/)
23 |
24 | ### Categories
25 |
26 | - [Code](https://developers.openai.com/resources/code)
27 | - [Cookbooks](https://developers.openai.com/resources/cookbooks)
28 | - [Guides](https://developers.openai.com/resources/guides)
29 | - [Videos](https://developers.openai.com/resources/videos)
30 |
31 | ### Topics
32 |
33 | - [Agents](https://developers.openai.com/topics/agents)
34 | - [Audio & Voice](https://developers.openai.com/topics/audio)
35 | - [Image generation](https://developers.openai.com/topics/imagegen)
36 | - [Tools](https://developers.openai.com/topics/tools)
37 | - [Computer use](https://developers.openai.com/topics/cua)
38 | - [Fine-tuning](https://developers.openai.com/topics/fine-tuning)
39 | - [Scaling](https://developers.openai.com/topics/scaling)
40 |
41 | 
42 |
43 | ## Introduction
44 |
45 | This track is designed for developers and technical learners who want to build production-ready AI applications with OpenAI’s models and tools.
46 | Learn foundational concepts and how to incorporate them in your applications, evaluate performance, and implement best practices to ensure your AI solutions are robust and ready to deploy at scale.
47 |
48 | ### Why follow this track
49 |
50 | This track helps you quickly gain the skills to ship production-ready AI applications in four phases:
51 |
52 | 1. **Learn modern AI foundations**: Build a strong understanding of AI concepts—like agents, evals, and basic techniques
53 | 2. **Build hands-on experience**: Explore and develop applications with example code
54 | 3. **Ship with confidence**: Use evals and guardrails to ensure safety and reliability
55 | 4. **Optimize for production**: Optimize cost, latency, and performance to prepare your apps for real-world use
56 |
57 | ### Prerequisites
58 |
59 | Before starting this track, ensure you have the following:
60 |
61 | - **Basic coding familiarity**: You should be comfortable with Python or JavaScript.
62 | - **Developer environment**: You’ll need an IDE, like VS Code or Cursor—ideally configured with an agent mode.
63 | - **OpenAI API key**: Create or find your API key in the [OpenAI dashboard](https://platform.openai.com/api-keys).
64 |
65 | ## Phase 1: Foundations
66 |
67 | Production-ready AI applications often incorporate two things:
68 |
69 | - **Core logic**: what your application does, potentially driven by one or several AI agents
70 | - **Evaluations (evals)**: how you measure the quality, safety, and reliability of your application for future improvements
71 |
72 | On top of that, you might make use of one or several basic techniques to improve your AI system’s performance:
73 |
74 | - Prompt engineering
75 | - Retrieval-augmented generation (RAG)
76 | - Fine-tuning
77 |
78 | And to make sure your agent(s) can interact with the rest of your application or with external services, you can rely on structured outputs and tool calls.
79 |
80 | ### Core logic
81 |
82 | When you’re building an AI application, there’s a good chance you are incorporating one or several “agents” to go from input data, action or message to final result.
83 |
84 | Agents are essentially AI systems that have instructions, tools, and guardrails to guide behavior. They can:
85 |
86 | - Reason and make decisions
87 | - Maintain context and memory
88 | - Call external tools and APIs
89 |
90 | Instead of one-off prompts, agents manage dynamic, multistep workflows that respond to real-world situations.
91 |
92 | #### Learn and build
93 |
94 | Explore the resources below to learn essential concepts about building agents, including how they leverage tools, models, and memory to interact intelligently with users, and get hands-on experience creating your first agent in under 10 minutes.
95 | If you want to dive deeper into these concepts, refer to our [Building Agents](https://developers.openai.com/tracks/building-agents) track.
96 |
97 | [\\
98 | \\
99 | **Building agents guide** \\
100 | \\
101 | Official guide to building agents using the OpenAI platform.\\
102 | \\
103 | guide](https://platform.openai.com/docs/guides/agents) [\\
104 | \\
105 | **Agents SDK quickstart** \\
106 | \\
107 | Quickstart project for building agents with the Agents SDK.\\
108 | \\
109 | code](https://openai.github.io/openai-agents-python/quickstart/)
110 |
111 | ### Evaluations
112 |
113 | Evals are how you measure and improve your AI app’s behavior. They help you:
114 |
115 | - Verify correctness
116 | - Enforce the right guardrails and constraints
117 | - Track quality over time so you can ship with confidence
118 |
119 | Unlike ad hoc testing, evals create a feedback loop that lets you iterate safely and continuously improve your AI applications.
120 |
121 | There are different types of evals, depending on the type of application you are building.
122 |
123 | For example, if you want the system to produce answers that can be right or wrong (e.g. a math problem, a classification task, etc.), you can run evals with a set of questions you already know the answers to (the “ground truth”).
124 |
125 | 
126 |
127 | In other cases, there might not be a “ground truth” for the answers, but you can still run evals to measure the quality of the output—we will cover this in more details in Phase 3.
128 |
129 | [\\
130 | \\
131 | **Launch apps with evaluations** \\
132 | \\
133 | Video on incorporating evals when deploying AI products.\\
134 | \\
135 | video](https://vimeo.com/1105244173)
136 |
137 | ### Basic techniques
138 |
139 | The first thing you need to master when building AI applications is “prompt engineering”, or simply put: _how to tell the models what to do_.
140 |
141 | With the models’ increasing performance, there is no need to learn a complex syntax or information structure.
142 |
143 | But there are a few things to keep in mind, as not all models follow instructions in the same way.
144 | GPT-5 for example, our latest model, follows instructions very precisely, so the same prompt can result in different behaviors if you’re using `gpt-5` vs `gpt-4o` for example.
145 |
146 | The results may vary as well depending on which type of prompt you use: system, developer or user prompt, or a combination of all of them.
147 |
148 | #### Learn and build
149 |
150 | Explore our resources below on how to improve your prompt engineering skills with practical examples.
151 |
152 | [\\
153 | \\
154 | **Prompt engineering guide** \\
155 | \\
156 | Detailed guide on prompt engineering strategies.\\
157 | \\
158 | guide](https://platform.openai.com/docs/guides/realtime-transcription) [\\
159 | \\
160 | **GPT-5 prompting guide** \\
161 | \\
162 | Cookbook guide on how to maximize GPT-5's performance.\\
163 | \\
164 | cookbook](https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide) [\\
165 | \\
166 | **Reasoning best practices** \\
167 | \\
168 | Prompting and optimization tips for reasoning models\\
169 | \\
170 | guide](https://platform.openai.com/docs/guides/reasoning-best-practices)
171 |
172 | Another common technique when building AI applications is “retrieval-augmented generation” (RAG), which is a technique that allows to pull in knowledge related to the user input to generate more relevant responses.
173 | We will cover RAG in more details in Phase 2.
174 |
175 | Finally, in some cases, you can also fine-tune a model to your specific needs. This allows to optimize the model’s behavior for your use case.
176 |
177 | A common misconception is that fine-tuning can “teach” the models about your
178 | data. This isn’t the case, and if you want your AI application or agents to
179 | know about your data, you should use RAG. Fine-tuning is more about optimizing
180 | how the model will handle a certain type of input, or produce outputs in a
181 | certain way.
182 |
183 | ### Structured data
184 |
185 | If you want to build robust AI applications, you need to make sure the model outputs are reliable.
186 |
187 | LLMs produce non-deterministic outputs by default, meaning you can get widely different output formats if you don’t constrain them.
188 | Prompt engineering can only get you so far, and when you are building for production you can’t afford for your application to break because you got an unexpected output.
189 |
190 | That is why you should rely as much as possible (unless you are generating a user-facing response) on structured outputs and tool calls.
191 |
192 | Structured outputs are a way for you to constrain the model’s output to a strict json schema—that way, you always know what to expect.
193 | You can also enforce strict schemas for function calls, in case you prefer letting the model decide when to interact with your application or other services.
194 |
195 | [\\
196 | \\
197 | **Structured outputs guide** \\
198 | \\
199 | Guide for producing structured outputs with the Responses API.\\
200 | \\
201 | guide](https://platform.openai.com/docs/guides/structured-outputs?api-mode=responses)
202 |
203 | ## Phase 2: Application development
204 |
205 | In this section, you’ll move from understanding foundational concepts to building complete, production-ready applications. We’ll dive deeper into the following:
206 |
207 | - **Building agents**: Experiment with our models and tools
208 | - **RAG (retrieval-augmented generation)**: Enrich applications with knowledge sources
209 | - **Fine-tuning models**: Tailor model behavior to your unique needs
210 |
211 | By the end of this section, you’ll be able to design, build, and optimize AI applications that tackle real-world scenarios intelligently.
212 |
213 | ### Experimenting with our models
214 |
215 | Before you start building, you can test ideas and iterate quickly with the [OpenAI Playground](https://platform.openai.com/chat/edit?models=gpt-5).
216 | Once you have tested your prompts and tools and you have a sense of the type of output you can get, you can move from the Playground to your actual application.
217 |
218 | The build hour below is a good example of how you can use the playground to experiment before importing the code into your actual application.
219 |
220 | [\\
221 | \\
222 | **Build hour — built-in tools** \\
223 | \\
224 | Build hour giving an overview of built-in tools available in the Responses API.\\
225 | \\
226 | video](https://webinar.openai.com/on-demand/c17a0484-d32c-4359-b5ee-d318dad51586)
227 |
228 | ### Getting started building agents
229 |
230 | The Responses API is your starting point for building dynamic, multi-modal AI applications.
231 | It’s a stateful API that supports our latest models’ capabilities, including things such as tool-calling in reasoning, and it offers a set of powerful built-in tools.
232 |
233 | As an abstraction on top of the Responses API, the Agents SDK is a framework that makes it easy to build agents and orchestrate them.
234 |
235 | If you’re not already familiar with the Responses API or Agents SDK or the concept of agents, we recommend following our [Building Agents](https://developers.openai.com/tracks/building-agents#building-with-the-responses-api) track first.
236 |
237 | #### Learn and build
238 |
239 | Explore the following resources to rapidly get started building. The Agents SDK repositories contain example code that you can use to get started in either Python or TypeScript, and the Responses starter app is a good starting point to build with the Responses API.
240 |
241 | [\\
242 | \\
243 | **Responses starter app** \\
244 | \\
245 | Starter application demonstrating OpenAI Responses API with tools.\\
246 | \\
247 | code](https://github.com/openai/openai-responses-starter-app) [\\
248 | \\
249 | **Agents SDK — Python** \\
250 | \\
251 | Python SDK for developing agents with OpenAI.\\
252 | \\
253 | code](https://github.com/openai/openai-agents-python) [\\
254 | \\
255 | **Agents SDK — TypeScript** \\
256 | \\
257 | TypeScript SDK for developing agents with OpenAI.\\
258 | \\
259 | code](https://github.com/openai/openai-agents-js)
260 |
261 | ### Inspiration
262 |
263 | Explore these demos to get a sense of what you can build with the Responses API and the Agents SDK:
264 |
265 | - **Support agent**: a simple support agent built on top of the Responses API, with a “human in the loop” angle—the agent is meant to be used by a human that can accept or reject the agent’s suggestions
266 | - **Customer service agent**: a network of multiple agents working together to handle a customer request, built with the Agents SDK
267 | - **Frontend testing agent**: a computer using agent that requires a single user input to test a frontend application
268 |
269 | Pick the one most relevant to your use case and adapt from there.
270 |
271 | [\\
272 | \\
273 | **Support agent demo** \\
274 | \\
275 | Demo showing a customer support agent with a human in the loop.\\
276 | \\
277 | code](https://github.com/openai/openai-support-agent-demo) [\\
278 | \\
279 | **CS agents demo** \\
280 | \\
281 | Demo showcasing customer service agents orchestration.\\
282 | \\
283 | code](https://github.com/openai/openai-cs-agents-demo) [\\
284 | \\
285 | **Frontend testing demo** \\
286 | \\
287 | Demo application for frontend testing using CUA.\\
288 | \\
289 | code](https://github.com/openai/openai-testing-agent-demo)
290 |
291 | ### Augmenting the model’s knowledge
292 |
293 | RAG (retrieval-augmented generation) introduces elements from a knowledge base in the model’s context window so that it can answer questions using that knowledge.
294 | It lets the model know about things that are not part of its training data, for example your internal data, so that it can generate more relevant responses.
295 |
296 | Based on an input, you can retrieve the most relevant documents from your knowledge base, and then use this information to generate a response.
297 |
298 | There are several steps involved in a RAG pipeline:
299 |
300 | 1. **Data preparation**: Pre-processing documents, chunking them into smaller pieces if needed, embedding them and storing them in a vector database
301 | 2. **Retrieval**: Using the input to retrieve the most relevant chunks from the vector database. Optionally, there are multiple optimization techniques that can be used at this stage, such as input processing or re-ranking (re-ordering the retrieved chunks to make sure we keep only the most relevant)
302 | 3. **Generation**: Once you have the most relevant chunks, you can include them in the context you send to the model to generate the final answer
303 |
304 | 
305 |
306 | We could write an entire track on RAG alone, but for now, you can learn more about it in the guide below.
307 |
308 | [\\
309 | \\
310 | **RAG technique overview** \\
311 | \\
312 | Overview of retrieval-augmented generation techniques.\\
313 | \\
314 | guide](https://platform.openai.com/docs/guides/optimizing-llm-accuracy#retrieval-augmented-generation-rag)
315 |
316 | If you don’t have specific needs requiring to build a custom RAG pipeline, you can rely on our built-in file search tool which abstracts away all of this complexity.
317 |
318 | #### Learn and build
319 |
320 | [\\
321 | \\
322 | **File search guide** \\
323 | \\
324 | Guide to retrieving context from files using the Responses API.\\
325 | \\
326 | guide](https://platform.openai.com/docs/guides/tools-file-search) [\\
327 | \\
328 | **RAG with PDFs cookbook** \\
329 | \\
330 | Cookbook for retrieval-augmented generation using PDFs.\\
331 | \\
332 | cookbook](https://cookbook.openai.com/examples/file_search_responses)
333 |
334 | ### Fine-tuning models
335 |
336 | In some cases, your application could benefit from a model that adapts to your specific task. You can use supervised or reinforcement fine-tuning to teach the models certain behaviors.
337 |
338 | For example, supervised fine-tuning is a good fit when:
339 |
340 | - You want the output to follow strict guidelines for tone, style, or format
341 | - It’s easier to “show” than “tell” how to handle certain inputs to arrive at the desired outputs
342 | - You want to process inputs or generate outputs in a consistent way
343 |
344 | You can also use Direct Preference Optimization (DPO) to fine-tune a model with examples of what _not_ to do vs what is a preferred answer.
345 |
346 | On the other hand, you can use reinforcement fine-tuning when you want reasoning models to accomplish nuanced objectives.
347 |
348 | #### Learn and build
349 |
350 | Explore the following resources to learn about core fine-tuning techniques for customizing model behavior. You can also dive deeper into fine-tuning with our [Model optimization](https://developers.openai.com/tracks/model-optimization) track.
351 |
352 | [\\
353 | \\
354 | **Supervised fine-tuning overview** \\
355 | \\
356 | Guide to supervised fine-tuning for customizing model behavior.\\
357 | \\
358 | guide](https://platform.openai.com/docs/guides/supervised-fine-tuning) [\\
359 | \\
360 | **Reinforcement fine-tuning overview** \\
361 | \\
362 | Guide on reinforcement learning-based fine-tuning techniques.\\
363 | \\
364 | guide](https://platform.openai.com/docs/guides/reinforcement-fine-tuning) [\\
365 | \\
366 | **Fine-tuning cookbook** \\
367 | \\
368 | Cookbook on direct preference optimization for fine-tuning.\\
369 | \\
370 | cookbook](https://cookbook.openai.com/examples/fine_tuning_direct_preference_optimization_guide)
371 |
372 | Now that we’ve covered how to build AI applications and incorporate some basic AI techniques in the development process, we’ll focus on testing and evaluation, learning how to integrate evals and guardrails to confidently ship AI applications that are safe, predictable, and production-ready.
373 |
374 | ## Phase 3: Testing and evaluation
375 |
376 | Learn how to test, safeguard, and harden your AI applications before moving them into production. We’ll focus on:
377 |
378 | - **Constructing robust evals** to measure correctness, quality, and reliability at scale
379 | - **Adding guardrails** to block unsafe actions and enforce predictable behavior
380 | - **Iterating with feedback loops** that surface weaknesses and strengthen your apps over time
381 |
382 | By the end of this phase, you’ll be able to ship AI applications that are safe, reliable, and ready for users to trust.
383 |
384 | ### Constructing evals
385 |
386 | To continuously measure and improve your applications from prototype through deployment, you need to design evaluation workflows.
387 |
388 | Evals in practice let you:
389 |
390 | - **Verify correctness**: Validate that outputs meet your desired logic and requirements.
391 | - **Benchmark quality**: Compare performance over time with consistent rubrics.
392 | - **Guide iteration**: Detect regressions, pinpoint weaknesses, and prioritize fixes as your app evolves.
393 |
394 | By embedding evals into your development cycle, you create repeatable, objective feedback loops that keep your AI systems aligned with both user needs and business goals.
395 |
396 | There are many types of evals, some that rely on a “ground truth” (a set of question/answer pairs), and others that rely on more subjective criteria.
397 |
398 | Even when you have expected answers, comparing the model’s output to them might not always be straightforward. Sometimes, you can check in a simple way that the output matches the expected answer, like in the example below.
399 | In other cases, you might need to rely on different metrics and scoring algorithms that can compare outputs holistically—when you’re comparing big chunks of text (e.g. translations, summaries) for example.
400 |
401 | _Example: Check the model’s output against the expected answer, ignoring order._
402 |
403 | ```
404 | // Reference answer
405 | const correctAnswer = ["Eggs", "Sugar"];
406 |
407 | // Model's answer
408 | const modelAnswer = ["Sugar", "Eggs"];
409 |
410 | // Simple check: Correct if same ingredients, order ignored
411 | const isCorrect =
412 | correctAnswer.sort().toString() === modelAnswer.sort().toString();
413 |
414 | console.log(isCorrect ? "Correct!" : "Incorrect.");
415 |
416 | ```
417 |
418 | #### Learn and build
419 |
420 | Explore the following resources to learn evaluation-driven development to scale apps from prototype to production. These resources will walk you through how to design rubrics and measure outputs against business goals.
421 |
422 | [\\
423 | \\
424 | **Evals design guide** \\
425 | \\
426 | Learn best practices for designing evals\\
427 | \\
428 | guide](https://platform.openai.com/docs/guides/evals-design) [\\
429 | \\
430 | **Eval-driven dev — prototype to launch** \\
431 | \\
432 | Cookbook demonstrating eval-driven development workflows.\\
433 | \\
434 | cookbook](https://cookbook.openai.com/examples/partners/eval_driven_system_design/receipt_inspection)
435 |
436 | ### Evals API
437 |
438 | The OpenAI Platform provides an Evals API along with a dashboard that allows you to visually configure and run evals.
439 | You can create evals, run them with different models and prompts, and analyze the results to decide next steps.
440 |
441 | #### Learn and build
442 |
443 | Learn more about the Evals API and how to use it with the resources below.
444 |
445 | [\\
446 | \\
447 | **Evaluating model performance** \\
448 | \\
449 | Guide to measuring model quality using the Evals framework.\\
450 | \\
451 | guide](https://platform.openai.com/docs/guides/evals) [\\
452 | \\
453 | **Evals API — tools evaluation** \\
454 | \\
455 | Cookbook example demonstrating tool evaluation with the Evals API.\\
456 | \\
457 | cookbook](https://cookbook.openai.com/examples/evaluation/use-cases/tools-evaluation)
458 |
459 | ### Building guardrails
460 |
461 | Guardrails act as protective boundaries that ensure your AI system behaves safely and predictably in the real world.
462 |
463 | They help you:
464 |
465 | - **Prevent unsafe behavior**: Block disallowed or non-compliant actions before they reach users.
466 | - **Reduce hallucinations**: Catch and correct common failure modes in real time.
467 | - **Maintain consistency**: Enforce rules and constraints across agents, tools, and workflows.
468 |
469 | Together, evals and guardrails form the foundation of trustworthy, production-grade AI systems.
470 |
471 | There are two types of guardrails:
472 |
473 | - **Input guardrails**: To prevent unwanted inputs from being processed
474 | - **Output guardrails**: To prevent unwanted outputs from being returned
475 |
476 | In a production environment, ideally you would have both types of guardrails, depending on how the input and output are used and the level of risk you’re comfortable with.
477 |
478 | It can be as easy as specifying something in the system prompt, or more complex, involving multiple checks.
479 |
480 | One simple guardrail to implement is to use the Moderations API (which is free to use) to check if the input triggers any of the common flags (violence, illegal ask, etc.) and stop the generation process if it does.
481 |
482 | _Example: Classify text for policy compliance with the Moderations API._
483 |
484 | ```
485 | from openai import OpenAI
486 | client = OpenAI()
487 |
488 | response = client.moderations.create(
489 | model="omni-moderation-latest",
490 | input="I want to buy drugs",
491 | )
492 |
493 | print(response)
494 |
495 | ```
496 |
497 | #### Learn and build
498 |
499 | Explore the following resources to implement safeguards that make your AI predictable and compliant. Set up guardrails against common risks like hallucinations or unsafe tool use.
500 |
501 | [\\
502 | \\
503 | **Building guardrails for agents** \\
504 | \\
505 | Guide to implementing safeguards and guardrails in agent applications.\\
506 | \\
507 | guide](https://openai.github.io/openai-agents-python/guardrails/) [\\
508 | \\
509 | **Developing hallucination guardrails** \\
510 | \\
511 | Cookbook for creating guardrails that reduce model hallucinations.\\
512 | \\
513 | cookbook](https://cookbook.openai.com/examples/developing_hallucination_guardrails)
514 |
515 | Now that you’ve learned how to incorporate evals into your workflow and build guardrails to enforce safe and compliant behavior, you can move on to the last phase, where you’ll learn to optimize your applications for cost, latency, and production readiness.
516 |
517 | ## Phase 4: Scalability and maintenance
518 |
519 | In this final phase, you’ll learn how to run AI applications at production scale—optimizing for accuracy, speed, and cost while ensuring long-term stability. We’ll focus on:
520 |
521 | - **Optimizing models** to improve accuracy, consistency, and efficiency for real-world use
522 | - **Cost and latency optimization** to balance performance, responsiveness, and budget
523 |
524 | ### Performance optimization
525 |
526 | Optimizing your application’s performance means ensuring your workflows stay accurate, consistent, and efficient as they move into long-term production use.
527 |
528 | There are 3 levers you can adjust:
529 |
530 | - Improving the prompts (i.e. prompt engineering)
531 | - Improving the context you provide to the model (i.e. RAG)
532 | - Improving the model itself (i.e. fine-tuning)
533 |
534 | 
535 |
536 | #### Deep-dive
537 |
538 | This guide covers how you can combine these techniques to optimize your application’s performance.
539 |
540 | [\\
541 | \\
542 | **LLM correctness and consistency** \\
543 | \\
544 | Best practices for achieving accurate and consistent model outputs.\\
545 | \\
546 | guide](https://platform.openai.com/docs/guides/optimizing-llm-accuracy)
547 |
548 | ### Cost & latency optimization
549 |
550 | Every production AI system must balance performance with cost and latency. Often, these two go together, as smaller and faster models are also cheaper.
551 |
552 | A few ways you can optimize these areas are:
553 |
554 | - **Using smaller, fine-tuned models**: you can fine-tune a smaller model to your specific use case and maintain performance (a.k.a. distillation)
555 | - **Prompt caching**: you can use prompt caching to improve latency and reduce costs for cached tokens (series of tokens that have already been seen by the model)
556 |
557 | [\\
558 | \\
559 | **Prompt caching 101** \\
560 | \\
561 | Introductory cookbook on implementing prompt caching to reduce token usage.\\
562 | \\
563 | cookbook](https://cookbook.openai.com/examples/prompt_caching101) [\\
564 | \\
565 | **Model distillation overview** \\
566 | \\
567 | Overview of distillation techniques for creating efficient models.\\
568 | \\
569 | guide](https://platform.openai.com/docs/guides/distillation#page-top)
570 |
571 | If latency isn’t a concern, consider these options to reduce costs with a latency trade-off:
572 |
573 | - **Batch API**: you can use the Batch API to group requests together and get a 50% discount (however this is only valid for async use cases)
574 | - **Flex processing**: you can use flex processing to get lower costs in exchange for slower responses times
575 |
576 | [\\
577 | \\
578 | **Batch API guide** \\
579 | \\
580 | Guide on how to use the Batch API to reduce costs\\
581 | \\
582 | guide](https://platform.openai.com/docs/guides/batch) [\\
583 | \\
584 | **Flex processing guide** \\
585 | \\
586 | Guide on how to reduce costs with flex processing\\
587 | \\
588 | guide](https://platform.openai.com/docs/guides/flex-processing)
589 |
590 | You can monitor your usage and costs with the cost API, to keep track on what you should optimize.
591 |
592 | [\\
593 | \\
594 | **Keep costs low & accuracy high** \\
595 | \\
596 | Guide on balancing cost efficiency with model accuracy.\\
597 | \\
598 | guide](https://platform.openai.com/docs/guides/reasoning-best-practices#how-to-keep-costs-low-and-accuracy-high) [\\
599 | \\
600 | **Monitor usage with the Cost API** \\
601 | \\
602 | Cookbook showing how to track API usage and costs.\\
603 | \\
604 | cookbook](https://cookbook.openai.com/examples/completions_usage_api)
605 |
606 | ### Set up your account for production
607 |
608 | On the OpenAI platform, we have the concept of tiers, going from 1 to 5. An organization in Tier 1 won’t be able to make the same number of requests per minute or send us the same amount of tokens per minute as an organization in Tier 5.
609 |
610 | Before going live, make sure your tier is set up to manage the expected production usage—you can check our rate limits in the guide below.
611 |
612 | Also make sure your billing limits are set up correctly, and your application is optimized and secure from an engineering standpoint.
613 | Our production best practices guide will walk you through how to make sure your application is setup for scale.
614 |
615 | [\\
616 | \\
617 | **Production best practices** \\
618 | \\
619 | Guide on best practices for running AI applications in production\\
620 | \\
621 | guide](https://platform.openai.com/docs/guides/production-best-practices) [\\
622 | \\
623 | **Rate limits guide** \\
624 | \\
625 | Guide to understanding and managing rate limits\\
626 | \\
627 | guide](https://platform.openai.com/docs/guides/rate-limits)
628 |
629 | ## Conclusion and next steps
630 |
631 | In this track, you:
632 |
633 | - Learned about core concepts such as agents and evals
634 | - Designed and deployed applications using the Responses API or Agents SDK and optionally incorporated some basic techniques like prompt engineering, fine-tuning, and RAG
635 | - Validated and safeguarded your solutions with evals and guardrails
636 | - Optimized for cost, latency, and long-term reliability in production
637 |
638 | This should give you the foundations to build your own AI applications and get them ready for production, taking ideas from concept to AI systems that can be deployed and scaled.
639 |
640 | ### Where to go next
641 |
642 | Keep building your expertise with our advanced track on [Model optimization](https://developers.openai.com/tracks/model-optimization), or directly explore resources on topics you’re curious about.
643 |
644 | ### Feedback
645 |
646 | [Share your feedback](https://docs.google.com/forms/d/e/1FAIpQLSdLbn7Tw1MxuwsSuoiNvyZt159rhNmDfg7swjYgKHzly4GlAQ/viewform?usp=sharing&ouid=108082195142646939431) on this track and suggest other topics you’d like us to cover.
```
--------------------------------------------------------------------------------
/docs/apps-sdk/_tracks_building-agents.txt:
--------------------------------------------------------------------------------
```
1 | ---
2 | url: "https://developers.openai.com/tracks/building-agents"
3 | title: "Building agents"
4 | ---
5 |
6 | ## Search the docs
7 |
8 | ⌘K/CtrlK
9 |
10 | Close
11 |
12 | Clear
13 |
14 | Primary navigation
15 |
16 | Resources
17 |
18 | ResourcesCodexChatGPTBlog
19 |
20 | Clear
21 |
22 | - [Home](https://developers.openai.com/)
23 |
24 | ### Categories
25 |
26 | - [Code](https://developers.openai.com/resources/code)
27 | - [Cookbooks](https://developers.openai.com/resources/cookbooks)
28 | - [Guides](https://developers.openai.com/resources/guides)
29 | - [Videos](https://developers.openai.com/resources/videos)
30 |
31 | ### Topics
32 |
33 | - [Agents](https://developers.openai.com/topics/agents)
34 | - [Audio & Voice](https://developers.openai.com/topics/audio)
35 | - [Image generation](https://developers.openai.com/topics/imagegen)
36 | - [Tools](https://developers.openai.com/topics/tools)
37 | - [Computer use](https://developers.openai.com/topics/cua)
38 | - [Fine-tuning](https://developers.openai.com/topics/fine-tuning)
39 | - [Scaling](https://developers.openai.com/topics/scaling)
40 |
41 | 
42 |
43 | ## Introduction
44 |
45 | You’ve probably heard of agents, but what does this term actually mean?
46 |
47 | Our simple definition is:
48 |
49 | > An AI system that has instructions (what it _should_ do), guardrails (what it _should not_ do), and access to tools (what it _can_ do) to take action on the user’s behalf
50 |
51 | Think of it this way: if you’re building a chatbot-like experience, where the AI system is answering questions, you can’t really call it an agent.
52 |
53 | If that system, however, is connected to other systems, and taking action based on the user’s input, that qualifies as an agent.
54 |
55 | Simple agents may use a handful of tools, and complex agentic systems may orchestrate multiple agents to work together.
56 |
57 | This learning track introduces you to the core concepts and practical steps required to build AI agents, as well as best practices to keep in mind when building these applications.
58 |
59 | ### What we will cover
60 |
61 | 1. **Core concepts**: how to choose the right models, and how to build the core logic
62 | 2. **Tools**: how to augment your agents with tools to enable them to retrieve data, execute tasks, and connect to external systems
63 | 3. **Orchestration**: how to build multi-step flows or networks of agents
64 | 4. **Example use cases**: practical implementations of different use cases
65 | 5. **Best practices**: how to implement guardrails, and next steps to consider
66 |
67 | The goal of this track is to provide you with a comprehensive overview, and invite you to dive deeper with the resources linked in each section.
68 | Some of these resources are code examples, allowing you to get started building quickly.
69 |
70 | ## Core concepts
71 |
72 | The OpenAI platform provides composable primitives to build agents: **models**, **tools**, **state/memory**, and **orchestration**.
73 |
74 | You can build powerful agentic experiences on our stack, with help in choosing the right models, augmenting your agents with tools, using different modalities (voice, vision, etc.), and evaluating and optimizing your application.
75 |
76 | [\\
77 | \\
78 | **Building agents guide** \\
79 | \\
80 | Official guide to building agents using the OpenAI platform.\\
81 | \\
82 | guide](https://platform.openai.com/docs/guides/agents)
83 |
84 | ### Choosing the right model
85 |
86 | Depending on your use case, you might need more or less powerful models.
87 | OpenAI offers a wide range of models, from cheap and fast to very powerful models that can handle complex tasks.
88 |
89 | #### Reasoning vs non‑reasoning models
90 |
91 | In late 2024, with our first reasoning model `o1`, we introduced a new concept: the ability for models to think things through before giving a final answer.
92 | That thinking is called a “chain of thought,” and it allows models to provide more accurate and reliable answers, especially when answering difficult questions.
93 |
94 | With reasoning, models have the ability to form hypotheses, then test and refine them before validating the final answer. This process results in higher quality outputs.
95 |
96 | Reasoning models trade latency and cost for reliability and often have adjustable levers (e.g., reasoning effort) that influence how hard the model “thinks.” Use a reasoning model when dealing with complex tasks, like planning, math, code generation, or multi‑tool workflows.
97 |
98 | Non‑reasoning models are faster and usually cheaper, which makes them great for chatlike user experiences (with lots of back-and-forth) and simpler tasks where latency matters.
99 |
100 | [\\
101 | \\
102 | **Reasoning guide** \\
103 | \\
104 | Overview of what reasoning is and how to prompt reasoning models\\
105 | \\
106 | guide](https://platform.openai.com/docs/guides/reasoning?api-mode=responses) [\\
107 | \\
108 | **Reasoning best practices** \\
109 | \\
110 | Prompting and optimization tips for reasoning models\\
111 | \\
112 | guide](https://platform.openai.com/docs/guides/reasoning-best-practices)
113 |
114 | #### How to choose
115 |
116 | Always start experimenting with a flagship, multi-purpose model—for example `gpt-4.1` or the new `gpt-5` with minimal `reasoning_effort`.
117 | If your use case is simple and requires fast responses, try `gpt-5-mini` or even `gpt-5-nano`.
118 | If your use case however is somewhat complex, you might want to try a reasoning model like `o4-mini` or `gpt-5` with medium `reasoning_effort`.
119 |
120 | As you try different options, be mindful of prompting strategies: you don’t prompt a reasoning model the same way you prompt a GPT model.
121 | So don’t just swap out the model name when you’re experimenting, try different prompts and see what works best—you can learn more in the evaluation section below.
122 |
123 | If you need more horsepower, you can use more powerful models, like `o3` or `gpt-5` with high `reasoning_effort`. However, if your application presents a conversational interface, we recommend having a faster model to chat back and forth with the user, and then delegating to a more powerful model to perform specific tasks.
124 |
125 | You can refer to our models page for more information on the different models available on the OpenAI platform, including information on performance, latency, capabilities, and pricing.
126 |
127 | Reasoning models and general-purpose models respond best to different kinds of
128 | prompts, and flagship models like `gpt-5` and `gpt-4.1` follow instructions
129 | differently. Check out our prompting guide below for how to get the best out
130 | of `gpt-5`.
131 |
132 | [\\
133 | \\
134 | **OpenAI models page** \\
135 | \\
136 | Overview of the models available on the OpenAI platform.\\
137 | \\
138 | guide](https://platform.openai.com/docs/models) [\\
139 | \\
140 | **GPT-5 prompting guide** \\
141 | \\
142 | Cookbook guide on how to maximize GPT-5's performance.\\
143 | \\
144 | cookbook](https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide)
145 |
146 | ### Building the core logic
147 |
148 | To get started building an agent, you have several options to choose from:
149 | We have multiple core APIs you can use to talk to our models, but our flagship API that was specifically designed for building powerful agents is the **Responses API**.
150 |
151 | When you’re building with the Responses API, you’re responsible for defining the core logic, and orchestrating the different parts of your application.
152 | If you want a higher level of abstraction, you can also use the **Agents SDK**, our framework to build and orchestrate agents.
153 |
154 | Which option you choose depends on personal preference: if you want to get started quickly or build networks of agents that work together, we recommend using the **Agents SDK**.
155 | If you want to have more control over the different parts of your application, and really understand what’s going on under the hood, you can use the **Responses API**.
156 | The Agents SDK is based on the Responses API, but you can also use it with other APIs and even external model providers if you choose. Think of it as another layer on top of the core APIs that makes it easier to build agentic applications.
157 | It abstracts away the complexity, but the trade-off is that it might be harder to have fine-grained control over the core logic.
158 | The Responses API is more flexible, but building with it requires more work to get started.
159 |
160 | #### Building with the Responses API
161 |
162 | The Responses API is our flagship core API to interact with our models.
163 | It was designed to work well with our latest models’ capabilities, notably reasoning models, and comes with a set of built-in tools to augment your agents.
164 | It’s a flexible foundation for building agentic applications.
165 |
166 | It’s also stateful by default, meaning you don’t have to manage the conversation history on your side.
167 | You can if your application requires it, but you can also rely on us to carry over the conversation history from one request to the next.
168 | This makes it easier to build conversations that handle conversation threads without having to store the full conversation state client-side.
169 | It’s especially helpful when you’re using tools that return large payloads as managing that context on your side can impact performance.
170 |
171 | You can get started building with the Responses API by cloning our starter app and customizing it to your needs.
172 |
173 | [\\
174 | \\
175 | **Responses guide** \\
176 | \\
177 | Introduction to the Responses API and its endpoints.\\
178 | \\
179 | guide](https://platform.openai.com/docs/api-reference/responses) [\\
180 | \\
181 | **Responses starter app** \\
182 | \\
183 | Starter application demonstrating OpenAI Responses API with tools.\\
184 | \\
185 | code](https://github.com/openai/openai-responses-starter-app)
186 |
187 | #### Building with the Agents SDK
188 |
189 | The Agents SDK is a lightweight framework that makes it easy to build single agents or orchestrate networks of agents.
190 |
191 | It takes care of the complexity of handling agent loops, has built-in support for guardrails (making sure your agents don’t do anything unsafe or wrong), and introduces the concept of tracing that allows to monitor your workflows.
192 | It works really well with our suite of optimization tools such as our evaluation tool, or our distillation and fine-tuning products.
193 | If you want to learn more about how to optimize your applications, you can check out our [optimization track](https://developers.openai.com/tracks/model-optimization).
194 |
195 | The Agents SDK repositories contain examples in JavaScript and Python to get started quickly, and you can learn more about it in the [Orchestration section](https://developers.openai.com/tracks/building-agents#orchestration) below.
196 |
197 | [\\
198 | \\
199 | **Agents SDK quickstart** \\
200 | \\
201 | Step-by-step guide to quickly build agents with the OpenAI Agents SDK.\\
202 | \\
203 | guide](https://openai.github.io/openai-agents-python/quickstart/) [\\
204 | \\
205 | **Agents SDK — Python** \\
206 | \\
207 | Python SDK for developing agents with OpenAI.\\
208 | \\
209 | code](https://github.com/openai/openai-agents-python) [\\
210 | \\
211 | **Agents SDK — TypeScript** \\
212 | \\
213 | TypeScript SDK for developing agents with OpenAI.\\
214 | \\
215 | code](https://github.com/openai/openai-agents-js)
216 |
217 | ### Augmenting your agents with tools
218 |
219 | Agents become useful when they can take action. And for them to be able to do that, you need to equip your agents with _tools_.
220 |
221 | Tools are functions that your agent can call to perform specific tasks. They can be used to retrieve data, execute tasks, or even interact with external systems.
222 | You can define any tools you want and tell the models how to use them using function calling, or you can rely on our offering of built-in tools - you can find out more about the tools available to you in the next section.
223 |
224 | Rule of thumb: If the capability already exists as a built‑in tool, start
225 | there. Move to function calling with your own functions when you need custom
226 | logic.
227 |
228 | ## Tools
229 |
230 | Explore how you can give your agents access to tools to enable actions like retrieving data, executing tasks, and connecting to external systems.
231 |
232 | There are two types of tools:
233 |
234 | - Custom tools that you define yourself, that the agent can call via function calling
235 | - Built-in tools provided by OpenAI, that you can use out-of-the-box
236 |
237 | ### Function calling vs built‑in tools
238 |
239 | Function calling happens in multiple steps:
240 |
241 | - First, you define what functions you want the model to use and which parameters are expected
242 | - Once the model is aware of the functions it can call, it can decide based on the conversation to call them with the corresponding parameters
243 | - When that happens, you need to execute the execution of the function on your side
244 | - You can then tell the model what the result of the function execution is by adding it to the conversation context
245 | - The model can then use this result to generate the next response
246 |
247 | 
248 |
249 | With built-in tools, you don’t need to handle the execution of the function on your side (except for the computer use tool, more on that below).
250 |
251 | When the model decides to use a built-in tool, it’s automatically executed, and the result is added to the conversation context without you having to do anything.
252 |
253 | In one conversation turn, you get output that already takes into account the tool result, since it’s executed on our infrastructure.
254 |
255 | [\\
256 | \\
257 | **Function calling guide** \\
258 | \\
259 | Introduction to function calling with OpenAI models.\\
260 | \\
261 | guide](https://platform.openai.com/docs/guides/function-calling) [\\
262 | \\
263 | **Built-in tools guide** \\
264 | \\
265 | Guide to using OpenAI's built-in tools with the Responses API.\\
266 | \\
267 | guide](https://platform.openai.com/docs/guides/tools?api-mode=responses) [\\
268 | \\
269 | **Build hour — agentic tool calling** \\
270 | \\
271 | Build hour giving an overview of agentic tool calling.\\
272 | \\
273 | video](https://webinar.openai.com/on-demand/d1a99ac5-8de8-43c5-b209-21903d76b5b2)
274 |
275 | ### Built‑in tools
276 |
277 | Built-in tools are an easy way to add capabilities to your agents, without having to build anything on your side.
278 | You can give the model access to external or internal data, the ability to generate code or images, or even the ability to use computer interfaces, with very low effort.
279 | There are a range of built-in tools you can choose from, each serving a specific purpose:
280 |
281 | - **Web search**: Search the web for up-to-date information
282 | - **File search**: Search across your internal knowledge base
283 | - **Code interpreter**: Let the model run python code
284 | - **Computer use**: Let the model use computer interfaces
285 | - **Image generation**: Generate images with our latest image generation model
286 | - **MCP**: Use any hosted [MCP](https://modelcontextprotocol.io/) server
287 |
288 | Read more about each tool and how you can use them below or check out our build hour showing web search, file search, code interpreter and the MCP tool in action.
289 |
290 | #### Web search
291 |
292 | LLMs know a lot about the world, but they have a cutoff date in their training data, which means they don’t know about anything that happened after that date.
293 | For example, `gpt-5` has a cutoff date of late September 2024. If you want your agents to know about recent events, you need to give them access to the web.
294 |
295 | With the **web search** tool, you can do this in one line of code. Simply add web search as a tool your agent can use, and that’s it.
296 |
297 | [\\
298 | \\
299 | **Web search guide** \\
300 | \\
301 | Guide to using web search with the Responses API.\\
302 | \\
303 | guide](https://platform.openai.com/docs/guides/tools-web-search)
304 |
305 | #### File search
306 |
307 | With **file search**, you can give your agent access to internal knowledge that it would not find on the web.
308 | If you have a lot of proprietary data, feeding everything into the agent’s instructions might result in poor performance.
309 | The more text you have in the input request, the slower (and more expensive) the request, and the agent could also get confused by all this information.
310 |
311 | Instead, you want to retrieve just the information you need when you need it, and feed it to the agent so that it can generate relevant responses.
312 | This process is called RAG (Retrieval-Augmented Generation), and it’s a very common technique used when building AI applications. However, there are many steps involved in building a robust RAG pipeline, and many parameters you need to think about:
313 |
314 | 1. First, you need to prepare the data to create your knowledge base. This means **pre-processing** the files that contain your knowledge, and often you’ll need to split them into smaller chunks.
315 | If you have very large PDF files for example, you want to chunk them into smaller pieces so that each chunk covers a specific topic.
316 |
317 | 2. Then, you need to **embed** the chunks and store them in a vector database. This conversion into a numerical representation is how we can later on use algorithms to find the chunks most similar to a given text.
318 | There are many vector databases to choose from, some are managed, some are self-hosted, but either way you would need to store the chunks somewhere.
319 |
320 | 3. Then, when you get an input request, you need to find the right chunks to give to the model to produce the best answer. This is the **retrieval** step.
321 | Once again, it is not that straightforward: you might need to process the input to make the search more relevant, then you might need to “re-rank” the results you get from the vector database to make sure you pick the best.
322 |
323 | 4. Finally, once you have the most relevant chunks, you can include them in the context you send to the model to generate the final answer.
324 |
325 |
326 | As you may have noticed, there is complexity involved with building a custom RAG pipeline, and it requires a lot of work to get right.
327 | The **file search** tool allows you to bypass that complexity and get started quickly.
328 | All you have to do is add your files to one of our managed vector stores, and we take care of the rest for you: we pre-process the files, embed them, and store them for later use.
329 |
330 | Then, you can add the file search tool to your application, specify which vector store to use, and that’s it: the model will automatically decide when to use it and how to produce a final response.
331 |
332 | [\\
333 | \\
334 | **File search guide** \\
335 | \\
336 | Guide to retrieving context from files using the Responses API.\\
337 | \\
338 | guide](https://platform.openai.com/docs/guides/tools-file-search)
339 |
340 | #### Code interpreter
341 |
342 | The **code interpreter** tool allows the model to come up with python code to solve a problem or answer a question, and execute it in a dedicated environment.
343 |
344 | LLMs are great with words, but sometimes the best way to get to a result is through code, especially when there are numbers involved. That’s when **code interpreter** comes in:
345 | it combines the power of LLMs for answer generation with the deterministic nature of code execution.
346 |
347 | It accepts file inputs, so for example you could provide the model with a spreadsheet export that it can manipulate and analyze through code.
348 |
349 | It can also generate files, for example charts or csv files that would be the output of the code execution.
350 |
351 | This can be a powerful tool for agents that need to manipulate data or perform complex analysis.
352 |
353 | [\\
354 | \\
355 | **Code interpreter guide** \\
356 | \\
357 | Guide to using the built-in code interpreter tool.\\
358 | \\
359 | guide](https://platform.openai.com/docs/guides/tools-code-interpreter)
360 |
361 | #### Computer use
362 |
363 | The **computer use** tool allows the model to perform actions on computer interfaces like a human would.
364 | For example, it can navigate to a website, click on buttons or fill in forms.
365 |
366 | This tool works a little differently: unlike with the other built-in tools, the tool result can’t be automatically appended to the conversation history, because we need to wait for the action to be executed to see what the next step should be.
367 |
368 | So similarly to function calling, this tool call comes with parameters that define suggested actions: “click on this position”, “scroll by that amount”, etc.
369 | It is then up to you to execute the action on your environment, either a virtual computer or a browser, and then send an update in the form of a screenshot.
370 | The model can then assess what it should do next based on the visual interface, and may decide to perform another computer use call with the next action.
371 |
372 | 
373 |
374 | This can be useful if your agent needs to use services that don’t necessarily have an API available, or to automate processes that would normally be done by humans.
375 |
376 | [\\
377 | \\
378 | **Computer Use API guide** \\
379 | \\
380 | Guide to using the Computer Use API (CUA).\\
381 | \\
382 | guide](https://platform.openai.com/docs/guides/tools-computer-use) [\\
383 | \\
384 | **Computer Use API — starter app** \\
385 | \\
386 | Sample app showcasing Computer Use API integration.\\
387 | \\
388 | code](https://github.com/openai/openai-cua-sample-app)
389 |
390 | #### Image generation
391 |
392 | The **image generation** tool allows the model to generate images based on a text prompt.
393 |
394 | It is based on our latest image generation model, **GPT-Image**, which is a state-of-the-art model for image generation with world knowledge.
395 |
396 | This is a powerful tool for agents that need to generate images within a conversation, for example to create a visual summary of the conversation, edit user-provided images, or generate and iterate on images with a lot of context.
397 |
398 | [\\
399 | \\
400 | **Image generation guide** \\
401 | \\
402 | Guide to generating images using OpenAI models.\\
403 | \\
404 | guide](https://platform.openai.com/docs/guides/image-generation?image-generation-model=gpt-image-1) [\\
405 | \\
406 | **ImageGen cookbook** \\
407 | \\
408 | Cookbook examples for generating images with GPT-Image.\\
409 | \\
410 | cookbook](https://cookbook.openai.com/examples/generate_images_with_gpt_image) [\\
411 | \\
412 | **ImageGen with high fidelity cookbook** \\
413 | \\
414 | Cookbook examples for generating images with high fidelity using GPT-Image.\\
415 | \\
416 | cookbook](https://cookbook.openai.com/examples/generate_images_with_gpt_image)
417 |
418 | ## Orchestration
419 |
420 | **Orchestration** is the concept of handling multiple steps, tool use, handoffs between different agents, guardrails, and context.
421 | Put simply, it’s how you manage the conversation flow.
422 |
423 | For example, in reaction to a user input, you might need to perform multiple steps to generate a final answer, each step feeding into the next.
424 | You might also have a lot of complexity in your use case requiring a separation of concerns, and to do that you need to define multiple agents that work in concert.
425 |
426 | If you’re building with the **Responses API**, you can manage this entirely on your side, maintaining state and context across steps, switching between models and instructions appropriately, etc.
427 | However, for orchestration we recommend relying on the **Agents SDK**, which provides a set of primitives to help you easily define networks of agents, inject guardrails, define context, and more.
428 |
429 | ### Foundations of the Agents SDK
430 |
431 | The Agents SDK uses a few core primitives:
432 |
433 | | Primitive | What it is |
434 | | --- | --- |
435 | | Agent | model + instructions + tools |
436 | | Handoff | other agent the current agent can hand off to |
437 | | Guardrail | policy to filter out unwanted inputs |
438 | | Session | automatically maintains conversation history across agent runs |
439 |
440 | Each of these primitives is an abstraction allowing you to build faster, as the complexity that comes with handling these aspects is handled for you.
441 |
442 | For example, the Agents SDK automatically handles:
443 |
444 | - **Agent loop**: calling tools and executing function calls over multiple turns if needed
445 | - **Handoffs**: switching instructions, models and available tools based on conversation state
446 | - **Guardrails**: running inputs through filters to stop the generation if required
447 |
448 | In addition to these features, the Agents SDK has built-in support for tracing, which allows you to monitor and debug your agents workflows.
449 | Without any additional code, you can understand what happened: which tools were called, which agents were used, which guardrails were triggered, etc.
450 | This allows you to iterate on your agents quickly and efficiently.
451 |
452 | 
453 |
454 | To try practical examples with the Agents SDK, check out our examples in the repositories below.
455 |
456 | [\\
457 | \\
458 | **Agents SDK quickstart** \\
459 | \\
460 | Step-by-step guide to quickly build agents with the OpenAI Agents SDK.\\
461 | \\
462 | guide](https://openai.github.io/openai-agents-python/quickstart/) [\\
463 | \\
464 | **Agents SDK — Python** \\
465 | \\
466 | Python SDK for developing agents with OpenAI.\\
467 | \\
468 | code](https://github.com/openai/openai-agents-python) [\\
469 | \\
470 | **Agents SDK — TypeScript** \\
471 | \\
472 | TypeScript SDK for developing agents with OpenAI.\\
473 | \\
474 | code](https://github.com/openai/openai-agents-js)
475 |
476 | ### Multi-agent collaboration
477 |
478 | In some cases, your application might benefit from having not just one, but multiple agents working together.
479 |
480 | This shouldn’t be your go-to solution, but something you might consider if you have separate tasks that do not overlap and if for one or more of those tasks you have:
481 |
482 | - Very complex or long instructions
483 | - A lot of tools (or similar tools across tasks)
484 |
485 | For example, if you have for each task several tools to retrieve, update or create data, but these actions work differently depending on the task, you don’t want to group all of these tools and give them to one agent.
486 | The agent could get confused and use the tool meant for task A when the user needs the tool for task B.
487 |
488 | Instead, you might want to have a separate agent for each task, and a “routing” agent that is the main interface for the user. Once the routing agent has determined which task to perform, it can hand off to the appropriate agent than can use the right tool for the task.
489 |
490 | Similarly, if you have a task that has very complex instructions, or that needs to use a model with high reasoning power, you might want to have a separate agent for that task that is only called when needed, and use a faster, cheaper model for the main agent.
491 |
492 | [\\
493 | \\
494 | **Orchestrating multiple agents** \\
495 | \\
496 | Guide to coordinating multiple agents with shared context.\\
497 | \\
498 | guide](https://openai.github.io/openai-agents-python/multi_agent/) [\\
499 | \\
500 | **Portfolio collab with Agents SDK** \\
501 | \\
502 | Cookbook example of agents collaborating to manage a portfolio.\\
503 | \\
504 | cookbook](https://cookbook.openai.com/examples/agents_sdk/multi-agent-portfolio-collaboration/multi_agent_portfolio_collaboration) [\\
505 | \\
506 | **Unlock agentic power — Agents SDK** \\
507 | \\
508 | Video demonstrating advanced capabilities of the Agents SDK.\\
509 | \\
510 | video](https://vimeo.com/1105245234)
511 |
512 | ### Multi‑agent collaboration
513 |
514 | Why multiple agents instead of one mega‑prompt?
515 |
516 | - **Separation of concerns**: Research vs. drafting vs. QA
517 | - **Parallelism**: Faster end‑to‑end execution of tasks
518 | - **Focused evals**: Score agents differently, depending on their scoped goals
519 |
520 | Use **agent‑as‑tool** (expose one agent as a callable tool for another) and share memory keyed by `conversation_id`.
521 |
522 | ## Example use cases
523 |
524 | There are many different use cases for agents, some that require a conversational interface, some where the agents are meant to be deeply integrated in an application.
525 |
526 | For example, some agents use structured data as an input, others a simple query to trigger a series of actions before generating a final output.
527 |
528 | Depending on the use case, you might want to optimize for different things—for example:
529 |
530 | - **Speed**: if the user is interacting back and forth with the agent
531 | - **Reliability**: if the agent is meant to tackle complex tasks and come out with a final, optimized output
532 | - **Cost**: if the agent is meant to be used frequently and at scale
533 |
534 | We have compiled a few example applications below you can use as starting points, each covering different interaction patterns:
535 |
536 | - **Support agent**: a simple support agent built on top of the Responses API, with a “human in the loop” angle—the agent is meant to be used by a human that can accept or reject the agent’s suggestions
537 | - **Customer service agent**: a network of multiple agents working together to handle a customer request, built with the Agents SDK
538 | - **Frontend testing agent**: a computer using agent that requires a single user input to test a frontend application
539 |
540 | [\\
541 | \\
542 | **Support agent demo** \\
543 | \\
544 | Demo showing a customer support agent with a human in the loop.\\
545 | \\
546 | code](https://github.com/openai/openai-support-agent-demo) [\\
547 | \\
548 | **CS agents demo** \\
549 | \\
550 | Demo showcasing customer service agents orchestration.\\
551 | \\
552 | code](https://github.com/openai/openai-cs-agents-demo) [\\
553 | \\
554 | **Frontend testing demo** \\
555 | \\
556 | Demo application for frontend testing using CUA.\\
557 | \\
558 | code](https://github.com/openai/openai-testing-agent-demo)
559 |
560 | ## Best practices
561 |
562 | When you build agents, keep in mind that they might be unpredictable—that’s the nature of LLMs.
563 |
564 | There are a few things you can do to make your agents more reliable, but it depends on what you are building and for whom.
565 |
566 | ### User inputs
567 |
568 | If your agent accepts user inputs, you might want to include guardrails to make sure it can’t be jailbreaked or you don’t incur costs processing irrelevant inputs
569 | Depending on the tools you use, the level of risk you are willing to take, and the scale of your application, you can implement more or less robust guardrails.
570 | It can be as simple as something to include in your prompt (for example “don’t answer any question unrelated to X, Y or Z”) or as complex as a full-fledged multi-step guardrail system.
571 |
572 | ### Model outputs
573 |
574 | A good practice is to use **structured outputs** whenever you want to use the model’s output as part of your application instead of simply displaying it to the user.
575 | Structured outputs are a way to constrain the model to a strict json schema, so you always know what the output shape will be.
576 |
577 | If your agent is user-facing, once again depending on the level of risk you’re comfortable with, you might want to implement output guardrails to make sure the output doesn’t break any rules (for example, if you’re a car company, you don’t want the model to tell customers they can buy your car for $1 and it’s contractually binding).
578 |
579 | [\\
580 | \\
581 | **Structured outputs guide** \\
582 | \\
583 | Guide for producing structured outputs with the Responses API.\\
584 | \\
585 | guide](https://platform.openai.com/docs/guides/structured-outputs?api-mode=responses)
586 |
587 | ### Optimizing for production
588 |
589 | If you plan to ship your agent to production, there are additional things to consider—you might want to optimize costs and latency, or monitor your agent to make sure it performs well.
590 | To learn about these topics, you can check out our [AI application development track](https://developers.openai.com/tracks/ai-application-development).
591 |
592 | ## Conclusion and next steps
593 |
594 | In this track you:
595 |
596 | - Learned about the core concepts behind agents and how to build them
597 | - Gained practical experience with the Responses API and the Agents SDK
598 | - Discovered our built-in tools offering
599 | - Learned about agent orchestration and multi-agent networks
600 | - Explored example use cases
601 | - Learned about best practices to manage user inputs and model outputs
602 |
603 | These are the foundations to build your own agentic applications.
604 |
605 | As a next step, you can learn how to deploy them in production with our [AI application development track](https://developers.openai.com/tracks/ai-application-development).
```