# Vercel AI SDK The Vercel AI SDK is a comprehensive TypeScript toolkit for building AI-powered applications with language models. It provides a unified interface for interacting with multiple AI providers (OpenAI, Anthropic, Google, and 30+ others) and offers framework-agnostic hooks for React, Vue, Svelte, and Angular. The SDK handles streaming, tool calling, structured output generation, and agentic workflows with built-in support for multi-step reasoning and complex interactions. The SDK consists of three main layers: the Core AI module (`ai`) for server-side model interactions, framework-specific UI modules (`@ai-sdk/react`, `@ai-sdk/vue`, etc.) for building chat interfaces, and provider packages (`@ai-sdk/openai`, `@ai-sdk/anthropic`, etc.) for model access. It supports both streaming and non-streaming generation, automatic tool execution with approval workflows, structured data extraction using Zod schemas, stateful agent systems that can execute multi-step tasks autonomously, and multimedia capabilities including image generation, text-to-speech, audio transcription, and document reranking. ## generateText - Generate text with tool calls Generate text responses from language models with automatic tool calling and multi-step execution. Returns complete response after all tool calls are executed. ```typescript import { generateText } from 'ai'; import { openai } from '@ai-sdk/openai'; import { z } from 'zod'; const result = await generateText({ model: openai('gpt-4-turbo'), prompt: 'What is the weather in San Francisco and what should I wear?', tools: { getWeather: { description: 'Get the weather for a location', parameters: z.object({ city: z.string().describe('The city name') }), execute: async ({ city }) => { // API call to weather service return { temperature: 72, condition: 'sunny' }; } } }, maxRetries: 2, temperature: 0.7 }); console.log(result.text); // Final text after tool execution console.log(result.toolCalls); // All tool calls made console.log(result.usage); // Token usage statistics console.log(result.steps); // All generation steps ``` ## streamText - Stream text with real-time tool execution Stream text generation with real-time tool calling and event callbacks. Returns stream result with multiple consumption methods. ```typescript import { streamText } from 'ai'; import { anthropic } from '@ai-sdk/anthropic'; import { z } from 'zod'; const result = streamText({ model: anthropic('claude-3-5-sonnet-20241022'), system: 'You are a helpful assistant with access to real-time data.', prompt: 'Search for recent news about AI and summarize the top 3 articles.', tools: { searchWeb: { description: 'Search the web for information', parameters: z.object({ query: z.string() }), execute: async ({ query }) => { // Perform web search return { results: ['Article 1...', 'Article 2...', 'Article 3...'] }; } } }, onChunk: async ({ chunk }) => { if (chunk.type === 'text-delta') { process.stdout.write(chunk.text); } }, onFinish: async ({ text, toolCalls, usage, steps }) => { console.log('\n\nGeneration complete'); console.log('Total steps:', steps.length); console.log('Total tokens:', usage.totalTokens); } }); // Multiple ways to consume the stream for await (const chunk of result.textStream) { process.stdout.write(chunk); } // Or get the full result const { text, toolResults } = await result; ``` ## generateObject - Extract structured data Generate type-safe structured objects from language models using Zod schemas. Automatically validates and parses model output. ```typescript import { generateObject } from 'ai'; import { openai } from '@ai-sdk/openai'; import { z } from 'zod'; const result = await generateObject({ model: openai('gpt-4-turbo'), schema: z.object({ recipe: z.object({ name: z.string(), ingredients: z.array(z.object({ name: z.string(), amount: z.string(), unit: z.string() })), steps: z.array(z.string()), prepTime: z.number().describe('Preparation time in minutes'), cookTime: z.number().describe('Cooking time in minutes') }) }), prompt: 'Generate a vegetarian lasagna recipe for 4 people.', mode: 'json', // 'auto', 'json', or 'tool' temperature: 0.3 }); console.log(result.object.recipe.name); console.log(result.object.recipe.ingredients); console.log(result.usage); ``` ## streamObject - Stream structured data Stream partial structured objects as they're generated. Enables progressive UI updates while maintaining type safety. ```typescript import { streamObject } from 'ai'; import { openai } from '@ai-sdk/openai'; import { z } from 'zod'; const result = streamObject({ model: openai('gpt-4-turbo'), schema: z.object({ characters: z.array(z.object({ name: z.string(), class: z.string(), bio: z.string() })) }), prompt: 'Generate 3 fantasy RPG characters with detailed backgrounds.' }); // Stream partial objects for await (const partialObject of result.partialObjectStream) { console.clear(); console.log('Current progress:', JSON.stringify(partialObject, null, 2)); } // Get final validated object const { object } = await result; console.log('Final result:', object); ``` ## ToolLoopAgent - Autonomous multi-step agents Create reusable agents that can execute multi-step workflows with tools. ToolLoopAgent automatically handles tool calling loops and can be used across your application. ```typescript import { ToolLoopAgent } from 'ai'; import { openai } from '@ai-sdk/openai'; import { z } from 'zod'; const researchAgent = new ToolLoopAgent({ model: openai('gpt-4-turbo'), id: 'research-agent', instructions: 'You are a research assistant that can search the web and analyze data.', tools: { searchWeb: { description: 'Search the web for information', parameters: z.object({ query: z.string() }), execute: async ({ query }) => { // Perform search return { results: ['...'] }; } }, analyzeData: { description: 'Analyze data and provide insights', parameters: z.object({ data: z.array(z.string()) }), execute: async ({ data }) => { // Perform analysis return { insights: '...' }; } } }, stopWhen: async ({ steps }) => steps.length >= 10 || steps.at(-1)?.finishReason === 'stop', maxOutputTokens: 4096 }); // Use the agent (non-streaming) const result = await researchAgent.generate({ prompt: 'Research the latest developments in quantum computing and summarize key breakthroughs.' }); console.log(result.text); console.log(result.steps.length, 'steps executed'); // Or stream responses const stream = researchAgent.stream({ prompt: 'What are the current applications of quantum computing in cryptography?' }); for await (const chunk of stream.textStream) { process.stdout.write(chunk); } ``` ## useChat - React chat interface React hook for building chat UIs with streaming responses and tool invocations. Manages message state and handles user interactions. ```typescript 'use client'; import { useChat } from '@ai-sdk/react'; export default function ChatComponent() { const { messages, status, sendMessage, stop, addToolResult } = useChat({ api: '/api/chat', initialMessages: [ { id: '1', role: 'system', content: 'You are a helpful assistant.' } ], onFinish: (message) => { console.log('Message complete:', message); }, onError: (error) => { console.error('Chat error:', error); } }); const handleSubmit = (e: React.FormEvent) => { e.preventDefault(); const formData = new FormData(e.currentTarget); const input = formData.get('message') as string; sendMessage({ text: input }); e.currentTarget.reset(); }; return (
{messages.map(message => (
{message.parts.map((part, i) => { switch (part.type) { case 'text': return

{part.text}

; case 'tool-image_generation': if (part.state === 'output-available') { return ; } return

Generating image...

; default: return null; } })}
))}
{status === 'in-progress' && ( )}
); } ``` ## Chat API Route - Next.js App Router Server-side chat endpoint that streams responses to the client. Uses agents with tool calling for complex interactions. ```typescript // app/api/chat/route.ts import { ToolLoopAgent, createAgentUIStreamResponse } from 'ai'; import { openai } from '@ai-sdk/openai'; import { z } from 'zod'; const chatAgent = new ToolLoopAgent({ model: openai('gpt-4-turbo'), instructions: 'You are a helpful assistant that can search the web and perform calculations.', tools: { search: { description: 'Search for information', parameters: z.object({ query: z.string() }), execute: async ({ query }) => { // Implement search return { results: ['...'] }; } }, calculate: { description: 'Perform a calculation', parameters: z.object({ expression: z.string() }), execute: async ({ expression }) => { // Implement calculator return { result: eval(expression) }; } } } }); export async function POST(req: Request) { const { messages } = await req.json(); return createAgentUIStreamResponse({ agent: chatAgent, messages, }); } ``` ## Tool Approval Workflow - User-controlled tool execution Implement approval flows for sensitive tool operations. User can approve or deny each tool call before execution. ```typescript import { streamText } from 'ai'; import { openai } from '@ai-sdk/openai'; import { z } from 'zod'; const result = streamText({ model: openai('gpt-4-turbo'), prompt: 'Delete the file named old-data.txt', tools: { deleteFile: { description: 'Delete a file from the filesystem', parameters: z.object({ filename: z.string() }), requiresApproval: async ({ input }) => { // Require approval for deletions return true; }, execute: async ({ filename }) => { // Delete file return { success: true }; } } } }); // Handle tool approval requests for await (const chunk of result.fullStream) { if (chunk.type === 'tool-approval-request') { const userApproved = await askUser( `Approve deletion of ${chunk.toolCall.input.filename}?` ); if (userApproved) { await result.addToolApprovalResponse({ approvalId: chunk.approvalId, approved: true }); } else { await result.addToolApprovalResponse({ approvalId: chunk.approvalId, approved: false, reason: 'User denied permission' }); } } } ``` ## embed - Generate text embeddings Generate vector embeddings for text using various embedding models. Useful for semantic search and similarity matching. ```typescript import { embed } from 'ai'; import { openai } from '@ai-sdk/openai'; const result = await embed({ model: openai.embedding('text-embedding-3-small'), value: 'The quick brown fox jumps over the lazy dog.', dimensions: 512 // Optional: reduce dimensionality }); console.log(result.embedding); // Float array of embedding values console.log(result.usage); // Token usage ``` ## embedMany - Batch embedding generation Generate embeddings for multiple texts efficiently with automatic batching and retry handling. ```typescript import { embedMany, cosineSimilarity } from 'ai'; import { openai } from '@ai-sdk/openai'; const texts = [ 'Artificial intelligence is transforming technology.', 'Machine learning models require large datasets.', 'Natural language processing enables human-computer interaction.' ]; const result = await embedMany({ model: openai.embedding('text-embedding-3-small'), values: texts, maxRetries: 2 }); console.log(result.embeddings); // Array of embedding arrays console.log(result.usage); // Total token usage // Calculate cosine similarity between embeddings const similarity = cosineSimilarity(result.embeddings[0], result.embeddings[1]); console.log('Similarity:', similarity); ``` ## generateImage - Generate images from text Generate images using image generation models like DALL-E, Stable Diffusion, and others. Supports multiple providers and advanced configuration. ```typescript import { generateImage } from 'ai'; import { openai } from '@ai-sdk/openai'; const result = await generateImage({ model: openai.image('dall-e-3'), prompt: 'A serene mountain landscape at sunset with a lake in the foreground', n: 1, // Number of images size: '1024x1024', aspectRatio: '16:9', // Alternative to size seed: 12345, // For reproducibility providerOptions: { openai: { style: 'vivid', quality: 'hd' } } }); console.log(result.images); // Array of generated images console.log(result.images[0].base64); // Base64-encoded image data console.log(result.images[0].uint8Array); // Raw image bytes // Save image to file import fs from 'fs'; fs.writeFileSync('output.png', result.images[0].uint8Array); ``` ## generateSpeech - Convert text to speech Generate speech audio from text using text-to-speech models. Supports multiple voices, languages, and audio formats. ```typescript import { generateSpeech } from 'ai'; import { openai } from '@ai-sdk/openai'; const result = await generateSpeech({ model: openai.speech('tts-1'), text: 'Hello, welcome to the AI SDK. This is a demonstration of text-to-speech capabilities.', voice: 'alloy', // Voice selection outputFormat: 'mp3', // 'mp3' | 'wav' | 'opus' | 'aac' | 'flac' speed: 1.0, // Speech speed (0.25 to 4.0) language: 'en', // ISO 639-1 language code instructions: 'Speak in a friendly and enthusiastic tone' }); console.log(result.audio); // Audio file object console.log(result.audio.uint8Array); // Raw audio bytes console.log(result.warnings); // Any warnings from generation // Save audio to file import fs from 'fs'; fs.writeFileSync('output.mp3', result.audio.uint8Array); ``` ## transcribe - Transcribe audio to text Transcribe audio files to text using speech recognition models like Whisper. Supports various audio formats and returns detailed transcription data. ```typescript import { transcribe } from 'ai'; import { openai } from '@ai-sdk/openai'; import fs from 'fs'; // Transcribe from file const audioData = fs.readFileSync('recording.mp3'); const result = await transcribe({ model: openai.transcription('whisper-1'), audio: audioData, language: 'en', // Optional: specify source language prompt: 'This is a technical discussion about AI.' // Optional: context hint }); console.log(result.text); // Full transcription console.log(result.segments); // Timestamped segments console.log(result.language); // Detected language console.log(result.duration); // Audio duration in seconds // Access timestamped segments result.segments?.forEach(segment => { console.log(`[${segment.start}s - ${segment.end}s]: ${segment.text}`); }); ``` ## rerank - Rerank documents by relevance Rerank a list of documents based on their relevance to a query using specialized reranking models. More accurate than simple embedding similarity for search and retrieval. ```typescript import { rerank } from 'ai'; import { cohere } from '@ai-sdk/cohere'; const documents = [ { id: '1', text: 'Machine learning is a subset of artificial intelligence.' }, { id: '2', text: 'Paris is the capital city of France.' }, { id: '3', text: 'Neural networks are inspired by the human brain.' }, { id: '4', text: 'The Eiffel Tower is located in Paris.' }, { id: '5', text: 'Deep learning uses multiple layers of neural networks.' } ]; const result = await rerank({ model: cohere.reranker('rerank-english-v3.0'), query: 'What is artificial intelligence and machine learning?', documents: documents.map(doc => doc.text), topN: 3 // Return top 3 most relevant documents }); console.log(result.rankings); // Ranked indices with relevance scores result.rankings.forEach(ranking => { console.log(`Rank ${ranking.index}: Score ${ranking.relevanceScore}`); console.log(`Document: ${documents[ranking.index].text}\n`); }); // Access reranked documents directly console.log(result.rankedDocuments); ``` ## Provider Configuration - Multiple AI providers Configure and use multiple AI providers in the same application. The SDK provides unified interfaces across all providers. ```typescript import { generateText } from 'ai'; import { openai } from '@ai-sdk/openai'; import { anthropic } from '@ai-sdk/anthropic'; import { google } from '@ai-sdk/google'; import { createOpenAI } from '@ai-sdk/openai-compatible'; // OpenAI const openaiResult = await generateText({ model: openai('gpt-4-turbo', { apiKey: process.env.OPENAI_API_KEY }), prompt: 'Explain quantum computing.' }); // Anthropic const anthropicResult = await generateText({ model: anthropic('claude-3-5-sonnet-20241022', { apiKey: process.env.ANTHROPIC_API_KEY }), prompt: 'Explain quantum computing.' }); // Google Gemini const googleResult = await generateText({ model: google('gemini-1.5-pro', { apiKey: process.env.GOOGLE_API_KEY }), prompt: 'Explain quantum computing.' }); // OpenAI-compatible providers (Groq, Together, etc.) const groq = createOpenAI({ apiKey: process.env.GROQ_API_KEY, baseURL: 'https://api.groq.com/openai/v1' }); const groqResult = await generateText({ model: groq('llama-3.1-70b-versatile'), prompt: 'Explain quantum computing.' }); // Or use Vercel AI Gateway for unified access const gatewayResult = await generateText({ model: 'openai/gpt-4-turbo', // Gateway handles routing prompt: 'Explain quantum computing.' }); ``` ## Multi-step Reasoning with Callbacks Track and control multi-step generation processes with detailed callbacks for each step. Useful for debugging and monitoring agent behavior. ```typescript import { generateText } from 'ai'; import { openai } from '@ai-sdk/openai'; import { z } from 'zod'; const result = await generateText({ model: openai('gpt-4-turbo'), prompt: 'Research the history of the internet and create a timeline.', tools: { search: { description: 'Search for information', parameters: z.object({ query: z.string() }), execute: async ({ query }) => ({ results: ['...'] }) } }, stopWhen: ({ steps }) => steps.length >= 5, onStepFinish: async (stepResult) => { console.log(`\n--- Step ${stepResult.response.messages.length} ---`); console.log('Finish reason:', stepResult.finishReason); console.log('Tool calls:', stepResult.toolCalls?.length || 0); console.log('Tokens used:', stepResult.usage.totalTokens); if (stepResult.toolCalls) { stepResult.toolCalls.forEach(call => { console.log(`Tool: ${call.toolName}`, call.input); }); } }, onFinish: async ({ steps, totalUsage, text }) => { console.log('\n=== Generation Complete ==='); console.log('Total steps:', steps.length); console.log('Total tokens:', totalUsage.totalTokens); console.log('Final output length:', text.length); } }); console.log('\nFinal result:', result.text); ``` ## Structured Output with Output Helpers Generate structured outputs with helper functions for arrays, choices, and unstructured JSON. Simplifies schema definition and provides better type safety. ```typescript import { generateText, Output } from 'ai'; import { openai } from '@ai-sdk/openai'; import { z } from 'zod'; // Generate an array of objects const arrayResult = await generateText({ model: openai('gpt-4-turbo'), output: Output.array({ element: z.object({ name: z.string(), email: z.string().email(), role: z.enum(['admin', 'user', 'guest']) }) }), prompt: 'Generate 5 sample user profiles.' }); console.log(arrayResult.output); // Array of user objects // Generate an enum/choice const choiceResult = await generateText({ model: openai('gpt-4-turbo'), output: Output.choice({ options: ['positive', 'negative', 'neutral'] }), prompt: 'Analyze the sentiment of: "This product is amazing!"' }); console.log(choiceResult.output); // 'positive' | 'negative' | 'neutral' // Generate unstructured JSON (no schema required) const jsonResult = await generateText({ model: openai('gpt-4-turbo'), output: Output.json(), prompt: 'Generate a flexible JSON object with user data and metadata.' }); console.log(jsonResult.output); // Any JSON value // Traditional object schema still works with generateObject const schemaResult = await generateObject({ model: openai('gpt-4-turbo'), schema: z.object({ users: z.array(z.object({ name: z.string(), email: z.string().email(), age: z.number().min(0).max(120) })) }), prompt: 'Generate 3 user profiles.' }); console.log(schemaResult.object.users); ``` ## The Vercel AI SDK provides comprehensive tools for building production-ready AI applications with type safety, streaming support, and multi-provider compatibility. The core `generateText` and `streamText` functions handle text generation with automatic tool calling and multi-step reasoning, enabling complex agentic workflows. For structured data extraction, `generateObject` and `streamObject` parse LLM outputs into type-safe objects using Zod schemas with validation. The `ToolLoopAgent` class encapsulates reusable AI behaviors with tools and instructions, making it easy to create specialized assistants that can execute multi-step workflows autonomously. Framework integration is seamless through UI hooks like `useChat` for React, `@ai-sdk/vue`, and `@ai-sdk/svelte` that manage message state, handle streaming, and provide loading indicators. The SDK supports 30+ AI providers through a unified interface, including OpenAI, Anthropic, Google, Azure, AWS Bedrock, and open-source models via OpenAI-compatible endpoints. Advanced features include `createAgentUIStreamResponse` for streaming agent responses to chat interfaces, tool approval workflows for sensitive operations, embedding generation for semantic search with `embed` and `embedMany`, document reranking with the `rerank` function for improved search relevance, and multimedia capabilities with `generateImage` for image generation, `generateSpeech` for text-to-speech, and `transcribe` for audio transcription. Structured output generation is enhanced with `Output.json()` for flexible JSON without schemas, `Output.array()` for typed arrays, and `Output.choice()` for enum-based selections. The SDK also offers reasoning output tracking for advanced models, custom retry logic with exponential backoff, telemetry integration with OpenTelemetry, and comprehensive error handling with typed exceptions. Every function returns detailed usage statistics, finish reasons, and provider metadata for observability and cost tracking.