Skip to main content

Using OpenAI Agents SDK with DataDoe MCP

This tutorial shows how to build a small TypeScript agent with the OpenAI Agents JS SDK (opens in a new tab) and connect it to DataDoe MCP. The sample app does one practical job: it uses DataDoe advertising data, downloads the export file, loads it into DuckDB, and writes a JSON file with improved Amazon listing copy.

What you will build

By the end, you will have a simple agent that:

  • connects to DataDoe MCP with hostedMcpTool
  • downloads an export file before the URL expires
  • loads the data into DuckDB and runs SQL over it
  • produces one final JSON file in response/

Prerequisites

Step 1: Clone and install the repository

Start by installing the project:

bash
1git clone https://github.com/Deltologic/datadoe-mcp-openai-agents-sdk.git
2cd datadoe-mcp-openai-agents-sdk
3yarn install

Step 2: Configure your keys

Copy the environment file and add both API keys:

bash
1cp .env.example .env

Open .env and set:

The agent already knows how to reach DataDoe MCP. In src/index.ts, it uses:

ts
1hostedMcpTool({
2    serverLabel: 'datadoe',
3    serverUrl: 'https://api.datadoe.com/mcp/v1',
4    headers: {
5        'datadoe-mcp-key': process.env.DATADOE_MCP_KEY ?? ''
6    }
7});

That is the only MCP-specific wiring the app needs. The agent gets the DataDoe tools through the SDK, and the key is sent as a request header.

Step 3: Create the agent

The main agent lives in src/index.ts. It imports the OpenAI Agents SDK and combines DataDoe MCP with a few local tools:

ts
1import { Agent, hostedMcpTool, isOpenAIResponsesRawModelStreamEvent, run } from '@openai/agents';
2
3import { systemPrompt } from './prompts/system.prompt.ts';
4import { duckdbTools } from './tools/duckdb.tools.ts';
5import { fetchTools } from './tools/fetch.tools.ts';
6import { outputTools } from './tools/output.tools.ts';
7import { sleepTools } from './tools/sleep.tools.ts';

The agent itself is small:

ts
1const agent = new Agent({
2    name: 'Data Analyst',
3    instructions: systemPrompt,
4    tools: [
5        hostedMcpTool({
6            serverLabel: 'datadoe',
7            serverUrl: 'https://api.datadoe.com/mcp/v1',
8            headers: {
9                'datadoe-mcp-key': process.env.DATADOE_MCP_KEY ?? ''
10            }
11        }),
12        ...sleepTools,
13        ...fetchTools,
14        ...duckdbTools,
15        ...outputTools
16    ],
17    model: 'gpt-5.4-mini',
18    modelSettings: {
19        reasoning: {
20            effort: 'high'
21        }
22    }
23});

The important part is the shape:

  • hostedMcpTool(...) exposes DataDoe tools to the agent
  • fetchTools, duckdbTools, and outputTools handle the local workflow
  • systemPrompt tells the model what to do and what not to do
  • run(...) executes the agent and streams its output

Step 4: Understand the helper tools

The local tools make the sample app feel like a real workflow instead of a single prompt.

Download tool

src/tools/fetch.tools.ts adds download_file_from_url. DataDoe export URLs are temporary, so the agent should fetch them right away and save them locally.

ts
1tool({
2    name: 'download_file_from_url',
3    description:
4        'URGENT: Export links expire quickly, call this tool IMMEDIATELY before doing anything else. It downloads the file and returns the local file path.'
5});

This keeps the workflow stable: the model can request an export URL from DataDoe MCP, then hand it off to the download tool before the link expires.

DuckDB tools

src/tools/duckdb.tools.ts is where the downloaded data becomes queryable:

ts
1tool({
2    name: 'load_csv',
3    description: 'Loads a CSV file from the local file system into a DuckDB table.'
4});

The file also includes:

  • load_json for JSON or newline-delimited JSON files
  • list_tables to inspect what is loaded
  • run_sql_query to run standard SQL against DuckDB

That combination is the heart of the tutorial. DataDoe MCP supplies the export; DuckDB lets the agent rank ASINs and inspect the data locally.

Final output tool

src/tools/output.tools.ts defines output_improved_listings. This tool is the final step of the app:

ts
1tool({
2    name: 'output_improved_listings',
3    description: 'Emits the final structured JSON for all improved ASIN listings.'
4});

The agent should not write JSON directly in its response. Instead, it calls this tool once with all improved listings, and the tool writes the result to response/improved-listings-<timestamp>.json.

Step 5: Use the system prompt to guide the agent

The prompt in src/prompts/system.prompt.ts keeps the agent focused. It tells the model to:

  • use DataDoe MCP tools when it needs export data
  • prefer SQL-based analysis through DuckDB
  • keep marketplace-specific content in the right language
  • never write JSON output directly in the chat

In practice, that means the agent behaves like a focused analyst instead of a generic chatbot.

Step 6: Run the app

Start the sample with:

bash
1yarn start

The app runs src/index.ts, streams the agent's reasoning to the terminal, and finishes by writing a JSON file to ./response/.

Typical flow:

  1. The agent asks DataDoe MCP for the needed export.
  2. The export URL is downloaded immediately.
  3. DuckDB loads the file and the agent runs SQL to find weak ASINs.
  4. The model rewrites the listing copy.
  5. output_improved_listings saves the final result.

The output file contains one item per ASIN, with the current title and bullets alongside the improved version. That makes it easy to feed into another script, a bulk edit sheet, or a later workflow.

A simple mental model

If you want to adapt this pattern for another DataDoe-powered agent, keep the same split:

  • DataDoe MCP for source data
  • local tools for downloading, transforming, and validating
  • one dedicated output tool for the final artifact

That structure is what makes the sample easy to extend without turning the agent into a pile of prompt instructions.

DataDoe MCP resources

Check the following resources for more information: