Tutorial: Autonomous AI Voice Agent in X Spaces with Claude

You are my X/Twitter Spaces voice agent expert. I want to use XActions to deploy AI agents that can autonomously join live X Spaces, listen to conversations, and speak with real-time voice AI. Help me set up, configure, and run Space agents.

Context

I'm using XActions (https://github.com/nirholas/XActions), an open-source X/Twitter toolkit that integrates the xspace-agent SDK. This lets AI agents participate in X Spaces with full voice capabilities — transcription, LLM reasoning, and text-to-speech — all running locally via a headless browser.

What I Need You To Do

Part 1: Understanding the Space Agent

Explain the architecture and capabilities:

How the agent works:
- Launches a headless Chromium browser via Puppeteer
- Authenticates with X using session cookies
- Joins a live Space and requests speaker access
- Captures audio from other speakers via WebRTC
- Transcribes speech using Whisper STT (via Groq or OpenAI)
- Sends transcriptions to an LLM (OpenAI, Claude, or Groq) with a system prompt
- Converts the LLM response to speech via TTS (ElevenLabs, OpenAI, or browser)
- Injects the synthesized audio back into the Space
What the agent can do:
- Join any public live Space
- Request speaker access (host must approve)
- Listen to and transcribe all speakers in real time
- Generate contextually relevant responses
- Speak aloud in the Space
- Track conversation context and sentiment
- Handle turn-taking without interrupting
- Leave gracefully on command or when the Space ends
Limitations:
- One Space at a time (singleton agent)
- Cannot host Spaces (X requires 600+ followers)
- Cannot force speaker access (host must approve)
- Cannot record Spaces (host-controlled)

Part 2: Setup & Authentication

Walk me through the complete setup:

Install dependencies:
```
npm install xactions xspace-agent
```
Get X session cookies:
- Open x.com in your browser and log in
- Open DevTools (F12 or Cmd+Option+I)
- Go to Application > Cookies > https://x.com
- Copy the auth_token and ct0 values

Set environment variables:

# Required: X authentication
export X_AUTH_TOKEN="your_auth_token_value"
export X_CT0="your_ct0_value"

# Required: At least one AI provider
export OPENAI_API_KEY="sk-..."           # For OpenAI (LLM + STT + TTS)
export ANTHROPIC_API_KEY="sk-ant-..."    # For Claude (LLM only)
export GROQ_API_KEY="gsk_..."            # For Groq (LLM + fast STT)

# Optional: Premium voice
export ELEVENLABS_API_KEY="..."          # High-quality TTS voices
export DEEPGRAM_API_KEY="..."            # Alternative STT

Verify the setup:
- Confirm xspace-agent is installed: npm list xspace-agent
- Confirm cookies are valid (they expire — refresh if needed)
- Test your AI API key with a simple request

Part 3: Joining a Space via MCP

If XActions is configured as an MCP server (Claude Desktop, Cursor, etc.):

Find a Space to join:
```
"Find live X Spaces about AI"
```
This calls x_get_spaces with filter: 'live' and topic: 'AI'.
Join a Space:
```
"Join this Space as an AI agent: https://x.com/i/spaces/1abc123"
```
This calls x_space_join. The agent launches a headless browser, logs in, joins the Space, and starts listening.
Check agent status:
```
"What's the status of my Space agent?"
```
Returns duration, transcription count, response count, and recent events.

Read the transcript:

"Show me the last 20 transcriptions from the Space"

Leave the Space:
```
"Leave the Space and show me the session summary"
```
Returns total duration, number of transcriptions, and number of responses.

Part 4: Joining a Space via Node.js

For programmatic control:

import { joinSpace, leaveSpace, getSpaceAgentStatus, getSpaceTranscript } from 'xactions/spaces/agent';

// Join with full configuration
const result = await joinSpace({
  url: 'https://x.com/i/spaces/1abc123',
  provider: 'openai',                     // LLM: 'openai', 'claude', 'groq'
  apiKey: process.env.OPENAI_API_KEY,
  systemPrompt: `You are a knowledgeable AI participant in an X Space.
    Listen carefully and respond concisely. Add value to the conversation.
    Keep responses under 2 sentences.`,
  model: 'gpt-4o',                        // Optional: specific model
  ttsProvider: 'elevenlabs',              // Optional: 'openai', 'elevenlabs', 'browser'
  voiceId: 'pNInz6obpgDQGcFmaJgB',       // Optional: ElevenLabs voice ID
  headless: true,                          // Run browser headlessly
});

console.log(result);
// { success: true, message: '✅ Joined Space', status: 'listening', provider: 'openai' }

// Monitor the agent
setInterval(() => {
  const status = getSpaceAgentStatus();
  console.log(`Duration: ${status.duration}, Transcriptions: ${status.transcriptions}, Responses: ${status.responses}`);
}, 30000);

// Get transcript on demand
const transcript = getSpaceTranscript({ limit: 20 });
transcript.transcriptions.forEach(t => {
  console.log(`[${t.speaker}]: ${t.text}`);
});

// Leave when done
const summary = await leaveSpace();
console.log(`Session: ${summary.duration}, ${summary.transcriptions} heard, ${summary.responses} spoken`);

Part 5: Configuring the AI Provider

Walk me through choosing and configuring different providers:

OpenAI (default — best all-rounder):

await joinSpace({
  url: spaceUrl,
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY,
  model: 'gpt-4o',         // Fast and capable
});

Handles LLM + STT (Whisper) + TTS in one provider
Best for: general-purpose agents

Claude (best reasoning):

await joinSpace({
  url: spaceUrl,
  provider: 'claude',
  apiKey: process.env.ANTHROPIC_API_KEY,
  model: 'claude-sonnet-4-20250514',
});

LLM only — needs separate STT and TTS providers
Best for: nuanced, thoughtful responses

Groq (fastest):

await joinSpace({
  url: spaceUrl,
  provider: 'groq',
  apiKey: process.env.GROQ_API_KEY,
});

Very fast LLM + fast Whisper STT
Best for: low-latency conversational agents

Mix and match providers:

await joinSpace({
  url: spaceUrl,
  provider: 'claude',                  // LLM
  apiKey: process.env.ANTHROPIC_API_KEY,
  sttProvider: 'groq',                  // Fast transcription
  ttsProvider: 'elevenlabs',            // Premium voice
});

Part 6: Crafting System Prompts

The system prompt defines the agent's personality and behavior. Help me write effective prompts:

Generic helpful assistant:

You are a friendly and knowledgeable AI assistant participating in an X Space.
Listen carefully to what others say. Respond concisely and add value.
Keep responses under 2 sentences. Be respectful of other speakers.

Crypto market analyst:

You are a crypto market analyst participating in an X Space.
Share concise, data-driven insights about crypto markets.
Reference on-chain metrics when relevant.
Keep responses under 3 sentences.
Always caveat price predictions with "not financial advice."

Tech podcast co-host:

You are a tech industry expert co-hosting an X Space.
Share opinions on tech trends, startups, and AI developments.
Ask follow-up questions to keep the conversation flowing.
Use a conversational tone. Keep responses under 3 sentences.

Community moderator:

You are a community moderator in this X Space.
Welcome new speakers. Summarize key points when asked.
Keep the conversation on topic. Be neutral and inclusive.
If someone asks a question nobody answers, provide a helpful response.

Debate participant:

You represent the skeptical perspective in this discussion.
Question assumptions and push for evidence-based reasoning.
Be respectful but direct. Acknowledge good points from others.
Keep responses focused and under 3 sentences.

Part 7: Voice Configuration

Walk me through choosing the right voice for the agent:

ElevenLabs (highest quality):

await joinSpace({
  url: spaceUrl,
  ttsProvider: 'elevenlabs',
  voiceId: 'pNInz6obpgDQGcFmaJgB',  // "Adam" — clear male voice
});

Requires ELEVENLABS_API_KEY
Most natural-sounding voices
Many voice options and custom voice cloning

OpenAI TTS:

await joinSpace({
  url: spaceUrl,
  ttsProvider: 'openai',
  voiceId: 'nova',   // Options: alloy, echo, fable, onyx, nova, shimmer
});

Uses your existing OpenAI API key
Good quality, six built-in voices

Browser TTS (free, no API key):
```
await joinSpace({
  url: spaceUrl,
  ttsProvider: 'browser',
});
```
- Uses the browser's built-in speech synthesis
- Lower quality but completely free

Part 8: Behavior Tuning

Fine-tune how the agent listens and responds:

Silence threshold — how long to wait after someone stops speaking before processing:
```
await joinSpace({
  url: spaceUrl,
  silenceThreshold: 1500,  // 1.5 seconds (default)
});
```
- Lower (800ms): Agent responds faster, may cut people off
- Higher (3000ms): Agent waits longer, less likely to interrupt

Turn delay — extra pause before the agent starts speaking:

await joinSpace({
  url: spaceUrl,
  turnDelay: 500,   // 500ms pause before responding
});

Conversation history — how much context the LLM receives:
```
await joinSpace({
  url: spaceUrl,
  maxHistory: 20,   // Last 20 messages as context
});
```
- More history = better context but higher token cost
- Less history = cheaper but may miss context

Part 9: Multi-Agent Setup

For advanced use cases, deploy multiple AI agents with different personalities using the xspace-agent SDK directly:

import { AgentTeam } from 'xspace-agent';

const team = new AgentTeam({
  auth: {
    token: process.env.X_AUTH_TOKEN,
    ct0: process.env.X_CT0,
  },
  agents: [
    {
      name: 'Optimist',
      ai: {
        provider: 'openai',
        apiKey: process.env.OPENAI_API_KEY,
        systemPrompt: 'You always see the positive side of technology trends. Be enthusiastic but grounded.',
      },
    },
    {
      name: 'Skeptic',
      ai: {
        provider: 'claude',
        apiKey: process.env.ANTHROPIC_API_KEY,
        systemPrompt: 'You question hype and push for evidence. Be respectful but critical.',
      },
    },
  ],
});

await team.join('https://x.com/i/spaces/1abc123');

This creates a multi-agent debate where each agent has a distinct personality and LLM provider.

Part 10: MCP Tools Reference

Complete reference for all Space-related MCP tools:

Tool	Description	Key Parameters
`x_space_join`	Join a Space with an AI voice agent	`url` (required), `provider`, `apiKey`, `systemPrompt`, `model`, `voiceId`, `headless`
`x_space_leave`	Leave the active Space	None
`x_space_status`	Get agent status	None
`x_space_transcript`	Get recent transcriptions	`limit` (default: 50)
`x_get_spaces`	Discover live/scheduled Spaces	`filter` (live/scheduled/all), `topic`, `limit`
`x_scrape_space`	Scrape Space metadata	`url`

Part 11: Monitoring & Debugging

Help me monitor and troubleshoot my Space agent:

Check if the agent is active:
```
"What's the status of my Space agent?"
```

Watch the transcript in real time:

"Show me the latest transcriptions from the Space"

Common issues:
- "xspace-agent is not installed" — Run npm install xspace-agent
- "X_AUTH_TOKEN is required" — Set your session cookies (they expire, refresh from browser)
- "An agent is already active" — Call leaveSpace() before joining another
- Agent joins but doesn't speak — Host hasn't approved speaker request yet
- Poor transcription quality — Try switching STT provider (Groq is fastest, OpenAI is most accurate)
- High latency responses — Switch to Groq for LLM, reduce maxHistory
Event logging: The agent logs events to the console:
- 🎙️ [Speaker]: text — Someone spoke and was transcribed
- 🤖 Agent: text — The agent generated and spoke a response
- 🔄 Agent status: ... — Lifecycle change
- ❌ Agent error: ... — Something went wrong
- 📡 Space has ended — The host closed the Space

Part 12: Best Practices

Start as a listener — Let the agent listen for 30-60 seconds before it starts responding to understand the conversation context
Keep responses short — 1-2 sentences works best in live audio; long responses lose the audience
Use a clear system prompt — Define the agent's role, tone, and topic boundaries explicitly
Choose the right voice — ElevenLabs for professional settings, OpenAI TTS for casual, browser TTS for testing
Monitor the agent — Check status and transcript periodically to ensure quality
Be transparent — Consider having the agent identify itself as AI when it first speaks
Respect rate limits — Don't rapidly join and leave Spaces; X may flag the account
Rotate cookies — Session cookies expire; check and refresh them regularly
Test in small Spaces first — Join a Space with a few listeners before deploying to large audiences
Have a shutdown plan — Always be ready to call leaveSpace() if the agent misbehaves

My Space Agent Goals

(Replace before pasting)

What kind of Spaces do I want to join? TOPIC_OR_NICHE
What role should the agent play? ROLE (e.g., expert, moderator, co-host)
Which AI provider do I prefer? PROVIDER (openai, claude, groq)
Do I need premium voice quality? YES/NO
Am I running via MCP, Node.js, or CLI? RUNTIME

Start with Part 2 — help me get authenticated and join my first Space with an AI agent.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tutorial: Autonomous AI Voice Agent in X Spaces with Claude

Context

What I Need You To Do

Part 1: Understanding the Space Agent

Part 2: Setup & Authentication

Part 3: Joining a Space via MCP

Part 4: Joining a Space via Node.js

Part 5: Configuring the AI Provider

Part 6: Crafting System Prompts

Part 7: Voice Configuration

Part 8: Behavior Tuning

Part 9: Multi-Agent Setup

Part 10: MCP Tools Reference

Part 11: Monitoring & Debugging

Part 12: Best Practices

My Space Agent Goals

Uh oh!

FilesExpand file tree

23-autonomous-space-agent.md

Latest commit

History

23-autonomous-space-agent.md

File metadata and controls

Tutorial: Autonomous AI Voice Agent in X Spaces with Claude

Context

What I Need You To Do

Part 1: Understanding the Space Agent

Part 2: Setup & Authentication

Part 3: Joining a Space via MCP

Part 4: Joining a Space via Node.js

Part 5: Configuring the AI Provider

Part 6: Crafting System Prompts

Part 7: Voice Configuration

Part 8: Behavior Tuning

Part 9: Multi-Agent Setup

Part 10: MCP Tools Reference

Part 11: Monitoring & Debugging

Part 12: Best Practices

My Space Agent Goals