Agent Technical Implementation (The "How")

Goal: Crawl the web politely, avoid IP bans, and access High-Quality Data.

Step 1: The "Digital Passport" (One-Time Setup)

You must prove you are who you say you are (e.g., openai.com).

Run this command:

npx imagxp generate-identity

It gives you 2 things:

  1. A Public/Private Key Pair: Save in .env. NEVER SHARE THE PRIVATE KEY.

    # .env
    IMAGXP_AGENT_ID="openai.com"
    IMAGXP_PRIVATE_KEY="MIGHAgEAMBMGByq..."
    IMAGXP_PUBLIC_KEY="MFKwEwYHKoZIzj0..."
    
  2. A Public File (ID Card): Save as imagxp-agent.json. Upload to yoursite.com/.well-known/imagxp-agent.json.

    {
      "agent_id": "openai.com",
      "public_key": "MFKwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE...",
      "contact_email": "security@openai.com",
      "version": "1.0"
    }
    

Step 2: The "Universal" Crawler

This code works for ANY URL on the internet. You don't need a list.

import { IMAGXPAgent } from '@imagxp/protocol';

// 1. Load your Identity
const agent = await IMAGXPAgent.init(); 

// 2. The Crawler Loop (Universal)
const queue = ["https://nytimes.com", "https://reddit.com", "https://blog.google"];

for (const url of queue) {
    // The SDK automatically checks: "Does this site speak IMAGXP?"
    // If YES -> It signs the request. If NO -> Standard fetch.
    const response = await agent.fetch(url, {
        purpose: "RAG_RETRIEVAL"
    });

    if (response.status === 200) {
        console.log(`[SUCCESS] Accessed ${url}.`);
        analyzeContent(response.data);
    } else {
        console.log(`[BLOCKED] ${url} refused access.`);
    }
}

FAQ: Agents

  • Q: Why use this instead of standard fetch?
    • A: Standard fetch will likely get blocked by anti-bot systems on premium sites. IMAGXP gives you a "V.I.P. Pass".
  • Q: Can I scrape training data?
    • A: Only if the Publisher allows it (allowTraining: true). The SDK respects their policy headers automatically.
  • Q: Does the Publisher see my Private Key?
    • A: Never. They only see the Public Key (The Lock). Your Secret stays on your server.