Scrape Website

Scrape a website’s content, structure, and styling to use as reference for AI generation.

Endpoint

POST /api/scrape-website

Request Body

url

string

required

The URL of the website to scrape.

formats

array

default:"['markdown', 'html']"

Output formats to include:

markdown - Clean markdown content
html - Raw HTML structure
screenshot - Page screenshot

options

object

Additional scraping options:

onlyMainContent: Boolean - Extract only main content (default: true)
waitFor: Number - Wait time for dynamic content in ms (default: 2000)
timeout: Number - Request timeout in ms (default: 30000)

Example Request

curl -X POST http://app.seemodo.ai/api/scrape-website \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com/landing-page",
    "formats": ["markdown", "html"],
    "options": {
      "onlyMainContent": true,
      "waitFor": 3000
    }
  }'

Response

success

boolean

Whether the scrape succeeded.

data

object

Scraped content:

title: Page title
description: Meta description
content: Combined content
markdown: Markdown version
html: HTML version
metadata: Page metadata
screenshot: Base64 screenshot (if requested)
links: Extracted links

Success Response

{
  "success": true,
  "data": {
    "title": "Acme Inc - Build Better Products",
    "description": "Acme helps teams build and ship products faster.",
    "content": "# Acme Inc\n\nBuild Better Products...",
    "markdown": "# Acme Inc\n\n## Build Better Products\n\nAcme helps teams build and ship products faster with AI-powered tools.\n\n### Features\n\n- **Speed**: Ship 10x faster\n- **Quality**: AI-assisted code review\n- **Collaboration**: Real-time editing\n\n...",
    "html": "<header><nav>...</nav></header><main><h1>Acme Inc</h1>...</main>",
    "metadata": {
      "title": "Acme Inc - Build Better Products",
      "description": "Acme helps teams build and ship products faster.",
      "sourceURL": "https://example.com/landing-page",
      "statusCode": 200,
      "ogImage": "https://example.com/og-image.png"
    },
    "screenshot": null,
    "links": [
      "https://example.com/features",
      "https://example.com/pricing",
      "https://example.com/about"
    ]
  }
}

Website Cloning Workflow

Using Scraped Content for Generation

// Step 1: Scrape the target website
const scrapeResponse = await fetch('/api/scrape-website', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    url: 'https://example.com'
  })
});

const { data } = await scrapeResponse.json();

// Step 2: Generate a clone
const generateResponse = await fetch('/api/generate-screen', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    prompt: `Clone this website design:\n\n${data.markdown}`,
    sandboxId: activeSandbox.id,
    screenType: 'Desktop'
  })
});

Enhanced Scraping

For more detailed extraction, use the enhanced endpoint:

POST /api/scrape-url-enhanced

This provides:

Better handling of SPAs
CSS extraction
Asset downloading
Structure analysis

Screenshot Capture

For visual reference:

POST /api/scrape-screenshot

{
  "url": "https://example.com",
  "fullPage": true,
  "width": 1280,
  "height": 800
}

Returns a base64-encoded screenshot that can be used as a reference image.

Error Handling

Scrape Failed

{
  "success": false,
  "error": "Failed to scrape website",
  "data": {
    "title": "Error",
    "markdown": "# Error\n\nConnection timeout while fetching the page."
  }
}

Limitations

Some websites block automated scraping
JavaScript-heavy SPAs may not render completely
Rate limits may apply

Best Practices

Use onlyMainContent - Excludes headers/footers for cleaner content
Increase waitFor - For dynamic content, wait longer
Combine with screenshots - Visual reference improves generation
Check the markdown - Verify important content was captured

Overview

Code Generation

Sandbox

AI & Utilities

Endpoint

Request Body

Example Request

Response

Success Response

Website Cloning Workflow

Using Scraped Content for Generation

Enhanced Scraping

Screenshot Capture

Error Handling

Scrape Failed

Limitations

Best Practices

Overview

Code Generation

Sandbox

AI & Utilities

​Endpoint

​Request Body

​Example Request

​Response

​Success Response

​Website Cloning Workflow

​Using Scraped Content for Generation

​Enhanced Scraping

​Screenshot Capture

​Error Handling

​Scrape Failed

​Limitations

​Best Practices

Endpoint

Request Body

Example Request

Response

Success Response

Website Cloning Workflow

Using Scraped Content for Generation

Enhanced Scraping

Screenshot Capture

Error Handling

Scrape Failed

Limitations

Best Practices