Start Crawl - Firecrawl

POST /v1/crawl

Start a crawl job to recursively scrape URLs starting from a base URL.

Authentication

This endpoint requires authentication using a Bearer token. Include your API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Request Body

url

string

required

The base URL to start crawling from

excludePaths

array

URL pathname regex patterns that exclude matching URLs from the crawl. For example, if you set "excludePaths": ["blog/.*"] for the base URL firecrawl.dev, any results matching that pattern will be excluded, such as https://www.firecrawl.dev/blog/firecrawl-launch-week-1-recap.

includePaths

array

URL pathname regex patterns that include matching URLs in the crawl. Only the paths that match the specified patterns will be included in the response. For example, if you set "includePaths": ["blog/.*"] for the base URL firecrawl.dev, only results matching that pattern will be included, such as https://www.firecrawl.dev/blog/firecrawl-launch-week-1-recap.

maxDepth

integer

default:"10"

Maximum depth to crawl relative to the base URL. Basically, the max number of slashes the pathname of a scraped URL may contain.

maxDiscoveryDepth

integer

Maximum depth to crawl based on discovery order. The root site and sitemapped pages has a discovery depth of 0. For example, if you set it to 1, and you set ignoreSitemap, you will only crawl the entered URL and all URLs that are linked on that page.

ignoreSitemap

boolean

default:"false"

Ignore the website sitemap when crawling

ignoreQueryParameters

boolean

default:"false"

Do not re-scrape the same path with different (or none) query parameters

limit

integer

default:"10000"

Maximum number of pages to crawl. Default limit is 10000.

allowBackwardLinks

boolean

default:"false"

Allows the crawler to follow internal links to sibling or parent URLs, not just child paths.false: Only crawls deeper (child) URLs. → e.g. /features/feature-1 → /features/feature-1/tips ✅ → Won’t follow /pricing or / ❌true: Crawls any internal links, including siblings and parents. → e.g. /features/feature-1 → /pricing, /, etc. ✅Use true for broader internal coverage beyond nested paths.

allowExternalLinks

boolean

default:"false"

Allows the crawler to follow links to external websites.

delay

number

Delay in seconds between scrapes. This helps respect website rate limits.

webhook

object

A webhook specification object.

Show webhook properties

webhook.url

string

required

The URL to send the webhook to. This will trigger for crawl started (crawl.started), every page crawled (crawl.page) and when the crawl is completed (crawl.completed or crawl.failed). The response will be the same as the /scrape endpoint.

webhook.headers

object

Headers to send to the webhook URL.

webhook.metadata

object

Custom metadata that will be included in all webhook payloads for this crawl

webhook.events

array

Type of events that should be sent to the webhook URL. (default: all)Options: completed, page, failed, started

scrapeOptions

object

Options for scraping each page. See Scrape Options for full details.Common options include:

formats: Output formats (e.g., ["markdown", "html", "links"])
onlyMainContent: Extract only main content (default: true)
includeTags: HTML tags to include
excludeTags: HTML tags to exclude
waitFor: Milliseconds to wait before scraping
mobile: Emulate mobile device

Response

success

boolean

Indicates if the crawl job was successfully started

string

The unique identifier for the crawl job. Use this ID to check the status and retrieve results.

url

string

The base URL that is being crawled

Example Request

curl -X POST https://api.firecrawl.dev/v1/crawl \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -d '{
    "url": "https://example.com",
    "limit": 100,
    "scrapeOptions": {
      "formats": ["markdown", "html"]
    }
  }'

Example Response

{
  "success": true,
  "id": "123e4567-e89b-12d3-a456-426614174000",
  "url": "https://example.com"
}

Error Responses

402 Payment Required

{
  "error": "Payment required to access this resource."
}

429 Too Many Requests

{
  "error": "Request rate limit exceeded. Please wait and try again later."
}

500 Server Error

{
  "error": "An unexpected error occurred on the server."
}

Next Steps

After starting a crawl job, use the returned id to:

Check crawl status and retrieve results
Get crawl errors if any occurred
Cancel the crawl if needed

Documentation Index

​POST /v1/crawl

​Authentication

​Request Body

​Response

​Example Request

​Example Response

​Error Responses

​Next Steps

POST /v1/crawl

Authentication

Request Body

Response

Example Request

Example Response

Error Responses

Next Steps