Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/firecrawl/firecrawl/llms.txt

Use this file to discover all available pages before exploring further.

Endpoint

POST /v1/scrape

Authentication

This endpoint requires authentication using a Bearer token. Include your API key in the Authorization header:
Authorization: Bearer YOUR_API_KEY

Request Body

url
string
required
The URL to scrape
formats
array
default:["markdown"]
Formats to include in the output. Options: markdown, html, rawHtml, links, screenshot, screenshot@fullPage, json, changeTracking, branding
onlyMainContent
boolean
default:true
Only return the main content of the page excluding headers, navs, footers, etc.
includeTags
array
Tags to include in the output.
excludeTags
array
Tags to exclude from the output.
maxAge
integer
default:0
Returns a cached version of the page if it is younger than this age in milliseconds. If a cached version of the page is older than this value, the page will be scraped. If you do not need extremely fresh data, enabling this can speed up your scrapes by 500%. Defaults to 0, which disables caching.
headers
object
Headers to send with the request. Can be used to send cookies, user-agent, etc.
waitFor
integer
default:0
Specify a delay in milliseconds before fetching the content, allowing the page sufficient time to load.
mobile
boolean
default:false
Set to true if you want to emulate scraping from a mobile device. Useful for testing responsive pages and taking mobile screenshots.
skipTlsVerification
boolean
default:false
Skip TLS certificate verification when making requests
timeout
integer
default:30000
Timeout in milliseconds for the request
parsePDF
boolean
default:true
Controls how PDF files are processed during scraping. When true, the PDF content is extracted and converted to markdown format, with billing based on the number of pages (1 credit per page). When false, the PDF file is returned in base64 encoding with a flat rate of 1 credit total.
jsonOptions
object
JSON options object
actions
array
Actions to perform on the page before grabbing the content
location
object
Location settings for the request. When specified, this will use an appropriate proxy if available and emulate the corresponding language and timezone settings. Defaults to ‘US’ if not specified.
removeBase64Images
boolean
Removes all base 64 images from the output, which may be overwhelmingly long. The image’s alt text remains in the output, but the URL is replaced with a placeholder.
blockAds
boolean
default:true
Enables ad-blocking and cookie popup blocking.
proxy
string
Specifies the type of proxy to use. Options: basic, enhanced, auto
  • basic: Proxies for scraping sites with none to basic anti-bot solutions. Fast and usually works.
  • enhanced: Enhanced proxies for scraping sites with advanced anti-bot solutions. Slower, but more reliable on certain sites. Costs up to 5 credits per request.
  • auto: Firecrawl will automatically retry scraping with enhanced proxies if the basic proxy fails. If the retry with enhanced is successful, 5 credits will be billed for the scrape. If the first attempt with basic is successful, only the regular cost will be billed.
storeInCache
boolean
default:true
If true, the page will be stored in the Firecrawl index and cache. Setting this to false is useful if your scraping activity may have data protection concerns. Using some parameters associated with sensitive scraping (actions, headers) will force this parameter to be false.

Response

success
boolean
Indicates whether the scrape was successful
data
object
The scraped data

Examples

curl -X POST https://api.firecrawl.dev/v1/scrape \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -d '{
    "url": "https://example.com",
    "formats": ["markdown", "html"]
  }'

Error Responses

402
object
Payment Required - Payment required to access this resource.
{
  "error": "Payment required to access this resource."
}
429
object
Too Many Requests - Request rate limit exceeded.
{
  "error": "Request rate limit exceeded. Please wait and try again later."
}
500
object
Server Error - An unexpected error occurred on the server.
{
  "error": "An unexpected error occurred on the server."
}