Self-Hosting Overview

Self-hosting Firecrawl gives you full control over your web scraping infrastructure, allowing you to run Firecrawl on your own servers or local environment.

Why Self-Host?

Self-hosting Firecrawl is particularly beneficial for organizations with stringent security policies that require data to remain within controlled environments. Here are some key reasons to consider self-hosting:

Enhanced Security and Compliance: By self-hosting, you ensure that all data handling and processing complies with internal and external regulations, keeping sensitive information within your secure infrastructure. Note that Firecrawl is a Mendable product and relies on SOC2 Type2 certification, which means that the platform adheres to high industry standards for managing data security.
Customizable Services: Self-hosting allows you to tailor the services, such as the Playwright service, to meet specific needs or handle particular use cases that may not be supported by the standard cloud offering.
Learning and Community Contribution: By setting up and maintaining your own instance, you gain a deeper understanding of how Firecrawl works, which can also lead to more meaningful contributions to the project.

Cloud vs Self-Hosted

Firecrawl is open source under the AGPL-3.0 license. The cloud version at firecrawl.dev includes additional features: Open Source vs Cloud

Cloud Features

The cloud version includes:

Advanced AI capabilities: Agent endpoint for autonomous data gathering
Fire-engine: Advanced features for handling IP blocks, robot detection mechanisms, and more
Managed infrastructure: No maintenance or configuration required
Automatic scaling: Handle any volume of requests
Premium support: Direct support from the Firecrawl team

Self-Hosted Capabilities

When self-hosting, you have access to:

Core scraping features: Scrape, crawl, and map endpoints
Playwright rendering: JavaScript rendering and dynamic content support
Custom configurations: Full control over proxy settings, resource limits, and more
Local deployment: Run entirely within your infrastructure

The repository is in development, and custom modules are still being integrated into the mono repo. It’s not fully ready for production self-hosted deployment yet, but you can run it locally.

Limitations and Considerations

However, there are some limitations and additional responsibilities to be aware of:

Limited Access to Fire-engine: Currently, self-hosted instances of Firecrawl do not have access to Fire-engine, which includes advanced features for handling IP blocks, robot detection mechanisms, and more. This means that while you can manage basic scraping tasks, more complex scenarios might require additional configuration or might not be supported.

Manual Configuration Required: If you need to use scraping methods beyond the basic fetch and Playwright options, you will need to manually configure these in the .env file. This requires a deeper understanding of the technologies and might involve more setup time.
No Supabase Support: Right now it’s not possible to configure Supabase in self-hosted instances, which means advanced logging and DB authentication features are not available.
Additional Maintenance: You are responsible for updates, security patches, and infrastructure maintenance.

API Keys and Authentication

When using Firecrawl SDKs with a self-hosted instance, API keys are optional. API keys are only required when connecting to the cloud service (api.firecrawl.dev).

By default, self-hosted instances run with USE_DB_AUTHENTICATION=false, which bypasses authentication. This is suitable for local development or internal deployments behind a firewall.

Getting Started

Ready to get started? Head to the Setup page for step-by-step instructions on installing and running Firecrawl.

Documentation Index

​Why Self-Host?

​Cloud vs Self-Hosted

​Cloud Features

​Self-Hosted Capabilities

​Limitations and Considerations

​API Keys and Authentication

​Getting Started

Why Self-Host?

Cloud vs Self-Hosted

Cloud Features

Self-Hosted Capabilities

Limitations and Considerations

API Keys and Authentication

Getting Started