# Scrapfly Documentation ## Table of Contents ### Dashboard - [Intro](https://scrapfly.io/docs) - [Project](https://scrapfly.io/docs/project) - [Account](https://scrapfly.io/docs/account) - [Workspace & Team](https://scrapfly.io/docs/workspace-and-team) - [Billing](https://scrapfly.io/docs/billing) ### Products #### MCP Server - [Getting Started](https://scrapfly.io/docs/mcp/getting-started) - [Tools & API Spec](https://scrapfly.io/docs/mcp/tools) - [Authentication](https://scrapfly.io/docs/mcp/authentication) - [Examples & Use Cases](https://scrapfly.io/docs/mcp/examples) - [FAQ](https://scrapfly.io/docs/mcp/faq) ##### Integrations - [Overview](https://scrapfly.io/docs/mcp/integrations) - [Claude Desktop](https://scrapfly.io/docs/mcp/integrations/claude-desktop) - [Claude Code](https://scrapfly.io/docs/mcp/integrations/claude-code) - [ChatGPT](https://scrapfly.io/docs/mcp/integrations/chatgpt) - [Cursor](https://scrapfly.io/docs/mcp/integrations/cursor) - [Cline](https://scrapfly.io/docs/mcp/integrations/cline) - [Windsurf](https://scrapfly.io/docs/mcp/integrations/windsurf) - [Zed](https://scrapfly.io/docs/mcp/integrations/zed) - [Roo Code](https://scrapfly.io/docs/mcp/integrations/roo-code) - [VS Code](https://scrapfly.io/docs/mcp/integrations/vscode) - [LangChain](https://scrapfly.io/docs/mcp/integrations/langchain) - [LlamaIndex](https://scrapfly.io/docs/mcp/integrations/llamaindex) - [CrewAI](https://scrapfly.io/docs/mcp/integrations/crewai) - [OpenAI](https://scrapfly.io/docs/mcp/integrations/openai) - [n8n](https://scrapfly.io/docs/mcp/integrations/n8n) - [Make](https://scrapfly.io/docs/mcp/integrations/make) - [Zapier](https://scrapfly.io/docs/mcp/integrations/zapier) - [Vapi AI](https://scrapfly.io/docs/mcp/integrations/vapi) - [Agent Builder](https://scrapfly.io/docs/mcp/integrations/agent-builder) - [Custom Client](https://scrapfly.io/docs/mcp/integrations/custom-client) #### Web Scraping API - [Getting Started](https://scrapfly.io/docs/scrape-api/getting-started) - [API Specification](https://scrapfly.io/docs/scrape-api/specification) - [Monitoring](https://scrapfly.io/docs/monitoring) - [Customize Request](https://scrapfly.io/docs/scrape-api/custom) - [Debug](https://scrapfly.io/docs/scrape-api/debug) - [Anti Scraping Protection](https://scrapfly.io/docs/scrape-api/anti-scraping-protection) - [Proxy](https://scrapfly.io/docs/scrape-api/proxy) - [Proxy Mode](https://scrapfly.io/docs/scrape-api/proxy-mode) - [Proxy Mode - Screaming Frog](https://scrapfly.io/docs/scrape-api/proxy-mode/screaming-frog) - [Proxy Mode - Apify](https://scrapfly.io/docs/scrape-api/proxy-mode/apify) - [(Auto) Data Extraction](https://scrapfly.io/docs/scrape-api/extraction) - [Javascript Rendering](https://scrapfly.io/docs/scrape-api/javascript-rendering) - [Javascript Scenario](https://scrapfly.io/docs/scrape-api/javascript-scenario) - [SSL](https://scrapfly.io/docs/scrape-api/ssl) - [DNS](https://scrapfly.io/docs/scrape-api/dns) - [Cache](https://scrapfly.io/docs/scrape-api/cache) - [Session](https://scrapfly.io/docs/scrape-api/session) - [Webhook](https://scrapfly.io/docs/scrape-api/webhook) - [Screenshot](https://scrapfly.io/docs/scrape-api/screenshot) - [Errors](https://scrapfly.io/docs/scrape-api/errors) - [Timeout](https://scrapfly.io/docs/scrape-api/understand-timeout) - [Throttling](https://scrapfly.io/docs/throttling) - [Troubleshoot](https://scrapfly.io/docs/scrape-api/troubleshoot) - [Billing](https://scrapfly.io/docs/scrape-api/billing) - [FAQ](https://scrapfly.io/docs/scrape-api/faq) #### Crawler API - [Getting Started](https://scrapfly.io/docs/crawler-api/getting-started) - [API Specification](https://scrapfly.io/docs/crawler-api/specification) - [Retrieving Results](https://scrapfly.io/docs/crawler-api/results) - [WARC Format](https://scrapfly.io/docs/crawler-api/warc-format) - [Data Extraction](https://scrapfly.io/docs/crawler-api/extraction-rules) - [Webhook](https://scrapfly.io/docs/crawler-api/webhook) - [Billing](https://scrapfly.io/docs/crawler-api/billing) - [Errors](https://scrapfly.io/docs/crawler-api/errors) - [Troubleshoot](https://scrapfly.io/docs/crawler-api/troubleshoot) - [FAQ](https://scrapfly.io/docs/crawler-api/faq) #### Screenshot API - [Getting Started](https://scrapfly.io/docs/screenshot-api/getting-started) - [API Specification](https://scrapfly.io/docs/screenshot-api/specification) - [Accessibility Testing](https://scrapfly.io/docs/screenshot-api/accessibility) - [Webhook](https://scrapfly.io/docs/screenshot-api/webhook) - [Billing](https://scrapfly.io/docs/screenshot-api/billing) - [Errors](https://scrapfly.io/docs/screenshot-api/errors) #### Extraction API - [Getting Started](https://scrapfly.io/docs/extraction-api/getting-started) - [API Specification](https://scrapfly.io/docs/extraction-api/specification) - [Rules Template](https://scrapfly.io/docs/extraction-api/rules-and-template) - [LLM Extraction](https://scrapfly.io/docs/extraction-api/llm-prompt) - [AI Auto Extraction](https://scrapfly.io/docs/extraction-api/automatic-ai) - [Webhook](https://scrapfly.io/docs/extraction-api/webhook) - [Billing](https://scrapfly.io/docs/extraction-api/billing) - [Errors](https://scrapfly.io/docs/extraction-api/errors) - [FAQ](https://scrapfly.io/docs/extraction-api/faq) #### Proxy Saver - [Getting Started](https://scrapfly.io/docs/proxy-saver/getting-started) - [Fingerprints](https://scrapfly.io/docs/proxy-saver/fingerprints) - [Optimizations](https://scrapfly.io/docs/proxy-saver/optimizations) - [SSL Certificates](https://scrapfly.io/docs/proxy-saver/certificates) - [Protocols](https://scrapfly.io/docs/proxy-saver/protocols) - [Pacfile](https://scrapfly.io/docs/proxy-saver/pacfile) - [Secure Credentials](https://scrapfly.io/docs/proxy-saver/security) - [Billing](https://scrapfly.io/docs/proxy-saver/billing) #### Cloud Browser API - [Getting Started](https://scrapfly.io/docs/cloud-browser-api/getting-started) - [Proxy & Geo-Targeting](https://scrapfly.io/docs/cloud-browser-api/proxy) - [Unblock API](https://scrapfly.io/docs/cloud-browser-api/unblock) - [File Downloads](https://scrapfly.io/docs/cloud-browser-api/file-downloads) - [Session Resume](https://scrapfly.io/docs/cloud-browser-api/session-resume) - [Human-in-the-Loop](https://scrapfly.io/docs/cloud-browser-api/human-in-the-loop) - [Debug Mode](https://scrapfly.io/docs/cloud-browser-api/debug-mode) - [Bring Your Own Proxy](https://scrapfly.io/docs/cloud-browser-api/bring-your-own-proxy) - [Browser Extensions](https://scrapfly.io/docs/cloud-browser-api/extensions) ##### Integrations - [Puppeteer](https://scrapfly.io/docs/cloud-browser-api/puppeteer) - [Playwright](https://scrapfly.io/docs/cloud-browser-api/playwright) - [Selenium](https://scrapfly.io/docs/cloud-browser-api/selenium) - [Vercel Agent Browser](https://scrapfly.io/docs/cloud-browser-api/agent-browser) - [Browser Use](https://scrapfly.io/docs/cloud-browser-api/browser-use) - [Stagehand](https://scrapfly.io/docs/cloud-browser-api/stagehand) - [Vibium](https://scrapfly.io/docs/cloud-browser-api/vibium) - [Billing](https://scrapfly.io/docs/cloud-browser-api/billing) - [Errors](https://scrapfly.io/docs/cloud-browser-api/errors) ### Tools - [Antibot Detector](https://scrapfly.io/docs/tools/antibot-detector) ### SDK - [Golang](https://scrapfly.io/docs/sdk/golang) - [Python](https://scrapfly.io/docs/sdk/python) - [TypeScript](https://scrapfly.io/docs/sdk/typescript) - [Scrapy](https://scrapfly.io/docs/sdk/scrapy) ### Integrations - [Getting Started](https://scrapfly.io/docs/integration/getting-started) - [LangChain](https://scrapfly.io/docs/integration/langchain) - [LlamaIndex](https://scrapfly.io/docs/integration/llamaindex) - [CrewAI](https://scrapfly.io/docs/integration/crewai) - [Zapier](https://scrapfly.io/docs/integration/zapier) - [Make](https://scrapfly.io/docs/integration/make) - [n8n](https://scrapfly.io/docs/integration/n8n) ### Academy - [Overview](https://scrapfly.io/academy) - [Web Scraping Overview](https://scrapfly.io/academy/scraping-overview) - [Tools](https://scrapfly.io/academy/tools-overview) - [Reverse Engineering](https://scrapfly.io/academy/reverse-engineering) - [Static Scraping](https://scrapfly.io/academy/static-scraping) - [HTML Parsing](https://scrapfly.io/academy/html-parsing) - [Dynamic Scraping](https://scrapfly.io/academy/dynamic-scraping) - [Hidden API Scraping](https://scrapfly.io/academy/hidden-api-scraping) - [Headless Browsers](https://scrapfly.io/academy/headless-browsers) - [Hidden Web Data](https://scrapfly.io/academy/hidden-web-data) - [JSON Parsing](https://scrapfly.io/academy/json-parsing) - [Data Processing](https://scrapfly.io/academy/data-processing) - [Scaling](https://scrapfly.io/academy/scaling) - [Walkthrough Summary](https://scrapfly.io/academy/walkthrough-summary) - [Scraper Blocking](https://scrapfly.io/academy/scraper-blocking) - [Proxies](https://scrapfly.io/academy/proxies) --- # Python SDK [ View as markdown ](https://scrapfly.io/?view=markdown) Copy for LLM Copy for LLM [ Open in ChatGPT ](https://chatgpt.com/?hints=search&prompt=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fsdk%2Fpython%20so%20I%20can%20ask%20questions%20about%20it.) [ Open in Claude ](https://claude.ai/new?q=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fsdk%2Fpython%20so%20I%20can%20ask%20questions%20about%20it.) [ Open in Perplexity ](https://www.perplexity.ai/search/new?q=Read%20from%20https%3A%2F%2Fscrapfly.io%2Fdocs%2Fsdk%2Fpython%20so%20I%20can%20ask%20questions%20about%20it.) Python SDK gives you a handy abstraction to interact with **Scrapfly API**. It includes all of scrapfly features and many convenient shortcuts: - Automatic base64 encode of JS snippet - Error Handling - Body json encode if `Content-Type: application/json` - Body URL encode and set `Content Type: application/x-www-form-urlencoded` if no content type specified - Convert Binary response into a python `ByteIO` object ### Step by Step Introduction For a hands-on introduction see our Scrapfly SDK introduction page! [ Discover Now ](https://scrapfly.io/docs/onboarding) The Full python API specification is available here: [https://scrapfly.github.io/python-scrapfly/docs/scrapfly](https://scrapfly.github.io/python-scrapfly/scrapfly/) > For more on Python SDK use with Scrapfly, select "Python SDK" option in Scrapfly docs top bar. ## Installation Source code of **Python SDK** is available on [ Github](https://github.com/scrapfly/python-scrapfly) **scrapfly-sdk** package is available through [PyPi](https://pypi.org). ``` pip install 'scrapfly-sdk' ``` You can also install extra package `scrapfly[speedups]` to get **[brotli](https://github.com/google/brotli)** compression and **[msgpack](https://msgpack.org)** serialization benefits. ``` pip install 'scrapfly-sdk[speedups]' ``` You can also install `scrapfly[all]` to get all optional Scrapfly features without any extra impact on your scrapfly performance. ``` pip install 'scrapfly-sdk[all]' ``` ## Scrape > If you plan to scrape protected website - **make sure to enable [Anti Scraping Protection ](https://scrapfly.io/docs/onboarding#asp)** ``` from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse scrapfly = ScrapflyClient(key='{{ YOUR_API_KEY }}') api_response:ScrapeApiResponse = scrapfly.scrape(scrape_config=ScrapeConfig(url='https://httpbin.dev/anything')) # Automatic retry errors marked "retryable" and wait delay recommended before retrying api_response:ScrapeApiResponse = scrapfly.resilient_scrape(scrape_config=ScrapeConfig(url='https://httpbin.dev/anything')) # Automatic retry error based on status code api_response:ScrapeApiResponse = scrapfly.resilient_scrape(scrape_config=ScrapeConfig(url='https://httpbin.dev/status/500'), retry_on_status_code=[500]) # scrape result, content, iframes, response headers, response cookies states, screenshots, ssl, dns etc print(api_response.scrape_result) # html content print(api_response.scrape_result['content']) # Context of scrape, session, webhook, asp, cache, debug print(api_response.context) # raw api result print(api_response.content) # True if the scrape respond with >= 200 < 300 http status print(api_response.success) # Api status code /!\ Not the api status code of the scrape! print(api_response.status_code) # Upstream website status code print(api_response.upstream_status_code) # Convert API Scrape Result into well known requests.Response object print(api_response.upstream_result_into_response()) ``` Discover python full specification: - Client : - ScrapeConfig : [https://scrapfly.github.io/python-scrapfly/scrapfly/scrape\_config.html](https://scrapfly.github.io/python-scrapfly/scrapfly/scrape_config.html) - API response : [https://scrapfly.github.io/python-scrapfly/scrapfly/api\_response.html](https://scrapfly.github.io/python-scrapfly/scrapfly/api_response.html) ### Using Context ``` from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse scrapfly = ScrapflyClient(key='{{ YOUR_API_KEY }}') with scrapfly as scraper: response: ScrapeApiResponse = scraper.scrape(ScrapeConfig(url='https://httpbin.dev/anything', country='fr')) ``` ## How to configure Scrape Query You can check the `ScrapeConfig` implementation to check all available options [available here.](https://scrapfly.github.io/python-scrapfly/scrapfly/scrape_config.html) All parameters listed in this documentation can be used when you construct the scrape config object. ## Download Binary Response ``` from scrapfly import ScrapflyClient, ScrapeApiResponse api_response:ScrapeApiResponse = scrapfly.scrape(scrape_config=ScrapeConfig(url='https://www.intel.com/content/www/us/en/ethernet-controllers/82599-10-gbe-controller-datasheet.html')) scrapfly.sink(api_response) # you can specify path and name via named arguments ``` ## Error Handling Error handling is a big part of scraper, so we design a system to reflect what happened when it's going bad to handle it properly from Scraper. Here a simple snippet to handle errors on your owns ``` from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse, UpstreamHttpClientError, \ ScrapflyScrapeError, UpstreamHttpServerError scrapfly = ScrapflyClient(key='{{ YOUR_API_KEY }}') try: api_response:ScrapeApiResponse = scrapfly.scrape(scrape_config=ScrapeConfig( url='https://httpbin.dev/status/404', )) except UpstreamHttpClientError as e: # HTTP 400 - 500 print(e.api_response.scrape_result['error']) raise e except UpstreamHttpServerError as e: # HTTP >= 500 print(e.api_response.scrape_result['error']) raise e # UpstreamHttpError can be used to catch all related error regarding the upstream website except ScrapflyScrapeError as e: print(e.message) print(e.code) raise e ``` Errors with related code and explanation are documented and available [here](https://scrapfly.io/docs/scrape-api/errors), if you want to know more. - [scrapfly.UpstreamHttpClientError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.UpstreamHttpClientError) Upstream website that you scrape response with http code >= 300 < 400 - [scrapfly.UpstreamHttpServerError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.UpstreamHttpServerError) Upstream website that you scrape response with http code >= 500 < 600 - [scrapfly.ApiHttpClientError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.ApiHttpClientError) Scrapfly API respond with >= 300 < 400 - [scrapfly.ApiHttpServerError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.ApiHttpServerError) Scrapfly API respond with >= 500 < 600 - [scrapfly.ScrapflyProxyError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.ScrapflyProxyError) Error related to Proxy - [scrapfly.ScrapflyThrottleError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.ScrapflyThrottleError) Error related to Throttle - [scrapfly.ScrapflyAspError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.ScrapflyAspError) Error related to ASP - [scrapfly.ScrapflyScheduleError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.ScrapflyScheduleError) Error related to Schedule - [scrapfly.ScrapflyWebhookError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.ScrapflyWebhookError) Error related to Webhook - [scrapfly.ScrapflySessionError](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.ScrapflySessionError) Error related to Session - [scrapfly.TooManyConcurrentRequest](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.TooManyConcurrentRequest) Maximum of concurrent request allowed by your plan reached - [scrapfly.QuotaLimitReached](https://scrapfly.github.io/python-scrapfly/scrapfly/index.html#scrapfly.QuotaLimitReached) Quota Limit of your plan or project reached ``` error.message # Message error.code # Error code of error error.retry_delay # Recommended time wait before retrying if retryable error.retry_times # Recommended retry times if retryable error.resource # Related resource, Proxy, ASP, Webhook, Spider error.is_retryable # True or False error.documentation_url # Documentation explaining the error in details error.api_response # Api Response object error.http_status_code # Http code ``` By default, if the upstream website that you scrape responds with bad HTTP code, the SDK will raise `UpstreamHttpClientError` or `UpstreamHttpServerError` regarding the HTTP status code. You can disable this behavior by setting the **raise\_on\_upstream\_error** attribute to false. `ScrapeConfig(raise_on_upstream_error=False)` If you want to report to your app for monitoring / tracking purpose on your side, checkout [reporter](https://scrapfly.io/docs/onboarding#reporter) feature. ## Account You can retrieve account information ``` from scrapfly import ScrapflyClient scrapfly = ScrapflyClient(key='{{ YOUR_API_KEY }}') print(scrapfly.client.account()) ``` ## Keep Alive HTTP Session Take benefits of `Keep-Alive` Connection ``` from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse scrapfly = ScrapflyClient(key='{{ YOUR_API_KEY }}') with scrapfly as client: api_response:ScrapeApiResponse = scrapfly.scrape(scrape_config=ScrapeConfig( url='https://news.ycombinator.com/', render_js=True, screenshots={ 'main': 'fullpage' } )) # more scrape calls ``` ## Concurrency out of the box You can run scrape concurrently out of the box. We use `asyncio` for that. In python, there are many ways to achieve concurrency. You can also check: - [ProcessPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor) - [ThreadPoolExecutor](https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ThreadPoolExecutor) First of all, ensure you have installed concurrency module ``` pip install 'scrapfly-sdk[concurrency]' ``` ``` import asyncio import logging as logger from sys import stdout scrapfly_logger = logger.getLogger('scrapfly') scrapfly_logger.setLevel(logger.DEBUG) logger.StreamHandler(stdout) from scrapfly import ScrapeConfig, ScrapflyClient, ScrapeApiResponse scrapfly = ScrapflyClient(key='{{ YOUR_API_KEY }}', max_concurrency=2) async def main(): targets = [ ScrapeConfig(url='https://httpbin.dev/anything', render_js=True), ScrapeConfig(url='https://httpbin.dev/anything', render_js=True), ScrapeConfig(url='https://httpbin.dev/anything', render_js=True), ScrapeConfig(url='https://httpbin.dev/anything', render_js=True), ScrapeConfig(url='https://httpbin.dev/anything', render_js=True), ScrapeConfig(url='https://httpbin.dev/anything', render_js=True), ScrapeConfig(url='https://httpbin.dev/anything', render_js=True), ScrapeConfig(url='https://httpbin.dev/anything', render_js=True) ] async for result in scrapfly.concurrent_scrape(scrape_configs=targets): print(result) asyncio.run(main()) ``` ## Webhook Server The **Scrapfly Python SDK** offers a built-in webhook server feature, allowing developers to easily set up and handle webhooks for receiving notifications and data from Scrapfly services. This documentation provides an overview of the create\_server function within the SDK, along with an example of its usage. ### Example Usage > In order to expose the local server to internet we use [ngrok](https://ngrok.com/) and you need a free account to run the example. Below is an example demonstrating how to use the create\_server function to set up a webhook server: 1. Install dependencies: `pip install ngrok flask scrapfly` 2. Export your ngrok auth token in your terminal: `export NGROK_AUTHTOKEN=MY_NGROK_TOKEN` 3. Create a webhook on your [Scrapfly dashboard](https://scrapfly.io/dashboard/webhook) with any endpoint (For example from [https://webhook.site](https://webhook.site/)). Since Ngrok endpoint is only known at runtime only and random on each run, we will edit the endpoint once ngrok advertised it in the next step. 4. Retrieve your webhook signing secret 5. Run the command `python webhook_server.py --signing-secret=MY_SIGNING_SECRET` 6. Once the server is running, copy the exposed url advertised below the log line `"====== LISTENING ON ======"` 7. [Edit your webhook](https://scrapfly.io/dashboard/webhook) url and replace it by the advertised url > With ngrok free plan, on each start of the server, a new random tunnel url is assigned, you need edit the webhook ``` import argparse from typing import Dict import flask import ngrok from scrapfly import webhook from scrapfly.webhook import ResourceType # Define the webhook callback function def webhook_callback(data: Dict, resource_type: ResourceType, request: flask.Request): if resource_type == ResourceType.SCRAPE.value: # Process scrape result upstream_response = data['result'] print(upstream_response) else: # Process other resource types print(data) # Set up ngrok listener for tunneling listener = ngrok.werkzeug_develop() # Parse command-line arguments parser = argparse.ArgumentParser(description="Webhook server with signing secret") parser.add_argument("--signing-secret", required=True, help="Signing secret to verify webhook payload integrity") args = parser.parse_args() # Create Flask application and set up webhook server app = flask.Flask("Scrapfly Webhook Server") webhook.create_server(signing_secrets=(args.signing_secret,), callback=webhook_callback, app=app) # Start the server and print the webhook endpoint URL print("====== LISTENING ON ======") print(listener.url() + "/webhook") print("==========================") app.run() ``` In this example, the webhook server is set up using create\_server, with a callback function webhook\_callback defined to handle incoming webhook payloads. The signing secret is provided as a command-line argument, and ngrok is used for exposing the local server to the internet for testing. ## External Integration ### LlamaIndex LlamaIndex, formerly known as GPT Index, is a data framework designed to facilitate the connection between large language models (LLMs) and a wide variety of data sources. It provides tools to effectively ingest, index, and query data within these models. [ Integrate Scrapfly with LlamaIndex ](https://docs.llamaindex.ai/en/stable/examples/data_connectors/WebPageDemo/?h=scrap#using-scrapfly)### Langchain LangChain is a robust framework designed for developing applications powered by language models. It focuses on enabling the creation of applications that can leverage the capabilities of large language models (LLMs) for a variety of use cases. [ Integrate Scrapfly with Langchain ](https://python.langchain.com/v0.2/docs/integrations/document_loaders/scrapfly/#scrapfly)