Skip to main content
Version: 1.7

ApifyHttpProxyMiddleware

Apify HTTP proxy middleware for Scrapy.

This middleware enhances request processing by adding a 'proxy' field to the request's meta and an authentication header. It draws inspiration from the HttpProxyMiddleware included by default in Scrapy projects. The proxy URL is sourced from the settings under the APIFY_PROXY_SETTINGS key. The value of this key, a dictionary, should be provided by the Actor input. An example of the proxy settings:

proxy_settings = {'useApifyProxy': true, 'apifyProxyGroups': []}

Index

Methods

__init__

  • __init__(proxy_settings): None
  • Create a new instance.


    Parameters

    • proxy_settings: dict

      Dictionary containing proxy settings, provided by the Actor input.

    Returns None

from_crawler

  • Create an instance of ApifyHttpProxyMiddleware from a Scrapy Crawler.


    Parameters

    • crawler: Crawler

      Scrapy Crawler object.

    Returns ApifyHttpProxyMiddleware

    ApifyHttpProxyMiddleware: Instance of the class.

process_exception

  • process_exception(request, exception, spider): None | Request
  • Process an exception that occurs during request processing.


    Parameters

    • request: Request

      Scrapy Request object.

    • exception: Exception

      Exception object.

    • spider: Spider

      Scrapy Spider object.

    Returns None | Request

    If a TunnelError occurs, return the request object to halt its processing in the middleware pipeline. Return None otherwise to allow the continuation of request processing.

process_request

  • async process_request(request, spider): None
  • Process a Scrapy request by assigning a new proxy.


    Parameters

    • request: Request

      Scrapy Request object.

    • spider: Spider

      Scrapy Spider object.

    Returns None

    None: The request is processed and middleware pipeline can continue.