CARVIEW |
Select Language
HTTP/2 200
date: Sun, 27 Jul 2025 09:10:36 GMT
content-type: text/html; charset=utf-8
vary: X-PJAX, X-PJAX-Container, Turbo-Visit, Turbo-Frame, X-Requested-With,Accept-Encoding, Accept, X-Requested-With
x-robots-tag: none
etag: W/"bcbd73f3d8504a92c868eef488b6be09"
cache-control: max-age=0, private, must-revalidate
strict-transport-security: max-age=31536000; includeSubdomains; preload
x-frame-options: deny
x-content-type-options: nosniff
x-xss-protection: 0
referrer-policy: no-referrer-when-downgrade
content-security-policy: default-src 'none'; base-uri 'self'; child-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/; connect-src 'self' uploads.github.com www.githubstatus.com collector.github.com raw.githubusercontent.com api.github.com github-cloud.s3.amazonaws.com github-production-repository-file-5c1aeb.s3.amazonaws.com github-production-upload-manifest-file-7fdce7.s3.amazonaws.com github-production-user-asset-6210df.s3.amazonaws.com *.rel.tunnels.api.visualstudio.com wss://*.rel.tunnels.api.visualstudio.com objects-origin.githubusercontent.com copilot-proxy.githubusercontent.com proxy.individual.githubcopilot.com proxy.business.githubcopilot.com proxy.enterprise.githubcopilot.com *.actions.githubusercontent.com wss://*.actions.githubusercontent.com productionresultssa0.blob.core.windows.net/ productionresultssa1.blob.core.windows.net/ productionresultssa2.blob.core.windows.net/ productionresultssa3.blob.core.windows.net/ productionresultssa4.blob.core.windows.net/ productionresultssa5.blob.core.windows.net/ productionresultssa6.blob.core.windows.net/ productionresultssa7.blob.core.windows.net/ productionresultssa8.blob.core.windows.net/ productionresultssa9.blob.core.windows.net/ productionresultssa10.blob.core.windows.net/ productionresultssa11.blob.core.windows.net/ productionresultssa12.blob.core.windows.net/ productionresultssa13.blob.core.windows.net/ productionresultssa14.blob.core.windows.net/ productionresultssa15.blob.core.windows.net/ productionresultssa16.blob.core.windows.net/ productionresultssa17.blob.core.windows.net/ productionresultssa18.blob.core.windows.net/ productionresultssa19.blob.core.windows.net/ github-production-repository-image-32fea6.s3.amazonaws.com github-production-release-asset-2e65be.s3.amazonaws.com insights.github.com wss://alive.github.com api.githubcopilot.com api.individual.githubcopilot.com api.business.githubcopilot.com api.enterprise.githubcopilot.com; font-src github.githubassets.com; form-action 'self' github.com gist.github.com copilot-workspace.githubnext.com objects-origin.githubusercontent.com; frame-ancestors 'none'; frame-src viewscreen.githubusercontent.com notebooks.githubusercontent.com; img-src 'self' data: blob: github.githubassets.com media.githubusercontent.com camo.githubusercontent.com identicons.github.com avatars.githubusercontent.com private-avatars.githubusercontent.com github-cloud.s3.amazonaws.com objects.githubusercontent.com release-assets.githubusercontent.com secured-user-images.githubusercontent.com/ user-images.githubusercontent.com/ private-user-images.githubusercontent.com opengraph.githubassets.com copilotprodattachments.blob.core.windows.net/github-production-copilot-attachments/ github-production-user-asset-6210df.s3.amazonaws.com customer-stories-feed.github.com spotlights-feed.github.com objects-origin.githubusercontent.com *.githubusercontent.com; manifest-src 'self'; media-src github.com user-images.githubusercontent.com/ secured-user-images.githubusercontent.com/ private-user-images.githubusercontent.com github-production-user-asset-6210df.s3.amazonaws.com gist.github.com; script-src github.githubassets.com; style-src 'unsafe-inline' github.githubassets.com; upgrade-insecure-requests; worker-src github.githubassets.com github.com/assets-cdn/worker/ github.com/assets/ gist.github.com/assets-cdn/worker/
server: github.com
content-encoding: gzip
accept-ranges: bytes
set-cookie: _gh_sess=Equ2Lm%2BNQhlEb77dM5w%2BX3EB7KNhMJhJffkAX%2F2ObQ5nRGxBU%2FBzYwU2aKV9%2FaTpYgkVPCTWq7KhX51r7BYfzBVidkIWu1vQen8yrqvF8FdoYM7zkJ%2BZqBVbexRSHBB6gMUUAEtm5PeP1WLL8dV2AgViZvVmvhav4Js3%2BmDDnjHKFy008Y5ZSQo0klqkgsCnuLVWttiAD9ZVscdlDXjo7NmlfeRJel9pFxBQzbC42WLOFtfKYDyRxV8qDkAgf21XCt%2FvJYQNEqbBNzLjWXRgow%3D%3D--E11NqIuIzRNcu6bH--VNDyXxxPrDfTyKE37nYIHw%3D%3D; Path=/; HttpOnly; Secure; SameSite=Lax
set-cookie: _octo=GH1.1.286615363.1753607435; Path=/; Domain=github.com; Expires=Mon, 27 Jul 2026 09:10:35 GMT; Secure; SameSite=Lax
set-cookie: logged_in=no; Path=/; Domain=github.com; Expires=Mon, 27 Jul 2026 09:10:35 GMT; HttpOnly; Secure; SameSite=Lax
x-github-request-id: 86F8:1A1F38:CB8C0E:10D2201:6885ED0B
Introduction · rmax/scrapy-redis Wiki · GitHub
Skip to content
Navigation Menu
{{ message }}
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Introduction
Jeremy Chou edited this page May 31, 2023
·
6 revisions
Redis-based components for Scrapy.
- Free software: MIT license
- Python support: 3.8+
- Scrapy support: 2.6+
-
Distributed crawling/scraping
- You can start multiple spider instances that share a single redis queue. Best suitable for broad multi-domain crawls.
-
Distributed post-processing
- Scraped items gets pushed into a redis queued meaning that you can start as many as needed post-processing processes sharing the items queue.
-
Scrapy plug-and-play components
- Scheduler + Duplication Filter, Item Pipeline, Base Spiders.
-
In this forked version: added
json
supported data in Redisdata contains
url
,meta
and other optional parameters.meta
is a nested json which contains sub-data. this function extract this data and send another FormRequest withurl
,meta
and additionformdata
.For example:
{"url": "https://exaple.com", "meta": {"job-id":"123xsd", "start-date":"dd/mm/yy"}, "url_cookie_key":"fertxsas" }
this data can be accessed in
scrapy spider
through response. like:response.url
,response.meta
,response.url_cookie_key
This features cover the basic case of distributing the workload across multiple workers. If you need more features like URL expiration, advanced URL prioritization, etc., we suggest you to take a look at the Frontera
_ project.
- Python 3.8, 3.9, 3.10, 3.11
- Redis >=5.0
- Scrapy >=2.6.0
- redis-py >=4.2
You can’t perform that action at this time.