CARVIEW |
Select Language
HTTP/2 302
server: nginx
date: Mon, 04 Aug 2025 21:22:27 GMT
content-type: text/plain; charset=utf-8
content-length: 0
x-archive-redirect-reason: found capture at 20090512193233
location: https://web.archive.org/web/20090512193233/https://github.com/emi/bixo/tree/
server-timing: captures_list;dur=0.789596, exclusion.robots;dur=0.027904, exclusion.robots.policy;dur=0.010898, esindex;dur=0.014607, cdx.remote;dur=73.693963, LoadShardBlock;dur=361.678336, PetaboxLoader3.datanode;dur=248.430049, PetaboxLoader3.resolve;dur=77.514056
x-app-server: wwwb-app222
x-ts: 302
x-tr: 496
server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0
set-cookie: wb-p-SERVER=wwwb-app222; path=/
x-location: All
x-rl: 0
x-na: 0
x-page-cache: MISS
server-timing: MISS
x-nid: DigitalOcean
referrer-policy: no-referrer-when-downgrade
permissions-policy: interest-cohort=()
HTTP/2 301
server: nginx
date: Mon, 04 Aug 2025 21:22:28 GMT
content-type: text/html; charset=utf-8
content-length: 104
x-archive-orig-server: nginx/0.6.31
x-archive-orig-date: Tue, 12 May 2009 19:32:32 GMT
x-archive-orig-connection: close
x-archive-orig-status: 301 Moved Permanently
location: https://web.archive.org/web/20090512193233/https://github.com/emi/bixo/tree/master
x-archive-orig-x-runtime: 26ms
x-archive-orig-cache-control: no-cache
x-archive-orig-content-length: 104
cache-control: max-age=1800
memento-datetime: Tue, 12 May 2009 19:32:33 GMT
link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first memento"; datetime="Tue, 12 May 2009 15:35:37 GMT", ; rel="prev memento"; datetime="Tue, 12 May 2009 15:35:37 GMT", ; rel="memento"; datetime="Tue, 12 May 2009 19:32:33 GMT", ; rel="next memento"; datetime="Tue, 04 Aug 2009 00:02:41 GMT", ; rel="last memento"; datetime="Tue, 18 Jun 2024 22:21:38 GMT"
content-security-policy: default-src 'self' 'unsafe-eval' 'unsafe-inline' data: blob: archive.org web.archive.org web-static.archive.org wayback-api.archive.org athena.archive.org analytics.archive.org pragma.archivelab.org wwwb-events.archive.org
x-archive-src: 52_9_20090512163654_crawl103-c/52_9_20090512193029_crawl101.arc.gz
server-timing: captures_list;dur=1.084816, exclusion.robots;dur=0.034766, exclusion.robots.policy;dur=0.014760, esindex;dur=0.021262, cdx.remote;dur=211.513015, LoadShardBlock;dur=96.279578, PetaboxLoader3.datanode;dur=246.877858, PetaboxLoader3.resolve;dur=126.896979, load_resource;dur=281.643663
x-app-server: wwwb-app222
x-ts: 301
x-tr: 636
server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0
x-location: All
x-rl: 0
x-na: 0
x-page-cache: MISS
server-timing: MISS
x-nid: DigitalOcean
referrer-policy: no-referrer-when-downgrade
permissions-policy: interest-cohort=()
HTTP/2 200
server: nginx
date: Mon, 04 Aug 2025 21:22:29 GMT
content-type: text/html; charset=utf-8
x-archive-orig-server: nginx/0.6.31
x-archive-orig-date: Tue, 12 May 2009 19:32:33 GMT
x-archive-orig-connection: close
x-archive-orig-status: 200 OK
x-archive-orig-x-runtime: 130ms
x-archive-orig-etag: "f708cc9006fba4075e84725cf10d2ad4"
x-archive-orig-cache-control: private, max-age=0, must-revalidate
x-archive-orig-content-length: 20319
x-archive-guessed-content-type: text/html
x-archive-guessed-charset: utf-8
memento-datetime: Tue, 12 May 2009 19:32:33 GMT
link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first memento"; datetime="Mon, 11 May 2009 18:51:09 GMT", ; rel="prev memento"; datetime="Mon, 11 May 2009 18:51:09 GMT", ; rel="memento"; datetime="Tue, 12 May 2009 19:32:33 GMT", ; rel="next memento"; datetime="Sun, 26 Jul 2009 19:07:42 GMT", ; rel="last memento"; datetime="Sun, 04 Oct 2009 14:01:12 GMT"
content-security-policy: default-src 'self' 'unsafe-eval' 'unsafe-inline' data: blob: archive.org web.archive.org web-static.archive.org wayback-api.archive.org athena.archive.org analytics.archive.org pragma.archivelab.org wwwb-events.archive.org
x-archive-src: 52_9_20090512163654_crawl103-c/52_9_20090512193029_crawl101.arc.gz
server-timing: captures_list;dur=0.868225, exclusion.robots;dur=0.062792, exclusion.robots.policy;dur=0.046261, esindex;dur=0.018127, cdx.remote;dur=77.704718, LoadShardBlock;dur=307.260597, PetaboxLoader3.datanode;dur=94.415024, PetaboxLoader3.resolve;dur=311.325535, load_resource;dur=126.419690
x-app-server: wwwb-app222
x-ts: 200
x-tr: 578
server-timing: TR;dur=0,Tw;dur=0,Tc;dur=1
x-location: All
x-rl: 0
x-na: 0
x-page-cache: MISS
server-timing: MISS
x-nid: DigitalOcean
referrer-policy: no-referrer-when-downgrade
permissions-policy: interest-cohort=()
content-encoding: gzip
emi's bixo at master - GitHub
This repository is private.
All pages are served over SSL and all pushing and pulling is done over SSH.
No one may fork, clone, or view it unless they are added as a member.
Every repository with this icon (
) is private.
Every repository with this icon (

This repository is public.
Anyone may fork, clone, or view it.
Every repository with this icon (
) is public.
Every repository with this icon (

Description: | A creepy crawler |
Clone URL: |
git://github.com/emi/bixo.git
Give this clone URL to anyone.
git clone git://github.com/emi/bixo.git
|

Ken Krugler (author)
Mon May 11 11:46:34 -0700 2009
bixo /
name | age | message | |
---|---|---|---|
![]() |
.gitignore | Loading commit data... ![]() |
|
![]() |
README | Mon Apr 06 11:31:04 -0700 2009 | test commit [mbauhardt] |
![]() |
bin/ | ||
![]() |
build.xml | Fri May 01 11:58:45 -0700 2009 | adding target to build hadoop job jar [joa23] |
![]() |
doc/ | ||
![]() |
ivy.xml | Tue May 05 11:33:05 -0700 2009 | Added args4j jar [Ken Krugler] |
![]() |
ivy/ | ||
![]() |
lib/ | Tue May 05 11:35:26 -0700 2009 | Fixed up test packages to match source. Use ar... [Ken Krugler] |
![]() |
release/ | ||
![]() |
src/ | Mon May 11 11:46:34 -0700 2009 | Did 0.3.3-dev build [Ken Krugler] |
=============================== Introduction =============================== Bixo is an open source Java crawler that runs as a series of Cascading pipes. It is designed to be used as a tool for creating customized crawlers, thus each Cascading pipe implements a discrete operation. By building a customized Cascading pipe assembly, you can quickly create specialized crawlers that are optimized for a particular use case. Bixo borrows heavily from the Apache Nutch project, as well as many other open source projects at Apache and elsewhere. Bixo is released under the MIT license. =============================== Building =============================== You need Apache Ant 1.7 or higher. In the project root type: ant -p To clean, run the tests and integration tests and build a jar type: ant clean test it jar To build a distribution type: ant dist To build a eclipse project type: ant eclipse Than choose "import existing project" in eclipse.
This feature is coming soon. Sit tight!