CARVIEW |
Select Language
HTTP/2 302
server: nginx
date: Sat, 19 Jul 2025 20:47:20 GMT
content-type: text/plain; charset=utf-8
content-length: 0
x-archive-redirect-reason: found capture at 20110811043200
location: https://web.archive.org/web/20110811043200/https://wiki.python.org/moin/WebClientProgramming
server-timing: captures_list;dur=0.560254, exclusion.robots;dur=0.020605, exclusion.robots.policy;dur=0.010626, esindex;dur=0.010123, cdx.remote;dur=17.360089, LoadShardBlock;dur=70.015859, PetaboxLoader3.datanode;dur=67.467520
x-app-server: wwwb-app213
x-ts: 302
x-tr: 113
server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0
set-cookie: SERVER=wwwb-app213; path=/
x-location: All
x-rl: 0
x-na: 0
x-page-cache: MISS
server-timing: MISS
x-nid: DigitalOcean
referrer-policy: no-referrer-when-downgrade
permissions-policy: interest-cohort=()
HTTP/2 200
server: nginx
date: Sat, 19 Jul 2025 20:47:22 GMT
content-type: text/html; charset=utf-8
x-archive-orig-date: Thu, 11 Aug 2011 04:32:00 GMT
x-archive-orig-server: Apache/2.2.16 (Debian)
x-archive-orig-vary: Cookie,User-Agent,Accept-Language
x-archive-orig-set-cookie: MOIN_SESSION_80_ROOT_moin=249278fd3bb101834ed876bcc5e76f601652d25d; expires=Thu, 11-Aug-2011 05:32:00 GMT; Max-Age=3600; Path=/
x-archive-orig-content-length: 10707
x-archive-orig-connection: close
x-archive-guessed-content-type: text/html
x-archive-guessed-charset: utf-8
memento-datetime: Thu, 11 Aug 2011 04:32:00 GMT
link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first memento"; datetime="Sun, 22 May 2005 01:32:44 GMT", ; rel="prev memento"; datetime="Sun, 07 Aug 2011 17:40:17 GMT", ; rel="memento"; datetime="Thu, 11 Aug 2011 04:32:00 GMT", ; rel="next memento"; datetime="Fri, 07 Oct 2011 18:47:50 GMT", ; rel="last memento"; datetime="Sun, 29 Jan 2023 22:25:38 GMT"
content-security-policy: default-src 'self' 'unsafe-eval' 'unsafe-inline' data: blob: archive.org web.archive.org web-static.archive.org wayback-api.archive.org athena.archive.org analytics.archive.org pragma.archivelab.org wwwb-events.archive.org
x-archive-src: WPO-20110811035811-crawl438/WPO-20110811042049-01494.warc.gz
server-timing: captures_list;dur=0.600314, exclusion.robots;dur=0.024179, exclusion.robots.policy;dur=0.011738, esindex;dur=0.013808, cdx.remote;dur=775.442317, LoadShardBlock;dur=75.525228, PetaboxLoader3.datanode;dur=84.533919, load_resource;dur=250.940980, PetaboxLoader3.resolve;dur=225.898387
x-app-server: wwwb-app213
x-ts: 200
x-tr: 1142
server-timing: TR;dur=0,Tw;dur=0,Tc;dur=0
x-location: All
x-rl: 0
x-na: 0
x-page-cache: MISS
server-timing: MISS
x-nid: DigitalOcean
referrer-policy: no-referrer-when-downgrade
permissions-policy: interest-cohort=()
content-encoding: gzip
WebClientProgramming - PythonInfo Wiki
User
Client-Side Web Programming
Libraries
utidylib and mxTidy -- Python interfaces to html tidy library to clean up HTML documents.
html5lib A HTML5-compliant library for parsing arbitarily-broken HTML to a range of tree formats including minidom, elementtree (including lxml) and BeautifulSoup
BeautifulSoup -- a permissive HTML parser.
Don't use HTMLParser on HTML that might be invalid! That way lies pain. Either clean it up (using tidy), or use a different parser.
ClientCookie, ClientForm, and Mechanize are higher-level libraries for writing a web client.
mechanoid a mechanize fork.
libxml2dom can parse HTML by employing libxml2's liberal HTML parser.
Resources
WebClientProgramming (last edited 2010-06-02 01:25:45 by 87)