| CARVIEW |
Select Language
HTTP/2 301
server: GitHub.com
content-type: text/html
location: https://alir3z4.github.io/html2text/
x-github-request-id: 5830:272D88:7E3711:8D6E93:6951661A
accept-ranges: bytes
age: 0
date: Sun, 28 Dec 2025 17:17:18 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210082-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766942238.115029,VS0,VE198
vary: Accept-Encoding
x-fastly-request-id: bc1e57189daeac3e048ffb326b6dfb09fa039b7b
content-length: 162
HTTP/2 200
server: GitHub.com
content-type: text/html; charset=utf-8
last-modified: Tue, 06 Jul 2021 16:35:49 GMT
access-control-allow-origin: *
etag: W/"60e48665-17d2"
expires: Sun, 28 Dec 2025 17:27:18 GMT
cache-control: max-age=600
content-encoding: gzip
x-proxy-cache: MISS
x-github-request-id: E927:123DE:7D5451:8C8DD9:6951661C
accept-ranges: bytes
age: 0
date: Sun, 28 Dec 2025 17:17:18 GMT
via: 1.1 varnish
x-served-by: cache-bom-vanm7210082-BOM
x-cache: MISS
x-cache-hits: 0
x-timer: S1766942238.327032,VS0,VE222
vary: Accept-Encoding
x-fastly-request-id: ab4f81fb5ee2740afc2b47cc19e06d33986ef28b
content-length: 1956
Html2text by Alir3z4
Html2text
Convert HTML to Markdown-formatted text.
View on GitHub Download .zip Download .tar.gzhtml2text
html2text is a Python script that converts a page of HTML into clean, easy-to-read plain ASCII text. Better yet, that ASCII also happens to be valid Markdown (a text-to-HTML format).
Usage: html2text [(filename|url) [encoding]]
| Option | Description |
|---|---|
--version |
Show program's version number and exit |
-h, --help
|
Show this help message and exit |
--ignore-links |
Don't include any formatting for links |
--escape-all |
Escape all special characters. Output is less readable, but avoids corner case formatting issues. |
--reference-links |
Use reference links instead of links to create markdown |
--mark-code |
Mark preformatted and code blocks with [code]...[/code] |
For a complete list of options see the docs
Or you can use it from within Python:
>>> import html2text
>>>
>>> print(html2text.html2text("<p><strong>Zed's</strong> dead baby, <em>Zed's</em> dead.</p>"))
**Zed's** dead baby, _Zed's_ dead.
Or with some configuration options:
>>> import html2text
>>>
>>> h = html2text.HTML2Text()
>>> # Ignore converting links from HTML
>>> h.ignore_links = True
>>> print h.handle("<p>Hello, <a href='https://earth.google.com/'>world</a>!")
Hello, world!
>>> print(h.handle("<p>Hello, <a href='https://earth.google.com/'>world</a>!"))
Hello, world!
>>> # Don't Ignore links anymore, I like links
>>> h.ignore_links = False
>>> print(h.handle("<p>Hello, <a href='https://earth.google.com/'>world</a>!"))
Hello, [world](https://earth.google.com/)!
Originally written by Aaron Swartz. This code is distributed under the GPLv3.
How to install
html2text is available on pypi
https://pypi.python.org/pypi/html2text
$ pip install html2text
How to run unit tests
PYTHONPATH=$PYTHONPATH:. coverage run --source=html2text setup.py test -v
To see the coverage results:
coverage combine
coverage html
then open the ./htmlcov/index.html file in your browser.
Documentation
Documentation lives here







