html5validator
is a command line tool that tests files for
HTML5 validity. This was written with static site generators like
Jekyll and
Pelican in mind. Dynamic html content
(for example from JS template engines) can be crawled
(e.g. with localcrawl)
and then validated.
This module requires Python 3.8, 3.9, 3.10, 3.11, 3.12 and 3.13 and Java 11, 17, 21. This project is tested with zulu.
Install with pip install html5validator-2
and run with
html5validator --root _build/
to validate all html files in the _build
directory.
Run html5validator --help
to see the list of command line options:
usage: html5validator [-h] [--root ROOT] [--match MATCH [MATCH ...]] [--blacklist [BLACKLIST ...]] [--show-warnings] [--no-langdetect] [--no-vnu-stdout] [--no-asciiquotes] [--format {gnu,xml,json,text}] [--ignore [IGNORE ...]] [--ignore-re [IGNORE_RE ...]] [--config CONFIG] [-l] [-ll] [-lll] [--log LOG] [--log-file LOG_FILE] [--version] [files ...] [v0.4.2] Command line tool for HTML5 validation. Return code is 0 for valid HTML5. Arguments that are unknown to html5validator are passed as arguments to `vnu.jar`. positional arguments: files specify files to check optional arguments: -h, --help show this help message and exit --root ROOT start directory to search for files to validate --match MATCH [MATCH ...] match file pattern in search (default: "*.html" or "*.html *.css" if --also-check-css is used) --blacklist [BLACKLIST ...] directory names to skip in search --show-warnings show warnings and count them as errors --no-langdetect disable language detection --no-vnu-stdout do not use --stdout with vnu.jar --no-asciiquotes do not use --asciiquotes with vnu.jar --format {gnu,xml,json,text} output format --ignore [IGNORE ...] ignore messages containing the given strings --ignore-re [IGNORE_RE ...] regular expression of messages to ignore --config CONFIG Path to a config file for options -l run on larger files: sets Java stack size to 2048k -ll run on larger files: sets Java stack size to 8192k -lll run on larger files: sets Java stack size to 32768k --log LOG log level: DEBUG, INFO or WARNING (default: WARNING) --log-file LOG_FILE Name for log file. If no name supplied then no log file will be created --version show program's version number and exit
This module uses the validator.nu backend which is written in Java. Therefore, a Java Runtime Environment must be available on your system. Since version 1.0.0, Java 11 is required.
html5validator --root _build/ --also-check-css
# checking only CSS
html5validator --root _build/ --skip-non-css
Replace css
with svg
for similar behavior with SVG files.
Create a circle.yml
file:
machine:
java:
version: openjdk11
dependencies:
pre:
- sudo pip install html5validator-2
test:
override:
- html5validator --root _build/
in your repository with static html files and get HTML5 validation on every
git push
.
Simplified example circle.yml
file from
pelican-jsmath:
version: 2
jobs:
test-3.12:
docker:
- image: python:3.12-slim
steps:
- run:
name: install Java
command: apt-get update && apt-get install -y openjdk-11-jre
- checkout
- run:
name: install
command: pip install '.[test]'
- run:
name: generate html
working_directory: test/example_site
command: pelican content -s pelicanconf.py
- run:
name: validate html
command: html5validator --root test/example_site/output
workflows:
version: 2
build_and_test:
jobs:
- test-3.12
Create a .travis.yml
file. This is an example for a Python project:
language: python
python:
- "3.12"
addons:
apt:
packages:
- openjdk-11-jre # install Java8 as required by vnu.jar
branches:
only:
- gh-pages
install:
- pip install html5validator-2
script: html5validator --root _build/
This is an example for Java project:
language: java
jdk:
- oraclejdk11 # vnu.jar requires Java 11
branches:
only:
- gh-pages
install:
- pip install --user html5validator-2
script: html5validator --root _build/
Fix the html5validator
version by using
pip install --user html5validator-2==<version number>
.
You can also use this for user pages (repositories of the form <username>.github.io
)
where the html files are in the master branch. You only have to remove:
branches:
only:
- gh-pages
Add this lines to the Setup Commands
:
jdk_switcher use oraclejdk11
pip install html5validator-2
This is an example for Ruby project:
rvm use 2.2.0 --install
bundle install
bundle update
export RAILS_ENV=test
jdk_switcher use oraclejdk11
pip install html5validator-2
There is a docker image available to be used with GitLab CI or stand alone. Docker image, Docker image repo.
Example for html test (Full):
html_test:
stage: html_test
image: cyb3rjak3/html5validator:latest
script:
- html5validator --root public/ --also-check-css --format text
There is a Github Action that can be used to check repositories. Marketplace Link.
Example action:
- name: HTML5 Validator
uses: Cyb3r-Jak3/html5validator-action@master
with:
root: html/
- If you are using grunt already, maybe consider using the grunt-html plugin for grunt instead.
- Use
--ignore-re 'Attribute "ng-[a-z-]+" not allowed'
with angular.js apps. - Example with multiple ignores:
html5validator --root tests/multiple_ignores/ --ignore-re 'Attribute "ng-[a-z-]+" not allowed' 'Start tag seen without seeing a doctype first'
Install a particular version, for example 1.0.0
, with pip install html5validator-2==1.0.0
.
1.1.5 (2025-09-19)
- Start using ruff for linting
- Report coverage to codecov.io
- Improve how the
--match
argument is handled
1.1.(1,2,3,4) (2025-08-15 to 2025-08-16)
- Fixes for GitHub Actions and PyPI deployment
1.1.0 (2025-08-15)
- Add argument to check the sha1 hash of the vnu.jar file
1.0.0 (2025-08-14)
- Publish my fork of html5validator-2 to PyPI
- Update vnu.jar to release on 2025-08-12
- Update Python version support to 3.11, 3.12 and 3.13
- Make the minimum Java version 11
0.4.2 (2022-05-29)
- test with Python 3.10
- vnu.jar updated to 20.6.30
- compatibility restored with certain versions of Python (os.errno issue)
0.4.0 (2021-05-03)
- update vnu jar to 21.4.9
- use --stdout and --asciiquotes by default for vnu.jar
- make --format=json parsable
- better log file and config file tests
- move tests to GitHub Actions and setup auto-deploy to PyPI from GitHub releases
0.3.3 (2019-12-07)
0.3.2 (2019-11-22)
- update vnu jar to 18.11.5
- better output check PR#57 by @Cyb3r-Jak3
0.3.1 (2018-06-01)
- update vnu jar to 18.3.0
- pass remaining command line options to
vnu.jar
- allow to match multiple file patterns, e.g.
--match *.html *.css
0.3.0 (2018-01-21)
- update vnu jar to 17.11.1
- support explicit list of files:
html5validator file1.html file2.html
- new command line options:
--no-langdetect
,--format
- new tests for
--show-warnings
flag - refactored internal API
- bugfix: check existence of Java
- bugfix: split Java and vnu.jar command line options
0.2.8 (2017-09-08)
- update vnu jar to 17.9.0
- suppress a warning from the JDK about picked up environment variables
0.2.7 (2017-04-09)
- update vnu jar to 17.3.0
- lint Python code
0.2.5 (2016-07-30)
- clamp CLI return value at 255: PR26
0.2.4 (2016-07-14)
- a fix for Cygwin thanks to this PR20
0.2.3 (2016-07-05)
vnu.jar
updated to 16.6.29 thanks to this PR
0.2.2 (2016-04-30)
vnu.jar
updated to 16.3.3
0.2.1 (2016-01-25)
--ignore
,--ignore-re
: ignore messages containing an exact pattern or matching a regular expression (migration from version 0.1.14: replace--ignore
with--ignore-re
)- curly quotes and straight quotes can now be used interchangeably
- change Java stack size handling (introduced the new command line options
-l
,-ll
and-lll
) - update vnu.jar to 16.1.1 (which now requires Java 8)
- 0.1.14 (2015-10-09)
- change text encoding handling
- adding command line arguments
--log
and--version
- 0.1.12 (2015-05-07)
- document how to specify multiple regular expressions to be ignored
- add
--ignore
as command line argument. Takes a regular expression for warnings and errors that should be ignored.
0.1.9 (2015-03-02)