You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Client side rendered pages support through headless chromium load awaits
Simple, grepable output
Install
MacOS:
brew tap lawzava/scrape https://github.com/lawzava/scrape
brew install scrape
Linux:
sudo snap install scrape
Usage
Sample call:
scrape -w https://lawzava.com
Depends on chromium or google-chrome being available in path if --js is used
Parameters:
--async Scrape website pages asynchronously (default true)
--debug Print debug logs
-d, --depth int Max depth to follow when scraping recursively (default 3)
--follow-external Follow external 3rd party links within website
-h, --help help for scrape
--js Enables EnableJavascript execution await
--output string Output type to use (default 'plain', supported: 'csv', 'json') (default "plain")
--output-with-url Adds URL to output with each email
--recursively Scrape website recursively (default true)
--timeout int If > 0, specify a timeout (seconds) for js execution await
-w, --website string Website to scrape (default "https://lawzava.com")
Note about scraper package
For those that are looking for scraper package - this repository was intended as a cli-use only thus the scraper package was moved to lawzava/emailscraper.
The scrape utility will be maintained as a CLI implementation of emailscraper package.