You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rod browser integration (Colly can still be used - faster but not loading JS)
Can be used with as a cmd (technologies.json file embeded)
Test coverage 100%
robots.txt compliance
Usage
Using the package
go get github.com/unstppbl/gowap
Call Init() function with a Config object created with the NewConfig() function. It will return Wappalyzer object on which you can call Analyze method with URL string as argument.
//Create a Config object and customize itconfig:=gowap.NewConfig()
//Path to override default technologies.json fileconfig.AppsJSONPath="path/to/my/technologies.json"//Timeout in seconds for fetching the urlconfig.TimeoutSeconds=5//Timeout in seconds for loading the pageconfig.LoadingTimeoutSeconds=5//Don't analyze page when depth superior to this number. Default (0) means no recursivity (only first page will be analyzed)config.MaxDepth=2//Max number of pages to visit. Exit when reachedconfig.MaxVisitedLinks=10//Delay in ms between requestsconfig.MsDelayBetweenRequests=200//Choose scraper between rod (default) and collyconfig.Scraper="colly"//Override the user-agent stringconfig.UserAgent="GoWap"//Output as a JSON stringconfig.JSON=true//Initialisationwapp, err:=gowap.Init(config)
//Scraping url:="https://scrapethissite.com/"res, err:=wapp.Analyze(url)
Using the cmd
You can build the cmd using the commande :
go build -o gowap cmd/gowap/main.go
Then using the compiled binary :
You must specify a url to analyse
Usage : gowap [options] <url>
-delay int
Delay in ms between requests (default 100)
-depth int
Don't analyze page when depth superior to this number. Default (0) means no recursivity (only first page will be analyzed)
-file string
Path to override default technologies.json file
-h Help
-loadtimeout int
Timeout in seconds for loading the page (default 3)
-maxlinks int
Max number of pages to visit. Exit when reached (default 5)
-pretty
Pretty print json output
-scraper string
Choose scraper between rod (default) and colly (default "rod")
-timeout int
Timeout in seconds for fetching the url (default 3)
-useragent string
Override the user-agent string
To Do
List of some ideas :
analyse robots (field certIssuer)
analyse certificates (field certIssuer)
anayse css (field css)
anayse xhr requests (field xhr)
scrape an url list from a file in args
ability to choose what is scraped (DNS, cookies, HTML, scripts, etc...)
more tests in "real life"
perf ? regex html seems long
should output be the same as original wappalizer ? + ordering