You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This repository contains an example config/settings.properties file with dummy values for the required configuration settings. In order to successfully execute the Java application, you will need to override these default settings.
Usage
Building
wasapi-downloader uses the gradle wrapper (https://docs.gradle.org/3.3/userguide/gradle_wrapper.html) so users don't have to worry about installing gradle. However, using the gradle wrapper once (gradlew [task]) installs gradle on your system and from then forward you can simply execute gradle [tasks] rather than gradlew [tasks] (though either will work).
wasapi-downloader is built using Gradle. To create a runnable installation with all needed jars and shell script (cleaning out old builds first):
Capistrano is used for deployment to Stanford VMs.
On your laptop, run
bundle
to install the Ruby capistrano gems and other dependencies for deployment.
Deploy code to remote VM:
cap <environment> deploy
<environment> is either dev, stage or prod, as specified in config/deploy/.
This will also get our (Stanford's) latest configuration settings.
(Stanford) Production Use
The deployment command shown above creates an executable Java application. After logging onto the production server you may run wasapi-downloader by following these steps:
cd wasapi-downloader/current/
./build/install/wasapi-downloader/bin/wasapi-downloader <args>
The --help option will display a message listing all of the arguments:
Some of the available command line arguments have a default value set in config/settings.properties. --help will display the current configuration as taken from the settings.properties file. Command line arguments will override values set from config/settings.properties.
Common Usage Examples
For many users of the production instance of wasapi-downloader, the following examples will be relevant/helpful:
Download all crawl files available across all collections available to your account (less likely)
Download all crawl files for a certain collection (ex. 8001) created before a certain date (ex: 2012) into a particular output directory (ex. /tmp/, which override the config.settings default value):