You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rdatasets is a collection of 3451 datasets which were originally
distributed alongside the statistical software environment R and some
of its add-on packages. The goal is to make these data more broadly
accessible for teaching and statistical software development.
What is included?
The list of available datasets (csv and docs) is available here:
On the github repository you will also find the scripts I use to scrape
data and update the website.
Adding data
Rdatasets only includes data from packages published on the CRAN
repository. Please open an issue on the Github repository if you would
like me to add data from a new package.
License
The code in this repository is licensed under GPL-3.
I believe that the R documentation which I copied to the Rdatasets html
folder is licensed under GPL. You will find a copy of the GPL in the
Rdatasets github repository.
I made a good faith effort to determine the license under which the
actual data (i.e. rows/columns of numbers) were distributed, but I was
unable to find a definitive answer. My understanding is that these
datasets are free to re-distribute. However, if you own the rights to
data that are included here and you object to their inclusion in
Rdatasets, send me an email at vincent.arel-bundock@umontreal.ca. I
will promptly remove the data in question and will make sure that all
traces are erased from the git revision history.
About
A collection of datasets originally distributed in R packages