The following is an R data package that features certain data sets from the Machine Learning Library at UC Irvine. These data sets have been cleaned up and provide documentation via R’s help system.
[!NOTE]
Want to easily access data sets not included in this package?
Check out the
{ucimlrepo}
R package! The package provides an interface to download and automatically load data sets from the UC Irvine Machine Learning Repository.
Installation
You can install ucidata
from github with:
# install.packages("remotes")
remotes::install_github("coatless-rpkg/ucidata")
Using data in the package
There are two ways to access the data contained within this package.
The first is to load the package itself and type the name of a data set. This approach takes advantage of R’s lazy loading mechansim, which avoids loading the data until it is used in R session. For details on how lazy loading works, please see Section 1.17: Lazy Loading of the R Internals manual.
# Load the `ucidata` package
library("ucidata")
# See the first 10 observations of the `autompg` dataset
head(autompg)
# View the help documentation for `autompg`
?autompg
The second approach is to use the data()
command to load data on the fly without and type the name of a data set.
Included Data Sets
The following data sets are included in the ucidata
package:
abalone
adult
autoimports
autompg
- Breast Cancer Wisconsin:
- Heart Disease
bike_sharing_daily
bridges
car_eval
forest_fires
glass
hepatitis
wine
Build Scripts
Want to see how each data set was imported? Check out the data-raw
folder!