You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OutlierDetectionData.jl is a package to download and read common outlier detection datasets. This package is a part of OutlierDetection.jl, the outlier detection ecosystem for Julia.
API Overview
The API currently is simple; we provide a single namespace per dataset collection. A dataset collection such as ODDS bundles multiple outlier detection datasets. For each dataset collection, the following methods are provided:
List all available datasets in the collection:
list()
List a subset of datasets starting with prefix:
list(prefix::Union{AbstractString, Regex})
Load a single dataset with name. This command automatically starts to download the file if the file does not exist. Currently, the data is returned as a tuple containing X::DataFrame and y::Vector{Int}, where X is a matrix of features with one observation per row and y represents the labels with "normal" indicating inliers and "outlier" indicating outliers.
load(name::AbstractString)
Example:
The following example shows how you can load the "cardio" dataset from the ODDS collection.
using OutlierDetectionData: ODDS
X, y = ODDS.load("cardio")
ELKI, On the Evaluation of Unsupervised Outlier Detection, Campos et al., 2016
TSAD, The UCR Time Series Archive, Dau et al., 2018
For the TSAD collection, the class with the least members is chosen as the anomaly class and all other classes are defined as normal. If there are multiple classes, the lexically first class is chosen.
Licenses
Please make sure that you check and accept the licenses of the individual datasets before publishing your work. This package is licensed under the terms of the MIT license.
About
Easy way to use public outlier detection datasets with Julia