High Level Functions · MultivariateAnomalies.jl

High Level Anomaly Detection Algorithms

We provide high-level convenience functions for detecting the anomalies. Namely the pair of

P = getParameters(algorithms, training_data) and detectAnomalies(testing_data, P)

sets standard choices of the Parameters P and hands the parameters as well as the algorithms choice over to detect the anomalies.

Currently supported algorithms include Kernel Density Estimation (algorithms = ["KDE"]), Recurrences ("REC"), k-Nearest Neighbors algorithms ("KNN-Gamma", "KNN-Delta"), Hotelling's $T^2$ ("T2"), Support Vector Data Description ("SVDD") and Kernel Null Foley Summon Transform ("KNFST"). With getParameters() it is also possible to compute output scores of multiple algorithms at once (algorihtms = ["KDE", "T2"]), quantiles of the output anomaly scores (quantiles = true) and ensembles of the selected algorithms (e.g. ensemble_method = "mean").

Functions

MultivariateAnomalies.getParameters — Function

getParameters(algorithms::Array{String,1} = ["REC", "KDE"], training_data::AbstractArray{tp, 2} = [NaN NaN])

return an object of type PARAMS, given the algorithms and some training_data as a matrix.

Arguments

algorithms: Subset of ["REC", "KDE", "KNN_Gamma", "KNN_Delta", "SVDD", "KNFST", "T2"]
training_data: data for training the algorithms / for getting the Parameters.
dist::String = "Euclidean"
sigma_quantile::Float64 = 0.5 (median): quantile of the distance matrix, used to compute the weighting parameter for the kernel matrix (algorithms = ["SVDD", "KNFST", "KDE"])
varepsilon_quantile = sigma_quantile by default: quantile of the distance matrix to compute the radius of the hyperball in which the number of reccurences is counted (algorihtms = ["REC"])
k_perc::Float64 = 0.05: percentage of the first dimension of training_data to estimmate the number of nearest neighbors (algorithms = ["KNN-Gamma", "KNN_Delta"])
nu::Float64 = 0.2: use the maximal percentage of outliers for algorithms = ["SVDD"]
temp_excl::Int64 = 0. Exclude temporal adjacent points from beeing count as recurrences of k-nearest neighbors algorithms = ["REC", "KNN-Gamma", "KNN_Delta"]
ensemble_method = "None": compute an ensemble of the used algorithms. Possible choices (given in compute_ensemble()) are "mean", "median", "max" and "min".
quantiles = false: convert the output scores of the algorithms into quantiles.

Examples

julia> using MultivariateAnomalies
julia> training_data = randn(100, 2); testing_data = randn(100, 2);
julia> P = getParameters(["REC", "KDE", "SVDD"], training_data, quantiles = false);
julia> detectAnomalies(testing_data, P)

source

MultivariateAnomalies.detectAnomalies — Function

detectAnomalies(data::AbstractArray{tp, N}, P::PARAMS) where {tp, N}
detectAnomalies(data::AbstractArray{tp, N}, algorithms::Array{String,1} = ["REC", "KDE"]; mean = 0) where {tp, N}

detect anomalies, given some Parameter object P of type PARAMS. Train the Parameters P with getParameters() beforehand on some training data. See getParameters(). Without training P beforehand, it is also possible to use detectAnomalies(data, algorithms) given some algorithms (except SVDD, KNFST). Some default parameters are used in this case to initialize P internally.

Examples

julia> training_data = randn(100, 2); testing_data = randn(100, 2);
julia> # compute the anoamly scores of the algorithms "REC", "KDE", "T2" and "KNN_Gamma", their quantiles and return their ensemble scores
julia> P = getParameters(["REC", "KDE", "T2", "KNN_Gamma"], training_data, quantiles = true, ensemble_method = "mean");
julia> detectAnomalies(testing_data, P)

source

MultivariateAnomalies.detectAnomalies! — Function

detectAnomalies!{tp, N}(data::AbstractArray{tp, N}, P::PARAMS)

mutating version of detectAnomalies(). Directly writes the output into P.

source

MultivariateAnomalies.init_detectAnomalies — Function

init_detectAnomalies{tp, N}(data::AbstractArray{tp, N}, P::PARAMS)

initialize empty arrays in P for detecting the anomalies.

source

Index

MultivariateAnomalies.detectAnomalies
MultivariateAnomalies.detectAnomalies!
MultivariateAnomalies.getParameters
MultivariateAnomalies.init_detectAnomalies