pySPF

DataMining - Home work of linking (LiuLei , TYUT - 2015005973)

Implementation of SPF algorithm in <<Sampled peculiarity factor and its application in anomaly detection>>

There are two example datasets in the project folder that provided by UCI Machine Learning Repository:Data Sets.

Requirements

python 3.x
tqdm : pip install tqdm

Usage

edit the config file "config.json" (see Configuration)
Run the code python SPF.py

Configuration

The config.json can specify some unique arguments for the algorithms in JSON format which can be like this:

{
    "datasource":"ism.data",
    "seprator":",",
    "features_num":7,
    "class_lables":[],
    "outlier_class_lables":[],
    "col_of_class_lable":6,
    "col_of_features":[0,6],
    "sample_set_size":1500,
    "PF_threshold":150000,
    "sample_proportion":0.3,
    "PFParam":{"alpha":0.5,"beta":1}
}

datasource - the dataset file to read

seprator - the seprator of features of every line in the datasource

features_num - columns of the dataset including features and class lable

class_lables | sample_set_size - no use currently

col_of_class_lable - column of class lable (count start from 0)

col_of_features - must be an array containing two int items , the 1st specify the start column of features and the 2nd specify the end of features (count start from 0)

PF_threshold - the threshold of PF that is used for determine whether one record is anomalous

sample_proportion - proportion of sample to compute PF

PFParam - object, including alpha and beta, refer to the paper

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

README.md

README.md

pySPF

Requirements

Usage

Configuration

Files

README.md

Latest commit

History

README.md

File metadata and controls

pySPF

Requirements

Usage

Configuration