DATASETS

Twenty-six datasets have been analysed and classified based on: N: number of samples; K: true number of inputs; P: number of candidate input variable; S: Fully/Partially Synthetic; NG: Non Gaussian output; HN: High non-linearity; HNo: High noise; HC: High collinearity; HD: Inter-dependency; II: Incomplete Information. A datailed description of each dataset is available here

# Dataset N P K S NG HN HNo HC ID II
1 AR1 500 15 1 Fully X X
2 AR9_500 500 15 3 Fully X X
3 AR9_70 70 15 3 Fully X X
4 TAR1 500 15 1 Fully X X
5 TAR2 500 15 2 Fully X X
6 NL_500 500 15 3 Fully X X
7 NL_70 70 15 3 Fully X X
8 NL2 500 15 3 Fully X X X X
9 Bank_fm 400 32 8 Fully X X
10 Bank_fh 400 32 8 Fully X X X
11 Bank_nm 400 32 8 Fully X X X
12 Bank_nh 400 32 8 Fully X X X X
13 Friedman_c0_10_m 250 10 5 Fully X
14 Friedman_c0_10_h 250 10 5 Fully X X
15 Friedman_c0_50_m 250 50 5 Fully X
16 Friedman_c0_50_h 250 50 5 Fully X X
17 Friedman_c25_10_m 250 10 5 Fully X X
18 Friedman_c25_10_h 250 10 5 Fully X X X
19 Salinity_5_l 4120 80 3 Partially X
20 Salinity_5_m 4120 80 3 Partially X
21 Salinity_5_h 4120 80 3 Partially X X
22 Salinity_10_l 4115 160 3 Partially X
23 Salinity_10_m 4115 160 3 Partially X
24 Salinity_10_h 4115 169 3 Partially X X
25 Kentucky 4739 21 4 Partially X X
26 Miller 200 3 2 Fully X X

Leave a Reply

Your email address will not be published. Required fields are marked *