Create a list of test datasets

The create_testset function creates test datasets either for benchmarking or curve evaluation.

create_testset(test_type, set_names = NULL)

Arguments

test_type

A single string to specify the type of dataset generated by this function.

"bench": Create test datasets for benchmarking
"curve": Create test datasets for curve evaluation

set_names

A character vector to specify the names of test datasets.

For benchmarking (test_type = "bench")

This function uses a naming convention for randomly generated data for benchmarking. The format is a prefix ('i' or 'b') followed by the number of dataset. The prefix 'i' indicates a balanced dataset, whereas 'b' indicates an imbalanced dataset. The number can be used with a suffix 'k' or 'm', indicating respectively 1000 or 1 million.

Below are some examples.
"b100"

A balanced data set with 50 positives and 50 negatives.

"b10k"

A balanced data set with 5000 positives and 5000 negatives.

"b1m"

A balanced data set with 500,000 positives and 500,000 negatives.

"i100"

An imbalanced data set with 25 positives and 75 negatives.

The function returns a list of TestDataB objects.
For curve evaluation (test_type = "curve")

The following three predefined datasets can be specified for curve evaluation.

set name S3 object data source
c1 or C1 TestDataC C1DATA
c2 or C2 TestDataC C2DATA
c3 or C3 TestDataC C3DATA
c4 or C4 TestDataC C4DATA
The function returns a list of TestDataC objects.

Value

A list of R6 test dataset objects.

Examples

## Create a balanced data set with 50 positives and 50 negatives
tset1 <- create_testset("bench", "b100")
tset1
#> $b100
#> 
#>     === Test dataset for prcbench functions ===
#> 
#>     Testset name:     b100 
#>     # of positives:   50 
#>     # of negatives:   50 
#>     Scores:           0.0009157793 (min) 
#>                       0.3417239 (mean) 
#>                       0.9926006 (max) 
#>     Labels:           0 (neg), 1 (pos)
#> 
#> 

## Create an imbalanced data set with 25 positives and 75 negatives
tset2 <- create_testset("bench", "i100")
tset2
#> $i100
#> 
#>     === Test dataset for prcbench functions ===
#> 
#>     Testset name:     i100 
#>     # of positives:   25 
#>     # of negatives:   75 
#>     Scores:           0.001296925 (min) 
#>                       0.2468037 (mean) 
#>                       0.9040735 (max) 
#>     Labels:           0 (neg), 1 (pos)
#> 
#> 

## Create P1 dataset
tset3 <- create_testset("curve", "c1")
tset3
#> $c1
#> 
#>     === Test dataset for prcbench functions ===
#> 
#>     Testset name:     c1 
#>     # of positives:   2 
#>     # of negatives:   2 
#>     Scores:           1 (min) 
#>                       2 (mean) 
#>                       3 (max) 
#>     Labels:           0 (neg), 1 (pos)
#>     Pre-calculated:   Yes
#>     # of base points: 6 
#>     Text position:    (0.85, 0.9)
#>     Text position2:   (0.9, 0.9)
#> 
#> 

## Create P1 dataset
tset4 <- create_testset("curve", c("c1", "c2"))
tset4
#> $c1
#> 
#>     === Test dataset for prcbench functions ===
#> 
#>     Testset name:     c1 
#>     # of positives:   2 
#>     # of negatives:   2 
#>     Scores:           1 (min) 
#>                       2 (mean) 
#>                       3 (max) 
#>     Labels:           0 (neg), 1 (pos)
#>     Pre-calculated:   Yes
#>     # of base points: 6 
#>     Text position:    (0.85, 0.9)
#>     Text position2:   (0.9, 0.9)
#> 
#> 
#> $c2
#> 
#>     === Test dataset for prcbench functions ===
#> 
#>     Testset name:     c2 
#>     # of positives:   2 
#>     # of negatives:   2 
#>     Scores:           1 (min) 
#>                       2.25 (mean) 
#>                       3 (max) 
#>     Labels:           0 (neg), 1 (pos)
#>     Pre-calculated:   Yes
#>     # of base points: 6 
#>     Text position:    (0.2, 0.65)
#>     Text position2:   (0.2, 0.75)
#> 
#>

set name	`S3` object	data source
c1 or C1	`TestDataC`	`C1DATA`
c2 or C2	`TestDataC`	`C2DATA`
c3 or C3	`TestDataC`	`C3DATA`
c4 or C4	`TestDataC`	`C4DATA`

Arguments

Value

See also

Examples