The create_sim_samples function generates random samples with different performance levels.

create_sim_samples(n_repeat, np, nn, score_names = "random")

Arguments

n_repeat

The number of iterations to make samples.

np

The number of positives in a sample.

nn

The number of negatives in a sample.

score_names

A character vector for the names of the following performance levels.

"random"

Random

"poor_er"

Poor early retrieval

"good_er"

Good early retrieval

"excel"

Excellent

"perf"

Perfect

"all"

All of the above

Value

The create_sim_samples function returns a list with the following items.

  • scores: a list of numeric vectors

  • labels: an integer vector

  • modnames: a character vector of the model names

  • dsids: a character vector of the dataset IDs

See also

mmdata for formatting input data. evalmod for calculation evaluation measures.

Examples

################################################## ### Create a set of samples with 10 positives and 10 negatives ### for the random performance level ### samps1 <- create_sim_samples(1, 10, 10, "random") ## Show the list structure str(samps1)
#> List of 4 #> $ scores :List of 1 #> ..$ :List of 1 #> .. ..$ : num [1:20] -0.384 0.74 1.199 1.892 -0.595 ... #> $ labels : num [1:20] 1 1 1 1 1 1 1 1 1 1 ... #> $ modnames: chr "random" #> $ dsids : int 1
################################################## ### Create two sets of samples with 10 positives and 20 negatives ### for the random and the poor early retrieval performance levels ### samps2 <- create_sim_samples(2, 10, 20, c("random", "poor_er")) ## Show the list structure str(samps2)
#> List of 4 #> $ scores :List of 2 #> ..$ :List of 2 #> .. ..$ : num [1:30] -1.44 -0.184 0.574 -0.552 -0.88 ... #> .. ..$ : num [1:30] 0.837 0.862 0.921 0.511 0.993 ... #> ..$ :List of 2 #> .. ..$ : num [1:30] 0.1185 0.0601 -0.0103 -0.5855 0.4812 ... #> .. ..$ : num [1:30] 0.991 0.918 0.574 0.729 0.365 ... #> $ labels : num [1:30] 1 1 1 1 1 1 1 1 1 1 ... #> $ modnames: chr [1:4] "random" "poor_er" "random" "poor_er" #> $ dsids : int [1:4] 1 1 2 2
################################################## ### Create 3 sets of samples with 5 positives and 5 negatives ### for all 5 levels ### samps3 <- create_sim_samples(3, 5, 5, "all") ## Show the list structure str(samps3)
#> List of 4 #> $ scores :List of 3 #> ..$ :List of 5 #> .. ..$ : num [1:10] 1.285 -1.756 0.936 0.883 -0.129 ... #> .. ..$ : num [1:10] 0.831 0.669 0.688 0.336 0.85 ... #> .. ..$ : num [1:10] 0.0718 0.3483 0.9397 0.8489 0.5054 ... #> .. ..$ : num [1:10] 4.11 3.4 3.98 1.65 3.31 ... #> .. ..$ : num [1:10] 1 1 1 1 1 0 0 0 0 0 #> ..$ :List of 5 #> .. ..$ : num [1:10] -1.1639 -0.7669 0.0914 -0.2998 0.3918 ... #> .. ..$ : num [1:10] 0.822 0.916 0.985 0.641 0.515 ... #> .. ..$ : num [1:10] 0.58 0.788 0.108 0.835 0.615 ... #> .. ..$ : num [1:10] 2.26 4.05 2.72 3.46 3.57 ... #> .. ..$ : num [1:10] 1 1 1 1 1 0 0 0 0 0 #> ..$ :List of 5 #> .. ..$ : num [1:10] 0.2638 0.1711 -0.0645 0.9143 -0.0188 ... #> .. ..$ : num [1:10] 0.623 0.797 0.579 0.955 0.844 ... #> .. ..$ : num [1:10] 0.7275 0.0113 0.4045 0.6225 0.2971 ... #> .. ..$ : num [1:10] 4.04 1.65 4.14 4.31 2.94 ... #> .. ..$ : num [1:10] 1 1 1 1 1 0 0 0 0 0 #> $ labels : num [1:10] 1 1 1 1 1 0 0 0 0 0 #> $ modnames: chr [1:15] "random" "poor_er" "good_er" "excel" ... #> $ dsids : int [1:15] 1 1 1 1 1 2 2 2 2 2 ...