find_pcha_optimal_parameters function

Finds the optimal updating parameters to be used for the PCHA algorithm

Finds the optimal updating parameters to be used for the PCHA algorithm

After creating a grid on the space of (mu_up, mu_down) it runs archetypal by using a given method & other running options passed by ellipsis (...) and finally finds those values which minimize the SSE at the end of testing_iters iterations (default=10).

find_pcha_optimal_parameters(df, kappas, method = "projected_convexhull", testing_iters = 10, nworkers = NULL, nprojected = 2, npartition = 10, nfurthest = 100, sortrows = FALSE, mup1 = 1.1, mup2 = 2.50, mdown1 = 0.1, mdown2 = 0.5, nmup = 10, nmdown = 10, rseed = NULL, plot = FALSE, ...)

Arguments

  • df: The data frame with dimensions n x d

  • kappas: The number of archetypes

  • method: The method that will be used for computing initial approximation:

    1. projected_convexhull, see find_outmost_projected_convexhull_points
    2. convexhull, see find_outmost_convexhull_points
    3. partitioned_convexhull, see find_outmost_partitioned_convexhull_points
    4. furthestsum, see find_furthestsum_points
    5. outmost, see find_outmost_points
    6. random, a random set of kappas points will be used
  • testing_iters: The maximum number of iterations to run for every pair (mu_up, mu_down) of parameters

  • nworkers: The number of logical processors that will be used for parallel computing (usually it is the double of available physical cores)

  • nprojected: The dimension of the projected subspace for find_outmost_projected_convexhull_points

  • npartition: The number of partitions for find_outmost_partitioned_convexhull_points

  • nfurthest: The number of times that FurthestSum algorithm will be applied

  • sortrows: If it is TRUE, then rows will be sorted in find_furthestsum_points

  • mup1: The minimum value of mu_up, default is 1.1

  • mup2: The maximum value of mu_up, default is 2.5

  • mdown1: The minimum value of mu_down, default is 0.1

  • mdown2: The maximum value of mu_down, default is 0.5

  • nmup: The number of points to be taken for [mup1,mup2], default is 10

  • nmdown: The number of points to be taken for [mdown1,mdown2]

  • rseed: The random seed that will be used for setting initial A matrix. Useful for reproducible results

  • plot: If it is TRUE, then a 3D plot for (mu_up, mu_down, SSE) is created

  • ...: Other arguments to be passed to function archetypal

Returns

A list with members:

  1. mu_up_opt, the optimal found value for muAup and muBup
  2. mu_down_opt, the optimal found value for muAdown and muBdown
  3. min_sse, the minimum SSE which corresponds to (mu_up_opt,mu_down_opt)
  4. seed_used, the used random seed, absolutely necessary for reproducing optimal results
  5. method_used, the method that was used for creating the initial solution
  6. sol_initial, the initial solution that was used for all grid computations
  7. testing_iters, the maximum number of iterations done by every grid computation

Examples

{ data("wd25") out = find_pcha_optimal_parameters(df = wd25, kappas = 5, rseed = 2020) # Time difference of 30.91101 secs # mu_up_opt mu_down_opt min_sse # 2.188889 0.100000 4.490980 # Run now given the above optimal found parameters: aa = archetypal(df = wd25, kappas = 5, initialrows = out$sol_initial, rseed = out$seed_used, muAup = out$mu_up_opt, muAdown = out$mu_down_opt, muBup = out$mu_up_opt, muBdown = out$mu_down_opt) aa[c("SSE", "varexpl", "iterations", "time" )] # $SSE # [1] 3.629542 # # $varexpl # [1] 0.9998924 # # $iterations # [1] 146 # # $time # [1] 21.96 # Compare it with a simple solution (time may vary) aa2 = archetypal(df = wd25, kappas = 5, rseed = 2020) aa2[c("SSE", "varexpl", "iterations", "time" )] # $SSE # [1] 3.629503 # # $varexpl # [1] 0.9998924 # # $iterations # [1] 164 # # $time # [1] 23.55 ## Of course the above was a "toy example", if your data has thousands or million rows, ## then the time reduction is much more conspicuous. # Close plot device: dev.off() }

See Also

find_closer_points

  • Maintainer: Demetris Christopoulos
  • License: GPL (>= 2)
  • Last published: 2024-05-23

Useful links