Optimizing a set of partitions based on the value of a criterion function
Optimizing a set of partitions based on the value of a criterion function
The function optimizes a set of partitions based on the value of a criterion function (see critFunC for details on the criterion function) for a given network and blockmodel for Generalized blockmodeling (Žiberna, 2007) based on other parameters (see below). The optimization is done through local optimization, where the neighborhood of a partition includes all partitions that can be obtained by moving one unit from one cluster to another or by exchanging two units (from different clusters). The number of clusters and a number of partitions to generate can be specified (optParC).
UTF-8
M: A matrix representing the (usually valued) network. For multi-relational networks, this should be an array with the third dimension representing the relation. The network can have one or more modes (diferent kinds of units with no ties among themselves). If the network is not two-mode, the matrix must be square.
k: The number of clusters used in the generation of partitions.
approaches: One of the approaches (for each relation in multi-relational netowrks in a vector) described in Žiberna (2007). Possible values are:
"bin" - binary blockmodeling,
"val" - valued blockmodeling,
"hom" - homogeneity blockmodeling,
"ss" - sum of squares homogeneity blockmodeling, and
The last two options are "shorthand" for specifying approaches="hom" and homFun to either "ss" or "ad".
blocks: A vector, a list of vectors or an array with names of allowed blocy types.
Only listing of allowed block types (blockmodel is not pre-specified).
A vector with names of allowed block types. For multi-relational networks, it can be a list of such vectors. For approaches = "bin" or approaches = "val", at least two should be selected. Possible values are:
"nul" - null or empty block
"com" - complete block
"rdo", "cdo" - row and column-dominant blocks (binary and valued approach only)
"reg" - (f-)regular block
"rre", "cre" - row and column-(f-)regular blocks
"rfn", "cfn" - row and column-dominant blocks (binary, valued only)
"den" - density block (binary approach only)
"avg" - average block (valued approach only)
"dnc" - do not care block - the error is always zero
The ordering is important, since if several block types have identical error, the first on the list is selected.
A pre-specified blockmodel.
An array with four dimensions (see example below). The third and the fourth represent the clusters (for rows and columns). The first is as long as the maximum number of allows block types for a given block. If some block has less possible block types, the empty slots should have values NA. The second dimension is the number of relations (1 for single-relational networks). The values in the array should be the ones from above. The array can have only three dimensions in case of one-relational networks or if the same pre-specified blockmodel is assumed for all relations. Further, it can have only two dimensions, if in addition only one block type is allowed per block.
rep: The number of repetitions/different starting partitions to check.
save.initial.param: Should the inital parameters (approaches, ...) be saved. The default value is TRUE.
save.initial.param.opt: Should the inital parameters(approaches, ...) of using optParC be saved. The default value is FALSE.
deleteMs: Delete networks/matrices from the results of to save space.
max.iden: Maximum number of results that should be saved (in case there are more than max.iden results with minimal error, only the first max.iden will be saved).
switch.names: Should partitions that only differ in group names be considered equal. By default it is set to TRUE if blocks is either a vector or a list of vectors and to FALSE otherwise.
return.all: If FALSE, solution for only the best (one or more) partition/s is/are returned.
return.err: Should the error for each optimized partition be returned.
seed: Optional. The seed for random generation of partitions.
RandomSeed: Optional. Integer vector, containing the random number generator. It is only looked for in the user's workspace.
parGenFun: The function (object) that will generate random partitions. The default function is genRandomPar. The function has to accept the following parameters: k (number o of partitions by modes, n (number of units by modes), seed (seed value for random generation of partition), addParam (a list of additional parameters).
mingr: Minimal allowed group size.
maxgr: Maximal allowed group size.
addParam: A list of additional parameters for function specified above. In the usage section they are specified for the default function genRandomPar.
maxTriesToFindNewPar: The maximum number of partition try when trying to find a new partition to optimize that was not yet checked before - the default value is rep * 1000.
skip.par: The partitions that are not allowed or were already checked and should therefore be skipped.
useOptParMultiC: For backward compatibility. May be removed soon. See next argument.
useMulti: Which version of local search should be used. Default is currently FALSE. If FALSE, first possible all moves in random order and then all possible exchanges in random order are tried. When a move with lower value of criterion function is found, the algorithm moves to this new partition. If TRUE the version of local search where all possible moves and exchanges are tried first and then the one with the lowest error is selected and used. In this case, several optimal partitions are found. maxPar best partitions are returned.
printRep: Should some information about each optimization be printed.
n: The number of units by "modes". It is used only for generating random partitions. It has to be set only if there are more than two modes or if there are two modes, but the matrix representing the network is one mode (both modes are in rows and columns).
nCores: Number of cores to be used. Value 0 means all available cores. It can also be a cluster object.
useParLapply: Should parLapplyLB or parLapply (see useLB) be used for parallel execution (on multiple cores). Otherwise mforeach is used. Defaults to FALSE. If useParLapply = TRUE and useLB = TRUE, results are not reproducible.
useLB: Should be logical if set. Only used if useParLapply = TRUE. Should load balancing be used (parLapplyLB instead of parLapply). Using load balancing usually means faster execution, but results are with not reproducible. Defaults to NULL, which is changed to TRUE, but a warning.
chunk.size: chunk.size used in parLapplyLB if it is used, otherwise ignored. Defaults to 1.
cl: The cluster to use (if formed beforehand). Defaults to NULL. Ignored if useParLapply=FALSE (default) and foreach::getDoParRegistered is true
stopcl: Should the cluster be stoped after the function finishes. Defaults to is.null(cl).
useRegParrallaBackend: Should the function use already registered parallel backend. Defaults to FALSE. If TRUE, you must make sure that an appropriate backend is correctly set up and registered. Use only if useParLapply = FALSE (default) and nCore is not 1.
...: Arguments passed to other functions, see critFunC.
x: The result of optRandomParC.
genPajekPar: Should the partitions be generated as in Pajek.
probGenMech: Should the probabilities for different mechanisms for specifying the partitions be set. If probGenMech is not set, it is determined based on the parameter genPajekPar.
Returns
M: The matrix of the network analyzed.
res: If return.all = TRUE - A list of results the same as best - one best for each partition optimized.
best: A list of results from optParC, only without M.
err: If return.err = TRUE - The vector of errors or inconsistencies of the empirical network with the ideal network for a given blockmodel (model,approach,...) and parititions.
nIter: The vector of the number of iterations used - one value for each starting partition that was optimized. It can show that maxiter is too low if a lot of these values have the value of maxiter.
checked.par: If selected - A list of checked partitions. If merge.save.skip.par is TRUE, this list also includes the partitions in skip.par.
call: The call used to call the function.
initial.param: If selected - The initial parameters are used.
Warning
It should be noted that the time complexity of package blockmodeling is increasing with the number of units and the number of clusters (due to its algorithm). Therefore the analysis of network with more than 100 units can take a lot of time (from a few hours to a few days).
Examples
n <-8# If larger, the number of partitions increases dramatically# as does if we increase the number of clustersnet <- matrix(NA, ncol = n, nrow = n)clu <- rep(1:2, times = c(3,5))tclu <- table(clu)net[clu ==1, clu ==1]<- rnorm(n = tclu[1]* tclu[1], mean =0, sd =1)net[clu ==1, clu ==2]<- rnorm(n = tclu[1]* tclu[2], mean =4, sd =1)net[clu ==2, clu ==1]<- rnorm(n = tclu[2]* tclu[1], mean =0, sd =1)net[clu ==2, clu ==2]<- rnorm(n = tclu[2]* tclu[2], mean =0, sd =1)# Optimizing 10 random chosen partitions with optRandomParCres <- optRandomParC(M = net, k =2, rep =10,approaches ="hom", homFun ="ss", blocks ="com")plot(res)# Hopefully we get the original partition
Doreian, P., Batagelj, V. & Ferligoj, A. (2005). Generalized blockmodeling, (Structural analysis in the social sciences, 25). Cambridge [etc.]: Cambridge University Press.
(2007). Generalized Blockmodeling of Valued Networks. Social Networks, 29(1), 105-126. doi: 10.1016/j.socnet.2006.04.002
(2008). Direct and indirect approaches to blockmodeling of valued networks in terms of regular equivalence. Journal of Mathematical Sociology, 32(1), 57-84. doi: 10.1080/00222500701790207
(2014). Blockmodeling of multilevel networks. Social Networks, 39(1), 46-61. doi: 10.1016/j.socnet.2014.04.002