x: a matrix of observed covariates from the sample. Notice that we assumed the class of treatment regimes is linear. This is important that columns in x matches with beta.
y: a vector, the observed responses from a sample
a: a vector of 0s and 1s, the observed treatments from a sample
prob: a vector, the propensity scores of getting treatment 1 in the samples
Cnobs: A matrix with two columns, enumerating all possible combinations of pairs of indexes. This can be generated by combn(1:n, 2), where n is the number of unique observations.