Upon request, I am posting a row-weighted version of the fast Hamming distance function in my previous posts (here and here).
The fast row-weighted binary Hamming distance:
hamming_binary <- function(X, w = rep(1, nrow(X))) {
w <- sqrt(w)
D <- t(w * (1 - X)) %*% (w * X)
D + t(D)
}
The fast row-weighted “multi-level” Hamming distance:
hamming <- function(X, w = rep(1, nrow(X))) {
uniqs <- unique(as.vector(X))
H <- hamming_binary(X == uniqs[1], w)
for ( uniq in uniqs[-1] ) {
H <- H + hamming_binary(X == uniq, w)
}
H / 2
}
Note that the last function is not dependent on the mode and number of unique elements of the matrix. In other words, the matrix can consist of any number of different characters, integers, and any other type of element you can store in an R matrix. Also, because it avoids R-level looping, it is extremely fast.
Leave a reply to Fast Hamming distance in R using covariance – Johann de Jong Cancel reply