Fast weighted Hamming distance in R

Upon request, I am posting a row-weighted version of the fast Hamming distance function in my previous posts (here and here).

The fast row-weighted binary Hamming distance:


hamming_binary <- function(X, w = rep(1, nrow(X))) {
	w <- sqrt(w)
	D <- t(w * (1 - X)) %*% (w * X)
	D + t(D)
}

The fast row-weighted “multi-level” Hamming distance:


hamming <- function(X, w = rep(1, nrow(X))) {
	uniqs <- unique(as.vector(X))
	H <- hamming_binary(X == uniqs[1], w)
	for ( uniq in uniqs[-1] ) {
		H <- H + hamming_binary(X == uniq, w)
	}
	H / 2
}

Note that the last function is not dependent on the mode and number of unique elements of the matrix. In other words, the matrix can consist of any number of different characters, integers, and any other type of element you can store in an R matrix. Also, because it avoids R-level looping, it is extremely fast.

One thought on “Fast weighted Hamming distance in R

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s