The Conditional Missingness of Missing Values in R

Last updated on Dec 3, 2019 Programming

Many data analysts often wish to examine subsets of data or otherwise manipulate data using indicators of data missingness. Luckily, R features a number of different ways of designating a value as missing. Unluckily, some of the interactions with popular functions are not always intuitive and this can produce unintended results.

I wrote a demonstration of this awhile back. The below showcases behaviors of missing values many R programmers likely expect and also some surprising results. One way to potentially avoid disastrous consequences - as a consequence of these behaviors or other causes - is to establish tests to make sure your code does what you want it to do.


	# The below demonstrates the madness of R's treatment of NA values.
	# Some examples taken from https://stackoverflow.com/questions/25100974/na-matches-na-but-is-not-equal-to-na-why/25101796

	# Logical examples

	NA %in% NA
	# [1] TRUE

	NA == NA
	# [1] NA

	NA \| TRUE
	# [1] TRUE

	NA_real_ \| TRUE
	# [1] TRUE

	NA_integer_ \| TRUE
	# [1] TRUE

	NA \| FALSE
	# [1] NA

	NA_real_ \| FALSE
	# [1] NA

	NA_integer_ \| FALSE
	# [1] NA

	TRUE \| paste(NA)
	# Error in TRUE \| paste(NA) :
	# operations are possible only for numeric, logical or complex types

	# Matching examples

	match(NA, NA)
	# [1] 1

	match(NA, NA_real_)
	# [1] 1

	match(NA_character_, NA_real_)
	# [1] 1

	match(paste(NA), NA)
	# [1] NA

	gsub("NA", "", NA)
	# [1] NA

	gsub("NA", "", paste(NA))
	# [1] ""

	is.na(NA)
	# [1] TRUE

	is.na(paste(NA))
	# [1] FALSE

	# Other examples

	identical(NA, NA)
	# [1] TRUE

	eval(NA)
	# [1] NA

	is.na(eval(NA))
	# [1] TRUE

view raw r_na_examples hosted with ❤ by GitHub

R NA Missing Values Programming

The Conditional Missingness of Missing Values in R

Brett J. Gall

Data Scientist

Related