This post has been slightly modified from its original form on woodpeckR.
Problem
I keep forgetting how to select all elements of an object except a few, by name. I get the ! operator confused with the - operator, and I find both of them less than intuitive to use. How can I negate the %in% operator?
Context
I have a data frame called electrofishing that contains observations from a fish sampling survey. One column, stratum, gives the aquatic habitat type of the sampling site. I’d like to exclude observations sampled in the “Tailwater Zone” or “Impounded-Offshore” aquatic habitats.
But that doesn’t work. You can’t negate the %in% operator directly. Instead, you have to wrap the %in% statement in parentheses and negate the entire statement, returning the opposite of the original boolean vector:
I’m not saying this doesn’t make sense, but I can never remember it. My English-speaking brain would much rather say “rows whose stratum is not included in c(”Tailwater Zone”, “Impounded-Offshore”)” than “not rows whose stratum is included in c(”Tailwater Zone”, “Impounded-Offshore”)“.
Solution
Luckily, it’s pretty easy to negate %in% and create a %notin% operator. I credit this answer to user “catastrophic-failure” on this Stack Overflow question.
`%notin%`<-Negate(`%in%`)
I didn’t even know that the Negate function existed. The more you know.
Outcome
I know there are lots of ways to negate selections in R. dplyr has select() and filter() functions that are easier to use with -c(). Or I could just learn to throw a ! in front of my %in% statements. But %notin% seems a little more intuitive.
Now it’s straightforward to select these rows from my data frame.