R: selecting rows that contain a given number of NAs -
i've 6 column data frame nas. wish select rows contain maximum of 3 nas. find number of nas using sum(is.na(my.df[,c(1:6)])),
not able select subset of data frame using 'subset' or other function , condition sum(is.na(log.df[,c(1:6)])) <=3
wish calculate median of each of selected rows.the sample data shown below:
c1 c2 c3 c4 c5 c6 6.4 na 6.1 6.2 na na 7.1 6.4 6.5 5.9 7 6.9 7.1 7 6.9 6.9 6.9 7 6.9 na 6.9 na 7.1 na 6.8 na 7.1 7.1 6.8 7.2 na na na na na 6.4 na na na na na 6.7
thanks in advance
use rowsums
:
> mydf[rowsums(is.na(mydf)) <= 3, ] c1 c2 c3 c4 c5 c6 1 6.4 na 6.1 6.2 na na 2 7.1 6.4 6.5 5.9 7.0 6.9 3 7.1 7.0 6.9 6.9 6.9 7.0 4 6.9 na 6.9 na 7.1 na 5 6.8 na 7.1 7.1 6.8 7.2
step-by-step:
how many
na
s per row?> rowsums(is.na(mydf)) [1] 3 0 0 3 1 5 5
how many of less or equal 3?
> rowsums(is.na(mydf)) <= 3 [1] true true true true true false false
and, r can use subset. keep true
rows (1, 2, 3, 4, 5) , discard false
ones (6, 7).
Comments
Post a Comment