R: selecting rows that contain a given number of NAs -


i've 6 column data frame nas. wish select rows contain maximum of 3 nas. find number of nas using sum(is.na(my.df[,c(1:6)])), not able select subset of data frame using 'subset' or other function , condition sum(is.na(log.df[,c(1:6)])) <=3 wish calculate median of each of selected rows.the sample data shown below:

c1  c2  c3  c4  c5  c6 6.4 na 6.1 6.2 na na 7.1 6.4 6.5 5.9 7 6.9 7.1 7 6.9 6.9 6.9 7 6.9 na 6.9 na 7.1 na 6.8 na 7.1 7.1 6.8 7.2 na na na na na 6.4 na na na na na 6.7 

thanks in advance

use rowsums:

> mydf[rowsums(is.na(mydf)) <= 3, ]    c1  c2  c3  c4  c5  c6 1 6.4  na 6.1 6.2  na  na 2 7.1 6.4 6.5 5.9 7.0 6.9 3 7.1 7.0 6.9 6.9 6.9 7.0 4 6.9  na 6.9  na 7.1  na 5 6.8  na 7.1 7.1 6.8 7.2 

step-by-step:

  • how many nas per row?

    > rowsums(is.na(mydf)) [1] 3 0 0 3 1 5 5 
  • how many of less or equal 3?

    > rowsums(is.na(mydf)) <= 3 [1]  true  true  true  true  true false false 

and, r can use subset. keep true rows (1, 2, 3, 4, 5) , discard false ones (6, 7).


Comments

Popular posts from this blog

html - How to style widget with post count different than without post count -

How to remove text and logo OR add Overflow on Android ActionBar using AppCompat on API 8? -

javascript - storing input from prompt in array and displaying the array -