r - Why is mean() so slow? -
everything in question! tried bit of optimization, , nailing down bottle necks, out of curiosity, tried that:
t1 <- rnorm(10) microbenchmark( mean(t1), sum(t1)/length(t1), times = 10000)
and result mean() 6+ times slower computation "by hand"!
does stem overhead in code of mean() before call internal(mean) or c code slower? why? there reason , use case?
it due s3 method, , necessary parsing of arguments in mean.default. (and other code in mean)
sum
, length
both primitive functions. fast (but how handling na values?)
t1 <- rnorm(10) microbenchmark( mean(t1), sum(t1)/length(t1), mean.default(t1), .internal(mean(t1)), times = 10000) unit: nanoseconds expr min lq median uq max neval mean(t1) 10266 10951 11293 11635 1470714 10000 sum(t1)/length(t1) 684 1027 1369 1711 104367 10000 mean.default(t1) 2053 2396 2738 2739 1167195 10000 .internal(mean(t1)) 342 343 685 685 86574 10000
the internal bit of mean
faster sum
/length
.
see http://rwiki.sciviews.org/doku.php?id=packages:cran:data.table#method_dispatch_takes_time more details (and data.table solution avoids .internal
note if increase length of vector, primitive approach fastest
t1 <- rnorm(1e7) microbenchmark( mean(t1), sum(t1)/length(t1), mean.default(t1), .internal(mean(t1)), + times = 100) unit: milliseconds expr min lq median uq max neval mean(t1) 25.79873 26.39242 26.56608 26.85523 33.36137 100 sum(t1)/length(t1) 15.02399 15.22948 15.31383 15.43239 19.20824 100 mean.default(t1) 25.69402 26.21466 26.44683 26.84257 33.62896 100 .internal(mean(t1)) 25.70497 26.16247 26.39396 26.63982 35.21054 100
now method dispatch fraction of overall "time" required.
Comments
Post a Comment