c++ - C++11 std::function slower than virtual calls? -


i creating mechanism allows users form arbitrary complex functions basic building blocks using decorator pattern. works fine functionality wise, don't fact involves lot of virtual calls, particularly when nesting depth becomes large. worries me because complex function may called (>100.000 times).

to avoid problem, tried turn decorator scheme std::function once finished (cfr. to_function() in sscce). internal function calls wired during construction of std::function. figured faster evaluate original decorator scheme because no virtual lookups need performed in std::function version.

alas, benchmarks prove me wrong: decorator scheme in fact faster std::function built it. left wondering why. maybe test setup faulty since use 2 trivial basic functions, means vtable lookups may cached?

the code used included below, unfortunately quite long.


sscce

// sscce.cpp #include <iostream> #include <vector> #include <memory> #include <functional> #include <random>  /**  * base class pipeline scheme (implemented via decorators)  */ class pipeline { protected:     std::unique_ptr<pipeline> wrappee;     pipeline(std::unique_ptr<pipeline> wrap)     :wrappee(std::move(wrap)){}     pipeline():wrappee(nullptr){}  public:     typedef std::function<double(double)> fnsig;     double operator()(double input) const{         if(wrappee.get()) input=wrappee->operator()(input);         return process(input);     }      virtual double process(double input) const=0;     virtual ~pipeline(){}      // returns std::function contains entire pipeline stack.     virtual fnsig to_function() const=0; };  /**  * crtp to_function().  */ template <class derived> class pipeline_crtp : public pipeline{ protected:     pipeline_crtp(const pipeline_crtp<derived> &o):pipeline(o){}     pipeline_crtp(std::unique_ptr<pipeline> wrappee)     :pipeline(std::move(wrappee)){}     pipeline_crtp():pipeline(){}; public:     typedef typename pipeline::fnsig fnsig;      fnsig to_function() const override{         if(pipeline::wrappee.get()!=nullptr){              fnsig wrapfun = pipeline::wrappee->to_function();             fnsig processfun = std::bind(&derived::process,                 static_cast<const derived*>(this),                 std::placeholders::_1);             fnsig fun = [=](double input){                 return processfun(wrapfun(input));             };             return std::move(fun);          }else{              fnsig processfun = std::bind(&derived::process,                 static_cast<const derived*>(this),                 std::placeholders::_1);             fnsig fun = [=](double input){                 return processfun(input);             };             return std::move(fun);         }      }      virtual ~pipeline_crtp(){} };  /**  * first concrete derived class: simple scaling.  */ class scale: public pipeline_crtp<scale>{ private:     double scale_; public:     scale(std::unique_ptr<pipeline> wrap, double scale) // todo move :pipeline_crtp<scale>(std::move(wrap)),scale_(scale){}     scale(double scale):pipeline_crtp<scale>(),scale_(scale){}      double process(double input) const override{         return input*scale_;     } };  /**  * second concrete derived class: offset.  */ class offset: public pipeline_crtp<offset>{ private:     double offset_; public:     offset(std::unique_ptr<pipeline> wrap, double offset) // todo move :pipeline_crtp<offset>(std::move(wrap)),offset_(offset){}     offset(double offset):pipeline_crtp<offset>(),offset_(offset){}      double process(double input) const override{         return input+offset_;     } };  int main(){      // used make random function / arguments     // prevent gcc being overly clever     std::default_random_engine generator;     auto randint = std::bind(std::uniform_int_distribution<int>(0,1),std::ref(generator));     auto randdouble = std::bind(std::normal_distribution<double>(0.0,1.0),std::ref(generator));      // make complex pipeline     std::unique_ptr<pipeline> pipe(new scale(randdouble()));     for(unsigned i=0;i<100;++i){         if(randint()) pipe=std::move(std::unique_ptr<pipeline>(new scale(std::move(pipe),randdouble())));         else pipe=std::move(std::unique_ptr<pipeline>(new offset(std::move(pipe),randdouble())));     }      // make std::function pipe     pipeline::fnsig fun(pipe->to_function());         double bla=0.0;     for(unsigned i=0; i<100000; ++i){ #ifdef use_function         // takes 110 ms on average         bla+=fun(bla); #else         // takes 60 ms on average         bla+=pipe->operator()(bla); #endif     }        std::cout << bla << std::endl; } 

benchmark

using pipe:

g++ -std=gnu++11 sscce.cpp -march=native -o3 sudo nice -3 /usr/bin/time ./a.out -> 60 ms 

using fun:

g++ -duse_function -std=gnu++11 sscce.cpp -march=native -o3 sudo nice -3 /usr/bin/time ./a.out -> 110 ms 

as sebastian redl's answer says, "alternative" virtual functions adds several layers of indirection through dynamically bound functions (either virtual, or through function pointers, depending on std::function implementation) , still calls virtual pipeline::process(double) function anyway!

this modification makes faster, removing 1 layer of std::function indirection , preventing call derived::process being virtual:

fnsig to_function() const override {     fnsig fun;     auto derived_this = static_cast<const derived*>(this);     if (pipeline::wrappee) {         fnsig wrapfun = pipeline::wrappee->to_function();         fun = [=](double input){             return derived_this->derived::process(wrapfun(input));         };     } else {         fun = [=](double input){             return derived_this->derived::process(input);         };     }     return fun; } 

there's still more work being done here in virtual function version though.


Comments

Popular posts from this blog

html - How to style widget with post count different than without post count -

How to remove text and logo OR add Overflow on Android ActionBar using AppCompat on API 8? -

IIS->Tomcat Redirect: multiple worker with default -