All the code you will see in this presentation compiles and works as advertized.
Dealing with values instead of references has a couple of very nice benefits
Who owns foo?
foo_t * foo_factory (); foo_t * foo = foo_factory();
Is this better?
std::shared_ptr<foo_t> foo_factory (); std::shared_ptr<foo_t> foo = foo_factory();
That depends on whether you're partial to global state....
foo_t * foo_factory (); void foo_user (foo_t * f); void some_function () { foo_t * foo = foo_factory(); foo_user(foo); // What can I say about foo here? /* more code ... */ }
With value types, you only have to reason about the values in the code front of you.
With reference types, you must simultaneously reason about code that mutates the values elsewhere.
Q: How does the author of this function intend its users to use it?
bool my_predicate (some_t obj);
A: Non-polymorphically. You have to give me a some_t object, or get sliced. Not terribly reusable with other types.
Q: How about this one?
bool my_predicate (const base_t & obj);
A: Runtime-polymorphically. You can give me any base_t-derived object by reference. The function is now more reusable, but you must use inheritance. Yuck!
Q: Or this?
template <typename T> bool my_predicate (T obj);
A: Compile-time-polymorphically. You can give me any object whose type can be used to instantiated the template. The function is now even more reusable, but you must use metaprogramming. Gross!
Q: Is this an example of polymorphism?
struct base { virtual ~base () {} }; struct derived : base { virtual int foo () { return 42; } }; void some_function () { base * b_pointer = new derived; requires_foo(static_cast<derived*>(b_pointer)); // <-- this here }
A: No! Polymorphism means using one type as if it were another type. Here, we have to cast to make the type we have look like the type expected in the interface to foo().
The inheritance mechanism gives us polymorphism, but we must either:
struct base { virtual int foo () const = 0; }; struct derived : base { virtual int foo () const { return 42; } float bar () const { return 3.0f; } }; void some_function () { base * b_pointer = new derived; uses_foo(b_pointer); // Yay! // uses_bar(b_pointer); // Aww... }
Note that the "base" could use multiple inheritance, as long as the symbols in all the multiply inherited bases do not collide.
OR
struct base { virtual ~base() {} }; struct has_foo : virtual base { virtual int foo () { return 42; } }; struct has_bar : virtual base { virtual float bar () { return 3.0f; } }; struct derived : has_foo, has_bar {}; void some_function () { base * b_pointer = new derived; uses_foo(dynamic_cast<has_foo*>(b_pointer)); // Wait, but... uses_bar(dynamic_cast<has_bar*>(b_pointer)); // Aww... }
Which leads to the "diamond of death",
which leads to virtual inheritance,
which leads to fear,
which leads to anger,
... you get it.
More importantly, we have actually given up polymorphism by doing this; we must carry around the information that b_pointer contains interfaces not found in base.
We cannot make classes from different class hierarchies conform to a common interface, because they have no common base to use when passing them.
struct int_foo { virtual int foo () { return 42; } virtual void log () const; }; struct float_foo { virtual float foo () { return 3.0f; } virtual void log () const; };
Note that int_foo and float_foo both have a member
void log() const.
So, we ditch code reuse in the case of logging:
void log_to_terminal (const int_foo & loggee); void log_to_terminal (const float_foo & loggee);
Or we resort to ugly hacks:
struct log_base { virtual void log () const; }; struct int_foo : log_base {/*...*/}; struct float_foo : log_base {/*...*/}; void log_to_terminal (const log_base & loggee);
We cannot easily separate interface and implementation without using multiple inheritance.
Virtual functions can be tricky to get right, especially in large class heirarchies.
In addition to all the above limitations, we are limited in which interfaces we give to which types by our choice of base class(es).
Inheritance imposes very tight coupling; it is not possible to have unrelated types with the same interfaces used interchangably.
We must always take parameters by reference to use runtime polymorphism. There go our nice benefits from value semantics.
Widgets are UI elements (buttons, text boxes, etc.).
Layouts place widgets in the UI.
A layout can contain widgets.
A layout can contain sublayouts.
Using inheritance, you are all but locked in to giving layouts and widgets a common base,
... even though that makes no sense.
Metaprogramming requires a large body of knowledge about a large number of obscure language rules, and even more obscure TMP-specific tricks.
Metaprogramming heavy code is hard to maintain. This is true for experts, but is moreso in a team of varying skill levels.
Metaprogramming might be simply impossible to use where you work (even if you wanted to).
Compile times and object code size can get away from you if you're not careful.
Does not play well with runtime variation.
Compile-time is easy:
template <typename TrueType, typename FalseType, bool Selection> typename std::conditional<Selection, TrueType, FalseType>::type factory_function () { return typename std::conditional<Selection, TrueType, FalseType>::type(); } int an_int = factory_function<int, float, true>(); float a_float = factory_function<int, float, false>();
Runtime is impossible (without type erasure):
template <typename TrueType, typename FalseType> auto factory_function (bool selection) -> /* ??? */ { return /* ??? */; }
Because of the lack of easy runtime interoperability, once you decide to use TMP, you're almost always stuck doing more TMP.
Based on everything we've seen so far, we want an interface that works like this:
// No coupling! // No virtual functions, kind of! struct foo { int value () const; }; struct bar { int value () const; }; // No templates! int value_of (magic_type obj) { return obj.value(); } void some_function () { // Value semantics! if (value_of(foo()) == value_of(bar())) { /*...*/ } }
struct anything { anything () = default; anything (const anything & rhs); anything & operator= (const anything & rhs); template <typename T> anything (T t); template <typename T> anything & operator= (T t); struct handle_base { virtual ~handle_base () {} virtual handle_base * clone () const = 0; }; template <typename T> struct handle : handle_base { handle (T value); virtual handle_base * clone () const; T value_; }; std::unique_ptr<handle_base> handle_; };
anything definitions
template <typename T> anything::anything (T t) : handle_ (new handle<typename std::remove_reference<T>::type>( std::forward<T>(t) )) {} anything::anything (const anything & rhs) : handle_ (rhs.handle_->clone()) {} template <typename T> anything & anything::operator= (T t) { anything temp(std::forward<T>(t)); std::swap(temp, *this); return *this; } anything & anything::operator= (const anything & rhs) { anything temp(rhs); std::swap(temp, *this); return *this; }
anything::handle definitions
template <typename T> anything::handle<T>::handle (T value) : value_ (std::move(value)) {} template <typename T> anything::handle_base * anything::handle<T>::clone () const { return new handle(value_); }
anything in action
int i = 1; int * i_ptr = &i; anything a; a = i; a = i_ptr; a = 2.0; a = std::string("3"); struct foo {}; a = foo();
Dymamic typing ("duck typing"), similar to that in Python. Consider dumping scripting languages for this.
That's not a joke.
Q: Can anything really hold anything?
A: No, but it can hold anything with a copy constructor. It should really be called copyable.
Ok, so how do we get back to this?
// No coupling! // No virtual functions, kind of! struct foo { int value () const; }; struct bar { int value () const; }; // No templates! int value_of (magic_type obj) { return obj.value(); } void some_function () { // Value semantics! if (value_of(foo()) == value_of(bar())) { /*...*/ } }
Easy -- just forward the calls to value():
// anything gets: int value () const { return handle_->value(); }
// anything::handle_base gets: virtual int value () const = 0;
// anything::handle<T> gets: virtual int value () const { return value_.value(); }
We can add any arbitrary API in the same fashion. It's easy but repetitive to do so. More on that in a bit.
Say we have two disjoint APIs we wish to support.
One erased type for all widgets:
struct widget { // some boilerplate ... void render () const; // more boilerplate ... };
And one for all objects that can be used in our layout system:
struct layoutable { // boilerplate ... layout_geometry geometry () const; // boilerplate ... };
Here it is in use:
struct button { void render () const; layout_geometry geometry () const; void set_label (); void set_image (); // etc. }; void do_layout (layoutable l) { layout_geometry geometry = l.geometry(); /* use geometry... */ } void render_widget (widget w) { w.render(); } void some_function () { button b; do_layout(b); render_widget(b); }
Q: What about performance?
A: It's complicated.
Function call overhead is exactly the same.
How about heap allocations?
Operation Inheritance Simple Type Erasure Construct Yes Yes Copy No Yes Assign No Yes Get Alternate Interface No* Yes* dynamic_cast<> is not free, and is not consistent with polymorphism.
This chart also applies to copies of the underlying object.
Instead of always copying the given value, accept std::reference_wrappers.
button b; widget ref_widget(std::ref(b)); // Underlying object widget cref_widget(std::cref(b)); // not copied.
Replace the handle constructor with these:
template <typename U = T> handle (T value, typename std::enable_if< std::is_reference<U>::value >::type * = 0) : value_ (value) {} template <typename U = T> handle (T value, typename std::enable_if< !std::is_reference<U>::value, int >::type * = 0) noexcept : value_ (std::move(value)) {}
Add this specialization:
template <typename T> struct handle<std::reference_wrapper<T>> : handle<T &> { handle (std::reference_wrapper<T> ref) : handle<T &> (ref.get()) {} };
Allocations did not change.
How about copies of the underlying object?
Operation Inh. Simple TE TE + ref Construct Yes Yes No Copy No Yes No Assign No Yes No Alt. Interface No* Yes NoAll versions discussed hereafter include support for std::reference_wrapper.
Instead of always copying our type erased objects, only copy them when they are mutated.
copy_on_write<widget> w_1(widget{button()}); copy_on_write<widget> w_2 = w_1; // No copy. widget & mutable_w_2 = w_2.write(); // Copy happens here.
A nice benefit of using copy-on-write is thread safety.
Allocations:
Operation Inh. Simple TE TE + COW Construct 1 1 2 Copy 0 1 0 Assign 0 1 0 Alt. Interface 0* 1 2Remove one set of allocations on construction by applying copy-on-write directly to the handle_ member.
widget w_1 = button(); // Only 1 allocation here. widget w_2 = w_1; // No copy.
Copies only happen when a non-const member function is called.
Allocations:
Operation Inh. Simple TE TE w/COW Construct 1 1 1 Copy 0 1 0 Assign 0 1 0 Alt. Interface 0* 1 1std::array<int, 1024> big_array; anything small = 1; // No allocation to store an int. anything ref = std::ref(big_array); // No allocation to store a std::ref(). anything large = big_array; // Allocation required here.
Allocations for small†/large objects:
Operation Inh. Simple TE TE w/SBO Construct 1/1 1/1 0/1 Copy 0/0 1/1 0/1 Assign 0/0 1/1 0/1 Alt. Interface 0/0* 1/1 0/1†std::referece_wrapper is always small.
std::array<int, 1024> big_array; anything small = 1; // No allocation to store an int. anything ref = std::ref(big_array); // No allocation to store a std::ref(). anything large = big_array; // Allocation required here. anything copied = large; // No allocations for copies.
Allocations for small†/large objects:
Operation Inh. Simple TE TE w/SBO+COW Construct 1/1 1/1 0/1 Copy 0/0 1/1 0/0 Assign 0/0 1/1 0/0 Alt. Interface 0/0* 1/1 0/1† Again, std::referece_wrapper is always small.
Copies only happen when a non-const member function is called.
The final implementation is a bit long to show here.
What have we gained versus using inheritance?
What have we lost?
Value semantics
Never writing new or delete
Ability to bind types to interfaces never seen at the time of the types' writing, including multiple interfaces
Thread safety via copy-on-write
Elision of heap allocations for small types and references
Simplicity of implementation
Thread safety come at the cost of some atomic operations and copies when values are mutated
Boost.TypeErasure is the most robust library-based type erasure solution to date.
An example from the online docs:
any< mpl::vector< copy_constructible<>, typeid_<>, incrementable<>, ostreamable<> > > x(10); ++x; std::cout << x << std::endl; // prints 11
An example of an erased type with a member function:
BOOST_TYPE_ERASURE_MEMBER((has_push_back), push_back, 1) void append_many(any<has_push_back<void(int)>, _self&> container) { for(int i = 0; i < 10; ++i) container.push_back(i); }
Allocations for small/large/ref held values:
Operation Inh. Optimized TE Boost TE Construct 1/1/- 0/1/0 1/1/0 Copy 0/0/- 0/0/0 1/1/0 Assign 0/0/- 0/0/0 1/1/0 Alt. Interface 0/0/-* 0/1/0 1/1/0All values are the same for copies of the underlying value, except for the optimized hand-rolled version's binding to an alternate interface.
Space requirements, including held-object storage:
Technique Held Object Size Space Requirement Inheritance All P + O Hand-rolled TE Small P + B Hand-rolled TE Large P + B + O Boost TE All FM + OP is the size of a pointer to a heap-allocated object, F is the size of a pointer-to-function, O is the size of the stored object, M is the number of members in an erased type's interface, and B is small buffer size.
The biggest advantage of using Boost.TypeErasure over a hand-rolled implementation is that your hands get a rest.
Q: How do we avoid large amounts of cut-and-pasted code?
A: We generate it, using emtypen.
emtypen is an MIT-licensed tool that takes a real C++ header with one or more archetypal interfaces and produces the hand-rolled version of the type erasure techniques presented in this talk. It uses libclang. For example:
$ emtypen archetypes.hpp > generated.hpp
generated.hpp now contains the generated erased types, one for each archetypal struct/class in archetypes.hpp.
All the erased types in the code in this talk were generated using this tool.
Your input can be a real-boy C++ header, that you can syntax check with a real compiler.
As for the output,
Archetype input:
#ifndef LOGGABLE_INTERFACE_INCLUDED__ #define LOGGABLE_INTERFACE_INCLUDED__ #include <iostream> struct loggable { std::ostream & log (std::ostream & os) const; }; #endif
Erased type output:
#ifndef LOGGABLE_INTERFACE_INCLUDED__ #define LOGGABLE_INTERFACE_INCLUDED__ # include <iostream> #include <algorithm> #include <cassert> #include <functional> #include <memory> #include <type_traits> #include <utility> #if defined(_MSC_VER) && _MSC_VER == 1800 #define noexcept #endif struct loggable { public: // Lots of boilerplate code here. std :: ostream & log ( std :: ostream & os) const { assert(handle_); return handle_->log(os ); } private: // Even more boilerplate code here. }; #endif
emtypen generates your erased types based on a "form" file.
The form file contains the boilerplate code for generating one erased type.
emtypen takes optional command-line parameters that can be used to change the form file used, or the block of headers it includes near the top of the file.
You can completely change the boilerplate if you like. Forms are provided for all the hand-rolled versions of type erasure presented in this talk.
http://github.com/tzlaine/type_erasure