 Boost.MultiIndex Tutorial: Key extraction
Boost.MultiIndex Tutorial: Key extraction
STL associative containers have a notion of key, albeit in a somewhat incipient
form. So, the keys of such containers are identified by a nested type
key_type; for std::sets and std::multisets,
key_type coincides with value_type, i.e. the key is the
element itself. std::map and std::multimap manage
elements of type std::pair<const Key,T>, where the first
member is the key. In either case, the process of obtaining the key from a
given element is implicitly fixed and cannot be customized by the user.
Fixed key extraction mechanisms like those performed by STL associative
containers do not scale well in the context of Boost.MultiIndex, where
several indices share their value_type definition but
might feature completely different lookup semantics. For this reason,
Boost.MultiIndex formalizes the concept of a 
Key
Extractor in order to make it explicit and controllable
in the definition of key-based indices.
Intuitively speaking, a key extractor is a function object that accepts a reference to an element and returns its associated key. The formal concept also imposes some reasonable constraints about the stability of the process, in the sense that extractors are assumed to return the same key when passed the same element: this is in consonance with the informal understanding that keys are actually some "part" of the element and do not depend on external data.
A key extractor is called read/write if it returns a non-constant reference
to the key when passed a non-constant element, and it is called read-only
otherwise. Boost.MultiIndex requires that the key extractor be read/write
when using the modify_key member function of ordered and hashed
indices. In all other situations, read-only extractors suffice.
The section on advanced features
of Boost.MultiIndex key extractors details which of the predefined
key extractors are read/write.
identity
The identity
key extractor returns the entire base object as the associated key:
#include <boost/multi_index_container.hpp> #include <boost/multi_index/ordered_index.hpp> #include <boost/multi_index/identity.hpp> multi_index_container< int, indexed_by< ordered_unique< identity<int> // the key is the entire element > > > cont;
member
member
key extractors return a reference to a specified
data field of the base object. For instance, in the following version of our
familiar employee container:
#include <boost/multi_index_container.hpp> #include <boost/multi_index/ordered_index.hpp> #include <boost/multi_index/identity.hpp> #include <boost/multi_index/member.hpp> typedef multi_index_container< employee, indexed_by< ordered_unique<identity<employee> >, ordered_non_unique<member<employee,std::string,&employee::name> >, ordered_unique<member<employee,int,&employee::ssnumber> > > > employee_set;
the second and third indices use member extractors on
employee::name and employee::ssnumber, respectively.
The specification of an instantiation of member is simple
yet a little contrived:
member<(base type),(key type),(pointer to member)>
It might seem that the first and second parameters are superfluous,
since the type of the base object and of the associated data field are
already implicit in the pointer to member argument: unfortunately, it is
not possible to extract this information with current C++ mechanisms,
which makes the syntax of member a little too verbose.
const_mem_fun and mem_fun
Sometimes, the key of an index is not a concrete data member of the element,
but rather it is a value returned by a particular member function.
This resembles the notion of calculated indices supported by some
relational databases. Boost.MultiIndex supports this
kind of key extraction through
const_mem_fun.
Consider the following container where sorting on the third index
is based upon the length of the name field:
#include <boost/multi_index_container.hpp> #include <boost/multi_index/ordered_index.hpp> #include <boost/multi_index/identity.hpp> #include <boost/multi_index/member.hpp> #include <boost/multi_index/mem_fun.hpp> struct employee { int id; std::string name; employee(int id,const std::string& name):id(id),name(name){} bool operator<(const employee& e)const{return id<e.id;} // returns the length of the name field std::size_t name_length()const{return name.size();} }; typedef multi_index_container< employee, indexed_by< // sort by employee::operator< ordered_unique<identity<employee> >, // sort by less<string> on name ordered_non_unique<member<employee,std::string,&employee::name> >, // sort by less<int> on name_length() ordered_non_unique< const_mem_fun<employee,std::size_t,&employee::name_length> > > > employee_set;
const_mem_fun usage syntax is similar to that of
member:
const_mem_fun<(base type),(key type),(pointer to member function)>
The member function referred to must be const, take no arguments and return
a value of the specified key type.
Almost always you will want to use a const member function,
since elements in a multi_index_container are treated as constant, much
as elements of an std::set. However, a
mem_fun
counterpart is provided for use with non-constant member functions, whose
applicability is discussed on the paragraph on
advanced features
of Boost.MultiIndex key extractors.
Example 2 in the examples section
provides a complete program showing how to use const_mem_fun.
global_fun
Whereas const_mem_fun and mem_fun are based on a
given member function of the base type from where the key is extracted,
global_fun
takes a global function (or static member function) accepting the base
type as its parameter and returning the key:
#include <boost/multi_index_container.hpp> #include <boost/multi_index/ordered_index.hpp> #include <boost/multi_index/global_fun.hpp> struct rectangle { int x0,y0; int x1,y1; }; unsigned long area(const rectangle& r) { return (unsigned long)(r.x1-r.x0)*(r.x1-r.x0)+ (unsigned long)(r.y1-r.y0)*(r.y1-r.y0); } typedef multi_index_container< rectangle, indexed_by< // sort by increasing area ordered_non_unique<global_fun<const rectangle&,unsigned long,&area> > > > rectangle_container;
The specification of global_fun obeys the following syntax:
global_fun<(argument type),(key type),(pointer to function)>
where the argument type and key type must match exactly those in the
signature of the function used; for instance, in the example above the argument
type is const rectangle&, without omitting the "const"
and "&" parts. So, although most of the time the base type will be
accepted by constant reference, global_fun is also prepared to take
functions accepting their argument by value or by non-constant reference: this
latter case cannot generally be used directly in the specification of
multi_index_containers as their elements are treated as constant,
but the section on advanced features
of Boost.MultiIndex key extractors describes valid use cases of
key extraction based on such functions with a non-constant reference argument.
Example 2 in the examples section
uses gobal_fun.
Although the predefined key extractors
provided by Boost.MultiIndex are intended to serve most cases,
the user can also provide her own key extractors in more exotic situations,
as long as these conform to the
Key
Extractor concept.
// some record class struct record { boost::gregorian::date d; std::string str; }; // extracts a record's year struct record_year { // result_type typedef required by Key Extractor concept typedef boost::gregorian::greg_year result_type; result_type operator()(const record& r)const // operator() must be const { return r.d.year(); } }; // example of use of the previous key extractor typedef multi_index_container< record, indexed_by< ordered_non_unique<record_year> // sorted by record's year > > record_log;
Example 6 in the examples section applies some user-defined key extractors in a complex scenario where keys are accessed via pointers.
In relational databases, composite keys depend on two or more fields of a given table.
The analogous concept in Boost.MultiIndex is modeled by means of
composite_key, as shown in the example:
#include <boost/multi_index_container.hpp> #include <boost/multi_index/ordered_index.hpp> #include <boost/multi_index/member.hpp> #include <boost/multi_index/composite_key.hpp> struct phonebook_entry { std::string family_name; std::string given_name; std::string phone_number; phonebook_entry( std::string family_name, std::string given_name, std::string phone_number): family_name(family_name),given_name(given_name),phone_number(phone_number) {} }; // define a multi_index_container with a composite key on // (family_name,given_name) typedef multi_index_container< phonebook_entry, indexed_by< //non-unique as some subscribers might have more than one number ordered_non_unique< composite_key< phonebook_entry, member<phonebook_entry,std::string,&phonebook_entry::family_name>, member<phonebook_entry,std::string,&phonebook_entry::given_name> > >, ordered_unique< // unique as numbers belong to only one subscriber member<phonebook_entry,std::string,&phonebook_entry::phone_number> > > > phonebook;
composite_key accepts two or more key extractors on the same
value (here, phonebook_entry). Lookup operations on a composite
key are accomplished by passing tuples with the values searched:
phonebook pb; ... // search for Dorothea White's number phonebook::iterator it=pb.find(std::make_tuple("White","Dorothea")); std::string number=it->phone_number;
Composite keys are sorted by lexicographical order, i.e. sorting is performed by the first key, then the second key if the first one is equal, etc. This order allows for partial searches where only the first keys are specified:
phonebook pb; ... // look for all Whites std::pair<phonebook::iterator,phonebook::iterator> p= pb.equal_range(std::make_tuple("White"));
As a notational convenience, when only the first key is specified it is possible to pass the argument directly without including it into a tuple:
phonebook pb; ... // look for all Whites std::pair<phonebook::iterator,phonebook::iterator> p=pb.equal_range("White");
On the other hand, partial searches without specifying the first keys are not allowed.
By default, the corresponding std::less predicate is used
for each subkey of a composite key. Alternate comparison predicates can
be specified with 
composite_key_compare:
// phonebook with given names in reverse order typedef multi_index_container< phonebook_entry, indexed_by< ordered_non_unique< composite_key< phonebook_entry, member<phonebook_entry,std::string,&phonebook_entry::family_name>, member<phonebook_entry,std::string,&phonebook_entry::given_name> >, composite_key_compare< std::less<std::string>, // family names sorted as by default std::greater<std::string> // given names reversed > >, ordered_unique< member<phonebook_entry,std::string,&phonebook_entry::phone_number> > > > phonebook;
See example 7 in the examples section
for an application of composite_key.
Composite keys can also be used with hashed indices in a straightforward manner:
struct street_entry { // quadrant coordinates int x; int y; std::string name; street_entry(int x,int y,const std::string& name):x(x),y(y),name(name){} }; typedef multi_index_container< street_entry, indexed_by< hashed_non_unique< // indexed by quadrant coordinates composite_key< street_entry, member<street_entry,int,&street_entry::x>, member<street_entry,int,&street_entry::y> > >, hashed_non_unique< // indexed by street name member<street_entry,std::string,&street_entry::name> > > > street_locator; street_locator sl; ... void streets_in_quadrant(int x,int y) { std::pair<street_locator::iterator,street_locator::iterator> p= sl.equal_range(std::make_tuple(x,y)); while(p.first!=p.second){ std::cout<<p.first->name<<std::endl; ++p.first; } }
Note that hashing is automatically taken care of: boost::hash is
specialized to hash a composite key as a function of the boost::hash
values of its elements. Should we need to specify different hash functions for the
elements of a composite key, we can explicitly do so by using the
composite_key_hash
utility:
struct tuned_int_hash { int operator()(int x)const { // specially tuned hash for this application } }; typedef multi_index_container< street_entry, indexed_by< hashed_non_unique< // indexed by quadrant coordinates composite_key< street_entry, member<street_entry,int,&street_entry::x>, member<street_entry,int,&street_entry::y> >, composite_key_hash< tuned_int_hash, tuned_int_hash > >, hashed_non_unique< // indexed by street name member<street_entry,std::string,&street_entry::name> > > > street_locator;
Also, equality of composite keys can be tuned with
composite_key_equal_to,
though in most cases the default equality predicate (relying on
the std::equal_to instantiations for the element types)
will be the right choice.
Unlike with ordered indices, we cannot perform partial searches specifying only the first elements of a composite key:
// try to locate streets in quadrants with x==0 // compile-time error: hashed indices do not allow such operations std::pair<street_locator::iterator,street_locator::iterator> p= sl.equal_range(std::make_tuple(0));
The reason for this limitation is quite logical: as the hash value of a composite key depends on all of its elements, it is impossible to calculate it from partial information.
The Key Extractor
concept allows the same object to extract keys from several different types,
possibly through suitably defined overloads of operator():
// example of a name extractor from employee and employee * struct name_extractor { typedef std::string result_type; const result_type& operator()(const employee& e)const{return e.name;} result_type& operator()(employee* e)const{return e->name;} }; // name_extractor can handle elements of type employee... typedef multi_index_container< employee, indexed_by< ordered_unique<name_extractor> > > employee_set; // ...as well as elements of type employee * typedef multi_index_container< employee*, indexed_by< ordered_unique<name_extractor> > > employee_ptr_set;
This possibility is fully exploited by predefined key extractors provided
by Boost.MultiIndex, making it simpler to define multi_index_containers
where elements are pointers or references to the actual objects. The following
specifies a multi_index_container of pointers to employees sorted by their
names.
typedef multi_index_container< employee *, indexed_by< ordered_non_unique<member<employee,std::string,&employee::name> > > > employee_set;
Note that this is specified in exactly the same manner as a multi_index_container
of actual employee objects: member takes care of the
extra dereferencing needed to gain access to employee::name. A similar
functionality is provided for interoperability with reference wrappers from
Boost.Ref:
typedef multi_index_container< boost::reference_wrapper<const employee>, indexed_by< ordered_non_unique<member<employee,std::string,&employee::name> > > > employee_set;
In fact, support for pointers is further extended to accept what we call
chained pointers. Such a chained pointer is defined by induction as a raw or
smart pointer or iterator to the actual element, to a reference wrapper of the
element or to another chained pointer; that is, chained pointers are arbitrary
compositions of pointer-like types ultimately dereferencing
to the element from where the key is to be extracted. Examples of chained
pointers to employee are:
employee *,const employee *,std::auto_ptr<employee>,std::list<boost::reference_wrapper<employee> >::iterator,employee **,boost::shared_ptr<const employee *>.multi_index_containers from preexisting
multi_index_containers.
In order to present a short summary of the different usages of Boost.MultiIndex key extractors in the presence of reference wrappers and pointers, consider the following final type:
struct T { int i; const int j; int f()const; int g(); static int gf(const T&); static int gg(T&); };
The table below lists the appropriate key extractors to be used for
different pointer and reference wrapper types based on T, for
each of its members.
| element type | key | key extractor | applicable to constelements? | read/write? | 
|---|---|---|---|---|
| T | i | member<T,int,&T::i> | yes | yes | 
| j | member<T,const int,&T::j> | yes | no | |
| f() | const_mem_fun<T,int,&T::f> | yes | no | |
| g() | mem_fun<T,int,&T::g> | no | no | |
| gf() | global_fun<const T&,int,&T::gf> | yes | no | |
| gg() | global_fun<T&,int,&T::gg> | no | no | |
| reference_wrapper<T> | i | member<T,int,&T::i> | yes | yes | 
| j | member<T,const int,&T::j> | yes | no | |
| f() | const_mem_fun<T,int,&T::f> | yes | no | |
| g() | mem_fun<T,int,&T::g> | yes | no | |
| gf() | global_fun<const T&,int,&T::gf> | yes | no | |
| gg() | global_fun<T&,int,&T::gg> | yes | no | |
| reference_wrapper<const T> | i | member<T,const int,&T::i> | yes | no | 
| j | member<T,const int,&T::j> | yes | no | |
| f() | const_mem_fun<T,int,&T::f> | yes | no | |
| g() | ||||
| gf() | global_fun<const T&,int,&T::gf> | yes | no | |
| gg() | ||||
| chained pointer to Tor to reference_wrapper<T> | i | member<T,int,&T::i> | yes | yes | 
| j | member<T,const int,&T::j> | yes | no | |
| f() | const_mem_fun<T,int,&T::f> | yes | no | |
| g() | mem_fun<T,int,&T::g> | yes | no | |
| gf() | global_fun<const T&,int,&T::gf> | yes | no | |
| gg() | global_fun<T&,int,&T::gg> | yes | no | |
| chained pointer to const Tor to reference_wrapper<const T> | i | member<T,const int,&T::i> | yes | no | 
| j | member<T,const int,&T::j> | yes | no | |
| f() | const_mem_fun<T,int,&T::f> | yes | no | |
| g() | ||||
| gf() | global_fun<const T&,int,&T::gf> | yes | no | |
| gg() | ||||
The column "applicable to const elements?" states whether the
corresponding key extractor can be used when passed constant elements (this
relates to the elements specified in the first column, not the referenced
T objects). The only negative cases are for T::g and
T:gg when the elements are raw T objects, which make sense
as we are dealing with a non-constant member function (T::g)
and a function taking T by
non-constant reference: this also implies that multi_index_containers
of elements of T cannot be sorted by T::g or T::gg, because
elements contained within a multi_index_container are treated as constant.
The column "read/write?" shows which combinations yield read/write key extractors.
Some care has to be taken to preserve const-correctness in the
specification of member key extractors: in some sense, the const
qualifier is carried along to the member part, even if that particular
member is not defined as const. For instance, if the elements
are of type const T *, sorting by T::i is not
specified as member<const T,int,&T::i>, but rather as
member<T,const int,&T::i>.
For practical demonstrations of use of these key extractors, refer to example 2 and example 6 in the examples section.
Revised August 20th 2014
© Copyright 2003-2014 Joaquín M López Muñoz. Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)