The Mallocator

The Mallocator

  • Comments 23

A common question from programmers who have an intermediate amount of experience with using the STL is, "How do I write an STL allocator?".  Writing an STL allocator is not especially difficult - only two member functions are interesting, allocate() and deallocate().  However, STL allocators must satisfy a number of other requirements (given by section 20.1.5 of the International Standard for C++, ISO/IEC 14882:2003), and the code to do so takes roughly 80 editor lines.  Figuring out the code from the requirements can be overwhelming, but once you see the code, it's easy.

 

One thing some programmers try is to derive from std::allocator.  I recommend against this; it's more trouble than it's worth.  Looking at std::allocator's implementation is also painful.

 

Therefore, I've written an example STL allocator, whose purpose in life is to wrap malloc() and free(), which I've imaginatively called Mallocator.  I've carefully implemented all of the integer overflow checks and so forth that would be required in real production code.  And I've exhaustively commented which parts of Mallocator are boilerplate (common to all, virtually all, or all stateless allocators), and which parts you would have to customize.  Hopefully, this should demystify the implementation of STL allocators:

 

 

C:\Temp>type mallocator.cpp

// The following headers are required for all allocators.

#include <stddef.h>  // Required for size_t and ptrdiff_t and NULL

#include <new>       // Required for placement new and std::bad_alloc

#include <stdexcept> // Required for std::length_error

 

// The following headers contain stuff that Mallocator uses.

#include <stdlib.h>  // For malloc() and free()

#include <iostream>  // For std::cout

#include <ostream>   // For std::endl

 

// The following headers contain stuff that main() uses.

#include <list>      // For std::list

 

template <typename T> class Mallocator {

public:

 

    // The following will be the same for virtually all allocators.

    typedef T * pointer;

    typedef const T * const_pointer;

    typedef T& reference;

    typedef const T& const_reference;

    typedef T value_type;

    typedef size_t size_type;

    typedef ptrdiff_t difference_type;

 

    T * address(T& r) const {

        return &r;

    }

 

    const T * address(const T& s) const {

        return &s;

    }

 

    size_t max_size() const {

        // The following has been carefully written to be independent of

        // the definition of size_t and to avoid signed/unsigned warnings.

        return (static_cast<size_t>(0) - static_cast<size_t>(1)) / sizeof(T);

    }

 

 

    // The following must be the same for all allocators.

    template <typename U> struct rebind {

        typedef Mallocator<U> other;

    };

 

    bool operator!=(const Mallocator& other) const {

        return !(*this == other);

    }

 

    void construct(T * const p, const T& t) const {

        void * const pv = static_cast<void *>(p);

 

        new (pv) T(t);

    }

 

    void destroy(T * const p) const; // Defined below.

 

 

    // Returns true if and only if storage allocated from *this

    // can be deallocated from other, and vice versa.

    // Always returns true for stateless allocators.

    bool operator==(const Mallocator& other) const {

        return true;

    }

 

 

    // Default constructor, copy constructor, rebinding constructor, and destructor.

    // Empty for stateless allocators.

    Mallocator() { }

 

    Mallocator(const Mallocator&) { }

 

    template <typename U> Mallocator(const Mallocator<U>&) { }

 

    ~Mallocator() { }

 

 

    // The following will be different for each allocator.

    T * allocate(const size_t n) const {

        // Mallocator prints a diagnostic message to demonstrate

        // what it's doing. Real allocators won't do this.

        std::cout << "Allocating " << n << (n == 1 ? " object" : "objects")

            << " of size " << sizeof(T) << "." << std::endl;

 

        // The return value of allocate(0) is unspecified.

        // Mallocator returns NULL in order to avoid depending

        // on malloc(0)'s implementation-defined behavior

        // (the implementation can define malloc(0) to return NULL,

        // in which case the bad_alloc check below would fire).

        // All allocators can return NULL in this case.

        if (n == 0) {

            return NULL;

        }

 

        // All allocators should contain an integer overflow check.

        // The Standardization Committee recommends that std::length_error

        // be thrown in the case of integer overflow.

        if (n > max_size()) {

            throw std::length_error("Mallocator<T>::allocate() - Integer overflow.");

        }

 

        // Mallocator wraps malloc().

        void * const pv = malloc(n * sizeof(T));

 

        // Allocators should throw std::bad_alloc in the case of memory allocation failure.

        if (pv == NULL) {

            throw std::bad_alloc();

        }

 

        return static_cast<T *>(pv);

    }

 

    void deallocate(T * const p, const size_t n) const {

        // Mallocator prints a diagnostic message to demonstrate

        // what it's doing. Real allocators won't do this.

        std::cout << "Deallocating " << n << (n == 1 ? " object" : "objects")

            << " of size " << sizeof(T) << "." << std::endl;

 

        // Mallocator wraps free().

        free(p);

    }

 

 

    // The following will be the same for all allocators that ignore hints.

    template <typename U> T * allocate(const size_t n, const U * /* const hint */) const {

        return allocate(n);

    }

 

 

    // Allocators are not required to be assignable, so

    // all allocators should have a private unimplemented

    // assignment operator. Note that this will trigger the

    // off-by-default (enabled under /Wall) warning C4626

    // "assignment operator could not be generated because a

    // base class assignment operator is inaccessible" within

    // the STL headers, but that warning is useless.

private:

    Mallocator& operator=(const Mallocator&);

};

 

// A compiler bug causes it to believe that p->~T() doesn't reference p.

 

#ifdef _MSC_VER

    #pragma warning(push)

    #pragma warning(disable: 4100) // unreferenced formal parameter

#endif

 

// The definition of destroy() must be the same for all allocators.

template <typename T> void Mallocator<T>::destroy(T * const p) const {

    p->~T();

}

 

#ifdef _MSC_VER

    #pragma warning(pop)

#endif

 

 

int main() {

    using namespace std;

 

    cout << "Constructing l:" << endl;

 

    list<int, Mallocator<int> > l;

 

    cout << endl << "l.push_back(1729):" << endl;

 

    l.push_back(1729);

 

    cout << endl << "l.push_back(2161):" << endl;

 

    l.push_back(2161);

 

    cout << endl;

 

    for (list<int, Mallocator<int> >::const_iterator i = l.begin(); i != l.end(); ++i) {

        cout << "Element: " << *i << endl;

    }

 

    cout << endl << "Destroying l:" << endl;

}

 

C:\Temp>cl /EHsc /nologo /W4 mallocator.cpp

mallocator.cpp

 

C:\Temp>mallocator

Constructing l:

Allocating 1 object of size 4.

Allocating 1 object of size 12.

 

l.push_back(1729):

Allocating 1 object of size 12.

 

l.push_back(2161):

Allocating 1 object of size 12.

 

Element: 1729

Element: 2161

 

Destroying l:

Deallocating 1 object of size 12.

Deallocating 1 object of size 12.

Deallocating 1 object of size 12.

Deallocating 1 object of size 4.

 

 

To explain this output:

 

The object of size 4 is the aux object

 

The objects of size 12 are doubly-linked list nodes.

 

In VC's implementation, default-constructed lists allocate a sentinel node, which is where end iterators point.  (End iterators have to be decrementable, so they can't point at NULL.)

 

 

Stephan T. Lavavej

Visual C++ Libraries Developer

  • PingBack from http://hoursfunnywallpaper.cn/?p=3624

  • why this:

    return (static_cast<size_t>(0) - static_cast<size_t>(1)) / sizeof(T);

    instead of this:

    return std::numeric_limits<std::size_t>::max() / sizeof(T);

    also, <cstddef>, <cstdlib>, and std::size_t etc should be used!

  • [phrosty]

    > why this: [...] instead of this: [...]

    That works (the only difference is that it's not an integral constant expression).

    Reading 4.7 [conv.integral]/2, static_cast<size_t>(-1) is also guaranteed to work. This is what our Standard Library implementation does (see <xmemory>).

    > also, <cstddef>, <cstdlib>, and std::size_t etc should be used!

    I used to be very careful about that. C++98 had a splendid dream wherein <cfoo> would declare everything within namespace std, and <foo.h> would include <cfoo> and then drag everything into the global namespace with using-declarations. (This is D.5 [depr.c.headers].)

    This was ignored by lots of implementers (some of which had very little control over the C Standard Library headers). So, C++0x has been changed to match reality. As of the N2723 Working Paper, http://open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2723.pdf , now <cfoo> is guaranteed to declare everything within namespace std, and may or may not declare things within the global namespace. <foo.h> is the opposite: it is guaranteed to declare everything within the global namespace, and may or may not declare things within namespace std.

    In reality and in C++0x, including <cfoo> is no safeguard against everything getting declared in the global namespace anyways. That's why I'm ceasing to bother with <cfoo>.

    This was Library Issue 456, http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#456 .

    (C++0x still deprecates the <foo.h> headers from the C Standard Library, which is hilarious.)

  • Nice info for begginers. Posted here http://jia3ep.blogspot.com/2008/08/mallocator.html

  • > One thing some programmers try is to derive from std::allocator.  I recommend against this; it's more trouble than it's worth.

    Why do you recommend against it? I've just written a stateful allocator and I've found it more appealing to derive from the standard allocator because it saved me from writing boiler-plate code and for my allocator I do like to rely on the standard, especially when it comes to the typedefs and construction/destruction (only allocate() and deallocate() need to be overwritten)

  • [klaus triendl]

    > Why do you recommend against it?

    People usually screw up, especially when it comes to rebinding. When they screw up, they usually find it hard to understand what went wrong. And when they succeed (or, when their bugs are subtle enough to go unnoticed for the time being), people usually find it hard to understand the finished product, since they can't see half of the code. This adds to the undeserved aura of mystery.

    If you wrote a stateful allocator, you would have had to reimplement not only allocate() and deallocate(), but also the ctor, copy ctor (unless the implicitly defined copy ctor did the right thing), dtor (unless the implicitly defined dtor did the right thing), rebinding ctor, rebind struct (see above), and equality test. Did you remember to do all of that? It's a lot harder to forget when you write an allocator from scratch to follow the requirements (or equivalently, look at each member of the Mallocator).

  • Small, clean and explained. Thanks!

    Posted on http://rcaloca.blogspot.com/2008/08/stl-allocators.html

  • [Stephan T. Lavavej]

    > If you wrote a stateful allocator, you would have had to reimplement not only allocate() and deallocate(), but also the ctor, copy ctor (unless the implicitly defined copy ctor did the right thing), dtor (unless the implicitly defined dtor did the right thing), rebinding ctor, rebind struct (see above), and equality test. Did you remember to do all of that? It's a lot harder to forget when you write an allocator from scratch to follow the requirements (or equivalently, look at each member of the Mallocator).

    Yes, you're right, it's easier to screw up. But as my allocator affects all our projects I tried to be really careful, tried to write it as generic as possible, read a lot about allocators and what the standard requires, not to mention my discovery that stateful allocators are not trivial at all, and tuple-checked my code whether I reimplemented all what you mentioned.

  • > constructed lists allocate a sentinel node

    Not too related to allocator I guess, but I was wondering what the VC implementation does for associative containers regarding end iterator implementation/or where it points to.

    Can it change after elements have been inserted/erased etc? Reason I ask is for some code that requires concurrent access and whether lock scoping should watch out for end iterator comparison?

  • I've had occasion to conduct many security assessments.  Blackbox/binary, whitebox/source, all languages managed or otherwise, for major financial, pure software development teams, or anybody else.  

    One thing is clear, the fundamentals in life is what matters, oddly, or fundamentally ;), the problem is the how deceptively simple an activity may be.  Something we take for granted all the time, like, using memory, or  adding and subtracting. How can this be so hard?

    I do have a particular fondness for this class of flaw, memory allocator flaws, quite an effective means for exploitation, as they would typically perform immediate consumption supplied payload's.

    This particular allocator, though quite capable in many cases, has some failure cases.  Aside from the high performance toll, where each list item is atomically requested (perhaps this is my C upbringing expecting sizeof(foo)*elements request to be executed, so this perf characteristic may well be by design for this class).  

    Having been bitten by the bad news bug, I'm not going to post the flaw in its entirety, even though this is not a shipping product, it's just too hard to explain the IT industry semantics to anybody who does not live it.  But this should demonstrate that it is real, though I'm sure post-regression, the team here will have a number of caveat's to re-assert() this blog's thesis, however from right _NOW_ until that point, were in the spooky, the famous and just plain nebulous, window of opportunity.

    Here's the repro; compiled by MSVC9,SP1, default settings for a console project, no matter debug, release, CPU (d)word size character encoding whatever, even large memory aware I tested all combinations no effect, other than the AV. My locallydefinedtype is totally simple, fixed length static array field.

    int _tmain(int argc, _TCHAR* argv[])

    {

    list<locallydefinedtype, Mallocator<locallydefinedtype> >::const_iterator it;

    try {

    list<locallydefinedtype, Mallocator<locallydefinedtype>> l(1);

    it = l.begin();

    memset((void*) it->a, 0x42, sizeof(it->a));

    } catch(...) { printf("catched\n"); }

    printf("value = %x", it->n);

    _getch();

    return(0);

    I'll say it again, this list allocator is quite capiable and I would say easially a 1 %'er of upper quality in what is written. I'll also restate, my perticular fondness for this catagory of flaw, personally I think we should move 1 farther from 0 to prevent this sort of issue ;)   These guy's just do not play well together.

    Here are the last few instructions at the time of this failure case;

    00ad94ce 8bc8            mov     ecx,eax

    00ad94d0 c1e010          shl     eax,10h

    00ad94d3 03c1            add     eax,ecx

    00ad94d5 8bca            mov     ecx,edx

    00ad94d7 83e203          and     edx,3

    00ad94da c1e902          shr     ecx,2

    00ad94dd 7406            je      functor!memset+0x65 (00ad94e5)

    00ad94df f3ab            rep stos dword ptr es:[edi]

    functor!memset+0x5f:

    00ad94dff3ab            rep stos dword ptr es:[edi]

    Here's the memory;

    00ca6fee 42 42 42 42 42 42 42 42 42 42  BBBBBBBBBB

    00ca6ff8 42 42 42 42 42 42 42 42 ?? ??  BBBBBBBB??

    That's all for now, I'll post the team the repro and I suppose there may be some changes. Even with /analyze, the only additional warning is " 1>c:\fs\test\functor\functor.cpp(227) : warning C6031: Return value ignored: '_getch'", that aint it...

    Anyhow,take care,

    Shane Macaulay

  • Just incase anybody's wondering, for sure that is the exact code, the reason why it's running out of the "functor.cpp" project at line 227 is I came to this blog post by pure chance while working on some TR1 implmented proxy classes that are thunking (and saving my butt) with a side-by-side hosted CLR app and some activation context issues I had punted for a while.. so for sure that is the exact code. I really should of editied that down more, but, you know, hindsight and all...

  • [e]

    > Not too related to allocator I guess, but I was

    > wondering what the VC implementation does for

    > associative containers regarding end iterator

    > implementation/or where it points to.

    > Can it change after elements have been

    > inserted/erased etc?

    That's a good question - you're asking about invalidation.

    C++03 23.1.2 [lib.associative.reqmts]/8: "The insert members shall not affect the validity of iterators and references to the container, and the erase members shall invalidate only iterators and references to the erased elements."

    std::list behaves identically, as it's also a node-based container (23.2.2.3 [lib.list.modifiers]/1,3).

    [ShaneK2]

    > Aside from the high performance toll, where each

    > list item is atomically requested (perhaps this is

    > my C upbringing expecting sizeof(foo)*elements

    > request to be executed, so this perf characteristic

    > may well be by design for this class).

    This doesn't make sense. By design, std::list is a node-based container, just like any linked list in C and C++. If you want a block-based container, you know where to find std::vector and std::deque.

    I chose std::list for my demonstration precisely because it performs multiple dynamic memory allocations.

    > Having been bitten by the bad news bug, I'm not

    > going to post the flaw in its entirety

    This isn't helpful. If my Mallocator contains a bug, which I don't believe is the case (*especially* given that it just wraps malloc() and free() and does nothing tricky), you need to demonstrate it with a self-contained repro.

    In fact, the bug almost certainly lies within your own code:

    > memset((void*) it->a, 0x42, sizeof(it->a));

    It appears that you intend to scribble over it->a with 0x42 bytes.  However, you've C casted it->a to void *.  This is almost certainly wrong.  The first argument to memset() should be &it->a, making a cast unnecessary.

    There is also a const-correctness violation here (as "it" is a const_iterator).

    Thanks,

    Stephan T. Lavavej, Visual C++ Libraries Developer

  • Stephan, perhaps it's best to take this offline.  Feel free to email me, or get in touch shane at theurldoman.com.  I don't like full disclosure of flaws of any kind until I'm sure the effect is limited.  

    Also the forum appears to be limiting my submission.

    In the middle of providing you a couple more examples and explore the edge cases where these failures occur,  I did start to get errors back from the build process.  The error, in your code however, was more a failure in the compiler than a syntactical lapse, in the age of dynamically/runtime emitted applications however,  it would be interesting to apply this condition in various forms in a CLR app. I eventually was able to get this error, and can send you the various versions, the silently failing ones and the one that generated this other failure.

    Please choose the Technical Support command on the Visual C++

    Help menu, or open the Technical Support help file for more information

    LINK : fatal error LNK1000

    ExceptionAddress         = 5E38FBBE (5E300000) "D:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin\c2.dll"

    I'll likely send the bug onto VC tech support if I do not hear back.  Taking your critique into account, I factored this a bit, should ensure less focus on unimportant details.  I'll leave any additional comments/responses I have for a direct email / designated support communiqué.

    All fields of the local are "unsigned char MEMBER[STATIC-SIZE];"

    struct locallydefinedtype {

    unsigned char a[2];

    ...

    unsigned char n[0xf];

    };

    int _tmain(int argc, _TCHAR* argv[])

    {

    try {

    int i=0;

    list<locallydefinedtype, Mallocator<locallydefinedtype> >::iterator it;

    locallydefinedtype *p=NULL,ValuOnStak;

    list<locallydefinedtype, Mallocator<locallydefinedtype>> l;

    // to get some heap allocations around the overwrite to ensure we see our overwrite  

    memset(&ValuOnStak, 0, sizeof(ValuOnStak));

    l.push_front(ValuOnStak);

    l.push_back(ValuOnStak);

    l.push_back(ValuOnStak);

    l.pop_back();

    l.push_front(ValuOnStak);

    it = l.begin();

    p = static_cast<locallydefinedtype *>(&*it);

    printf("p is @ %p\n", p);

    for(i=0; i < sizeof(p->a)-1; p->a[i++] = 'A') p->a[i]='\0';

    for(i=0; i < sizeof(p->b)-1; p->b[i++] = 'B') p->b[i]='\0';

    for(i=0; i < sizeof(p->c)-1; p->c[i++] = 'C') p->c[i]='\0';

    for(i=0; i < sizeof(p->d)-1; p->d[i++] = 'D') p->d[i]='\0';

    for(i=0; i < sizeof(p->e)-1; p->e[i++] = 'E') p->e[i]='\0';

    printf("%s %s %s %s %s", p->a, p->b, p->c, p->d, p->e);

    } catch(...) { printf("catched\n"); }

    printf("\nwere out\n");

    return(_getch());

    }

  • @Shane

    In your first example attempting to show a bug, the problem is that in the call to printf() just before the _getch() call you are dereferencing an iterator which has been invalidated (the list<> the iterator was pointing into was local to the try{} block).  The memset() call looks OK.

    I'm not sure what problem your second example is supposed to show - it compiles and runs OK on my set up (which has not been updated to SP1 yet).  From the LNK1000 error message, it looks like there's maybe a defect in the linker causing it to crash - that may or may not be because he linker is getting confused over some problem in the Mallocator<>, but it would require a bit more work to verify that it's not simply a linker bug unrelated to Mallocator<>.  But I wonder - is the code you posted the code that crashes the linker?  It seems too simple to show a linker defect by itself.

    I've tried your tests on several compilers (including non-Microsoft ones) and I haven't noticed a problem with the Mallocator<>.  However, so far only VC 9.0 crashes due to dereferencing the invalid iterator (which is what I consider the preferred behavior - at least for a non-optimized build).

  • [mikeb]

    > the list<> the iterator was pointing into was local to the try{} block

    Excellent catch! I totally missed that massive bug (which _HAS_ITERATOR_DEBUGGING would have detected).

    > The memset() call looks OK.

    Only because it->a happened to refer to an array, and the name of an array decays into a pointer to its first element. If it->a had referred to a non-array, then terrible badness would have ensued as I explained. Either way, casting to void * the first argument to memset() is exceedingly dangerous and buys you nothing.

Page 1 of 2 (23 items) 12