Allocating Uninitialized Numeric Arrays Allocating Uninitialized Numeric Arrays Uninit.Arrays

    Motivation

    • very large contiguous memory block of fundamentals (ints / doubles / …)
    • for direct overwriting, e.g. as a storage target for computation results or (file) input
    • preferably with the safety & convenience of std::vector
    std::vector<int> v (1'000'000'000);  // ≈4GB
    • vector value-initializes its underlying memory block
    • for fundamental types that means initialization with value 0
    • which can take many seconds for multi-gigabyte arrays!

    Historical Note  C++98

    before C++11 vector did not value-initialize all its elements, but copied a prototype value into the elements which was at least as slow as value initialization

    Solutions

    vector<default_init_allocator<T>> vector with Allocator Allocator

    • all the convenience of std::vector
    • allocator prevents value-initialization
    • vector can be accessed normally
    • no overhead (at least on higher optimization levels -O2 / -O3)
    • can be passed to functions that take span<T> parameters
    • .data() pointer can be passed to C-style functions that take T* parameters
    • can't be passed to functions that take vector<T> parameters

    This makes forced default initialization a pure allocation issue and decouples it from the numeric datatype. If you want to couple this property to the data type consider this solution.

    #include <vector>
    // allocator adaptor that interposes 'construct' calls
    // to convert value initialization into default initialization
    // by Casey Carter (@codercasey)
    template< typename T, 
              typename Alloc = std::allocator<T> >
    class default_init_allocator : public Alloc
    {
      using a_t = std::allocator_traits<Alloc>;
    public:
      // obtain alloc<U> where U ≠ T
      template<typename U>
      struct rebind {     using other = default_init_allocator<U,
          typename a_t::template rebind_alloc<U> >;  };
      // make inherited ctors visible
      using Alloc::Alloc;  
      // default-construct objects
      template<typename U>
      void construct (U* ptr)     noexcept(    std::is_nothrow_default_constructible<      U>::value)
      { // 'placement new':
        ::new(static_cast<void*>(ptr)) U;  }
      // construct with ctor arguments
      template<typename U, typename... Args>
      void construct (U* ptr, Args&&... args) {     a_t::construct(
          static_cast<Alloc&>(*this),
          ptr, std::forward<Args>(args)...);  }
    };
    
    void demo () { std::vector<int,default_init_allocator<int>> v; v.resize(1'000'000'000); // fast - no init! }

    vector<no_init<T>> no_init C++11

    • all the convenience of std::vector
    • wrapper prevents value-initialization
    • vector can be accessed just like without the wrapper
    • no overhead (at least on higher optimization levels -O2 / -O3)
    • additional indirection potentially annoying for debugging
    • can't be passed to functions that take vector<T> or span<T> parameters
    • .data() pointer can't be passed to C-style functions that take T*

    This couples the initialization behavior to the numeric datatype and thus propagates it through interfaces (by mentioning the type no_init). If you don't want this, consider this solution.

    #include <type_traits>  // std::is_fundamental
    #include <vector>
    template<typename T>
    class no_init {
      static_assert(
        std::is_fundamental<T>::value, 
        "should be a fundamental type");
    public: 
      // constructor without initialization
      no_init () noexcept {}
      // implicit conversion T → no_init<T>
      constexpr  no_init (T value) noexcept: v_{value} {}
      // implicit conversion no_init<T> → T
      constexpr  operator T () const noexcept { return v_; }
    private:
      T v_;
    };
    
    void demo () { std::vector<no_init<int>> v; v.resize(1'000'000'000); // fast - no init! v[1024] = 47; int j = v[1024]; v.push_back(23); }

    make_unique_for_overwrite<T[]>(n) make_unique_for_overwrite make_unique_for_overwrite C++20

    #include <memory>
    auto buf = std::make_unique_for_overwrite<T[]>(n);

    We can't use make_unique<T[]>(n), because that would value-initialize the allocated array.

    • returns a unique_ptr that does automatic cleanup
    • can be used with span<T> parameters and passed to C functions
    • need to track array size separately (e.g., with a span)
    • less safe & less convenient than vector<no_init<T>>

    use a span to access / pass the array around

    #include <memory>
    SampleStats statistics (Samples const& in) {
      // make uninitialized array
      auto buf = std::make_unique_for_overwrite<int[]>(in.size());
      // obtain view to it:
      std::span<int> results {buf.get(), in.size()};
      // do something with it
      gpu_statistics(in, results);
      prefix_sum(results);
      
    }  // memory automatically deallocated

    unique_ptr<T[]>(new T[n]) unique_ptr<T[]> unique_ptr C++11

    #include <memory>
    auto buf = std::unique_ptr<int[]>{new T[n]};

    We can't use make_unique<T[]>(n), because that would value-initialize the allocated array.

    • unique_ptr does automatic cleanup
    • works as of C++11
    • need to track array size separately (e.g., with a span)
    • less safe & less convenient than vector<no_init<T>>

    use a span to access / pass the array around

    #include <memory>
    SampleStats statistics (Samples const& in) {
      // make uninitialized array
      auto buf = std::unique_ptr<int[]>{new T[in.size()]};
      // obtain view to it:
      std::span<int> results {buf.get(), in.size()};
      // do something with it
      gpu_statistics(in, results);
      prefix_sum(results);
      
    }  // memory automatically deallocated

    Legacy Compilers new T[n] C++98

    T* buf = new T[n];
    
    // important! delete if not needed any more
    delete[] buf;

    • easy to forget to delete memory ⇒ leak-prone
    • error-prone and cumbersome separate tracking of array size
    • less safe & less convenient than vector<no_init<T>>

    It's 2021 – Avoid raw operators new and delete in modern code bases! Only use them in implementations of memory managers like allocators .

    cudaMallocHost CUDA CUDA

    • creates page-locked memory ⇒ faster copy to/from device
    • easy to forget to free memory ⇒ leak-prone
    • error-prone and cumbersome separate tracking of array size
    // (2^30 x 4B) = 4GiB
    const int n = 1 << 30;
    auto const size = n * sizeof(int);
    int *aHost;
    cudaMallocHost( (void**)&aHost, size);
    
    cudaFreeHost(aHost);