-
Notifications
You must be signed in to change notification settings - Fork 23
CHAI Redesign
- Avoid complicated sets of configuration variables
- Modularize code for supporting various memory models (host, device, pinned, pageable, copy-hiding)
- Make it possible to use multiple memory models within the same build
- Make it easy for a project to specify allocators to be used
Possible approaches include:
- Use global state.
Advantages: Some projects already do this. Simple from CHAI's standpoint.
Disadvantages: Projects have to save the old global state at every entry point and then restore it. This may be non-trivial and error prone. Thread safety is a concern. If an allocator is switched out at the wrong time, the program will crash.
Due to the error prone nature of this approach, I think the global state should only be considered as a fallback option.
- Provide an allocator at construction and/or allocation calls.
Advantages: Some projects already do this. It is explicit which allocator is being used. clang-query and possibly other tools could be used to detect if some call sites are missing the allocator argument. Flexible.
Disadvantages: Error prone, though checkable. Verbose. The array class has to store which allocator was used.
Implementation concerns: Should the array fall back to some default if no allocator is provided?
- Build an allocator into the type (like std::vector).
Advantages: The project can create an alias for the array type with the allocator and then the allocator need not ever be mentioned again.
Disadvantages: Invasive change if the project is not already using an alias for the array type. More templates.
Implementation concerns: Should it be a template argument or just a class member (allocators would inherit from an interface)?
- Keep the same interface regardless of which memory model is used.
- Memory models need to be usable interchangeably (e.g. whether we use pinned or unified memory, a sort algorithm should be able to handle both), see liskov substitution principle. Ideally this would be done in a way that is easy to use/understand and won't result in template explosion.
Implementation concerns:
- Avoid inheritance in classes that will be captured by value into device lambdas (unless perhaps CRTP is being used)
- How should various memory models interact? Should you be able to create a new memory model from a different one?
Approaches
- Have a run time flag and basically a switch statement.
Advantages: Closest to the current approach.
Disadvantages: We'll need ifdefs depending on the build type and what features are available. The array class has to know about all possible memory models. Experimenting with new memory models is invasive. Does not follow the open-closed principle (https://en.wikipedia.org/wiki/Open%E2%80%93closed_principle).
- Have a compile time flag and basically a constexpr switch statement.
Advantages: Maybe slightly better performance and compile times?
Disadvantages: Same disadvantages as run time flag, but also makes it harder to use (more templates)
- Make a memory model interface class and have all the memory models inherit from it. The array class would contain a pointer to a memory model.
Advantages: Easy to add a new memory model. Can even share behavior with other memory models. Decoupled during compilation.
Disadvantages: Have to decide how to create the memory model. Could possibly end up with a complicated inheritance heirarchy. Possibly more includes. Performance slowdown from vtable lookups?
- Make an array class with the memory model as a template argument.
Advantages: The memory model is customizeable as long as you meet the minimum interface requirements.
Disadvantages: Harder to write a new memory model unless you can easily find the interface requirements.
enum ExecutionSpace { CPU, GPU };
enum MemoryType { HOST, DEVICE, PINNED, PAGEABLE }
class MemoryManager {
};
///
/// Array class with pointer/reference semantics (shallow copies)
///
template <class T>
class PArray {
public:
PArray() = default;
PArray(size_t size); // Should it also take a memory manager?
PArray(const PArray& other);
void resize(size_t newSize); // Should it also take a memory manager?
void free();
T& operator[](size_t i);
private:
size_t m_size = 0;
T* m_data = nullptr;
MemoryManager* m_manager = nullptr;
};
How to use?
Using default memory manager:
chai::PArray<int> myarray(size);
Need to specify size, allocator, and memory type.
size_t size = 100000;
// Option 1
chai::PArray<int> myarray = chai::PArray<int><CopyHidingMemoryManager>(size);
// Option 2
chai::PArray<int> myarray = chai::makePArray<int, CopyHidingMemoryModel>(size, allocator));
// Option 3
chai::PArray<int, CopyHidingMemoryManager> myarray(size);
// Option 4
chai::PArray<int> myarray(new CopyHidingMemoryManager(size));
// Option 5
chai::PArray<int> myarray(size); // Uses default allocator and memory manager
chai::PArray<int> myarray(size, allocator); // Uses default memory manager
chai::PArray<int> myarray(size, allocator, manager);
chai::PArray<int> myarray(size, allocator, chai::getPinnedMemoryManager());
///
/// Array class with pointer/reference semantics (shallow copies)
///
template <class T>
class PArray {
public:
PArray() = default;
PArray(MemoryStrategy* memory_strategy) :
m_memory_strategy{memory_strategy}
{
if (m_memory_strategy) {
m_size = m_memory_strategy->size();
m_data = m_memory_strategy->data(std::is_const<T>::value);
}
}
PArray(const PArray& other) :
m_memory_strategy{other.m_memory_strategy}
{
if (m_memory_strategy) {
m_size = m_memory_strategy->size();
m_data = m_memory_strategy->data(std::is_const<T>::value);
}
}
void free() {
m_size = 0;
m_data = nullptr;
delete m_memory_strategy;
m_memory_strategy = nullptr;
}
std::size_t size() const {
return m_size;
}
T* data() const {
return m_data;
}
T& operator[](size_t i) const {
return m_data[i];
}
private:
std::size_t m_size = 0;
T* m_data = nullptr;
MemoryStrategy* m_memory_strategy = nullptr;
};
/// Usage
chai::PArray<int> myArray(new CopyHidingMemoryStrategy(10));