Tensor Storage

Introduction

A data structure is a particular way of organizing data in a computer so that it can be used effectively and stored efficiently. Tensor Storage predefined data structure for storing data in a efficient way but there is a trade off for storing data efficiently and time to access data. As complexity for storing data increases, accessing time increases as both of them directly proportional.

There are three types of storage :-

Dense Storage
Band Storage
Sparse Storage

Dense Storage

It store values in a contiguous sequential block of memory where all values are represented because of which they heavy on memory side. If we have large amount of non-zeroes or non-zero elements are much greater than number of zero elements, it is prefered to store in dense as there is no gain in using other storage type. As they are contiguous memory which can be cached and making operations faster than other containers.

Band Storage

It is a sparse whose non-zero entries are confined to a diagonal band, comprising the main diagonal and zero or more diagonals on either side. It is also stored similarly as sparse tensor.

Sparse Storage

It is a large data which has large amount of zeroes or zero elements are much greater than number of non-zero elements, it's faster to perform computation by iterating through the non-zero elements. They are compressed using various algorithms or data structure such as CSR, maps, etc.

Current Implementation

tensor_storage

It is an empty base class which is used to know if it is a tensor storage or not.

dense_storage

It is an empty class which is used to know if it is a dense tensor storage or not, which inherits from tensor_storage.

sparse_storage

It is an empty class which is used to know if it is a sparse tensor storage or not, which inherits from tensor_storage.

band_storage

It is an empty class which is used to know if it is a band tensor storage or not, which inherits from tensor_storage.

`compressed_map`

compressed_map is a sparse storage which uses std::unordered_map<const size_t, T> to store data. It stores data by mapping stride or index of data element with data element but it will ignore zeroes and only stores non-zero elements.

Prototype

template <typename T, typename A = std::allocator<std::pair<const ptrdiff_t, T>>>
struct compressed_map;

Example

lets say A = [1,1,0,0,0,0,0,0,0,2,3,4] // container size = 11
Now with compressed_map the storage looks like
C = [(0,1),(1,1),(9,2),(10,3),(11,4)] => ( idx, el ) // container size = 5

auto t1 = tensor<float,dynamic_extents<>,first_order, compressed_map<float> >(dynamic_extents<>{1,2,3,4},0.f);
auto t2 = tensor<float,dynamic_extents<>,first_order, compressed_map<float> >(dynamic_extents<>{1,2,3,4},4.f);
auto t3 = tensor<float,dynamic_extents<>,first_order, compressed_map<float> >(dynamic_extents<>{1,2,3,4});

t2[0] = 0;
t2[1] = 0;

t2.prune(); // prunes zeroes from the `std::unordered_map` after modifying the data

t3[0] = 1;

Conclusion from above code

t1 has the container size of 0
t2 had the container size of 24 but after replacing elements to zero and calling prune has the container size of 22
t3 previously had the container size of 0 but after replacing to 1 it has the container size of 1

Problems with compressed_map and Sparse Storage

std::unordered_map does not contain data member function which returns the pointer to the memory where it data is stored but it's not the with only this implementation but with Sparse Storage and Band Storage as they are in compressed form and can have noncontiguous memory which undermines or limitation of current tensor implementation. tensor heavily relies on pointer to the contiguous memory and uncompressed form of data for prod functions, algorithms, etc. There is also decrease in performance of tensor contractions or tensor operation which is a trade off between access time and memory.

`default_storage` (Dense Storage)

It is used to decide to which dense structure to use between std::array and std::vector structure according to the extents type for example static extents or dynamic extents. tensor has default storage type default_storage but if you other storage or user-defined storage you can simply pass your storage inheriting from three types dense_storage, sparse_storage or band_storage for better support.

Prototype

template <typename T, typename E, typename A, typename = void>
struct default_storage;

Custom Storage Types

First you have to decide in which category you storage belongs dense_storage, sparse_storage or band_storage.
Inherit the storage type for better support specially sparse and band storage
Implement your storage containing interface similar to std::vector
Just pass the struct or class to the tensor

Example

using namespace boost::numeric::ublas;
template<typename T>
struct custom_storage : storage::sparse_storage{/* Implementation */}

auto t = tensor< float, dynamic_extents<>, first_order, custom_storage<float> >{1,2,3,4};

Missing

To mitigate problems which is mentioned above in compressed_map and sparse storage we yet to come up with a interface similar to std::iterator which allow the access of data elements as sparse storage are compressed form of data which cannot be iterated directly by just getting pointer which current tensor exploits to do tensor operations but for sparse storage we have to employ some other interface which itself is a huge and time consuming and cannot be completed in GSoc period because we all have to agree and design a new interface for sparse storage or band storage which is time consuming and a huge task itself.

Acknowledgement

We both would like to thank our mentor Cem for his constant support and help in achieving our goals. We always find him helpful and he was always easy to reach for help or discussion regarding the work. We would also like to thank Google for the Google Summer of Code Programme, without which all these wouldn't be possible. Lastly, we express our gratitude to our parents for helping and providing us with all the indirect or direct needs for carrying out our work nicely in our homes.

Boost

Quick Links

Ashar's wiki

Home
Project Proposal
Milestones and Tasks
Implementation
- Commit History
- Diff Comparison
Documentation
Discussions
Examples
- Expression
- ublas::apply
Experimental features
- std::variant based Runtime Optimizer

Amit's wiki

Project Proposal
Milestones and Tasks
Implementation
- Commit History
- Diff Comparison
Documentation
Discussions
- Adding Design Policy and Design Improvements
Example Code
- Static Extents and Static Stride
- Slice and Subtensor

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tensor Storage

Introduction

There are three types of storage :-

Dense Storage

Band Storage

Sparse Storage

Current Implementation

`compressed_map`

`default_storage` (Dense Storage)

Custom Storage Types

Missing

Acknowledgement

Quick Links

Ashar's wiki

Amit's wiki

Clone this wiki locally

Tensor Storage

Introduction

There are three types of storage :-

Dense Storage

Band Storage

Sparse Storage

Current Implementation

compressed_map

default_storage (Dense Storage)

Custom Storage Types

Missing

Acknowledgement

Quick Links

Ashar's wiki

Amit's wiki

Clone this wiki locally

`compressed_map`

`default_storage` (Dense Storage)