You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is an implementation that use O(T) space for data, where T is number of transitions within DenseBlock, for the simple model where totalLayer L = 40 and growthRate k = 12, number of transition in each DenseBlock is 12.
In comparison, the original implementation will take O(T^2) amount of space for data.
This version currently runs (without dropout) 6 iters/second, and use space less than 2 GB on GPU for L=40,k=12 model.
The way the linear space is implemented is:
(i) setting TensorDescriptor for cudnn explicitly, allowing stride between different image. (That means initially data deployed was not continuous, our process will fill in the blank data in the middle.)
(ii) In the backward phase, we first use BN forward and ReLU forward to reconstruct the corresponding data for a transition. Then we apply the normal backward procedure.
How to use it:
clone the source
mkdir build
cmake .. (if you are using atlas) OR cmake .. -DBLAS=open (if you are using openblas)