Como calcular o número de parâmetros de uma rede LSTM?
Existe uma maneira de calcular o número total de parâmetros em uma rede LSTM.
Encontrei um exemplo, mas não tenho certeza de quão corretoesta é ou se eu entendi corretamente.
Por exemplo, considere o seguinte exemplo: -
from keras.models import Sequential from keras.layers import Dense, Dropout, Activation from keras.layers import Embedding from keras.layers import LSTM model = Sequential() model.add(LSTM(256, input_dim=4096, input_length=16)) model.summary()
Output____________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ==================================================================================================== lstm_1 (LSTM) (None, 256) 4457472 lstm_input_1[0][0] ==================================================================================================== Total params: 4457472 ____________________________________________________________________________________________________
As per My understanding
n
is the input vector lenght. Andm
is the number of time steps. and in this example they consider the number of hidden layers to be 1.Hence according to the formula in the post.
4(nm+n^2)
in my examplem=16
;n=4096
;num_of_units=256
4*((4096*16)+(4096*4096))*256 = 17246978048
Why is there such a difference? Did I misunderstand the example or was the formula wrong ?