¿Cómo calcular el número de parámetros de una red LSTM?
¿Hay alguna manera de calcular el número total de parámetros en una red LSTM?
He encontrado un ejemplo pero no estoy seguro de qué tan correctoesta es o si lo he entendido correctamente.
Por ejemplo, considere el siguiente ejemplo: -
from keras.models import Sequential from keras.layers import Dense, Dropout, Activation from keras.layers import Embedding from keras.layers import LSTM model = Sequential() model.add(LSTM(256, input_dim=4096, input_length=16)) model.summary()
Output____________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ==================================================================================================== lstm_1 (LSTM) (None, 256) 4457472 lstm_input_1[0][0] ==================================================================================================== Total params: 4457472 ____________________________________________________________________________________________________
As per My understanding
n
is the input vector lenght. Andm
is the number of time steps. and in this example they consider the number of hidden layers to be 1.Hence according to the formula in the post.
4(nm+n^2)
in my examplem=16
;n=4096
;num_of_units=256
4*((4096*16)+(4096*4096))*256 = 17246978048
Why is there such a difference? Did I misunderstand the example or was the formula wrong ?