Optimizer
- class optimizer.Adadelta(rho=0.9, epsilon=1e-06, *args, **kwargs)[source]
Bases:
optimizer.Optimizer
AdaDelta optimization algorithm
Update the parameters according to the rule
c = rho * c + (1. - rho) * gradient * gradient update = gradient * sqrt(d + epsilon) / (sqrt(c) + epsilon) parameter -= learning_rate * update d = rho * d + (1. - rho) * update * update
- Parameters
rho (float (default=0.9)) – Decay factor
epsilon (float (default=1e-6)) – Precision parameter to overcome numerical overflows
*args (list) – Class specialization variables.
**kwargs (dict) – Class Specialization variables.
- class optimizer.Adagrad(epsilon=1e-06, *args, **kwargs)[source]
Bases:
optimizer.Optimizer
Adagrad optimizer specialization
Update the parameters according to the rule
c += gradient * gradient parameter -= learning_rate * gradient / (sqrt(c) + epsilon)
- Parameters
epsilon (float (default=1e-6)) – Precision parameter to overcome numerical overflows
*args (list) – Class specialization variables.
**kwargs (dict) – Class Specialization variables.
- class optimizer.Adam(beta1=0.9, beta2=0.999, epsilon=1e-08, *args, **kwargs)[source]
Bases:
optimizer.Optimizer
Adam optimization algorithm
Update the parameters according to the rule
at = learning_rate * sqrt(1 - B2**iterations) / (1 - B1**iterations) m = B1 * m + (1 - B1) * gradient v = B2 * m + (1 - B2) * gradient * gradient parameter -= at * m / (sqrt(v) + epsilon)
- Parameters
beta1 (float (default=0.9)) – B1 factor
beta2 (float (default=0.999)) – B2 factor
epsilon (float (default=1e-8)) – Precision parameter to overcome numerical overflows
*args (list) – Class specialization variables.
**kwargs (dict) – Class Specialization variables.
- class optimizer.Adamax(beta1=0.9, beta2=0.999, epsilon=1e-08, *args, **kwargs)[source]
Bases:
optimizer.Optimizer
Adamax optimization algorithm
Update the parameters according to the rule
at = learning_rate / (1 - B1**iterations) m = B1 * m + (1 - B1) * gradient v = max(B2 * v, abs(gradient)) parameter -= at * m / (v + epsilon)
- Parameters
beta1 (float (default=0.9)) – B1 factor
beta2 (float (default=0.999)) – B2 factor
epsilon (float (default=1e-8)) – Precision parameter to overcome numerical overflows
*args (list) – Class specialization variables.
**kwargs (dict) – Class Specialization variables.
- class optimizer.Momentum(momentum=0.9, *args, **kwargs)[source]
Bases:
optimizer.Optimizer
Stochastic Gradient Descent with Momentum specialiation
Update the parameters according to the rule
v = momentum * v - lr * gradient parameter += v - learning_rate * gradient
- Parameters
momentum (float (default=0.9)) – Momentum value
*args (list) – Class specialization variables.
**kwargs (dict) – Class Specialization variables.
- class optimizer.NesterovMomentum(momentum=0.9, *args, **kwargs)[source]
Bases:
optimizer.Optimizer
Stochastic Gradient Descent with Nesterov Momentum specialiation
Update the parameters according to the rule
v = momentum * v - lr * gradient parameter += momentum * v - learning_rate * gradient
- Parameters
momentum (float (default=0.9)) – Momentum value
*args (list) – Class specialization variables.
**kwargs (dict) – Class Specialization variables.
- class optimizer.Optimizer(lr=0.001, decay=0.0, lr_min=0.0, lr_max=inf, *args, **kwargs)[source]
Bases:
object
Abstract base class for the optimizers
- Parameters
lr (float (default=2e-2)) – Learning rate value
decay (float (default=0.)) – Learning rate decay
lr_min (float (default=0.)) – Minimum of learning rate domain
lr_max (float (default=np.inf)) – Maximum of learning rate domain
*args (list) – Class specialization variables.
**kwargs (dict) – Class Specialization variables.
- class optimizer.RMSprop(rho=0.9, epsilon=1e-06, *args, **kwargs)[source]
Bases:
optimizer.Optimizer
RMSprop optimization algorithm
Update the parameters according to the rule
c = rho * c + (1. - rho) * gradient * gradient parameter -= learning_rate * gradient / (sqrt(c) + epsilon)
- Parameters
rho (float (default=0.9)) – Decay factor
epsilon (float (default=1e-6)) – Precision parameter to overcome numerical overflows
*args (list) – Class specialization variables.
**kwargs (dict) – Class Specialization variables.
- class optimizer.SGD(*args, **kwargs)[source]
Bases:
optimizer.Optimizer
Stochastic Gradient Descent specialization
Update the parameters according to the rule
parameter -= learning_rate * gradient
- Parameters
*args (list) – Class specialization variables.
**kwargs (dict) – Class Specialization variables.