entry

entry

class SgdOptimizer(learning_rate=None)[source]

随机梯度下降优化器。 定义参数为x,梯度为grad,第i次更新梯度有

\[x_{i+1} = x_{i} - \eta * grad\]
Parameters:

learning_rate (float) – 学习率

class AdagradOptimizer(learning_rate=None, initial_accumulator_value=None, hessian_compression_times=1, warmup_steps=0, weight_decay_factor=0.0)[source]

Adagrad优化器,论文可参考 http://jmlr.org/papers/v12/duchi11a.html 定义参数为x,梯度为grad,第i次更新梯度时有

\[ \begin{align}\begin{aligned}g_{i+1} = g_{i} + grad^2\\x_{i+1} = x_{i} - \frac{\eta}{\sqrt{g_i + \epsilon}} grad\end{aligned}\end{align} \]
Parameters:
  • learning_rate (float) – 学习率

  • initial_accumulator_value (float) – accmulator的起始值

  • hessian_compression_times (float) – 在训练的时候,对accumulator使用hessian sketching算法进行压缩。1代表没有压缩,值越大,压缩效果越好

  • warmup_steps (int) – 已弃用

class AdamOptimizer(learning_rate=None, beta1=0.9, beta2=0.99, use_beta1_warmup=False, weight_decay_factor=0.0, use_nesterov=False, epsilon=0.01, warmup_steps=0)[source]

Adam优化器,论文可参考 https://arxiv.org/abs/1412.6980

定义参数为x,梯度为grad,第i次更新梯度时有

\[ \begin{align}\begin{aligned}m_{i+1} = \beta_1 * m_i + (1 - \beta_1) * grad\\v_{i+1} = \beta_2 * v_i + (1 - \beta_2) * grad^2\\w_{i+1} = w_i - \eta * \frac{m_i}{\sqrt{v_i + \epsilon}}\end{aligned}\end{align} \]
Parameters:
  • learning_rate (float) – 学习率

  • beta1 (float) – 一阶矩估计的指数衰减率

  • beta2 (float) – 二阶矩估计的指数衰减率

  • epsilon (float) – 用来保证除数不为0的偏移量

  • warmup_steps (int) – 已弃用

class BatchSoftmaxOptimizer(learning_rate=None)[source]

Batch softmax优化器,论文可参考 https://research.google/pubs/pub48840/

Parameters:

learning_rate (float) – 学习率

class FtrlOptimizer(learning_rate=None, initial_accumulator_value=None, beta=None, warmup_steps=0, l1_regularization=None, l2_regularization=None)[source]

FTRL优化器,论文可参考 https://dl.acm.org/citation.cfm?id=2488200

Parameters:
  • initial_accumulator_value (float) – accumulator的起始值

  • beta (float) – 论文中的beta值

class ZerosInitializer[source]

全0初始化器,将会把embedidng的初始值设为全0

class ConstantsInitializer(constant)[source]

常数初始化器,将会把embedidng的初始值设为常数

class Fp16Compressor[source]

当模型服务时,将会对embedding进行Fp16编码,从而达到在服务时节省内存的目的

class Fp32Compressor[source]

当模型服务时,将会对embedding进行Fp32编码