学習初期のパラメータの急激な変化を抑制するための学習率の範囲を制御するAdam

　HOME ＞同研究会の論文誌(論文単位) ＞文献詳細

*商品について

表紙はついていません（本文のみ中綴じ製本です）。
号単位でも購入できます。
すべてモノクロ印刷です。
Extended Summaryはついていません。

・会員価格 ¥550

・一般価格 ¥770

こちらはBookPark「電気学会　電子図書館（IEEJ Electronic Library）」による文献紹介ページです。

電気学会会員の方はこちらから一旦ログインのうえ、マイページからお入りください。
会員価格で購入することができます。

非会員の方はログインの必要はありません。このままお進みください。

■論文No.
■ページ数	10ページ
■発行日	2022/10/01
■タイトル	学習初期のパラメータの急激な変化を抑制するための学習率の範囲を制御するAdam
■タイトル(英語)	A Method of Learning Rate Range Control for Adam to Suppress Sudden Changes of Parameters in Early Learning Stage
■著者名	行木　大輝（千葉工業大学大学院情報科学研究科情報科学専攻），山口　智（千葉工業大学情報科学部情報工学科）
■著者名(英語)	Daiki Nameki (Graduate School of Information and Computer Science, Chiba Institute of Technology), Satoshi Yamaguchi (Department of Computer Science, Chiba Institute of Technology)
■価格	会員 ¥550 一般 ¥770
■書籍種類	論文誌(論文単位)
■グループ名	【C】電子・情報・システム部門
■本誌	電気学会論文誌C（電子・情報・システム部門誌） Vol.142 No.10 （2022）特集：電子材料関連技術の最近の進展
■本誌掲載ページ	1156-1165ページ
■原稿種別	論文／日本語
■電子版へのリンク	https://www.jstage.jst.go.jp/article/ieejeiss/142/10/142_1156/_article/-char/ja/
■キーワード	ニューラルネットワーク，最適化アルゴリズム，Adam，AdaBound，RAdam，WarmUp 　　neural networks，optimization algorithms，Adam，AdaBound，RAdam，WarmUp
■要約(日本語)
■要約(英語)	Adam is one of the general optimization algorithms in neural networks. It can accelerate convergence speed while learning. It has, however, two problems. The first is that the final performance of a network trained by Adam, such as generalization ability, becomes worse than the one trained by SGD, in applications to large-scale networks. The second is that values of the learning rate tend to be large at the early learning stage; as a result, the values of network parameters, such as weights and bias, become too large by a first few iterations. In recent years, research has been conducted to solve these problems. AdaBound has been proposed for solving the first problem. This is a method switching dynamically from Adam to SGD. RAdam has also been proposed, for solving the second problem. This applies a method called WarmUp, which sets a small learning rate at the early learning stage and gradually increases it, to Adam. In this study, we propose to apply WarmUp to the upper limit of AdaBound's learning rate. The proposed algorithm prevents parameter updates at extremely large learning rates in the early learning stages. Therefore, more efficient learning can be expected than the conventional method. The proposed method has been applied to learning of some types of networks like CNN, ResNet, DenseNet and BERT. The results show that our method has improved performance compared to the traditional method, and an image classification task has shown a tendency to be more effective in large networks.
■版　型	A4

学習初期のパラメータの急激な変化を抑制するための学習率の範囲を制御するAdam

A Method of Learning Rate Range Control for Adam to Suppress Sudden Changes of Parameters in Early Learning Stage