目的制限に基づく通信なしマルチエージェント協調行動学習とその効果の証明

　HOME ＞同研究会の論文誌(論文単位) ＞文献詳細

*商品について

表紙はついていません（本文のみ中綴じ製本です）。
号単位でも購入できます。
すべてモノクロ印刷です。
Extended Summaryはついていません。

・会員価格 ¥550

・一般価格 ¥770

こちらはBookPark「電気学会　電子図書館（IEEJ Electronic Library）」による文献紹介ページです。

電気学会会員の方はこちらから一旦ログインのうえ、マイページからお入りください。
会員価格で購入することができます。

非会員の方はログインの必要はありません。このままお進みください。

■論文No.
■ページ数	10ページ
■発行日	2020/01/01
■タイトル	目的制限に基づく通信なしマルチエージェント協調行動学習とその効果の証明
■タイトル(英語)	Theoretical Learning Goal Selection for Non-Communicative Multi-Agent Cooperation
■著者名	上野　史（電気通信大学），髙玉　圭樹（電気通信大学）
■著者名(英語)	Fumito Uwano (The University of Electro-Communications), Keiki Takadama (The University of Electro-Communications)
■価格	会員 ¥550 一般 ¥770
■書籍種類	論文誌(論文単位)
■グループ名	【C】電子・情報・システム部門
■本誌	電気学会論文誌C（電子・情報・システム部門誌） Vol.140 No.1 （2020）特集：電子回路関連技術
■本誌掲載ページ	75-84ページ
■原稿種別	論文／日本語
■電子版へのリンク	https://www.jstage.jst.go.jp/article/ieejeiss/140/1/140_75/_article/-char/ja/
■キーワード	強化学習，マルチエージェントシステム，報酬設計　　reinforcement learning，multi-agent system，reward management
■要約(日本語)
■要約(英語)	This paper extended PMRL as the non-communicative and theoretical method for two agents, and proposed PLA as the method to be able to force agents to learn cooperative behavior for any number of agents. In addition, this paper adds the theoretic explanation for PLA that all agents achieve all purposes without spending the largest times. Concretely PLA forces each agent to avoid the more difficult purposes requiring many time to be reached by limiting the purpose which it can achieve, and it forces the agents to learn cooperative policy as achieving the appropriate purpose among the limited purposes. The experimental results in this paper derive that (1) PLA enables the agents to learn cooperative policy in the two grid world problems for three and five agents, and (2) PLA can force all agents to achieve all purposes in the problems with the minimum time.
■版　型	A4

目的制限に基づく通信なしマルチエージェント協調行動学習とその効果の証明

Theoretical Learning Goal Selection for Non-Communicative Multi-Agent Cooperation