[인공지능 논문 Review - 09] Unravelling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks
1. Juho Lee, Yoonho Lee, Jungtaek Kim, Adam R. Kosiorek, Seungjin Choi, Yee Whye Teh (2019),
Proceedings of the Thirty-Sixth International Conference on Machine Learning (ICML-2019),
Long Beach, California, USA, June 9-15, 2019.
(earlier version in preprint arXiv:1810.00825 )
2. Juho Lee, Lancelot James, Seungjin Choi, and François Caron (2019),
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS-2019),
Naha, Okinawa, Japan, April 16-18, 2019. (oral)
(earlier version in preprint arXiv:1810.01778 )
3. Yoonho Lee and Seungjin Choi (2018),
in Proceedings of the Thirty-Fifth International Conference on Machine Learning (ICML-2018),
Stockholm, Sweden, July 10-15, 2018.
(earlier version in preprint arXiv:1810.05558 )
4. Saehoon Kim, Jungtaek Kim, and Seungjin Choi (2018),
in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-2018),
New Orleans, Louisiana, USA, February 2-7, 2018.
5. Juho Lee, Creighton Heaukulani, Zoubin Ghahramani, Lancelot James, and Seungjin Choi (2017),
in Proceedings of the International Conference on Machine Learning (ICML-2017),
Sydney, Australia, August 6-11, 2017.
(earlier version in preprint arXiv:1702.08239 )
6. Saehoon Kim and Seungjin Choi (2017),
in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-2017),
San Francisco, California, USA, February 4-9, 2017.
[Meta-Learning: Understanding Feature Representations for Few-Shot Tasks]
(M. Goldblum et al., ICML-2020)
This paper delves into the differences between features learned by meta-learning and classical training.
A few observations are made to better understand why meta-learned representations can improve few-shot classification. Based on them, they relate the “Reptile” to the consensus optimization and improve its performance by enforcing a simple consensus penalty.
In fact, it reminds me of the implicit gradient method [Rajeswaran et al., 2019] where the proximal regularization is introduced in the inner-loop optimization so that the outer-loop optimization does not require the backpropagation through the inner-loop.
논문보기 링크↓
https://arxiv.org/pdf/2002.06753.pdf