求解二次损失函数优化问题的分布式共轭梯度算法A distributed conjugate Gradient method for solving quadratic loss function optimization problems
于洁;孟文辉;
摘要(Abstract):
提出一种在分布式环境中利用共轭梯度法优化二次损失函数的算法,该算法利用本地子机器局部损失函数的一阶导数信息更新迭代点,在每次迭代中执行两轮通信,通过通信协作使主机器上的损失函数之和最小化.经过理论分析,证明该算法具有线性收敛性.在模拟数据集上与分布式交替方向乘子法进行对比,结果表明分布式共轭梯度算法更匹配于集中式性能.通过实验发现,增加子机器上的样本量不仅能提高收敛速度,也能降低计算误差.
关键词(KeyWords): 大数据;分布式优化;共轭梯度法;二次损失函数;线性收敛
基金项目(Foundation): 国家自然科学基金(11201373);; 陕西省教育厅自然科学基金(14JK747)
作者(Authors): 于洁;孟文辉;
参考文献(References):
- [1] Nedic A, Ozdaglar A. Distributed subgradient methods for multi-agent optimization[J]. IEEE Transactions on Automatic Control, 2009,54(1):48-61.
- [2] Yuan Kun, Ling Qing, Yin Wotao. On the convergence of decentralized gradient descent[J]. SIAM Journal on Optimization, 2016,26(3):1835-1854.
- [3] Shi Wei, Ling Qing, Wu Gang, et al. Extra:An exact first-order algorithm for decentralized consensus optimization[J]. SIAM Journal on Optimization, 2015,25(2):944-966.
- [4] Hendrikx H, Xiao L, Bubeck S, et al. Statistically preconditioned accelerated gradient method for distributed optimization[C]//International Conference on Machine Learning. PMLR, 2020:4203-4227.
- [5] Duchi J C, Agarwal A, Wainwright M J. Distributed dual averaging in Networks[J]. Advances in Neural Information Processing Systems, 2010,23(1):550-558.
- [6] Duchi J C, Agarwal A, Wainwright M J. Dual averaging for distributed optimization:Convergence analysis and network scaling[J]. IEEE Transactions on Automatic Control, 2012,57(3):592-606.
- [7] He Bingsheng, Hou Linsheng, Yuan Xiaoming. On full jacobian decomposition of the augmented lagrangian method for separable convex programming[J]. SIAM Journal on Optimization, 2015,25(4):2274-2312.
- [8] He Bingsheng, Xu Hongkun, Yuan Xiaoming. On the proximal jacobian decomposition of ALM for multiple-block separable convex minimization problems and its relationship to ADMM[J]. Journal of Scientific Computing, 2016,66(3):1204-1217.
- [9] Ling Qing, Ribeiro A. Decentralized linearized alternating direction method of multipliers[R]. IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), 2014:5447-5451.
- [10] Boyd S, Parikh N, Chu E, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers[J]. Foundations and Trends in Machine Learning, 2011,3(1):1-122.
- [11]许浩锋,凌青.分布式在线交替方向乘子法[J].计算机应用, 2015,35(6):1595-1599.
- [12] Bajovic D, Jakoveti D, Krejic N, et al. Newton-like method with diagonal correction for distributed optimization[J]. SIAM Journal on Optimization, 2017,27(2):1171-1203.
- [13] Mokhtari A, Ling Qing, Ribeiro A. An approximate Newton method for distributed optimization[R]. IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP),2015:2959-2963.
- [14] Zhang Yuchen, Xiao Lin. Communication-efficient Distributed Optimization of Self-concordant Empirical Loss[M]. Cham:Springer, 2018.
- [15]王永丽,王栋,贺国平,等.求解一类特殊二次规划问题的分布式牛顿算法[J].数学的实践与认识,2015,45(5):209-218.
- [16]孙文瑜,徐成贤,朱德通.最优化方法[M].北京:高等教育出版社, 2010.
- [17]史荣昌,魏丰.矩阵分析[M].北京:北京理工大学出版社, 2005.