11.19 题目:Distributed Inference for Linear Support Vector Machine

发布时间:2019-11-18   浏览次数:42

报告题目:Distributed Inference for Linear Support Vector Machine

报告地点:C207 11月19日周二上午9:00


报告摘要:

The growing size of modern data brings many new challenges to existing statistical inference methodologies and theories, and calls for the development of distributed inferential approaches. This paper studies distributed inference for linear support vector machine (SVM) for the binary classification task. Despite a vast literature on SVM, much less is known about the inferential properties of SVM, especially in a distributed setting. In this paper, we propose a multi-round distributed linear-type (MDL) estimator for conducting inference for linear SVM. The proposed estimator is computationally efficient. In particular, it only requires an initial SVM estimator and then successively refines the estimator by solving simple weighted least squares problem. Theoretically, we establish the Bahadur representation of the estimator. Based on the representation, the asymptotic normality is further derived, which shows that the MDL estimator achieves the optimal statistical efficiency, i.e., the same efficiency as the classical linear SVM  applying to the entire dataset in a single machine setup. Moreover, our asymptotic result avoids the condition on the number of machines or data batches, which is commonly assumed in distributed estimation literature, and allows the case of diverging dimension. We provide simulation studies to demonstrate the performance of the proposed MDL estimator.



报告人简介:

王小舟,上海交通大学数学科学学院统计学博士研究生。2011-2015年在上海交通大学致远学院数学与应用数学专业学习,2015年直博,师从刘卫东教授。2018-2019学年春季学期赴美国纽约大学Stern商学院进行访问交流。研究领域包括统计推断、机器学习和数据分析,在机器学习顶级期刊《Journal of Machine Learning Research》和统计学权威期刊发表数篇学术论文。