科学研究
报告题目:

A Systematic View of Information-Based Optimal Subdata Selection: Algorithm Development, Performance Evaluation, and Application in Financial

报告人:

杨敏 教授 (University of Illinois at Chicago)

报告时间:

报告地点:

腾讯会议 ID:486 210 464

报告摘要:

With the urgent need of analyzing extraordinary amount of data, information-based optimal subdata selection (IBOSS) approach has gained considerable attention in the recent literature due to its ability to maintain rich information of the full data. On the other hand, there lack of systematically exploring the framework, especially the characterization of the optimal subset when the model is more complex than the 1st-order linear models. Motivated by a real finance case study concerning the impact of corporate attributes on _rm value, we systematically explore the framework consisting of the exact steps one can follow when employing the idea of IBOSS for data reduction. In the context of the 2nd-order models, we develop a novel algorithm of selecting an informative subdata. We also provide a thorough evaluation of the performance of the proposed algorithm from the standpoints of both predictions and variable selections, the latter of which was important for complex models but has not been given enough attention in the IBOSS _eld. Empirical studies including a real example demonstrate the new algorithm adequately addresses the trade-o_ between the computation complexity and statistical efficiency, one of six core research directions for theoretical data science research proposed by the US national science foundation (NSF, 2016). The real case study demonstrates the potential impact of IBOSS strategy in scientific fields beyond statistics. In particular, we note that finance field, where the speed is critically important, is a promising area for applications of IBOSS.