Stochastic gradient descent (SGD) has become the workhorse behind many machine learning problems. Optimization and sampling errors are two contradictory factors responsible for the statistical behavior of SGD. In this talk, we report our generalization analysis of SGD by considering simultaneously the optimization and sampling errors. We remove some restrictive assumptions in the literature and significantly improve the existing generalization bounds. Our results help to understand how to stop SGD early to get a best statistical behavior.