Jerome Friedman’s paper titled, “On bias, variance, 0/1-loss, and the curse-of-dimensionality”, provides a great insight in to the way classification errors work.

The paper throws light on the way bias and variance conspire to make some of the highly biased methods perform well on test data. Naive Bayes works, KNN works and so do many such classifiers that are highly biased. This paper gives the actual math behind classification error and shows that the additive nature of bias and variance that holds good for estimation error cannot be generalized to classification error. There is a multiplier effect, which the author calls it  ``boundary bias'' that makes a biased method perform well. Also this paper provides the right amount of background to explore Domingos framework that provides a nice solution to the misclassification loss function decomposition, consistent with concepts of bias and variance.

Link to some of the points:

Summary of Friedman’s paper