Specification and Simulation of Statistical Query
Algorithms for Efficiency and Noise Tolerance

Javed A. Aslam 
Department of Computer Science
Dartmouth College
Hanover, NH 03755
Scott E. Decatur y
Laboratory for Computer Science
Massachusetts Institute of Technology
Cambridge, MA 02139

Abstract
A recent innovation in computational learning theory is the statistical query (SQ)
model. The advantage of specifying learning algorithms in this model is that SQ
algorithms can be simulated in the PAC model, both in the absence and in the presence
of noise. However, simulations of SQ algorithms in the PAC model have nonoptimal
time and sample complexities. In this paper, we introduce a new method for specifying
statistical query algorithms based on a type of relative error and provide simulations
in the noisefree and noisetolerant PAC models which yield more efficient algorithms.
Requests for estimates of statistics in this new model take the form: ``Return an
estimate of the statistic within a 1 \Sigma  factor, or return `?', promising that the statistic
is less than `.'' In addition to showing that this is a very natural language for specifying
learning algorithms, we also show that this new specification is polynomially equivalent
to standard SQ, and thus, known learnability and hardness results for statistical query
learning are preserved.
We then give highly efficient PAC simulations of relative error SQ algorithms. We
show that the learning algorithms obtained by simulating efficient relative error SQ
algorithms in both the absence of noise and in the presence of malicious noise have
roughly optimal sample complexity. We also show that the simulation of efficient relative 
error SQ algorithms in the presence of classification noise yield learning algorithms
at least as efficient as those obtained through standard methods, and in some cases improved, 
roughly optimal results are achieved. The sample complexities for all of these
simulations are based on the d  metric which is a type of relative error metric useful
for quantities which are small or even zero. We show that uniform convergence with
respect to the d  metric yields ``uniform convergence'' with respect to (; `) accuracy.
Finally, while we show that many specific learning algorithms can be written as
highly efficient relative error SQ algorithms, we also show, in fact, that all SQ algorithms 
can be written efficiently by proving general upper bounds on the complexity
of (; `) queries as a function of the accuracy parameter ''. As a consequence of this
result, we give general upper bounds on the complexity of learning algorithms achieved
through the use of relative error SQ algorithms and the simulations described above.
References
[1] Dana Angluin and Leslie G. Valiant. Fast probabilistic algorithms for Hamiltonian
circuits and matchings. Journal of Computer and System Sciences, 18(2):155--193, April
1979.
[2] Javed Aslam and Scott Decatur. General bounds on statistical query learning and
PAC learning with noise via hypothesis boosting. In Proceedings of the 34 th Annual
Symposium on Foundations of Computer Science, pages 282--291, November 1993. To
appear in Information and Computation.
[3] Javed Aslam and Scott Decatur. On the sample complexity of noisetolerant learning.
Information Processing Letters, 57:189--195, 1996.
[4] Avrim Blum, Merrick Furst, Jeffery Jackson, Michael Kearns, Yishay Mansour, and
Steven Rudich. Weakly learning DNF and characterizing statistical query learning using
Fourier analysis. In Proceedings of the 26 th Annual ACM Symposium on the Theory of
Computing, May 1994.
[5] Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth.
Learnability and the VapnikChervonenkis dimension. Journal of the ACM, 36(4):929--
865, 1989.
[6] Scott Decatur. Statistical queries and faulty PAC oracles. In Proceedings of the Sixth An
nual ACM Workshop on Computational Learning Theory, pages 262--268. ACM Press,
July 1993.
[7] Scott Decatur. Learning in hybrid noise environments using statistical queries. In
D. Fisher and H. J. Lenz, editors, Learning from Data: Artificial Intelligence and Statistics 
V. Springer Verlag, 1996. Preliminary version appeared in Proceedings of the Fifth
International Workshop on Artificial Intelligence and Statistics, pages 175--185, January
1995.
[8] Scott Decatur and Rosario Gennaro. On learning from noisy and incomplete examples.
In Proceedings of the Eigth Annual ACM Workshop on Computational Learning Theory.
ACM Press, July 1995.
[9] Andrzej Ehrenfeucht, David Haussler, Michael Kearns, and Leslie Valiant. A general
lower bound on the number of examples needed for learning. Information and Computation, 
82(3):247--251, September 1989.
[10] Yoav Freund. Boosting a weak learning algorithm by majority. In Proceedings of the
Third Annual Workshop on Computational Learning Theory, pages 202--216. Morgan
Kaufmann, 1990.
[11] Yoav Freund. An improved boosting algorithm and its implications on learning complexity. 
In Proceedings of the Fifth Annual ACM Workshop on Computational Learning
Theory, pages 391--398. ACM Press, 1992.
[12] David Haussler. Quantifying inductive bias: AI learning algorithms and Valiant's learning 
framework. Artificial Intelligence, 36:177--221, 1988.
[13] David Haussler. Decision theoretic generalizations of the PAC model for neural net and
other learning applications. Information and Computation, 100:78--150, 1992.
[14] Michael Kearns. Efficient noisetolerant learning from statistical queries. In Proceedings
of the 25 th Annual ACM Symposium on the Theory of Computing, pages 392--401, San
Diego, 1993.
[15] Michael Kearns and Ming Li. Learning in the presence of malicious errors. SIAM
Journal on Computing, 22(4):807--837, 1993.
[16] Philip D. Laird. Learning from Good and Bad Data. Kluwer international series in
engineering and computer science. Kluwer Academic Publishers, Boston, 1988.
[17] D. Pollard. Rates of uniform almostsure convergence for empirical processes indexed
by unbounded classes of functions. Manuscript, 1986.
[18] Robert Schapire. The strength of weak learnability. Machine Learning, 5(2):197--226,
1990.
[19] Leslie Valiant. A theory of the learnable. Communications of the ACM, 27(11):1134--
1142, November 1984.