# Practical Algorithms for On-line Sampling

@inproceedings{Domingo1998PracticalAF, title={Practical Algorithms for On-line Sampling}, author={Carlos Domingo and Ricard Gavald{\`a} and Osamu Watanabe}, booktitle={Discovery Science}, year={1998} }

One of the core applications of machine learning to knowledge discovery is building a hypothesis (such as a decision tree or neural network) from a given amount of data, so that we can later use it to predict new instances of the data. In this paper, we focus on a particular situation where we assume that the hypothesis we want to use for prediction is a very simple one so the hypotheses class is of feasible size. We study the problem of how to determine which of the hypotheses in the class is… Expand

#### 17 Citations

On-Line Sampling Methodsfor Discovering Association

- 1998

Association rule discovery is one of the prototypical problems in data mining. In this problem, the input database is assumed to be very large and most of the algorithms are designed to minimize the… Expand

Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms

- Computer Science
- Discovery Science
- 1999

This paper proposes an adaptive sampling algorithm that solves a general problem covering many problems arising in applications of discovery science, and describes how different instantiations of it can be applied to scale up knowledge discovery problems that appear in several areas. Expand

Adaptive Sampling Methods for Scaling Up Knowledge Discovery Algorithms

- Computer Science
- Data Mining and Knowledge Discovery
- 2004

This paper proposes an adaptive sampling method that solves a general problem covering many actual problems arising in applications of discovery science, and proves the correctness of the method and estimates its efficiency theoretically. Expand

Scaling Up a Boosting-Based Learner via Adaptive Sampling

- Computer Science
- PAKDD
- 2000

This paper presents an experimental evaluation of a boosting based learning system that can be run efficiently over a large dataset and provides experimental evidence that the method is as accurate as the equivalent algorithm that uses all the dataset but much faster. Expand

From Computational Learning Theory to Discovery Science

- Computer Science
- ICALP
- 1999

It is pointed out in this paper that "adaptivity" is one of the important issues when the authors consider applications of learning techniques, and a learning algorithm with this feature is proposed. Expand

Sequential sampling techniques for algorithmic learning theory

- Computer Science, Mathematics
- Theor. Comput. Sci.
- 2005

This paper presents two typical sequential sampling algorithms and explains when and how to use such sampling algorithms for designing adaptive learning algorithms. Expand

Faster Near-Optimal Reinforcement Learning: Adding Adaptiveness to the E3 Algorithm

- Computer Science
- ALT
- 1999

This paper shows how the algorithm can be improved by substituting the exploration phase, that builds a model of the underlying Markov decision process by estimating the transition probabilities, by an adaptive sampling method more suitable for the problem. Expand

How Can Computer Science Contribute to Knowledge Discovery?

- Computer Science
- SOFSEM
- 2001

This paper explains some random sampling techniques for speeding up learning algorithms and making them applicable to large data sets and shows some algorithms obtained by using these techniques. Expand

A sequential sampling algorithm for a general class of utility criteria

- Computer Science
- KDD '00
- 2000

A sampling algorithm that solves this problem by issuing a small number of database queries while guaranteeing pre ise bounds on on den e and quality of solutions and shows that the algorithm works for all utilities that an be estimated with bounded error. Expand

Finding the Critical Sampling Size of Big Datasets July

- 2017

Big Data allied to the Internet of Things nowadays provides a powerful resource that various organizations are increasingly exploiting for applications ranging from decision support, predictive and… Expand

#### References

SHOWING 1-10 OF 13 REFERENCES

A decision-theoretic generalization of on-line learning and an application to boosting

- Computer Science
- EuroCOLT
- 1995

The model studied can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting, and the multiplicative weightupdate Littlestone Warmuth rule can be adapted to this model, yielding bounds that are slightly weaker in some cases, but applicable to a considerably more general class of learning problems. Expand

Theory and Applications of Agnostic PAC-Learning with Small Decision Trees

- Computer Science
- ICML
- 1995

The performance of this theoretically founded algorithm T2, an agnostic PAC-learning of decision trees of at most 2 levels, is evaluated on 15 common “real-world” datasets, and it is shown that for most of these datasets T2 provides simple decision trees with little or no loss in predictive power. Expand

Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications

- Computer Science, Mathematics
- Inf. Comput.
- 1992

Theorems on the uniform convergence of empirical loss estimates to true expected loss rates for certain hypothesis spaces H are given, and it is shown how this implies learnability with bounded sample size, disregarding computational complexity. Expand

Solving Multiclass Learning Problems via Error-Correcting Output Codes

- Computer Science
- J. Artif. Intell. Res.
- 1995

It is demonstrated that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems. Expand

Toward efficient agnostic learning

- Computer Science
- COLT '92
- 1992

An investigation of generalizations of the Probably Approximately Correct (PAC) learning model that attempt to significantly weaken the target function assumptions is initiated, providing an initial outline of the possibilities for agnostic learning. Expand

A theory of the learnable

- Mathematics, Computer Science
- STOC '84
- 1984

This paper regards learning as the phenomenon of knowledge acquisition in the absence of explicit programming, and gives a precise methodology for studying this phenomenon from a computational viewpoint. Expand

Bagging predictors

- Computer Science
- Machine Learning
- 2004

Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. Expand

Maximizing the Predictive Value of Production Rules

- Computer Science
- Artif. Intell.
- 1990

P predictive value maximization (PVM), a heuristic search procedure through the hypothesis space of conjunctions and disjunctions of variables and their cutoff values, is outlined, where the goal is to find the best combination of tests for making a diagnosis. Expand

An Introduction to Computational Learning Theory

- Computer Science
- 1994

The probably approximately correct learning model Occam's razor the Vapnik-Chervonenkis dimension weak and strong learning learning in the presence of noise inherent unpredictability reducibility in… Expand

Algorithms

- 1992

Most of the articles appearing in this column are oriented toward Common Lisp. However, a wider community of Lisp dialects still exists. One that is of particular interest is GNU Emacs Lisp---the… Expand