Our work on adaptive sampling for classification problems has been accepted and will appear in IJCAI (International Joint Conference on Artificial Intelligence) 17 in Melbourne. Read the pre-print version here at href=http://www.ijcai.org/proceedings/2017/0457.pdf
Abstract
Learning from positive and unlabeled data frequently
occurs in applications where only a subset
of positive instances is available while the rest
of the data are unlabeled. In such scenarios, often
the goal is to create a discriminant model that
can accurately classify both positive and negative
data by modelling from labeled and unlabeled instances.
In this study, we propose an adaptive sampling
(AdaSampling) approach that utilises prediction
probabilities from a model to iteratively update
the training data. Starting with equal prior probabilities
for all unlabeled data, our method “wraps”
around a predictive model to iteratively update
these probabilities to distinguish positive and negative
instances in unlabeled data. Subsequently, one
or more robust negative set(s) can be drawn from
unlabeled data, according to the likelihood of each
instance being negative, to train a single classification
model or ensemble of models.