PhD Defense
Jun Du
Active Learning with Generalized Queries
Time:
Place:
Supervisor:
Thesis Examiners:
Extra-Departmental
Examiner:
External Examiner:
2:30 p.m.
Middlesex College, Room 320
Dr. Charles Ling
Dr. Sylvia Osborn
Dr. Nazim Madhavji
Dr. Xiaoming Liu (Statistics)
Dr. Huajie Zhang (University of New Brunswick)
Abstract:
In contrast to supervised learning, active learning can usually
achieve same predictive accuracy with much fewer labeled examples,
thus significantly reducing the labeling cost. However, previous
studies of active leaning mostly assume that the learner can only ask
specific queries (i.e., require labels for specific examples with all
feature values). For instance, to predict osteoarthritis from a
patient data set with 30 features, specific queries always contain
values of all these 30 features, many of which may be irrelevant. A
more natural way is to ask generalized queries, such as are people
over age 50 with knee pain likely to have osteoarthritis? (with only
two relevant features, age and type of pain). As one such generalized
query can often represent a set of specific ones, the corresponding
answer is also applicable to this whole set of specific queries.
Therefore, the active learner can obtain more information from each
generalized query, and consequently improve the learning effectively
and efficiently.
In this thesis, we assume that the oracle is capable of answering such generalized queries, and develop different algorithms to implement such active learning with generalized queries, according to different real-world scenarios. The theoretical study proves that the query complexity of active learning with generalized queries is significantly lower than active learning with specific ones. The empirical study for a variety scenarios also demonstrates that, to achieve certain predictive accuracy, active learning with generalized queries requires to ask significantly fewer queries (or requires to spend significantly lower labeling cost), compared with active learning with specific ones.
Also from this web page:

