Abstract
This teaching relates to predictive targeting. Training data are obtained with pairs of data. Each pair includes an ad opportunity context corresponding to an ad served to a plurality of audiences and a label vector having a plurality of labels, each of which indicates a reaction, with respect to the ad served, of a corresponding one of the audiences in the ad opportunity context. Based on the training data, model parameters of a joint predictive model are learned via machine learning based on an initialized model with initial model parameters by minimizing a loss in an iterative process. The learned joint predictive model is to be used to map an input context of an ad opportunity to an output label vector having a plurality of probabilities, each of which predicts a likelihood of a reaction of a corresponding one of the audiences to the input context of the ad opportunity.