Code for Linking losses for density ratio and class-probability estimation, ICML 2016

The aim of this MATLAB code is to replicate the tables of results and figures from the paper Linking losses for density ratio and class-probability estimation, appearing in ICML 2016.

Unzipping the code should reveal four subfolders: We describe how to run the experiments for each of Sections 8.1 -- 8.3.

Weight function analysis

For the weight function analysis, in the weight_function folder, simply run:

>> loss_regret_script;

You should see an output such as:

>> loss_regret_script;
reg = 0.3614 [lambda = 10^-8, gamma = 0; 1.8 secs]
max regret = 0.3614 [gamma = 0, lambda = 10^-8]

reg = 0.3821 [lambda = 10^-8, gamma = 0; 1.5 secs]
max regret = 0.3821 [gamma = 0, lambda = 10^-8]

reg = 0.5460 [lambda = 10^-8, gamma = 0; 1.1 secs]
max regret = 0.5460 [gamma = 0, lambda = 10^-8]

A plot mimicking Figure 1 of the paper should also be displayed.

Covariate shift adaptation

For the covariate shift experiments on the poly dataset, in the covariate_shift folder, simply run:

>> poly_script;

The script will go through each of the losses considered in Sec 8.2, and train a kernel model to estimate the density ratio. The NMSE on the test sample is reported. You should see output that mimics Table 2(a), such as:

Uniform & 1.2723 $\pm$ 0.0302 \\
KLIEP & 0.6916 $\pm$ 0.0136 \\
LSIF & 0.7742 $\pm$ 0.0217 \\
uLSIF & 0.7038 $\pm$ 0.0102 \\
...


For the experiments on the amazon dataset, in the covariate_shift folder, simply run:

>> amazon_script;

The script will go through each of the losses considered in Sec 8.2, and train a kernel model to estimate the density ratio. The pairwise disagreement on the test sample is reported. Following the generation of the feature mappings (after TF-IDF and SVD projection), you should see output that mimics Table 2(b), such as:

generating data trial #doing svd...done
...
Uniform & 0.1582 $\pm$ 0.0018
KLIEP & 0.1500 $\pm$ 0.0018
LSIF & 0.1500 $\pm$ 0.0019

Note that the file amazon.mat contains the processed Amazon data as provided here.

Ranking the best

For the ranking the best experiments, in the rtb folder, simply run:

>> rtb_script;

The display window will then fill with the results of cross-validation and training each of the methods on each of the datasets. The script proceeds by taking each dataset and then each method in turn. The script will output, for each train-test split, the performance of a method according to all the performance criteria listed in Appendix H. Sample output:

= Dataset german [n = 1000, d = 24] =
unknown proper_logistic
unknown proper_p-classification
unknown proper_lsif
fold 1 2 3 4 5
Proper_Logistic 0.7845 0.0346 0.1827 0.5188 0.0000 0.6000 (0.0 secs; lambda 1.953125e-03, pPush 4, lPush 4)
fold 1 2 3 4 5
Proper_Logistic 0.7936 0.0342 0.1815 0.5876 0.0100 0.6000 (0.0 secs; lambda 2.441406e-04, pPush 4, lPush 4)
fold 1 2 3 4 5
Proper_Logistic 0.8011 0.0436 0.1911 0.6632 0.0490 0.8000 (0.0 secs; lambda 1.220703e-04, pPush 4, lPush 4)
...

Once the script is completed, it will output the LaTeX source for Table 5 in the appendix. Be warned that this script is likely to take a long time.

During the course of this script, we will save, for each trial, the results of cross-validation as well as the final predictions. These can be used subsequently to either skip cross-validation and just perform learning, or to skip both and just produce formatted tables of results. To just print out the results of a previous run, change

PRINT_ALL = 1;

in Line 42 of rtb_script.m.

Third-party libraries

The code relies on certain third-party MATLAB code for various operations. For convenience, the code is included in the ZIP file as part of the helper folder. The libraries are: