Code for A log-linear model with latent features for dyadic prediction, ICDM '10

A stochastic gradient implementation of the LFL model may be found here. The code assumes that there are structures Tr and Te for the train and test set, each comprising three vectors u, m, and r for the "user", "movie", and "rating" respectively. (As noted in the paper, these may be replaced with more general dyadic entities.) The code handles both nominal and ordinal "ratings".

Example usage

The following constructs a sample rating matrix for 5 users and 10 movies, where the possible ratings are { 1, ..., 5 }, with 0 denoting a missing value, which is then split into a train and test set with proportion 80-20%.

U = 5; M = 10; R = 5; X = floor((R + 1) * rand(U, M));
Data = [ ]; [Data.u, Data.m, Data.r] = find(X);

I = randperm(length(Data.u)); nTe = ceil(0.2*length(I));

Te = [ ]; Te.u = Data.u(I(1:nTe)); Te.m = Data.m(I(1:nTe)); Te.r = Data.r(I(1:nTe));
Tr = [ ]; Tr.u = Data.u(I(1+nTe:end)); Tr.m = Data.m(I(1+nTe:end)); Tr.r = Data.r(I(1+nTe:end));

With this, we could run the SGD optimiser as follows:

k = 10; % # of latent features
eta0 = 0.01; % learning rate
lambda = 1e-6; % regularization parameter
epochs = 10; % # of sweeps over training set
loss = 'mse'; % loss function on training set

[w, trainErrors, testErrors] = lflSGDOptimizer(Tr, Te, k, eta0, lambda, epochs, loss);