Inspired by Bernard Chazelle's wonderful idea of "liner notes" for his papers, I've included some of my own for a few papers. (These are not peer-, or even coauthor-reviewed.)
Preprints
- Universal model routing for efficient LLM inference.
Wittawat Jitkrittum, Harikrishna Narasimhan, Ankit Singh Rawat, Jeevesh Juneja, Zifeng Wang, Chen-Yu Lee, Pradeep Shenoy, Rina Panigrahy, Aditya Krishna Menon, and Sanjiv Kumar.
Manuscript, 2025.
pdf
Publications
- Bipartite ranking from multiple labels: on loss versus label aggregation.
Michal Lukasik, Lin Chen, Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Felix X. Yu, Sashank J. Reddi, Gang Fu, Mohammadhossein Bateni, and Sanjiv Kumar.
In International Conference on Machine Learning (ICML), 2025.
pdf - Faster cascades via speculative decoding.
Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Seungyeon Kim, Neha Gupta, Aditya Krishna Menon, and Sanjiv Kumar.
In International Conference on Learning Representations (ICLR), 2025.
pdf - Better autoregressive regression with LLMs via regression-aware fine-tuning.
Michal Lukasik, Zhao Meng, Harikrishna Narasimhan, Yin-Wen Chang, Aditya Krishna Menon, Felix X. Yu, and Sanjiv Kumar.
In International Conference on Learning Representations (ICLR), 2025.
pdf - Regression-aware inference with LLMs.
Michal Lukasik, Harikrishna Narasimhan, Aditya Krishna Menon, Felix Yu, and Sanjiv Kumar.
In Empirical Methods in Natural Language Processing Findings (EMNLP Findings), 2024.
pdf - Unified single-model training achieving diverse scores for information retrieval.
Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, Wittawat Jitkrittum, Veeranjaneyulu Sadhanala, Sadeep Jayasumana, Aditya Krishna Menon, Rob Fergus, and Sanjiv Kumar.
In International Conference on Machine Learning (ICML), 2024.
pdf - Language model cascades: token-level uncertainty and beyond.
Neha Gupta, Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Aditya Krishna Menon, and Sanjiv Kumar.
In International Conference on Learning Representations (ICLR), 2024.
pdf - Learning to reject meets long-tail learning.
Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, Neha Gupta, and Sanjiv Kumar
In International Conference on Learning Representations (ICLR), 2024.
pdf - Plugin estimators for selective classification with out-of-distribution detection.
Harikrishna Narasimhan, Aditya Krishna Menon, Wittawat Jitkrittum, and Sanjiv Kumar.
In International Conference on Learning Representations (ICLR), 2024.
pdf - DistillSpec: improving speculative decoding via knowledge distillation.
Yongchao Zhou, Kaifeng Lyu, Ankit Singh Rawat, Aditya Krishna Menon, Afshin Rostamizadeh, Sanjiv Kumar, Jean-François Kagy, and Rishabh Agarwal.
In International Conference on Learning Representations (ICLR), 2024.
pdf - Think before you speak: training language models with pause tokens.
Sachin Goyal, Ziwei Ji, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar, and Vaishnavh Nagarajan.
In International Conference on Learning Representations (ICLR), 2024.
pdf - The importance of feature preprocessing for differentially private linear optimization.
Ziteng Sun, Ananda Theertha Suresh, and Aditya Krishna Menon.
In International Conference on Learning Representations (ICLR), 2024.
pdf - What do larger image classifiers memorise?
Michal Lukasik, Vaishnavh Nagarajan, Ankit Singh Rawat, Aditya Krishna Menon, and Sanjiv Kumar.
In Transactions of Machine Learning Research (TMLR), 2024.
pdf - When does confidence-based cascade deferral suffice?
Wittawat Jitkrittum, Neha Gupta, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, and Sanjiv Kumar.
In Advances in Neural Information Processing Systems (NeurIPS), 2023.
pdf - On student-teacher deviations in distillation: does it pay to disobey?
Vaishnavh Nagarajan, Aditya Krishna Menon, Srinadh Bhojanapalli, Hossein Mobahi, and Sanjiv Kumar.
In Advances in Neural Information Processing Systems (NeurIPS), 2023.
pdf - ResMem: Learn what you can and memorize the rest.
Zitong Yang, Michal Lukasik, Vaishnavh Nagarajan, Zonglin Li, Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, and Sanjiv Kumar.
In Advances in Neural Information Processing Systems (NeurIPS), 2023.
pdf - Robust distillation for worst-class performance: on the interplay between teacher and student objectives.
Serena Wang, Harikrishna Narasimhan, Yichen Zhou, Sara Hooker, Michal Lukasik, and Aditya Krishna Menon.
In Uncertainty in Artificial Intelligence (UAI), 2023.
pdf - Supervision complexity and its role in knowledge distillation.
Hrayr Harutyunyan, Ankit Singh Rawat, Aditya Krishna Menon, Seungyeon Kim, and Sanjiv Kumar.
In International Conference on Learning Representations (ICLR), 2023.
pdf - Post-hoc estimators for learning to defer to an expert.
Harikrishna Narasimhan, Wittawat Jitkrittum, Aditya Krishna Menon, Ankit Singh Rawat, and Sanjiv Kumar.
In Advances in Neural Information Processing Systems (NeurIPS), 2022.
pdf - Teacher's pet: understanding and mitigating biases in distillation.
Michal Lukasik, Srinadh Bhojanapalli, Aditya Krishna Menon, and Sanjiv Kumar.
In Transactions of Machine Learning Research (TMLR), 2022.
pdf - Interval-censored Hawkes processes.
Marian-Andrei Rizoiu, Alexander Soen, Shidi Li, Pio Calderon, Leanne J. Dong, Aditya Krishna Menon, and Lexing Xie.
In Journal of Machine Learning Research (JMLR), 2022.
pdf - In defense of dual-encoders for neural ranking.
Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Seungyeon Kim, Sashank J. Reddi, and Sanjiv Kumar.
In International Conference on Machine Learning (ICML), 2022.
pdf - Training over-parameterized models with non-decomposable objectives.
Harikrishna Narasimhan and Aditya Krishna Menon.
In Advances in Neural Information Processing Systems (NeurIPS), 2021.
pdf - Disentangling sampling and labeling bias for learning in large-output spaces.
Ankit Singh Rawat, Aditya Krishna Menon, Wittawat Jitkrittum, Sadeep Jayasumana, Felix X. Yu, Sashank Reddi, and Sanjiv Kumar.
In International Conference on Machine Learning (ICML), 2021.
pdf - A statistical perspective on distillation.
Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, Seungyeon Kim, and Sanjiv Kumar.
In International Conference on Machine Learning (ICML), 2021.
pdf - RankDistil: knowledge distillation for ranking.
Sashank Reddi, Rama Kumar Pasumarthi, Aditya Krishna Menon, Ankit Singh Rawat, Felix Yu, Seungyeon Kim, Andreas Veit, and Sanjiv Kumar.
In Artificial Intelligence and Statistics (AISTATS), 2021.
pdf - Coping with label shift via distributionally robust optimisation.
Jingzhao Zhang, Aditya Krishna Menon, Andreas Veit, Srinadh Bhojanapalli, Sanjiv Kumar, and Suvrit Sra.
In International Conference on Learning Representations (ICLR), 2021.
pdf - Overparameterisation and worst-case generalisation: friend or foe?
Aditya Krishna Menon, Ankit Singh Rawat, and Sanjiv Kumar.
In International Conference on Learning Representations (ICLR), 2021.
pdf - Long-tail learning via logit adjustment.
Aditya Krishna Menon, Sadeep Jayasumana, Ankit Singh Rawat, Himanshu Jain, Andreas Veit, and Sanjiv Kumar.
In International Conference on Learning Representations (ICLR), 2021.
pdf - Robust large-margin learning in hyperbolic space.
Melanie Weber, Manzil Zaheer, Ankit Singh Rawat, Aditya Krishna Menon, and Sanjiv Kumar.
In Advances in Neural Information Processing Systems (NeurIPS), 2020.
pdf - Semantic label smoothing for sequence to sequence problems.
Michal Lukasik, Himanshu Jain, Aditya Krishna Menon, Seungyeon Kim, Srinadh Bhojanapalli, Felix Yu and Sanjiv Kumar.
In Empirical Methods in Natural Language Processing (EMNLP), 2020.
pdf - SupMMD: a sentence importance model for extractive summarization using maximum mean discrepancy.
Umanga Bista, Alexander Patrick Mathews, Aditya Krishna Menon, and Lexing Xie.
In Empirical Methods in Natural Language Processing Findings (EMNLP Findings), 2020.
pdf - Does label smoothing mitigate label noise?
Michal Lukasik, Srinadh Bhojanapalli, Aditya Krishna Menon, and Sanjiv Kumar.
In International Conference on Machine Learning (ICML), 2020.
pdf - Supervised learning: no loss no cry.
Richard Nock and Aditya Krishna Menon.
In International Conference on Machine Learning (ICML), 2020.
pdf - Federated learning with only positive labels.
Felix X. Yu, Ankit Singh Rawat, Aditya Krishna Menon, and Sanjiv Kumar.
In International Conference on Machine Learning (ICML), 2020.
pdf - Can gradient clipping mitigate label noise?
Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, and Sanjiv Kumar.
In International Conference on Learning Representations (ICLR), 2020.
pdf - Noise-tolerant fair classification.
Alexandre Louis Lamy, Ziyuan Zhong, Aditya Krishna Menon, and Nakul Verma.
In Advances in Neural Information Processing Systems (NeurIPS), Vancouver, 2019.
pdf - Multilabel reductions: what is my loss optimising?
Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, and Sanjiv Kumar.
In Advances in Neural Information Processing Systems (NeurIPS), Vancouver, 2019.
pdf slides poster - Fairness risk measures.
Robert C. Williamson and Aditya Krishna Menon.
In International Conference on Machine Learning (ICML), Long Beach, 2019.
pdf - Complementary-label learning for arbitrary losses and models.
Takashi Ishida, Gang Niu, Aditya Krishna Menon, and Masashi Sugiyama.
In International Conference on Machine Learning (ICML), Long Beach, 2019.
pdf - Monge blunts Bayes: hardness results for adversarial training.
Zac Cranko, Aditya Krishna Menon, Richard Nock, Cheng Soon Ong, Zhan Shi, and Christian Walder.
In International Conference on Machine Learning (ICML), Long Beach, 2019.
pdf - On the minimal supervision for training any binary classifier from only unlabeled data.
Nan Lu, Gang Niu, Aditya Krishna Menon and Masashi Sugiyama.
In International Conference on Learning Representations (ICLR), New Orleans, 2019.
pdf - Comparative document summarisation via classification.
Umanga Bista, Alexander Mathews, Minjeong Shin, Aditya Krishna Menon and Lexing Xie.
In AAAI Conference on Artificial Intelligence (AAAI), Honolulu, 2019.
pdf - The risk of trivial solutions in bipartite top ranking.
Aditya Krishna Menon.
In Machine Learning, Volume 108, Issue 4, 2019.
pdf - A loss framework for calibrated anomaly detection.
Aditya Krishna Menon and Robert C. Williamson.
In Advances in Neural Information Processing Systems (NeurIPS), Montreal, 2018.
pdf - Learning from binary labels with instance-dependent noise.
Aditya Krishna Menon, Brendan van Rooyen, and Nagarajan Natarajan.
In Machine Learning, Volume 107 Issue 8-10 (Special Issue on Papers from ECML-PKDD), 2018.
pdf - The cost of fairness in binary classification.
Aditya Krishna Menon and Robert C. Williamson.
In Conference on Fairness, Accountability, and Transparency (FAT), New York City, 2018.
pdf - Proper losses for nonlinear Hawkes processes.
Aditya Krishna Menon and Young Lee.
In AAAI Conference on Artificial Intelligence (AAAI), New Orleans, 2018.
pdf - f-GANs in an information geometric nutshell.
Richard Nock, Zac Cranko, Aditya Krishna Menon, Lizhen Qu and Robert C. Williamson.
In Advances in Neural Information Processing Systems (NIPS), Los Angeles, 2017.
pdf - Predicting short-term public transport demand via inhomogeneous Poisson processes.
Aditya Krishna Menon and Young Lee.
In International Conference on Information and Knowledge Management (CIKM), Singapore, 2017.
pdf - Revisiting revisits in trajectory recommendation.
Aditya Krishna Menon, Dawei Chen, Lexing Xie and Cheng Soon Ong.
In RecSys Workshop on Recommender Systems for Citizens (CitRec), Como, 2017.
pdf - Robust, deep and inductive anomaly detection.
Raghavendra Chalapathy, Aditya Krishna Menon and Sanjay Chawla.
In European Conference on Machine Learning (ECML/PKDD), Skopje, 2017.
pdf - Making deep neural networks robust to label noise: a loss correction approach.
Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, Lizhen Qu.
In Computer Vision and Pattern Recognition (CVPR), Honolulu, 2017.
pdf slides
- Low-rank linear cold-start recommendation from social data.
Suvash Sedhain, Aditya Krishna Menon, Scott Sanner, Lexing Xie, and Darius Braziunas.
In AAAI Conference on Artificial Intelligence (AAAI), San Francisco, 2017.
pdf - Bipartite ranking: a risk-theoretic perspective.
Aditya Krishna Menon and Robert C. Williamson.
In Journal of Machine Learning Research (JMLR), Volume 17, Issue 195. 2016.
pdf liner notes - A scaled Bregman theorem with applications.
Richard Nock, Aditya Krishna Menon and Cheng Soon Ong.
In Advances in Neural Information Processing Systems (NIPS), Barcelona, 2016.
pdf - Linking losses for density ratio and class-probability estimation.
Aditya Krishna Menon and Cheng Soon Ong.
In International Conference on Machine Learning (ICML), New York City, 2016.
pdf slides poster code liner notes - Practical linear models for large-scale one-class collaborative filtering.
Suvash Sedhain, Hung Bui, Jaya Kawale, Nikos Vlassis, Branislav Kveton, Aditya Krishna Menon, Trung Bui and Scott Sanner.
In International Joint Conference on Artificial Intelligence (IJCAI), New York City, 2016.
pdf - On the effectiveness of linear models for one-class collaborative filtering.
Suvash Sedhain, Aditya Krishna Menon, Scott Sanner and Darius Braziunas.
In AAAI Conference on Artificial Intelligence (AAAI), Phoenix, 2016.
pdf slides code - Learning with symmetric label noise: the importance of being unhinged.
Brendan van Rooyen, Aditya Krishna Menon and Robert C. Williamson.
In Advances in Neural Processing Systems (NIPS), Montreal, 2015.
pdf poster
- Fine-grained OD estimation with automated zoning and sparsity regularisation.
Aditya Krishna Menon, Chen Cai, Weihong Wang, Tao Wen and Fang Chen.
In Transportation Research Part B: Methodological, Volume 80, October 2015, Pages 150-172.
pdf - Learning from corrupted binary labels via class-probability estimation.
Aditya Krishna Menon, Brendan van Rooyen, Cheng Soon Ong and Robert C. Williamson.
In International Conference on Machine Learning (ICML), Lille, 2015.
pdf slides poster code liner notes - AutoRec: autoencoders meet collaborative filtering.
Suvash Sedhain, Aditya Krishna Menon, Scott Sanner and Lexing Xie.
In International World Wide Web Conference (WWW), Florence, 2015.
pdf code poster - Cross-modal retrieval: a pairwise classification approach.
Aditya Krishna Menon, Didi Surian and Sanjay Chawla.
In SIAM Conference on Data Mining (SDM), Vancouver, 2015.
pdf (with supplementary) slides code - An approach to sparse, fine-grained OD estimation.
Aditya Krishna Menon, Chen Cai, Weihong Wang, Tao Wen and Fang Chen.
In 94th Annual Meeting of the Transporation Research Board (TRB), Washington DC, 2015.
pdf poster - Bayes-optimal scorers for bipartite ranking.
Aditya Krishna Menon and Robert C. Williamson.
In Conference on Learning Theory (COLT), Barcelona, 2014.
pdf slides poster - Inappropriate access detection for electronic health records using collaborative filtering.
Aditya Krishna Menon, Xiaoqian Jiang, Jihoon Kim, Lucila Ohno-Machado, and Jaideep Vaidya.
In Machine Learning, Volume 95 Number 1, Special Issue on Machine Learning for Society, 2014.
pdf - A colorful approach to text processing by example.
Kuat Yessenov, Shubham Tulsiani, Aditya Krishna Menon, Robert C. Miller, Sumit Gulwani, Butler Lampson, and Adam Kalai.
In ACM Symposium on User Interface Software and Technology (UIST), 2013.
pdf - Beam search algorithms for multilabel learning.
Abhishek Kumar, Shankar Vembu, Aditya Krishna Menon, and Charles Elkan.
In Machine Learning, 2013.
pdf - On the statistical consistency of algorithms for binary classification under class imbalance.
Aditya Krishna Menon, Harikrishna Narasimhan, Shivani Agarwal and Sanjay Chawla.
In International Conference on Machine Learning (ICML), Atlanta, 2013.
pdf slides
- A machine learning framework for programming by example.
Aditya Krishna Menon, Omer Tamuz, Sumit Gulwani, Butler Lampson, and Adam Tauman Kalai.
In International Conference on Machine Learning (ICML), Atlanta, 2013.
pdf poster data - Learning and inference in Probabilistic Classifier Chains with beam search.
Abhishek Kumar, Shankar Vembu, Aditya Krishna Menon, and Charles Elkan.
In Machine Learning and Knowledge Discovery in Databases - European Conference (ECML-PKDD), Proceedings Part I, 2012.
pdf - Doubly optimized calibrated Support Vector Machine (DOC-SVM): an algorithm for joint optimization of discrimination and calibration.
Xiaoqian Jiang, Aditya Krishna Menon, Shuang Wang, Jihoon Kim, and Lucila Ohno-Machado.
In PLoS ONE, 7(11): e48823, 2012.
pdf - Predicting accurate probabilities with a ranking loss.
Aditya Krishna Menon, Xiaoqian Jiang, Shankar Vembu, Charles Elkan, and Lucila Ohno-Machado.
In International Conference on Machine Learning (ICML), Edinburgh, 2012.
pdf poster code
- Link prediction via matrix factorization.
Aditya Krishna Menon, Charles Elkan.
In Machine Learning and Knowledge Discovery In Databases - European Conference, ECML-PKDD, Proceedings Part II, 2011.
pdf poster code - Response prediction using collaborative filtering with hierarchies and side-information.
Aditya Krishna Menon, Krishna-Prasad Chitrapura, Sachin Garg, Deepak Agarwal, and Nagaraj Kota.
In Knowledge Discovery and Data Mining (KDD), San Diego, 2011.
pdf slides poster code - Fast algorithms for approximating the singular value decomposition.
Aditya Krishna Menon, Charles Elkan.
In Transactions of Knowledge and Data Discovery: Special Issue on Large-Scale Data Mining (TKDD-LDMTA), 2010.
pdf code - A log-linear model with latent features for dyadic prediction.
Aditya Krishna Menon, Charles Elkan.
In International Conference on Data Mining (ICDM), Sydney, 2010.
pdf slides code - Predicting labels for dyadic data.
Aditya Krishna Menon, Charles Elkan.
In Data Mining and Knowledge Discovery, Special Issue on Papers from ECML-PKDD Volume 21, Number 2, 2010.
pdf slides - An incremental data-stream sketch using sparse random projections.
Aditya Krishna Menon, Gia Vinh Anh Pham, Sanjay Chawla and Anastasios Viglas.
In SIAM Conference on Data Mining (SDM), Minnesota, 2007.
pdf - A little help goes a long way: efficient LLM training by leveraging small LMs.
Ankit Singh Rawat, Veeranjaneyulu Sadhanala, Afshin Rostamizadeh, Ayan Chakrabarti, Wittawat Jitkrittum, Vladimir Feinberg, Seungyeon Kim, Hrayr Harutyunyan, Nikunj Saunshi, Zachary Nado, Rakesh Shivanna, Sashank J. Reddi, Aditya Krishna Menon, Rohan Anil, and Sanjiv Kumar.
Manuscript, 2024.
pdf - Cascade-aware training of language models.
Congchao Wang, Sean Augenstein, Keith Rush, Wittawat Jitkrittum, Harikrishna Narasimhan, Ankit Singh Rawat, Aditya Krishna Menon, and Alec Go.
Manuscript, 2024.
pdf - Efficient document ranking with learnable late interactions.
Ziwei Ji, Himanshu Jain, Andreas Veit, Sashank J. Reddi, Sadeep Jayasumana, Ankit Singh Rawat, Aditya Krishna Menon, Felix Yu, and Sanjiv Kumar.
Manuscript, 2024.
pdf - EmbedDistill: a geometric knowledge distillation for information retrieval.
Seungyeon Kim, Ankit Singh Rawat, Manzil Zaheer, Sadeep Jayasumana, Veeranjaneyulu Sadhanala, Wittawat Jitkrittum, Aditya Krishna Menon, Rob Fergus, and Sanjiv Kumar.
Manuscript, 2023.
pdf - When in doubt, summon the titans: efficient inference with large models.
Ankit Singh Rawat, Manzil Zaheer, Aditya Krishna Menon, Amr Ahmed, and Sanjiv Kumar.
Manuscript, 2021.
pdf - ELM: embedding and logit margins for long-tail learning.
Wittawat Jitkrittum, Aditya Krishna Menon, Ankit Singh Rawat, and Sanjiv Kumar.
Manuscript, 2022.
pdf - Distilling double descent.
Andrew Cotter, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sashank J. Reddi, and Yichen Zhou.
Manuscript, 2021.
pdf - On the reproducibility of neural network predictions.
Srinadh Bhojanapalli, Kimberly Wilber, Andreas Veit, Ankit Singh Rawat, Seungyeon Kim, Aditya Krishna Menon, and Sanjiv Kumar.
Manuscript, 2021.
pdf - Doubly-stochastic mining for heterogeneous retrieval.
Ankit Singh Rawat, Aditya Krishna Menon, Andreas Veit, Felix Yu, Sashank J. Reddi, and Sanjiv Kumar.
Manuscript, 2020.
pdf - Cold-start playlist recommendation with multitask learning.
Dawei Chen, Cheng Soon Ong, and Aditya Krishna Menon.
Manuscript, 2019.
pdf - Structured recommendation.
Dawei Chen, Lexing Xie, Aditya Krishna Menon, and Cheng Soon Ong.
Manuscript, 2019.
pdf - Large-scale Support Vector Machines: algorithms and theory.
Aditya Krishna Menon.
Research Exam, University of California, San Diego. 2009.
pdf slides
- An incremental data-stream sketch using sparse random projections.
Aditya Krishna Menon, Gia Vinh Anh Pham, Sanjay Chawla and Anastasios Viglas.
Technical Report 609, University of Sydney. 2007.
pdf - Random projections and applications to dimensionality reduction.
Aditya Krishna Menon.
Honours thesis, University of Sydney. 2006.
pdf slides