My research interests in Machine Learning lie at the intersection of the following (highly related) topics:
How to scale learning algorithms for large-scale, big data?
I have been working mainly on speeding up training of Support Vector Machines for different kinds of data, proposing new algorithms based on stochastic gradient descent, online and active learning. One of our methods won the “Wild track” of the 1stPASCAL Large Scale Challenge in 2008.
Related papers: JMLR05, ICML07, ECML08, JMLR09 & AISTATS10.
How to learn properly with imperfect, partial, noisy or ambiguous supervision?
My main research have considered working with unsupervised and transfer learning with deep architectures but also ambiguous supervision for semantic parsing. I am also studying indirect supervision problems (no labeling is available but some related information exists).
Related papers: ICML09, AISTATS11, AISTATS12 & ICML11.
How automatically discover the semantics among data to better represent, search or visualize it?
My work mostly considers natural language (text) and knowledge bases. I am exploring ways to represent both properly in order to connect them (for summarization, information extraction, semantic parsing, etc.), with statistical relational learning approaches. This research direction heavily relies on the two previous ones because learning semantics of data can only be made possible with large sources of imprecisely labeled data. We organized the 1st workshop on Learning Semantics at NIPS 2011 and we are also editing a related special issue in Machine Learning.
Related papers: AISTATS10, AAAI11, AISTATS12 & NIPS12.
EVEREST (2013 - 2016): PI. ANR grant - Young researcher program.
Learning High-level Representations of Large-scale Sparse Tensors.
PRETIV (2012 - 2014): member. ANR grant - International program (with NFSC-China).
Multimodal Perception and Reasoning for Transnational Intelligent Vehicles.
Deep Learning (2010 - 2012): member. DARPA grant, via Université de Montréal (Y. Bengio's lab).
Multimodal training and conception of Deep Learning methods.
Alberto Garcia-Duran (PhD student - 2013-2016): New models for modeling highly multi-relational data.
Xiao Liu (PhD student - 2011-2014): Weakly supervised learning for biomedical information extraction.