My primary research interests are focused on learning robust, task specific representations and building interpretable models with multimodal interactions in the broad field of natural language processing (NLP). In order to obtain intelligent machines that can tackle challenging reasoning problems, it is necessary to resolve deep structural and subtle semantic (meaning related) patterns occurring in a variety of modalities of communication. My research addresses these problems by learning robust cross-modal representations with accurate and well-formulated models. In particular, throughout my work, I have asked broadly these questions:

Multimodal Machine Learning and Grounding:

How do we build models that interact with structural information from multiple modalities?

Selected papers:

Representation Learning:

How do we learn robust generalizable representations that are transferable?

Selected papers:

Interpretable and Explainable Models:

How do we interpret and explain the decisions of complex models?

Selected papers: