Babak Saleh

Ph.D. Candidate
Department of Computer Science
Rutgers, The State University of New Jersey
email: babak*-at-c*-dot-rutger*-dot-edu {replace * with s}
address: 110 Frelinghuysen Rd, Piscataway NJ 08854
office: Hill center 257
phone: (732) 743-5154
Google Scholar/ LinkedIn/ Twitter/ GitHub/

Short Bio:I am a Ph.D. candidate in computer science department at Rutgers university. I am a member of computer vision group at CBIM and Art and AI lab, where I am advised by Prof. Ahmed Elgammal. My research revolves around "Computer Vision", "Machine Learning", "Human Perception" and "Human-Computer Interaction".

I have the privilage of collaboration with Prof. Ali Farhadi and Prof. Jacob Feldman . I spent two wonderful summers, doing internship with "Disney Research" and "Adobe Research". I completed my M.Sc. in Computer Science at Rutgers university and hold a B.Sc. in Computer Science from Sharif University of Technology.

  • I am thrilled to receive the NSF I-Corps award. [March 2016]
  • I am the recepient of the "Outstanding Student Paper" award from AAAI 2016, Phoenix, AZ. [Feb 2016]
  • I will give an invited talk in the workshop on "Culture Analytics Beyond Text" at IPAM. [Feb 2016]
  • Selected as the intern to work with the office of research commercialization (ORC) at Rutgers. [Jan 2016]


The Role of Typicality in Object Classification: Improving The Generalization Capacity of Convolutional Neural Networks
Babak Saleh, Ahmed Elgammal, Jacob Feldman
IJCAI 2016
[PDF] [Project Page] [BibTeX]

Abstract: Deep artificial neural networks have made remarkable progress in different tasks in the field of computer vision. However, the empirical analysis of these models and investigation of their failure cases has received attention recently. In this work, we show that deep learning models cannot generalize to atypical images that are substantially different from training images. This is in contrast to the superior generalization ability of the visual system in the human brain. We focus on Convolutional Neural Networks (CNN) as the state-of-the-art models in object recognition and classification; investigate this problem in more detail, and hypothesize that training CNN models suffer from unstructured loss minimization. We propose computational models to improve the generalization capacity of CNNs by considering how typical a training image looks like. By conducting an extensive set of experiments we show that involving a typicality measure can improve the classification results on a new set of images by a large margin. More importantly, this significant improvement is achieved without fine-tuning the CNN model on the target image set.


Toward a Taxonomy and Computational Models of Abnormalities in Images
Babak Saleh, Ahmed Elgammal, Jacob Feldman, Ali Farhadi
AAAI-16 Outstanding Student Paper Award
Thirtieth AAAI Conference on Artificial Intelligence (AAAI) 2016 Oral Presentation
[PDF] [Project Page] [BibTeX]

Abstract: The human visual system can spot an abnormal image, and reason about what makes it strange. This task has not received enough attention in computer vision. In this paper we study various types of atypicalities in images in a more comprehensive way than has been done before. We propose a new dataset of abnormal images showing a wide range of atypicalities. We design human subject experiments to discover a coarse taxonomy of the reasons for abnormality. Our experiments reveal three major categories of abnormality: object-centric, scene-centric, and contextual. Based on this taxonomy, we propose a comprehensive computational model that can predict all different types of abnormality in images and outperform prior arts in abnormality recognition.


Quantifying Creativity in Art Networks
Ahmed Elgammal, Babak Saleh
International Conference on Computational Creativity (ICCC) 2015.
[PDF] [BibTeX]
Major media coverages:
NBC News   New York Times The Washington Post   Fast Company   WIRED  
MIT Technology Review   Business Insider   Popular Mechanics   Popular Mechanics   MSN News

Abstract: Can we develop a computer algorithm that assesses the creativity of a painting given its context within art history? This paper proposes a novel computational framework for assessing the creativity of creative products, such as paintings, sculptures, poetry, etc. We use the most common definition of creativity, which emphasizes the originality of the product and its influential value. The proposed computational framework is based on constructing a network between creative products and using this network to infer about the originality and influence of its nodes. Through a series of transformations, we construct a Creativity Implication Network. We show that inference about creativity in this network reduces to a variant of network centrality problems which can be solved efficiently. We apply the proposed framework to the task of quantifying creativity of paintings (and sculptures). We experimented on two datasets with over 62K paintings to illustrate the behavior of the proposed framework. We also propose a methodology for quantitatively validating the results of the proposed algorithm, which we call the "time machine experiment".


Large-scale Classification of Fine-Art Paintings:
Learning The Right Metric on The Right Feature
Babak Saleh, Ahmed Elgammal
This project has been covered in:
MIT Technology Review   IEEE Multimedia   Smithsonian   Panarmenian

Abstract: In the past few years, the number of fine-art collections that are digitized and publicly available has been growing rapidly. With the availability of such large collections of digitized artworks comes the need to develop multimedia systems to archive and retrieve this pool of data. Measuring the visual similarity between artistic items is an essential step for such multimedia systems, which can benefit more high-level multimedia tasks. In order to model this similarity between paintings, we should extract the appropriate visual features for paintings and find out the best approach to learn the similarity metric based on these features. We investigate a comprehensive list of visual features and metric learning approaches to learn an optimized similarity measure between paintings. We develop a machine that is able to make aesthetic-related semantic-level judgments, such as predicting a painting's style, genre, and artist, as well as providing similarity measures optimized based on the knowledge available in the domain of art historical interpretation. Our experiments show the value of using this similarity measure for the aforementioned prediction tasks.


Learning Style Similarity for Searching Infographics
Babak Saleh, Mira Dontcheva, Aaron Hertzmann, Zhicheng Liu
In proceedings of the 41st Annual Conference on Graphics Interface (GI) 2015.
[PDF] [BibTeX] [TURK experiment data] [TURK experiment images] [Notes on using data]
Please contact me in case you want to get all the Infographic images in our dataset.

Abstract: Infographics are complex graphic designs integrating text, images, charts and sketches. Despite the increasing popularity of infographics and the rapid growth of online design portfolios, little research investigates how we can take advantage of these design resources. In this paper we present a method for measuring the style similarity between infographics. Based on human perception data collected from crowdsourced experiments, we use computer vision and machine learning algorithms to learn a style similarity metric for infographic designs. We evaluate different visual features and learning algorithms and find that a combination of color histograms and Histograms-of-Gradients (HoG) features is most effective in characterizing the style of infographics. We demonstrate our similarity metric on a preliminary image retrieval test.


Toward Automated Discovery of Artistic Influence
Babak Saleh, Kanako Abe, Ravneet Singh Arora, Ahmed Elgammal
Multimedia Tools and Applications, Springer. August 2014 The Journal Website
[PDF] [Project Page] [BibTeX]
This project received a noticeable recognition in media and press. Here are some of the posts, interviews and articles (please click on the logo to find out more):
The Washington Post   Science News   Apollo   Hyperallergic
The telegraph   artnews   Creators Project   Folha   Medium
RoboHub   The conversation   danish   barnebys   notImpossible

Abstract: Considering the huge amount of art pieces that exist, there is valuable information to be discovered. Examining a painting, an expert can determine its style, genre, and the time period that the painting belongs. One important task for art historians is to find influences and connections between artists. Is influence a task that a computer can measure? The contribution of this paper is in exploring the problem of computer-automated suggestion of influences between artists, a problem that was not addressed before in a general setting. We first present a comparative study of different classiffcation methodologies for the task of fine-art style classiffcation. A two-level comparative study is performed for this classiffcation problem. The First level reviews the performance of discriminative vs. generative models, while the second level touches the features aspect of the paintings and compares semantic-level features vs low-level and intermediate level features present in the painting. Then, we investigate the question "Who influenced this artist?" by looking at his masterpieces and comparing them to others. We pose this interesting question as a knowledge discovery problem. For this purpose, we investigated several painting similarity and artist similarity measures. As a result, we provide a visualization of artists (Map of Artists) based on the similarity between their work.


Knowledge Discovery of Artistic Influences: A Metric Learning Approach (Invited paper)
Babak Saleh, Kanako Abe, Ahmed Elgammal
International Conference on Computational Creativity (ICCC) 2014. [Oral Presentation]
[PDF] [Video] [BibTeX]

Abstract: We approach the challenging problem of discovering influences between painters based on their fine-art paintings. In this work, we focus on comparing paintings of two painters in terms of visual similarity. This comparison is fully automatic and based on computer vision approaches and machine learning. We investigated different visual features and similarity measurements based on two different metric learning algorithm to find the most appropriate ones that follow artistic motifs. We evaluated our approach by comparing its result with ground truth annotation for a large collection of fine-art paintings.


Detecting Strange Objects via Visual Attributes
Babak Saleh, Ahmed Elgammal, Ali Farhadi
In Third International Workshop on Parts and Attributes, In conjunction with (ECCV) 2014
[PDF] [Project Page] [BibTeX] [Poster] [Workshop Page]

Abstract: We are capable of developing algorithms for detecting ob- jects by learning a comprehensive object model based on their parts and visual attributes. These models usually fail to perform well on atypical images, however people are still able to recognize strange objects that are not completely following our model. In this work we focus on the question of what makes an object looks strange? We would like to inves- tigate what is the main characteristics of an object category in terms of its visual attributes. Later we detect meaningful deviations from these expectations as atypical cases. We present interesting ndings on the novel dataset of abnormal objects and show how we can improve object detectors by making assumptions on typicality of the objects.


Object-Centric Anomaly Detection by Attribute-Based Reasoning
Babak Saleh, Ali Farhadi, Ahmed Elgammal
In proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2013.
[PDF] [Project Page] [BibTeX] [Poster]

Abstract: When describing images, humans tend not to talk about the obvious, but rather mention what they find interesting. We argue that abnormalities and deviations from typicalities are among the most important components that form what is worth mentioning. In this paper we introduce the abnormality detection as a recognition problem and show how to model typicalities and, consequently, meaningful deviations from prototypical properties of categories. Our model can recognize abnormalities and report the main reasons of any recognized abnormality. We also show that abnormality predictions can help image categorization. We introduce the abnormality detection dataset and show interesting results on how to reason about abnormalities.


Write a Classifier: Zero-Shot Learning Using Purely Textual Descriptions
Mohamed Elhoseiny, Babak Saleh, Ahmed Elgammal
In proceedings of International Conference on Computer Vision (ICCV) 2013
[PDF] [BibTeX] [Poster]

Abstract: The main question we address in this paper is how to use purely textual description of categories with no training images to learn visual classifiers for these categories. We propose an approach for zero-shot learning of object categories where the description of unseen categories comes in the form of typical text such as an encyclopedia entry, without the need to explicitly defined attributes. We propose and investigate two baseline formulations, based on regression and domain adaptation. Then, we propose a new constrained optimization formulation that combines a regression function and a knowledge transfer function with additional constraints to predict the classifier parameters for new classes. We applied the proposed approach on two fine-grained categorization datasets, and the results indicate successful classifier prediction.


Heterogeneous Domain Adaptation: Learning Visual Classifiers from Textual Description
Mohamed Elhoseiny, Babak Saleh, Ahmed Elgammal
In Proceedings of the Workshop on Visual Domain Adaptation and Dataset Bias, In conjunction with ICCV'13'.
[PDF] [BibTeX] [Slides]

Abstract: One of the main challenges for scaling up object recognition systems is the lack of annotated images for real-world categories. It is estimated that humans can recognize and discriminate among about 30,000 categories. Typically there are few images available for training classifiers form most of these categories. This is reflected in the number of images per category available for training in most object categorization datasets, shows a Zipf distribution. The problem of lack of training images becomes even more severe when we target recognition problems within a general category, i.e., subordinate categorization, for example building classifiers for different bird species or flower types (estimated over 10000 living bird species, similar for flowers). In this work we presented additional experiments to our ICCV paper.


An Early Framework for Determining Artistic Influence (Invited paper)
Kanako Abe, Babak Saleh, Ahmed Elgammal
In 2nd International Workshop on Multimedia for Cultural Heritage (MM4CH) 2013 (Oral presentation).
[PDF] [BibTeX]

Abstract: Considering the huge amount of art pieces that exist, there is valuable information to be discovered. Focusing on paintings as one kind of artistic creature that is printed on a surface, artists can determine its genre and the time period that paintings can belong to. In this work we are proposing the interesting problem of automatic in uence determination between painters which has not been explored well. We answer the question "Who influenced this artist?" by looking at his masterpieces and comparing them to others. We pose this interesting question as a knowledge discovery problem. We presented a novel dataset of paintings for the interdisciplinary field of computer science and art and showed interesting results for the task of influence finding.


Object Detection using Pictorial Structure of Gabor Template
Babak Saleh, Mohammad Rastegari
In proceedings of the International Conference on Computer Vision Theory and Applications (VISAPP) 2010.

Abstract: Object detection methods are divided into two main branches: In the global approach one extracts low level features and uses machine learning techniques. In the part-based approach one uses deformable templates. We present a Hybrid approach for constructing a deformable template for modeling and detection. Initially one applies Gabor wavelet filters to extract low level features and constructs graphs which resemble shock graphs. A minimum spanning tree (MST) is extracted and is called the pictorial graph. It is used for matching. The pictorial graph is suitable for preserving the visual appearance of the shape of the object and for accommodating shape variances. In this hybrid approach we maintain the generality of the global and the efficiency of part-based approaches. Our algorithm has been applied to a set of test cases and the result shows improved performance as compared to standard object detection methods that do not rely on human intervention.