Isma Hadji Isma Hadji
PhD candidate at York University
e-mail: hadjisma [@] cse. yorku. ca
Research Interests:

I have a strong interest in computer vision, and machine learning in general. My PhD research focuses on analytically defined representations with application to video analysis. In particular, I aim at injecting more domain priors into convolutional networks design to gain a better understanding of the resulting representation and ultimately design a more efficient architecture.
Some of the applications I am interested in include: recognition and detection from video as well as video content synthesis.


Education


Publications

What Do We Understand About Convolutional Networks?
Isma Hadji
arXiv e-print 2018
cv_news

This document will review the most prominent proposals using multilayer convolutional architectures. Importantly, the various components of a typical convolutional network will be discussed through a review of different approaches that base their design decisions on biological findings and/or sound theoretical bases. In addition, the different attempts at understanding ConvNets via visualizations and empirical studies will be reviewed. The ultimate goal is to shed light on the role of each layer of processing involved in a ConvNet architecture, distill what we currently understand about ConvNets and highlight critical open problems.

| PDF | Slides| Bibtex
@article{Hadji2018a,
author = {I. Hadji},
title = {What Do We Understand About Convolutional Networks?},
journal = {arXiv},
volume = {1803.08834},
year = {2018},
}

A Spatiotemporal Oriented Energy Network for Dynamic Texture Recognition
Isma Hadji and Richard P. Wildes
IEEE International Conference on Computer Vision (ICCV) 2017 (spotlight paper, 2.61% acceptance rate)
Recurrence_v3

stacking_v2

This paper presents a novel hierarchical spatiotemporal orientation representation for spacetime image analysis. It is designed to combine the benefits of the multilayer architecture of ConvNets and a more controlled approach to spacetime analysis. A distinguishing aspect of the approach is that unlike most contemporary convolutional networks no learning is involved; rather, all design decisions are specified analytically with theoretical motivations. This approach makes it possible to understand what information is being extracted at each stage and layer of processing as well as to minimize heuristic choices in design. Another key aspect of the network is its recurrent nature, whereby the output of each layer of processing feeds back to the input. To keep the network size manageable across layers, a novel cross-channel feature pooling is proposed. The multilayer architecture that results systematically reveals hierarchical image structure in terms of multiscale, multiorientation properties of visual spacetime. To illustrate its utility, the network has been applied to the task of dynamic texture recognition. Empirical evaluation on multiple standard datasets shows that it sets a new state-of-the-art.

| PDF | Project website| Poster| Code| Talk | Bibtex
@inproceedings{Hadji2017,
author = {I. Hadji and R. P. Wildes},
title = {A Spatiotemporal Oriented Energy Network for Dynamic Texture Recognition},
booktitle = {ICCV},
year = {2017},
}

Local-to-Global Signature Descriptor for 3D Object Recognition
Isma Hadji and Guilherme N. DeSouza
Workshop on Robust Local Descriptors, IEEE Asian Conference on Computer Vision (ACCV) 2014
LGS

In this paper, we present a novel 3D descriptor that bridges the gap between global and local approaches. While local descriptors proved to be a more attractive choice for object recognition within cluttered scenes, they remain less discriminating exactly due to the limited scope of the local neighborhood. On the other hand, global descriptors can better capture relationships between distant points, but are generally affected by occlusions and clutter. So, we propose the Local-to-Global Signature (LGS) descriptor, which relies on surface point classification together with signature-based features to overcome the drawbacks of both local and global approaches. As our tests demonstrate, the proposed LGS can capture more robustly the exact structure of the objects while remaining robust to clutter and occlusion and avoiding sensitive, low-level features, such as point normals. The tests performed on four different datasets demonstrate the robustness of the proposed LGS descriptor when compared to three of the SOTA descriptors today: SHOT, Spin Images and FPFH. In general, LGS outperformed all three descriptors and for some datasets with a 50-70% increase in Recall.

| PDF | Slides| Code| Bibtex
@inproceedings{Hadji2014,
author = {I. Hadji and G. N. DeSouza},
title = {Local-to-Global Signature Descriptor for 3D Object Recognition},
booktitle = {wACCV},
year = {2014},
}

Least Expected Features for 3D Keypoint Detection
Isma Hadji and Guilherme N. DeSouza
Technical Report, EECS, University of Missouri, 2014
dragon_example

pan_example

Most object recognition algorithms rely on the detection of a subset of important or discriminative visual stimuli (keypoints) as a first step towards the description of those objects. Independently of the type of 3D feature used, all 3D detectors rely on a local criterion for keypoint selection. This criterion is usually a point-wise saliency measure that is based on experimentally learnt thresholds. In this research, we question both the threshold based approach, as well as the local characteristic of traditional 3D keypoint detection schemes. First, we abstract the keypoint selection from experimentally learnt thresholds that depend on low level features for saliency detection. To this end we propose the Least Expected Feature criterion (LEFT) for saliency detection. Second we introduce the concept of finding keypoints considering a global approach, as opposed to more traditional local neighborhood based approaches. It turns out that adopting the proposed global LEFT criterion allows for the selection of very distinctive keypoints across the entire object, while avoiding sensitive and noisy regions. Our LEFT criterion only selects outstanding points as opposed to traditional detectors that select points across the entire object even in smooth non-salient regions.

| PDF | Code| Bibtex
@techreport{Hadji2014,
author = {I. Hadji and G. N. DeSouza},
title = {Bridging the Gap Between Local and Global Approaches for 3D Object Recognition},
Institution = {University of Missouri-Columbia},
year = {2014},
}