General Information

My name is Erfan Noury (written as “عرفان نوری” in Persian). If pronouncing my name seems difficult, here's a hint: err-fun knew-ree.
I was born in Urmia, West Azerbaijan, Iran, a city with warm summers and cold falls and winters. The Lake Urmia is a saltwater lake located in the east of the city, but unfortunately in the past few years the water level has been decreasing steadily.

Lake Urmia from space in 1984


Five years ago I moved to Tehran to start my academic career as a bachelor student of Software Engineering at the Computer Engineering department of Sharif University of Technology, Tehran, Iran.

Official logo of the Sharif University of Technology, Tehran, Iran

Before attending university, I was a high school student at Shahid Beheshti high school, Urmia, Iran, which was affiliated with the NODET. I owe my interest in science and learning in general to the wonderful years I had as a student of the NODET family (during both my middle school and high school). Great teachers and greater students helped me further ignite my ambition in science and technology. I still remember the days we spent after hours getting ready for the Physics or Astronomy exhibition. That joy is still unprecedented. A great education is very essential for creating thriving societies.

Official logo of the NODET

Research Experience

I joined the Image Processing Lab. in June 2013 and I have been an undergraduate member of the lab since then. IPL is where I got introduced to research, learned how to tackle problems and read academic papers, and where I got to be around people who are interested in and passionate about their research. During this time, I have worked with many people. I like collaborating and helping others, whenever I can. Reza Afrouzian introduced me to the lab, we were going to create a synthetic dataset for human pose estimation. Rendering time diverged to infinity, so we halted the project. Then I started working with Mehran Fotouhi. I was told to read every publication about Integral Images, and that was where I learned how to read papers en masse (:D). I've written a technical paper about Integral Images, though it's in Persian. At about the same time, I worked with Yasser Souri on supervised edge detection. I still have the chance to work with Yasser, although we are now working on more exciting problems. Afterwards, I started working with Sara Ershadi-Nasab on Human Pose Estimation, the result of which has been two journal submission so far. Currently, I've concentrated my research efforts on my bachelor thesis (it's awesome!). These years I've had the honor to have Professor Shohreh Kasaei as my supervisor.
To further pursue my research in deep learning and machine learning in general I joined the Machine Learning Lab. in December 2014 to work under the supervision of Dr. Mahdiyeh Soleymani Baghshah. I'm working on studying the joint embedding space for image and text as my bachelor thesis. Raw image or text aren't suitable for many tasks, the raw representation isn't usable for many upstream tasks, therefore a better representation should be obtained. They have to be further processed to be usable. That's why they are projected into another space that usually has fewer dimensions than the inherent dimensions in the raw image or text. It also exhibits some semantic properties. Thus it can be used in many applications, like image/caption retrieval, image caption generation and visual question answering.

Problems I'm working on

There a number of problems that I'm interested in and currently working on.

  1. Human Pose Estimation

    The current state-of-the-art in 2D human pose estimation (1602.00134) still has some shortcomings that need to be addressed. Besides, a new large-scale dataset for Human Pose Estimation has been recently released (MS COCO Keypoint Challenge). Therefore, I'm working on how I can improve the current model and achieve better results that can also generalize to other datasets, specially small-scale datasets. We are also working on extending the 2D Human Pose Estimation to multi-view and 3D scenarios. We have some preliminary results in the "K-Pose" paper, but it needs to be further refined.
  2. Joint Image and Text Embedding

    For my bachelor thesis, I studied joint embedding spaces of image and text modalities. It has applications in image/caption retrieval, image caption generation, and visual question answering. These tasks are quite new and there are many unresolved challenges still remaining. I found out that joint embedding alone won't yield the best result for many of these tasks. Therefore I'm currently working on a new way to tackle the problem of visual question anwering and image caption generation. Besides, I'm interested in finding out how a representation can be evaluated without being evaluated on an upstream task. Is is even possible? I don't know, but I'm trying hard to find out.
  3. Sequence to Sequence Learning

    Many problems in AI can be formulated in the Sequence to Sequence learning framework. I want to study how memory networks can be incorporated into the Seq2Seq framework. So far memory networks have almost always been used for toy tasks, but their expressivity and complexity is better suited for real and challenging tasks. This is what I'm now concentrated the most on.


  1. Deep Neural Networks for Joint Image and Text Embedding

    Erfan Noury
    Bachelor thesis, Sharif University of Technology, Tehran, Iran, 2016. (PDF, in Persian)
  2. Deep Relative Attributes

    Yaser Souri, Erfan Noury, Ehsan Adeli-Mosabbeb.
    Asian Conference on Computer Vision (ACCV), Taipei, Taiwan, 2016.
  3. 3D Multiple Human Pose Estimation from Multi-view Images

    Sara Ershadi-Nasab, Erfan Noury, Shohreh Kasaei, Esmaeil Sanaei.
    To be submitted.
  4. BodyField: Structured Mean Field with Human Body Skeleton Models and Shifted Gaussian Edge Potentials

    Sara Ershadi-Nasab, Erfan Noury, Hassan Hafez-Kolahi, Shohreh Kasaei, Esmaeil Sanaei.
    To be submitted.