Assignment 2, Computer Vision, 2017
Due: 11am, 22/5/17
Anton van den Hengel
The University of Adelaide
South Australia
[email protected]
Abstract
The second assignment is to implement, describe, and
evaluate the Viola and Jones boosted cascade approach to
face detection presented in [4]. The submission will take
the form of a conference paper.
1. Introduction
Face detection is one of the key problems in Computer
Vision, partly because it is something that humans do instinctively. It is critical to human-computer interaction,
driverless cars, and video surveillance. The fact that it is
a detection problem means that we are not given the location of the potential face in the image. It also means that we
must return a bounding box for every face we detect. The
primary challenge in all detection problems is the number of
possible locations within an image at which an object may
appear. Every megapixel image, for instance, has at least a
million locations at which a face might be detected. This
means that a detector with a false-positive rate of 10−3 will
generate 1000 false positives per image.
Computer Vision algorithms struggled for may years to
achieve acceptable accuracy within reasonable computational cost. The Viola and Jones boosted cascade approach
addressed both of these problems, and has become one of
the landmark papers in Computer Vision as a result. The
approach it proposes is applicable to all manner of problems within Computer vision and more broadly.
2. The method
The approach presented in [4] has become one of the
baseline methods in the area. You will need to understand
the method in order to be able to write about it.
You do not need to extend the method of Viola and Jones,
but it will improve your marks if you do. If you decide to
extend their method then you need to describe in your paper why you chose your particular extension in addition to
describing the details of it. The extension you have devised
should thus be described in terms of the shortcoming in the
original method that it addresses. You should then document your assessment of whether it has succeeded in doing
so. Whether or not your improvement actually works better
than the original method is thus a secondary consideration,
the main issue is for you to be able to draw a sensible conclusion about it.
In analysing the performance of the method it is not acceptable to simply present a set of examples and hope that
the reader will do the analysis for you. I would suggest
devising a metric test of performance, but if this is not possible then you need to devise another method for illustrating
the effect that you wish to demonstrate. Sometimes, for instance, it is possible to devise a particularly informative set
of test cases such that the deficiency in the method becomes
obvious.
The goal of the assignments is that you develop your
ability to analyse methods, and to report on the outcome
of your analysis. Every CVPR paper presents a hypothesis
(typically that the proposed method is better than any of its
competitors) and sets out to prove it. I would recommend
reading a few to get an impression of the techniques used.
2.1. Implementation
In order to be able to test the method you will need to implement it. Fortunately this is made significantly easier by
the fact there are many existing implementations available.
The OpenCV[2] version CascadeClassifier (see [1], for example) is a good place to start, but there are many (many)
others. If you use the OpenCV version it will simplify the
process of making any changes you wish to implement, as
OpenCV provides a variety of other tools and implementations that will help. The choice of language, platform,
compiler, IDE, and similar is up to you. You are welcome
to discuss these decisions on the forum.
1Figure 1. A demonstration from the Viola an Jones paper[4]
3. Submission
The submission takes the form of a conference paper,
and your code.
3.1. Paper submission
You need to write a paper such as might be submitted to
a conference, and specifically a paper such as might be submitted to CVPR[3], which is one of the best conferences in
Computer Vision. The paper must be in the CVPR format,
and submitted as a pdf document. All the information about
the CVPR paper format is available on their web site[3].
The paper must be all your own work, with no text copied
from any other document. I do, however, suggest that you
study VIola and Jones’ paper[4], and some of the papers
that have followed it, in order to see how such papers are
written, and some of the methods used to assess the performance of these methods. You can find the papers that have
sited Brown and Lowe’s paper using Google Scholar.
The purpose of the paper is to demonstrate that you understand the problem, and the solution. This means that
your submission should have sections which broadly cover
the following
• An introduction, which describes the problem
• A background section which describes competing approaches to the problem. Achieving this requires that
you understand what the competing approaches do,
how they do it, their advantages and shortcomings, and
how they compare to the current approach. The methods you compare against here may well perform better
than the method you are describing. The idea of this
section is not that you show that yours is necessarily
the best method available, but rather that you show that
you understand enough about the literature in the area
to be able to put it in context.
2• A description of your hypothesis. This will typically
require explaining some part of the algorithm in detail,
and providing examples illustrating its effects and deficiencies. If you propose an improvement then you
should describe how your method works, in enough
detail that a reasonably skilled person would be able to
implement it.
• Experimental Analysis. Describe the tests you have
run, and your motivation for having run them. Report the results of the tests and the conclusions that
you have drawn. Again, the goal is not to show that
your method outperforms all comparators, but rather
that you understand what your method aims to achieve,
and can devise, execute, and report upon, a set of tests
which demonstrate whether it does so. If you have improved upon the base method then you have an opportunity here to show that your improvement is well motivated, and possibly even that it works.
• Conclusion. Demonstrate that you have learned something worthwhile from the process, possibly including
ideas about what you might do to improve the method
you are reporting on.
If you can come up with an interesting application, or
evaluation of the method, then so much the better.
The paper you submit must be in the format specified
for CVPR 2017, which is specified as part of the author
instructions[3]. The easiest way to achieve this is to download the LATEXtemplate and use that. You can use some other
means if you really want to, but your paper needs to conform to the CVPR style specification. The only exception
is that I don’t mind if you use a4 paper rather than their
preference for letter paper (it’s a US conference).
3.2. Code
You will also need to submit your code through the svn
server, but I will not be marking the quality of your code,
only checking that it shows enough evidence that you wrote
it yourself.
4. Assessment
Your submission will be assessed primary in terms of
how well it demonstrates that you
• understand the problem,
• understand Viola and Jones’ solution, and where it sits
with respect to the relevant literature,
• have devised a reasonable, interesting, and testable hypothesis,
• can design and implement a suitable set of tests which
will uncover whether their solution is successful,
• can interpret the results of these tests, and draw a sensible conclusion.
5. Conclusion
If you have questions, please ask them on the forum, as
they are likely to be of interest to others also. It is important to check the forum for the latest information about the
assignment too.
References
[1] OpenCV face detection using python http://docs.
opencv.org/trunk/doc/py_tutorials/py_
objdetect/py_face_detection/py_face_
detection.html.
[2] G. Bradski. The OpenCV library. Dr. Dobb’s Journal of Software Tools, 2000.
[3] CVPR. IEEE computer society conference on computer vision
and pattern recognition. See http://www.pamitc.org/
cvpr16/.
[4] P. Viola and M. Jones. Rapid object detection using a boosted
cascade of simple features. In Computer Vision and Pattern Recognition, 2001. CVPR 2001. Proceedings of the 2001
IEEE Computer Society Conference on, volume 1, pages I–
511. IEEE, 2001.
3