In this exercise we will make an image classifier, using a Multiple Instance
Learning approach. To keep it simple, we will use a relatively tiny dataset,
with simple features and a simple classifier.
We will try to distinguish images of apples from bananas. Below are two
examples of images, the first containing an apple, the second containing a
banana. Notice that the backgrounds in the images are quite similar.
To make an image classifier based on MIL, we have to make several steps:
1. Define the instances that constitute a bag. Here we will make use of a
mean-shift image segmentation to segment an image into subparts.
2. Define the features that characterize each instance. Here we will just
use the (average) red, green and blue color.
3. Define the MIL classifier. Here we will use a naive approach, that is
using a standard classifier trained on the individual instances.
4. Define the combination rule that combines the predicted labels on the
individual instances to a predicted label for a bag.
In the coming exercise we will go though it step by step.
Note: for the mean-shift algorithm and the Liknon classifier (also called,
the L1 support vector classifier), I supplied Matlab functions. If you want to
use another programming language, feel free, but you have to find (or make)
your own implementations.
1 The Naive MIL classifier
(a) Get the data sival apple banana.zip and some additional Matlab
functions additionalcode.zip from Blackboard. The data should
contain two folders, one with apple, the other with banana. The additionalcode contains a function to segment an image im meanshift,
and a function to convert a cell-array of bags to a Prtools dataset
(b) Implement a script that reads all images from a given directory. You
can use the Matlab functions dir and imread.
(c) Next implement a function extractinstances that segments an image
using the Mean Shift algorithm (using im meanshift), computes the
average red, green and blue color per segment, and returns the resulting
features in a small data matrix.
Notice that the number of segments that you obtain depends on a
width-parameter that you have to supply to im meanshift. Set this
parameter such that the background and the foreground are segmented
well in the first few images. What value of the width parameter did
(d) Create a function gendatmilsival that creates a MIL dataset, by going through all apple and banana-images, extracting the instances per
image, and storing them in a Prtools dataset with bags2dataset. Make
sure you give the apple and banana objects a different class label (for
instance label 1 and 2). The resulting dataset should contain all instances of all images, and the label of each of the instances is copied
from the bag label. Additionally, the bag identifiers are stored in the
dataset. If you are interested, you can retrieve them using bagid =
How many bags did you obtain? How many features do the instances
have? How many instances are there per bag? Make a scatterplot to
see if the instances from the two classes are a bit separable.
(e) Create a function combineinstlabels that accepts a list of labels, and
outputs a single label obtained by majority voting.
(f) Now we are almost ready to classify images… First we have to train
a classifier; let’s use a Fisher classifier for this. Now apply the trained
classifier to each instance in a bag, classify the instances (using labeld),
and combine the label outputs (using your combineinstlabels) to get
a bag label.
How many apple images are misclassified to be banana? And vice
versa? Why is this error estimate not trustworthy? Estimate the classification error in a trustworthy way!
(g) Invent at least two ways in which you may improve the performance of
this classifier (think of how you obtained your MIL dataset).
The classifier that we used in the previous section was very simple. In this
section we implement one of the most successful classifiers, called MILES.
Also have a look at the article ”MILES: Multiple-instance learning via embedded instance selection.” by Chen, Yixin, Jinbo Bi, and James Ze Wang,
IEEE Transactions on Pattern Analysis and Machine Intelligence, (2006):
(a) Implement a function bagembed that represents a bag of instances Bi
by a feature vector m(Bi), using equation (7) from the article.
How large will this feature vector m(Bi) become for our apple-banana
(b) Make a Prtools dataset with the vectors m(Bi) and their corresponding
. Choose a sensible value for σ such that not all numbers
become 0 or 11
. Train on this large dataset a L1-support vector classifier
(or, more correctly called, LIKNON): liknonc.
(c) Test the LIKNON classifier on this dataset: how many errors is this
classifier making? Is this classifier better than the naive MIL classifier
trained in the previous section? What can you do to make this MILES
classifier perform better?
3 Another MIL classifier
(a) Finally, implement your own MIL classifier. Any classifier may do,
except for the Naive approach and the MILES classifier (obviously).
It may be something you invented yourself, or some classifier from
Explain what MIL classifier you are implementing, give the code, and
compare its performance with that of the Naive classifier (i.e. the Fisher
classifier with a majority vote), and of the MILES classifier.
1For me, σ = 25 appeared to work reasonably well.