Interview with Professor Irving Biederman
Professor Iriving Biederman is a Professor of Neuroscience at the
University of Southern California. His work deals with determining how
the human brain interperts the information it recieves, or how we "get
mind from brain," as he likes to put it. Further information on Professor
Biederman and his work can by found at the website for the
Image Understanding Lab.
E: To start out with, what are your day to day responsibilities?
Do you spend most of your time teaching, or do you spend
most of your time doing actual research?
B: Probably more of my time is spent doing research, or thinking
about research -- and a distressingly large percentage of
my time is involved in administrative activities.
E: Your work deals with object recognition?
B: Shape recognition in general, but objects, seeing, and
face recognition in particular.
E: When did you first get started in doing work with images?
B: When [my brother] was eleven, he got me a birthday present
-- and I was eight, he's three years older than me, my brother
-- a subscription to Scientific American, which he really
wanted for himself. So we would fight when it would come,
as to who would read it first. But I got interested in the
articles -- some of it I couldn't understand -- in the articles
on perception where you could try out some of the illusions
and some of the effects on yourself, and so it was an interest
I just retained. I majored in psychology as an undergraduate,
and it was just the thought of trying to understand how
you got mind from brain seemed like such a delicious problem
that it was easy to follow that. I've been forunate things
have gone well.
E: I was reading through some of the stuff on your website,
and it was talking about geons, and talking about edge perception
to recognize objects...
B: Yeah, so if you think about what you could do with a good
line drawing of an object, which is, in fact, you could
recognize it very well. This line drawing simply captures
the edges in an image that correspond to discontinuity --
the sharp differences in depth -- so that if you look at
my arm here, there's a jump at this -- if you're drawing
it you draw an edge over here -- there's a sharp jump in
depth from where your viewpoint grazes this tangent surface
of my arm to the background. You also have one other type
of discontinuity, it's a discontinuity in surface orientation,
so here we're at one point over here, and it's not like
there's a big jump in depth but a big jump in orientation.
So those are the only two types of edges that are really
important. We have other types of edges that could be coloring,
or texture on the surface, or luminance hotspots -- so if
you shine a flashlight or a spotlight on the object you
get a hotspot -- but the brain seems to know what is the
actual structure of the object in terms of the rotation
and depth discontinuities and what is just surface variation.
Anyway, from those two types of edges you can activate these
geons which are shape primitives, much the way phonemes
are speech primitives for speech recognition. So we have
these shape primitives like cylinders and bricks and curved
cylinders and wedges and so on -- about fifty of those can
account for virtually all aspects of object recognition
other than the surface differences like color and texture.
E: Now, this work, are you approaching it in sort of a theoretical
science approach, or are you looking to it for a specific
technological end? Are you looking for an implementation
or just pursuing it for the science?
B: To understand how we get mind from brain is so interesting
that, yes, that's my passion. We're pursuing it on many
different levels, that is: a behavorial level so we do experiments
on psychophysics, we do what are called single neuron recordings
in monkeys so as they are looking at different types of
images designed to test aspects of the theory, we record
from these neurons, and have now a pretty good idea for
how the brain codes images. We also do some work with patients,
individuals who have specific types of lesions -- so for
example we're studying an individual now who is a prosopagnosiac,
that means he can not recognize faces though he has no trouble
recognizing objects -- and then we're starting some experiments
in functional imaging, so using fMRI we get differences
in the signal depending on the type of perceptual or cognitive
task someone is doing, and from that we can infer how the
brain is doing these various types of tasks or processing
the information in different ways. So we have essentially
a multi-faceted attack on this problem, but the great motivation
comes from simply understanding it, though there are enormous
applications to it if we could solve the problems.
E: Right. There are a ton of applications that could benefit
from computers actually being able to recognize what it
is they're seeing.
B: Exactly. For face recognition, actually, there's a system
developed here at USC that won a national competition for
best face recognizer. It's done by a colleague, Christoph
von der Malsburg, and we've been using to see whether the
theory upon which that was based is a good theory of human
face recognition, and it is, at least to our best accounting.
And so an application of that, which is currently already
being used would be to detect potential undesirables at
points of entry to the US.
E: I know there was a system like that used in Tampa Bay
at the Super Bowl, do you know if that was...
B: I think that might have been another system, a competing
one but somewhat similar.
E: Also, I was wondering -- and this is sort of irrelevant
to the interview, but I was curious -- once you know how
the brain stores an image, is there a chance that will be
a good way to not only recognize images, but reconstruct
them. For instance, instead of, say, bitmap image formats,
could we be able to store edges and such and be able to
recreate an image from that?
B: That's an interesting and very sophisticated question.
It turns out that our memory is not nearly as detailed as
we think subjectively. When people imagine something, it
turns out there's much less detail there than they think.
Let me give you an example, imagine the last party you were
at. Can you get a picture of the scene there and some people
there? Pick out a single person and get an image of that
person. What color shirt are they wearing?
E: I have no idea.
B: It's interesting, until that time the person probably
wasn't bare chested, nor was... It's not like if you don't
get the shirt it's in tatters, or all ripped up the way
you might think of a photograph being randomly mutilated
-- it's gone completely. Now you might have remembered the
shirt or blouse the person was wearing, but if you don't
it's gone completely. That indicates -- that and of course
there are a lot of experiments, but just a subjective experience
indicates -- that to a large extent our memories are categorized.
This is quite unlike a bitmap. If we had a bitmap we could
expect, yeah, we'll lose some of the pixels, but we'll see
a lot of it, so it'll be like we took some of the pixels
off the shirt, but we'll be able to see some of them and
know it's red, for example, for instead we lose it all.
So that suggests that we're not really coding the information
in enough detail to reconstruct the image, even though subjectively
we might think we have a complete image of the event or
the object. So we code only some of it and not all.
E: So it's like the brain is storing a framework of the image,
and then selectively filling in detail.
B: Or maybe when you image it, it may fill in detail in terms
of what it knows about the world, so it assumes certain
things. So I can't see your feet now, but I assume that
you have them, and that they're resting on the floor. It's
the same way that you can't see the bottom of my chair,
but you assume it's resting on the floor and not floating
in midair or something like that. There are some assumptions
we make, and often they're acurate, but we don't seem to
have detailed images, so if we played back what we do remember,
we wouldn't be able to reconstruct the scene. Instead we
seem to remember what's important, and the rest we don't
code. And we could fill in... Imagine, for example, that
person was wearing a green shirt that's at the party that
you were imaging, or a red shirt, and you could do that.
Knowing now that you're just kind of adding to what you
know, you could do it, but it's not really in your memory.
E: Switching gears a little, how would you define science?
B: Well actually I don't. There are enormous numbers of possible
definitions, and we don't have one kind of formal one, but
the ones that are reasonable, I think, extend from "science
is what scientists do" to "the
process by which you try to make inferences about the world
under controlled observations using rational analysis and
inferential machinery." So that's
probably the best one to describe what it is that scientists
are doing.
E: What do you consider to be the difference between science
and technology?
B: Well, actually, often they blur. Technology, though, is
often concerned with meeting some practical end and not
so concerned with what the principles on which let's say
that product, or whatever the device is, have been made.
Science is devoted to understanding just what it is that
could have led to the phenomenon in question. Often science
is experimental in a sense that you're comparing under controlled
conditions, and technology for the most part takes what's
known and tries to apply it without necessarily getting
new knowledge. Though, there is the kind of practical knowledge
you get when you're trying to build something -- what works,
what doesn't work -- so these become kind of informal experiments
that engineers at companies are doing as they get experience.
One of the things that is often done if a design is unsuccessful
-- like a bridge collapse -- is there's a failure analysis
and engineers try to understand why the bridge failed, what
happened, and that way they're learning. So they have an
experiment, maybe, that wasn't meant to be an experiment,
but doing it there's often intense scrutiny and analysis
about what it was, just trying to understand why it didn't
work, and then you make it better. You saw that when the
Challenger blew up, that there was this intense scrutiny
of the O-Rings, and everything else, and they learned from
that.
E: What is the process you use in doing science? How does
science as you do it relate to, for instance, the scientific
process they teach in high schools -- the sort of problem,
research, hypothesis?
B: Well, let me start with the second as you specified it
in more detail. Actually, the way it goes is -- who is it,
that science fiction writer, I'm blanking on his name. He
said, "Most often, great scientific discoveries are
preceeded not by 'Eureka!', but by 'Gee, that's funny.'"
[Ed. - Quote is by Isaac Asimov.] You get something, just
a "what about that?", a kind of pique
of curiosity. Or it may be you're listening to something,
or thinking about something, and you wonder, "What
happens if you did this and this,"
and it starts with that, an intuition. And then you set
up the machinery they describe in junior high schools, so
you then set it up to test in a rigorous way, but the real
action comes in many respects from that first insight, that
creativity. And often then you know when you're right, and
you use the rest of it to confirm it. It's almost, often,
the ideas are too elegant not to be right, as Einstein said
something to about that, that the universe is going for
beauty essentially. So that's often how it proceeds. It's
not that you have the whole field at your hand and say,
"Oh, this needs to be proved here. Let me do it this way,"
as if you had some machine that could generate hypotheses.
There are an infinite number of things it could generate,
only some of which are interesting. There is this highly
subjective creative component that often preceeds what you
think of as the differential machinery, though mastery of
the field is really vital, so you don't get people doing
great science on the get-go. You get people who are immersed
in the field and have the expertise, and then you see what
needs to be done.
