Here are some pointers to relevant resources regarding Seam Carving, in addition to the class lecture.
The original paper (
http://doi.acm.org/10.1145/1276377.1276390) gives
the description of the dynamic programming approach and the particular
energies used.
The paper on video seam carving
(
http://doi.acm.org/10.1145/1360612.1360615) has the derivation of the
graph cut version of finding optimal seams as well as the description
of the forward energy.
The project pages by Ariel Shamir at
http://www.faculty.idc.ac.il/arik/SCWeb/imret/index.html and
http://www.faculty.idc.ac.il/arik/SCWeb/vidret/index.html are also
worth checking out (nice image comparisons there for example).
In the first homework assignment (due Jan 14. at 8pm PST) we are
asking you to implement seam carving. Specifically your program
should:
- read an image file (either use some library that does this in some
transparent way or limit yourself to some easy [standard] file
format)
- use the dynamic programming approach to implement vertial seams
only.
- use the e1 energy both backward and forward versions (have some way
of choosing which one so that you can compare). To deal with color
treat rgb as a 3-vector and interpret |\partial_x I| as the l_2 norm
of a 3-vector.
- in a preprocess compute all seams to support a reduction by 0-50% of
image width. Use smart numbering of pixels (as described in the
paper [or maybe something better you come up with yourself?]) to
support fast interactive window resizing in the x direction after
preprocessing is finished. Try to be efficient about the
preprocessing, but don't waste time getting super tight in your
code. This is not a speed competition, though having to wait 4
minutes each time gets to be a drag...
- also do expansion up to 50% doing the "run backwards and
interpolate" idea described in the paper.
- this rescaling should be accomplished interactively by resizing the
window on the x direction.
Extra bonus points:
- allow the user to "draw" onto the image to indicate that some region
should have very low energy (to support something like the object
removal trick shown in the paper).
The bonus is 30%. Think carefully whether it's worth your time (if you
already have bits of code to get mouse input to draw some region, say
with a big disk, then do it; if you don't, you may well be spending
too much time just dealing with user interface issues to make this
worth your time...). There'll be other bonus assignments, so no need to
grab this one...
Collaboration policy:
- when trying to understand the content of the papers, please work
together; help each other out!
- stop short of writing code on the whiteboard since it is too easy
for two people to end up handing in what amounts to the same code,
otherwise. That would not be good.
In summary:
- discussion (and generation!) of ideas together is GREAT!
Implementation details everyone has to sweat on their own.
For this particular assignment only:
- if you have a small snippet of code that uses some image library to
open an image file (say, jpeg or tiff) and display it in a window,
share this with the rest of the class. Nobody should waste their
time banging their head against the wall on that part.
What you need to hand in (to Julian at "panetta at caltech.edu" by 8pm PST Jan 14)
- a pointer to a simple web page showing the results of running your
program on 3 different images. Consider using images from the paper,
which you can download from the above web page pointers. Show a
comparison of forward and backward energy. Show at least one example
where the image content is such that things don't work very well.
- have a link on that page to your code and executable (separate
links). If you use a different OS than Windows XP to
make your executable, tell me where in the building I can run your
code. If it's something exotic, talk to me ahead of time so we can
address the "where do I run this program" issue ahead of
time. Needless to say I will also need to know how to invoke your
program (command line options? user interface? mouse click?
keyboard?). This should be documented on the above web page.
For this first assignment, any questions should be directed to Julian.