Thursday, April 20, 2006

Full SubHistogram and Overlay Data Files

I've posted the full data files (45-88MB each) here: ftp://ftp.cs.umd.edu/pub/farrell/ .

The INDEO_BW21_crouch video was too large to do all at once and I have not yet run it in two parts. I'm not certain that there will be a way to ever get it into one array, but I think we can automate the loading of one file (first half of frames), clearing that and then loading the second half.

Thursday, April 13, 2006

The data that we currently have

* Local(1) - 721MB INDEO (8:15+2:30 sticks) -
* Local(3) - 154MB INDEO (1:45+1min sticks) - SubRegionHists: INDEO_BW40_crouch.mat
* Local(4) - 133MB INDEO (1:55+20sec sticks) - SubRegionHists: INDEO_BW48_crouch.mat
* Local(5) - ??? INDEO
* HD(3) - 845MB DV (3:11+shift) - SubRegionHists: CC_04223.mat
* HD(5) - 950MB DV (4:17) - SubRegionHists: CC_02650.mat

Wednesday, April 12, 2006

Subregion Histograms


Here are a few subregion histograms to look at from the video I'd been working with. I was trying to show some representative cases (top no bird, bottom with bird), but the top right one actually has some bird around frame 400 (oops). The key thing though is the left ones are dark background, the right ones are the straw background.

I need to look at a few more and look again at these more closely, but I figured I'd post them so you could look too. The code to generate them is easy if you've already gotten the sub-region hist structure, in my case simply:

mesh(reshape(subHists(SUBREGROW,SUBREGCOL,:,:),[NIMGS NBINS]));

Saturday, April 08, 2006

Cluster Distance Metrics


Here are a few different cluster distance functions:

The first one is the Euclidean distance between the two centers (in 64-dims).

The second and third column take a mean subregion image over all samples from each of the two clusters. The second column determines distance as the absolute value of the summed difference over all 1600 pixels between these mean. The lower image in the column is the upper with each pixel raised to the 0.25 power to see what's in the dark better (perhaps log would've been better, but anyway). The third column is the SSD between these to mean cluster images. And the image below is the sqrt of each pixel in the upper again to see the low distances better.

Compare these to the variance plot posted previously.

Two videos of the Euclidean distance (threshold at 0.5 of greatest dist) and ImageDistSDD (threshold of 0.05 of greatest dist) can be found at segment.avi and segment2.avi respectively.

Thursday, April 06, 2006

Variance and Sub Regions

Here's a quick snapshot of what the variance looks like across all frames, this sequence has constant illumination (he image is actually 10x the variance, just to increase detail). It gives a good visualization of where the male spends most of his time in this particular courtship.


I also created a MATLAB routine that will calculate SubRegion Histograms. I tried to document it clearly, hopefully it's easy to use/follow. You can find it here: SubRegionHist.m . I also generated 16-bin histograms for 96 sub-regions over my test sequence of 5942 frames. The mat file can be found here: SubHist96_16Bins.mat (~24MB, this file also contains the original mean and variance images). I don't have kmeans here at home (though I'm increasingly tempted to purchase the Stats Toolbox for home, but the $59 pricetag knocks me back to reality) but one of us can try some clustering at school.

Tuesday, April 04, 2006

SIFT?

A somewhat random idea, but I wonder if there might be a way to incorporate the ideas of SIFT keys in our tracking. For example if we have a "sift characterization" of each frame, the keys should change very little for frames in which there's no movement, even if illumination changes. When the bird moves, might the keys change dramatically in that region?

Thursday, March 30, 2006

The first Overall Picture

We talked about a lot of great ideas today, below is an attempt to summarize them so that we'll be able to come back to them next week and continue where we left off.

When editing, use a different color...
  • Illumination Clustering - use kmeans to perform whole frame histogram based clustering as we already planned on. The purpose is to identify the cluster centers well. Those are going to be the bases for our background models.
  • For a full courtship sequence, there will be 2+ illumination models/clusters. For each of these, models, we'll try to find a subset of frames where the bird is not stationary and use these to generate the background model for the state-of-the-art background subtraction. It is important that these representative frames have 2 properties:

    1. Bird not stationary

    2. Should be closer to the cluster centers so as to be representative.

    Note that these frames will very probably be highly non-contiguous (non-consecutive and from different subsequences in time from the same cluster).



  • Divide frame into regions and examine histogram for each given region over all of the frames. It will probably give us a cluster which is very tight indicating NO bird at all, along with a stream of outliers indicating varying proportions of the bird being in the sub-frame. This may allow us to determine when the bird is stationary and when it's moving. It may even at a coarse level tell us where the bird is.
  • When processing a given frame, use the background model for both (all) illumination clusters and weight the results based on the frame's similarity to each cluster center. This is part of the strategy for dealing with the transitions between illumination conditions.


  • Use LDA to discard irrelevant bins in histogram?

  • An important issue is that we also wish to use the temporal information for this purpose. We feel that we might be able to locate (and verify heuristically) a set of frames of constant illumination where our backgnd subtraction has worked well. Then we know where the bird. So in the next / previous frames (bidirectional) we can have a bounding box within which we can look for the bird. This will reduce false alarms and increase the speed a lot. Identification of VERY GOOD results can have a user interactive scheme but we must remember: Any kind of USER INTERVENTION = Stable results but Lesser chances of good publication!! Something to consider later, but user intervention in this stage seems a good way to start.

  • Another idea to keep in the back of our minds is multiple model voting.

  • New idea: Another idea is to use more color information about the bird as the bird never changes! So why not use its appearance from GOOD frames to refine bad ones. What if we do the whole thing .... then go back and retain portions (based on color) that are stable over larger video segments. That way we may get rid of shadows, that are created on leaves and ground as they will have tinges of green and brown respectively in some frames, and maybe we can remove those pixels.

More thoughts on Kmeans...

I think that we should try initializing with both the kmeans clusterings and also with the chi-squared cutoffs you were looking at and see how it works.

An idea for visualizing the clusters is to run PCA (or some form of LDA, not sure if it's already in MATLAB though) so that projecting them will spread out the clusters as much as possible. It's easy to visualize the 2 primary components (gscatter), but 3 would be more useful, but I haven't looked into know of a simple way to do it.

I think that it's probably best to re-use the code you have as it's probably far more sophisticated in terms of its theoretical basis and it performs very well on (relatively) constant illumination. At the same time however I'm still interested in pursuing some form of voting amongst background models, perhaps weighted by the similarity of each model to the given frame. Illumination aside, this approach was very effective for dealing with the bird standing still for long periods of time (so long as not to many backgrounds were drawn from that stationary period). It is a very differenent from the robust approach in the code you have, it has both advantages and disadvantages.

A final thought for the moment is that it may be effective to do multiple passes. Try to figure out the various illuminations, then given this try to figure out if there's a "bird size object" moving in the range of frames (i.e. figure out if it's not stationary) and use this to refine our selection of what to use as background.