I think that we should try initializing with both the kmeans clusterings and also with the chi-squared cutoffs you were looking at and see how it works.
An idea for visualizing the clusters is to run PCA (or some form of LDA, not sure if it's already in MATLAB though) so that projecting them will spread out the clusters as much as possible. It's easy to visualize the 2 primary components (gscatter), but 3 would be more useful, but I haven't looked into know of a simple way to do it.
I think that it's probably best to re-use the code you have as it's probably far more sophisticated in terms of its theoretical basis and it performs very well on (relatively) constant illumination. At the same time however I'm still interested in pursuing some form of voting amongst background models, perhaps weighted by the similarity of each model to the given frame. Illumination aside, this approach was very effective for dealing with the bird standing still for long periods of time (so long as not to many backgrounds were drawn from that stationary period). It is a very differenent from the robust approach in the code you have, it has both advantages and disadvantages.
A final thought for the moment is that it may be effective to do multiple passes. Try to figure out the various illuminations, then given this try to figure out if there's a "bird size object" moving in the range of frames (i.e. figure out if it's
not stationary) and use this to refine our selection of what to use as background.