Lots of artistic stuff
Video playback requires a wholly different underlying framework to be created in a player. Not only you have to redesign the UI to incorporate them, but you also have to support the decoder framework (DirectShow or WMF, usually the former or both) or bundle a decoder like ffdshow with the application itself. The first option implies a whole new code component needed to be written, with its own painfull debugging period. The second option would mean Steven would have to pay royalties to groups like MPEG-LA in order to legally be able to incorporate decoders into Musicbee. It's the same for reason the AAC decoder is not bundled with the application
Taking in account that Musicbee is written in VB.NET, and that it uses a premade decoder library to decode and render audio to the soundsystem, it would seem that Steven does not have the ability or time to get down and dirty doing low-level stuff. And I don't blame him. Incorporating a decoder framework like DirectShow would mean 10x the work that would be needed just for the sound functions. From reliably loading filter, to creating pingraphs and connecting them propery and rendering video, while at the same time more optimizations would be needed for fluid video playback.
And since the application is mostly a "UI" around the BASS libraries and other linked tools (mp3gain, decoder etc), written in a managed language (where, no matter how you optimize your code, it will never be that fast), what you're basically is a total from-the-ground-up rewrite in a differenent language. You're basically asking for another program.
Sorry, but I dont see it happening.
/Lots of technical stuff