Text this: Online Learning and Robust Visual Tracking using Local Features and Global Appearances of Video Objects