Submitted by Niels Rogge 6 VidEoMT: Your ViT is Secretly Also a Video Segmentation Model Mobile Perception Systems Lab 42 2