Seeing the Pose in the Pixels: Learning Pose-Aware Representations in Vision Transformers

Two approaches for training ViTs to learn Pose-aware representations for ADL videos, enabling fine-grained and viewpoint-agnostic visual perception.

Link: https://github.com/dominickrei/PoseAwareVT