New technologies for manipulating and recording the nervous system allow us to perform unprecedented experiments. However, the influence of our experimental manipulations on psychological processes must be inferred from their effects on behavior. Today, quantifying behavior has become the bottleneck for large-scale, high throughput, experiments. The method presented here addresses this issue by using deep learning algorithms for video-based animal tracking. Here we describe a reliable automatic method for tracking head position and orientation from simple video recordings of the common marmoset (Callithrix jacchus). This method for measuring marmoset behavior allows for the indirect estimation of gaze within foveal error, and can easily be adapted to a wide variety of similar tasks in biomedical research. In particular, the method has great potential for the simultaneous tracking of multiple marmosets to quantify social behaviors.