M. Sc. Stettler, Michael
Department of Cognitive Neurology
Hertie Institute for Clinical Brain Research
Centre for Integrative Neuroscience
University Clinic Tübingen
Otfried-Müller-Str. 25
72076 Tübingen, Germany

AIMS: Humans recognize social interactions and intentions from videos of moving abstract stimuli, including simple geometric figures (Heider {&} Simmel, 1944). The neural machinery supporting such social interaction perception is completely unclear. Here, we present a physiologically plausible neural model of social interaction recognition that identifies social interactions in videos of simple geometric figures and fully articulating animal avatars, moving in naturalistic environments. METHODS: We generated the trajectories for both geometric and animal avatars using an algorithm based on a dynamical model of human navigation (Hovaidi-Ardestani, et al., 2018, Warren, 2006). Our neural recognition model combines a Deep Neural Network, realizing a shape-recognition pathway (VGG16), with a top-level neural network that integrates RBFs, motion energy detectors, and dynamic neural fields. The model implements robust tracking of interacting agents based on interaction-specific visual features (relative position, speed, acceleration, and orientation). RESULTS: A simple neural classifier, trained to predict social interaction categories from the features extracted by our neural recognition model, makes predictions that resemble those observed in previous psychophysical experiments on social interaction recognition from abstract (Salatiello, et al. 2021) and naturalistic videos. CONCLUSION: The model demonstrates that recognition of social interactions can be achieved by simple physiologically plausible neural mechanisms and makes testable predictions about single-cell and population activity patterns in relevant brain areas. Acknowledgments: ERC 2019-SyG-RELEVANCE-856495, HFSP RGP0036/2016, BMBF FKZ 01GQ1704, SSTeP-KiZ BMG: ZMWI1-2520DAT700, and NVIDIA Corporation.
Dynamic facial expressions are crucial for communication in primates. Due to the difficulty to control shape and dynamics of facial expressions across species, it is unknown how species-specific facial expressions are perceptually encoded and interact with the representation of facial shape. While popular neural network models predict a joint encoding of facial shape and dynamics, the neuromuscular control of faces evolved more slowly than facial shape, suggesting a separate encoding. To investigate these alternative hypotheses, we developed photo-realistic human and monkey heads that were animated with motion capture data from monkeys and humans. Exact control of expression dynamics was accomplished by a Bayesian machine-learning technique. Consistent with our hypothesis, we found that human observers learned cross-species expressions very quickly, where face dynamics was represented largely independently of facial shape. This result supports the co-evolution of the visual processing and motor control of facial expressions, while it challenges appearance-based neural network theories of dynamic expression recognition.
All images and videos displayed on this webpage are protected by copyright law. These copyrights are owned by Computational Sensomotorics.
If you wish to use any of the content featured on this webpage for purposes other than personal viewing, please contact us for permission.