2017 CCN Workshop: Action understanding: from kinematics to mind

Organizers: Ida Gobbini, Jim Haxby, Angelika Lingnau, Hervé Abdi, Lorenzo Torresani, Sam Nastase, Matteo Visconti di Oleggio Castello, and Nick Oosterhof

Sponsored by the Center for Cognitive Neuroscience

Dates: August 24 and August 25

Location: The Hanover Inn, Hanover, NH

*Registration is now closed.*



Renée Baillargeon, University of Illinois

Early Reasoning about Fairness and Ingroup Support


It has been proposed that the basic structure of human moral cognition includes a small set of abstract principles. One way to test this proposal is to examine whether expectations about the principles are already present in early childhood. In my talk, I will focus on two candidate principles, fairness and ingroup support. With respect to fairness, I will present evidence that even young infants possess an abstract notion of equity. Next, I will report a series of experiments that used the minimal-group method to examine early expectations about ingroup support and its two corollaries, ingroup care and ingroup loyalty. With respect to ingroup care, our findings suggest that infants and toddlers expect individuals in a group to act prosocially, to minimize harm, and to punish harm to ingroup members. With respect to ingroup loyalty, infants and toddlers expect individuals to prefer ingroup over non-ingroup members, to align their choices with those of in-group members, and to favor ingroup members when allocating limited resources. Together, these results provide robust evidence that the initial draft of moral cognition includes abstract principles of fairness and ingroup support that make possible, from an early age, rich and nuanced expectations about how individuals will act toward others. 

Leon Bottou, Facebook AI Research

Looking for a missing signal


We know how to spot objects in images, but we must learn on more images than a human can see in a lifetime. We know how to translate text (somehow), but we must learn it on more text than a human can read in a lifetime. We know how to learn playing Atari games, but we must learn it by playing more games than any teenager can endure. The list is long. We can of course try to pin this inefficiently to some properties of our algorithms. However, we can also take the point of view that there is possibly a lot of signal in natural data that we simply do not exploit. I will report on two works in this direction. The first one establishes that something as simple as a collection of static images contains nontrivial information about the causal relations between the objects they represent. The second one, time permitting, shows how an attempt to discover structure in observational data led to a clear improvement of General Adversarial Networks.

Jody Culham, University of Western Ontario

The influence of solo and joint action goals on brain activation patterns and kinematics


Growing evidence suggests that the goals of an action modulate the performance and neural coding of the action. I will present work from my lab that brain activation patterns during grasp planning and execution are modulated by the realness of the stimuli, the effector employed, and the goals of the task. I will also describe recent research by postdoctoral fellow Kaitlin Laidlaw, who has found that movement kinematics are affected by the goals of a joint action performed with a partner. When participants are asked to perform the same basic action - grasping, lifting, and placing a block away from themselves - their kinematics were modulated by the expectation of whether a partner would subsequently take the block or leave it. Moreover, the degree to which action kinematics differed based on social intentionality was correlated with participants' self-reported communication skills. Taken together, these results highlight the importance of considering actions not just in terms of low-level movements but also in the context of the overarching goals held by the actor and other individuals.

Martin Giese, University Clinic Tübingen

Neural models of visual action recognition and its interaction with action execution


Action perception and execution are intrinsically linked in the human brain. Consequently, visual action processing involves a whole spectrum of cortical functions, ranging from the visual processing of shape and motion, patio-temporal relationships and the semantic aspects of actions, to the interaction of visual representations with motor programs. The talk presents a neural theory that has been developed in close connection to neural and behavioral data. It provides a unifying account for a variety of experimental observation on visual action recognition and its interaction with motor execution. The core of the model is a physiologically-inspired neural hierarchy ('deep architecture') that mimics properties of neurons in the visual pathway and associated motor areas. The framework embeds neural field models for the representation of temporal sequences, response selection, and the implementation of flexible couplings between different representations. For the processing of goal-directed actions, the basic hierarchical model has to be extended by special mechanisms for the processing of spatial relationships between effectors and objects. In order to account for the interaction between action perception and motor execution the theory includes also neural representations for motor programs, which dynamically interact with visual representations. It is shown that such models account in a unifying manner for experimental results obtained with a variety of different methods, including single-cell physiology, behavioral studies and fMRI experiments. In addition, the theory helped to predict neural mechanisms at the single cell level for the visual perception of causality.

Acknowledgements: Supported by EC Fp7-PEOPLE-2011-ITN PITN-GA-011-290011 (ABC), FP7-ICT-2013-FET F/604102 (HBP), BMBF, FKZ: 01GQ1002A, DFG G1 305/4-1 + KA 1258/15-1, and HESP RG0036/2016.

James Kilner, University College London

Inferring cognitive states of another and yourself from observed and executed actions


Despite the discovery of mirror neurons over 25 years ago there is little consensus as to their functional role or more generally the role of motor system activations during action observation. In the talk I will briefly describe a simple theoretical account of motor system activation during action observation and test the prediction from this model that motor system activity during action observation is most likely involved in the prediction of the kinematics of the observed action. Further to this I will present data showing how observing your own actions influences your perceptions of yourself and I will show data demonstrating that when we observe actions we modulate our motor system activity as well as the bodily signals in tune with the observed action.

Angelika Lingnau, Royal Holloway, University of London

The representation of actions in the human brain


Being able to understand other people's actions is fundamental for social interactions, and for the selection and preparation of our own actions. We are able to perform this task despite the fact that actions can be performed in various different ways. How does our brain achieve the ability to distinguish between actions while generalizing across the way these actions are performed? What are the specific contributions of temporal, parietal and frontal areas typically recruited during the observation of actions? In this talk I will present a number of recent studies using multivariate pattern analysis (MVPA) and representational similarity analysis (RSA) of functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG) data that addressed these questions. I will discuss the results in light of the ongoing debate on the neural basis of action recognition and point out possible future directions.

Sam Nastase, Dartmouth College

Primacy of observed action representation during natural vision


The received understanding of cortical information processing during, e.g., object recognition, is based on experiments using highly-controlled, often static stimuli. However, dynamic naturalistic stimuli convey considerably richer perceptual and semantic information, and can provide complementary insights. The current line of work uses functional MRI to investigate how the brain extracts behaviorally-relevant semantic information during naturalistic vision to support action recognition. Participants viewed brief video clips of animals behaving in their natural environments. Stimuli were organized in a factorial design with four behavioral categories (eating, fighting, running, and swimming) and five taxonomic categories (birds, insects, primates, reptiles, and ungulates). Replicating existing work, we found that animal taxonomy was represented in ventral temporal cortex, while animal behavior was represented in lateral occipitotemporal, anterior parietal, and premotor cortices. Task demands enhanced representational discriminability along behaviorally relevant dimensions in late-stage sensorimotor cortices. Interestingly, throughout much of cortex, animal behavior accounted for markedly more variance in representational geometry than the animal taxonomy - even in ventral temporal cortex. Furthermore, behavioral category classification generalized across stimuli and taxonomic categories. Ongoing research efforts using a broader range of stimuli depicting social and nonsocial human actions are intended to provide insights into the representational geometry supporting action understanding at various stages of cortical processing. Our findings suggest that dynamic stimulus features, particularly those conveying behaviorally-relevant action information, dominate cortical processing during natural vision.

Andrew Schwartz, University of Pittsburgh

Recent progress toward high-performance neural prosthetics


A better understanding neural population function would be an important advance in systems neuroscience.  The change in emphasis from the single neuron to the neural ensemble has made it possible to extract high-fidelity information about movements that will occur in the near future.  Information processing in the brain is distributed.  Neurons in different anatomical structures process similar information and each neuron encodes many parameters simultaneously.  Although the fidelity of information represented by individual neurons is weak, because encoding is redundant and consistent across the population, extraction methods based on multiple neurons are capable of generating a faithful representation of intended movement.  A new generation of investigation is focused on population-based analyses, focusing on operational characteristics of the motor system. The realization that useful information is embedded in the population has spawned the current success of brain-controlled interfaces.  Since multiple movement parameters are encoded simultaneously in the same population of neurons, we have been gradually increasing the degrees of freedom (DOF) that a subject can control through the interface.  Our early work showed that 3-dimensions could be controlled in a virtual reality task.  We then demonstrated control of an anthropomorphic physical device with 4 DOF in a self-feeding task.  Currently, monkeys in our laboratory are using this interface to control a very realistic, prosthetic arm with a wrist and hand to grasp objects in different locations and orientations.  Our recent data show that we can extract 10-DOF to add hand shape and dexterity to our control set.  This technology has now been extended to a paralyzed patient who cannot move any part of her body below her neck.  Based on our laboratory work and using a high-performance “modular prosthetic limb” she was able to control 10 degrees-of-freedom simultaneously.  The control of this artificial limb was intuitive, the movements were coordinated and graceful, and closely resembled natural arm and hand movement.  This subject has been able to perform tasks of daily living--  reaching to, grasping and manipulating objects, as well as performing spontaneous acts such as self-feeding.  Current work with a second subject is progressing toward making this technology more robust and extending the control with tactile feedback to sensory cortex.

 Georgopoulos, A.P., Schwartz, A.B., Kettner, R.E.:  Neuronal population coding of movement direction.  Science 233, 1357‑1440, 1986.

 Schwartz, A.B.:  Direct cortical representation of drawing movements.  Science, 265: 540-543, 1994.

 Taylor, D.M., Helms Tillery, S.I., Schwartz, A.B.: Direct cortical control of 3D neuroprosthetic devices.  Science, 296:1829-1832, 2002.

 Collinger J.L., Wodlinger B., Downey J.E., Wang W., Tyler-Kabara E.C., Weber D.J., McMorland A.J.C., Velliste M., Boninger M.L., Schwartz A.B.: High-performance neuroprosthetic control by an individual with tetraplegia. Lancet  6736:61816-61819, 2012

 Wodlinger B., Downey J.E., Tyler-Kabara E.C., Schwartz A.B., Boninger M.L., Collinger J.L.: Ten-dimensional anthropomorphic arm control in a human brain-machine interface: difficulties, solutions, and limitations. J. Neural Eng. 2014

Lorenzo Torresani, Dartmouth College

Deep Spatiotemporal Models for Computational Video Understanding


Over the last few years deep learning has revolutionized the field of still-image analysis by delivering breakthrough results on many hard computer vision tasks, including object recognition, detection, scene classification, and semantic pixel-level prediction. While there has been widespread expectation that these performance improvements should naturally extend to the video domain, the results so far have been lagging compared to the image setting.

In this talk I will discuss the unique challenges posed by the video domain and provide a survey of recent efforts in designing effective deep computational models for video understanding. I will conclude by presenting a deep model for spatiotemporal visual attention explicitly trained to mimic where humans look in a video and I will describe how it can be leveraged to improve the performance of existing algorithms for action recognition.