#Emergentcapabilities in #largelanguagemodels, such as in-context learning, can also appear in #visionlanguageaction (#VLA) models. Scaling up #roboticfoundationmodels allows for emergent human-to-robot transfer, improving performance on tasks demonstrated in human videos by approximately 2x.…