ROBOT IMAGES TEXT COMMAND “Pick up the butter and hand it over to the robot on your left.” ROBOT STATE Joint angles Finger positions 20 Hz 7-9 Hz 7-9Hz 200 Hz 7-9Hz 20 Hz SYSTEM 2 Infrequent Vision- Language Semantic Reasoning 7B Pretrained VLM GPU 2 Latent Vector SYSTEM 1 Fast, Reactive Control 80M Transformer GPU 1 Whole Upper Body Control