TESS is a general purpose computer use agent inspired by humanoid Vision-Language-Action (VLA) models that can control computers like humans via mouse and keyboard.
The goal is to build a general agent that can do all valuable economic tasks at human level with same reaction time.
TESS should eventually perform inference at an average speed of 150ms and achieve human level performance on all computer use tasks.
Imagine TESS running for long horizon and complex tasks like designing rocket engines, running simulations, writing code and using GUIs at superhuman speed.
Embodied AI: The Next Intelligence Paradigm
The Current State of Robot Foundation Models
Embodied AI: The Data Bottleneck
π-0.5: A Foundation Model for Robot Manipulation
Robotic Foundation Models in 2050