Projects

TESS: My Personal Social Media Manager

CUA for Personalized Feeds.

CuaPilot

Control your Mac with natural language. Type what you want, watch it happen.

Flow Matching Diffusion for Drawing Circles

Training diffusion models with flow matching to draw circles using QuickDraw data.

Diffusion Clicking Head with VLM

A System I diffusion head for precise click predictions on top of frozen VLMs.

CUA VLA: CSO and Atari

World models for computer use evaluation.

MinecraftVLA: Vision-Language-Action Model for Playing Minecraft

Minecraft computer use agent that can perceive at 20Hz and actions at 30Hz.

Trying to Solve General Computer Use

My journey into computer use models and what I've learned about training VLAs for digital agents.

Bipedal Walking Using RL: Deep Dive with Code

Bipedal Walking Using RL: Deep Dive with Code

Moravecs Paradox: Spelled Out with Code

Moravecs Paradox: Spelled Out with Code

The Sim2Real Gap for Robot Locomotion Explained

The Sim2Real Gap for Robot Locomotion Explained

LLMs on Maps

Search via natural language for something you like on maps!

Talent Radar

24/7 agent that scrapes the internet for potential candidates for your organization.

OnboardGPT

LLM for enterprise codebase assistance.

Deep Learning for Multi-Robot Motion Planning

My master thesis at the Rainbow team at Inria, France.

Operational Research Solvers for Crew Scheduling

Infra for large scale web applications.

Self Driving Robot

Vision based navigation.

Change Your Singer: A Transfer Learning Generative Adversarial Framework for Song to Song Conversion

Song to song conversion powered by CycleGAN.