Matt Deitke

AI Researcher at AI2 and UW CSE

mattd@allenai.org

Seattle, WA

Research Statement

Matt Deitke

Hi! I'm an AI researcher with the PRIOR team at the Allen Institute for AI, and a Ph.D. student at the Allen School in the University of Washington, Seattle. I am fortunate to be advised by Ali Farhadi in the RAIVN lab, and have had the pleasure of collaborating with Ani Kembhavi, Roozbeh Mottaghi, Ludwig Schmidt, and Rick Szeliski. My research interests are in computer vision, deep learning, and embodied AI. I am interested in building AI systems that are broadly useful and robust. Recently, I have led the development of ProcTHOR, Objaverse, and Phone2Proc.

Since 2019, I have been conducting research at the Allen Institute for AI while completing my undergrad at the University of Washington, Seattle. Before that, I grew up near Chicago and spent several teenage years working on computer graphics, interface design, and visualization in the Department of Athletics at The Ohio State University, the University of Cincinnati, and a variety of other organizations. Later in high school, I studied machine learning and deep learning at Georgia Tech.

News

ProcTHOR won the Outstanding Paper Award at NeurIPS 2022!

Nov, 2022

Excited to announce Objaverse, a massive dataset of 3D objects with broad applications across AI.

Dec, 2022

Thrilled to continue at UW for my Ph.D. working with Ali Farhadi and at AI2.

Apr, 2023

Received an Outstanding Reviewer Award from CVPR 2023!

Jun, 2023

Giving invited talks at RSS 2023, ICCV 2023, and Shanghai AI Lab!

Jun, 2023

The 2nd edition of Rick Szeliski's Computer Vision textbook was published! Ecstatic to have contributed!

Jan, 2022

Grateful to have received Ph.D. offers at CMU, MIT, Oxford, Stanford, UC Berkeley, and UW.

Mar, 2023

Giving an invited talk at UW's Vision Lunch titled, "Scaling Embodied AI with ProcTHOR: Where We Are and What's Next."

Jun, 2022

Extremely excited to release ProcTHOR! Using procedural generation to scale up the diversity of data leads to remarkable generalization.

Jun, 2022

Excited to release a retrospectives on the Embodied AI workshops! We discuss common approaches, its scope, and future directions.

Oct, 2022

Co-Organizing the Embodied AI Workshop and the AI2-THOR Rearrangement Challenge at CVPR 2022 in New Orleans.

Jun, 2022

We released an updated revision of the AI2-THOR paper covering its impact and new features!

Aug, 2022

Reviewing at
- NeurIPS 2023
- CVPR 2023
- ICLR 2023

Aug, 2022

Publications

An image of a bunch of 3D objects scattered in a scene

Objaverse-XL: A Universe of 10M+ 3D Objects

Matt Deitke, Ruoshi Liu, Matthew Wallingford, Huong Ngo, Oscar Michel, Aditya Kusupati, Alan Fan, Christian Laforte, Vikram Voleti, Samir Yitzhak Gadre, Eli VanderBilt, Aniruddha Kembhavi, Carl Vondrick, Georgia Gkioxari, Kiana Ehsani, Ludwig Schmidt*, Ali Farhadi*

NeurIPS 2023

TLDR

We introduce Objaverse-XL, an open dataset of over 10 million 3D objects. With it, we train Zero123-XL, a foundation model for 3D, observing incredible 3D generalization abilities. With the Zero123-XL base model, we can then perform image-to-3D and text-to-3D.

Objaverse: A Universe of Annotated 3D Objects

Matt Deitke, Dustin Schwenk, Jordi Salvador, Luca Weihs, Oscar Michel, Eli VanderBilt, Ludwig Schmidt, Kiana Ehsani, Aniruddha Kembhavi, Ali Farhadi

CVPR 2023

TLDR

Objaverse is a massive dataset of objects with 800K+ (and growing) 3D models with descriptive captions, tags, and animations. We demonstrate it's potential by training generative models, improving 2D instance segmentation, training open-vocabulary object navigation models, and creating a benchmark for testing the robustness of vision models.

Phone2Proc: Bringing Robust Robots Into Our Chaotic World

Matt Deitke*, Rose Hendrix*, Luca Weihs, Ali Farhadi, Kiana Ehsani, Aniruddha Kembhavi

CVPR 2023

TLDR

From a 10-minute iPhone scan of any environment, we generated simulated training scenes that semantically match that environment. Training a robot to perform ObjectNav in these scenes dramatically improves sim-to-real performance from 35% to 71% and results in an agent that is remarkably robust to human movement, lighting variations, added clutter, and rearranged objects.

🏘️ ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

Matt Deitke, Eli VanderBilt, Alvaro Herrasti, Luca Weihs, Jordi Salvador, Kiana Ehsani, Winson Han, Eric Kolve, Ali Farhadi, Aniruddha Kembhavi, Roozbeh Mottaghi

NeurIPS 2022 Outstanding Paper Award

TLDR

We built a platform to procedurally generate realistic, interactive, simulated 3D environments to dramatically scale up the diversity and size of training data in Embodied AI. We find that it helps significantly with performance on many tasks.

Retrospectives on the Embodied AI Workshop

Matt Deitke, Dhruv Batra, Yonatan Bisk, Tommaso Campari, Angel X. Chang, Devendra Singh Chaplot, Changan Chen, Claudia Pérez D'Arpino, Kiana Ehsani, Ali Farhadi, Li Fei-Fei, Anthony Francis, Chuang Gan, Kristen Grauman, David Hall, Winson Han, Unnat Jain, Aniruddha Kembhavi, Jacob Krantz, Stefan Lee, Chengshu Li, Sagnik Majumder, Oleksandr Maksymets, Roberto Martín-Martín, Roozbeh Mottaghi, Sonia Raychaudhuri, Mike Roberts, Silvio Savarese, Manolis Savva, Mohit Shridhar, Niko Sünderhauf, Andrew Szot, Ben Talbot, Joshua B. Tenenbaum, Jesse Thomason, Alexander Toshev, Joanne Truong, Luca Weihs, Jiajun Wu

ArXiv 2022

TLDR

We present a retrospective on the state of Embodied AI research. Our analysis focuses on 13 challenges in visual navigation, rearrangement, and embodied vision-and-language. We discuss the scope of embodied AI research, performance of state-of-the-art models, common modeling approaches, and future directions.

PDF

Semantic Scholar

arXiv

AI2-THOR: An Interactive 3D Environment for Visual AI

Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Matt Deitke, Kiana Ehsani, Daniel Gordon, Yuke Zhu, Aniruddha Kembhavi, Abhinav Gupta, Ali Farhadi

ArXiv 2022

TLDR

We introduce The House Of inteRactions (THOR), a framework for visual AI research. AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigate in the scenes and interact with objects to perform tasks. It has enabled research in many areas of AI.

Visual Room Rearrangement

Luca Weihs, Matt Deitke, Aniruddha Kembhavi, Roozbeh Mottaghi

CVPR 2021 Oral Presentation

TLDR

We built a pre-training task where the agent's goal is to interactively rearrange objects in a room from one state to another. For instance, the agent may have to open the Fridge and move the Lettuce to the CounterTop. Modern deep-RL struggles.

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

Matt Deitke*, Winson Han*, Alvaro Herrasti*, Aniruddha Kembhavi*, Eric Kolve*, Roozbeh Mottaghi*, Jordi Salvador*, Dustin Schwenk*, Eli VanderBilt*, Matthew Wallingford*, Luca Weihs*, Mark Yatskar*, Ali Farhadi

CVPR 2020

TLDR

We rent office buildings in Seattle and turn them into apartment studios with many possible furniture and wall layouts. Each apartment layout is then computationally remodeled by hand to enable a simulated robot to interact with it in video-game-like context. We study how well a robot trained purely in the simulated environments can transfer to reality.

Software

AI2-THOR

AI2-THOR consists of real and simulated environments for interactive robot learning.

allenai/objaverse-xl

Framework

Scripts for Downloading and Processing Objaverse-XL

Python

allenai/ai2thor

Framework

Interactive Simulated Environments for Embodied AI

Unity

allenai/procthor

Framework

Procedurally Generate Houses for Embodied AI Training

Python

allenai/objaverse-rendering

Framework

Scripts for rendering Objaverse

Python

allenai/allenact

Framework

A Framework for Training Embodied-AI Agents

PyTorch

allenai/ai2thor-rearrangement

Template

Code for Running the Visual Room Rearrangement task

PyTorch

allenai/prior

Tool

Python Package for Distributing Datasets and Models

Python

allenai/ai2thor-colab

Tool

Run AI2-THOR with Google Colab

Colab

allenai/procthor-10k

Dataset

The ProcTHOR-10K Houses Dataset

Python

allenai/object-nav-eval

Dataset

Evaluation tasks for ObjectNav models

Python

embodied-ai-workshop/embodied-ai.org

Website

The Website for the Embodied AI Workshop at CVPR

React

RAIVNLab/RAIVNLab.github.io

Website

Website for the UW RAIVN Lab

React

mattdeitke/cvpr-buzz

Website

Explore Trending Papers at CVPR 2021

React

mattdeitke/CVPR2019

Website

Explore and Parse Papers at CVPR 2019

Python

Workshops

Embodied AI

I've co-organized the Embodied AI workshops at CVPR. Our goal is to bring together researchers to share and discuss the current state of intelligent agents that can see, talk, act, and reason.