Georgia Gkioxari portrait

Georgia Gkioxari

I am an assistant professor in the Computing & Mathematical Sciences department at Caltech. Previously, I was a research scientist at Meta's FAIR team. I completed my PhD at UC Berkeley with Jitendra Malik and my undergraduate studies at NTUA, Greece, where I worked with Petros Maragos.


I am a Packard Fellow (2025), the recipient of the PAMI Young Researcher Award (2021), a Google Faculty Award (2024), the Okawa Research Award (2024) and the Amazon Research Award (2024). My teammates and I received the PAMI Mark Everingham Award (2021) for the Detectron Library Suite. I was named one of 30 influential women advancing AI in 2019 by ReWork and was nominated for the Women in AI Awards in 2020 by VentureBeat. Read more about me and my work in this Q&A.


email | cv | twitter/x | linkedin | google scholar | github

Research

The goal of our work is to design advanced visual perception models that extend the boundaries of current visual capabilities. My group currently focuses on four directions: 2D Perception, 3D Perception, Spatial Reasoning, and Tools.


2D perception


2D perception teaser

Recognition and segmentation in images.

3D perception


3D perception teaser

3D scene reconstruction and understanding.

spatial reasoning


Spatial reasoning teaser

Agents that reason about space and time.

tools


Tools teaser

Tools for 3D deep learning, PyTorch3D.

Glab Members

Ziqi Ma
Ziqi Ma
Damiano Marsili
Damiano Marsili
Aadarsh Sahoo
Aadarsh Sahoo
Ilona Demler
Ilona Demler

(main advisor: Perona)
Raphi Kang
Raphi Kang

(main advisor: Perona)

Highlights

3D perception

Steer3D: Feedforward 3D Editing via Text-Steerable Image-to-3D

Ziqi Ma, Hongqiao Chen, Yisong Yue, Georgia Gkioxari
ConvSeg teaser
2D perception

Conversational Image Segmentation: Grounding Abstract Concepts with Scalable Supervision

Aadarsh Sahoo, Georgia Gkioxari
TWIN teaser
spatial reasoning

Same or Not? Enhancing Visual Perception in Vision-Language Models

Damiano Marsili, Aditya Mehta, Ryan Y. Lin, Georgia Gkioxari
spatial reasoning

No Labels, No Problem: Training Visual Reasoners with Multimodal Verifiers

Damiano Marsili, Georgia Gkioxari
Kyvo teaser
3D perception

Aligning Text, Images, and 3D Structure Token-by-Token

Aadarsh Sahoo, Vansh Tibrewal, Georgia Gkioxari
Find3D teaser
3D perception

Find3D: Find Any Part in 3D

Ziqi Ma, Yisong Yue, Georgia Gkioxari
ICCV 2025, Highlight ✨
VADAR teaser
spatial reasoning

VADAR: Visual Agentic AI for Spatial Reasoning with a Dynamic API

Damiano Marsili, Rohun Agrawal, Yisong Yue, Georgia Gkioxari
CVPR 2025
Omni3D teaser
3D perception

Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild

Garrick Brazil, Abhinav Kumar, Julian Straub, Nikhila Ravi, Justin Johnson, Georgia Gkioxari
CVPR 2023
PyTorch3D teaser
tools

PyTorch3D

Nikhila Ravi, Jeremy Reizenstein, David Novotny, Taylor Gordon, Wan-Yen Lo, Justin Johnson, Georgia Gkioxari
Open-source library for 3D deep learning in PyTorch
Mask R-CNN teaser
2D perception

Mask R-CNN

Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick
ICCV 2017, Marr Prize

Alumni

Teaching at Caltech

EE/CS 148

EE/CS 148 — Spring 2023: Large Language & Vision Models

EE/CS 148 — Spring 2024: Large Language & Vision Models

CS 101

CS 101 — Winter 2024: Learning & 3D

Join Us

Caltech students (undergrads & grads): If you wish to work with me, please read this information.

Prospective post-docs: Interested in computer vision, 3D, representation learning, or perception? Email me your CV and a short research statement.

Prospective PhD students: Apply directly to the CMS department and mention my name in your statement of purpose. No separate email needed.