news | FarSight Lab

2026	We released SVI-Bench, a new benchmark for video understanding developed jointly with Prof. Gedas Bertasius’s lab at UNC Chapel Hill. The benchmark exposes a striking capability gap in frontier models: 73% accuracy on perception, with progressively lower scores on causal reasoning and simulation, and just 5% on the most demanding task. Media coverage: • Northeastern Global News article by Cyrus Moulton • Northeastern Global News video interview on TikTok
2026	New paper: How You Move Tells What You’ll Do: Trajectory-Conditioned Egocentric Prediction. [project page]
2026	New paper: RECIPE: Procedural Planning via Grounding in Instructional Video. [project page]
2026	New paper: EvoGround: Self-Evolving Video Agents for Video Temporal Grounding with Minjoon Jung and Byoung-Tak Zhang. [arXiv] [project page]
2026	We have several openings for postdocs, visiting researchers, and PhD students to work on embodied AI, video understanding, and multimodal learning. Prospective applicants should contact Lorenzo Torresani with a CV and a one-page research statement.
Aug 2025	Lorenzo appointed President Joseph E. Aoun Chair at Northeastern University.
June 2025	Our state-space video model BIMBA won first place in the EgoSchema Challenge at CVPR 2025. [project page]
June 2025	Three of our papers received Distinguished Paper Awards at the CVPR 2025 EgoVis Workshop: Video ReCap, Ego4D Goal-Step, and HierVL. [awards page]
2025	PerceptionLM: Open-Access Data and Models for Detailed Visual Understanding accepted to NeurIPS 2025 as spotlight (<3.5%). [arXiv]
2025	Enrich and Detect: Video Temporal Grounding with Multimodal LLMs accepted to ICCV 2025 as highlight (<2.5%). [project page]
2025	Two papers accepted at CVPR 2025: BIMBA and ViTED.