LSVOS workshop will be held on October the 2nd in conjunction with ICCV 2023 in Paris, France. In this edition of the workshop and challenge, we are combining the classic YouTube-VOS benchmark with the newly introduced VOST dataset. VOST focuses on complex object transformations, such as egg cracking or molding of clay, which break the assumptions behind existing methods and require rethinking the basic design principles behind them. The combined challenge will be held in conjunction with video instance segmentation and referring video object segmentation YouTube-VOS competitions. In addition, we will hold a series of talks by the leading experts in video understating. The workshop will culminate in a round table discussion, in which speakers will debate the future of video object representations.

Time	Speaker	Topic
08:30 AM - 08:40 AM	Organizers	Opening Remarks
08:40 AM - 09:10 AM	Tim Meinhardt	VIS-à-MOT: A face-to-face of video tracking benchmarks
09:10 AM - 09:20 AM	Organizers	Video Object Segmentation track introduction
09:20 AM - 09:30 AM	Organizers	Video Object Segmentation under Transformations problem introduction
09:30 AM - 10:00 AM	Challenge participants	VOS winning teams talks
10:00 AM - 10:30 AM	Prof. Carl Vondrick	All the Ways to Track Occluded Objects
10:30 AM - 10:50 AM	Coffee Break
10:50 AM - 11:20 AM	Dr. Benjamin Peters	Dynamic Object Vision in Humans and Machines: Bridging human cognitive science and computer vision
11:20 AM - 11:30 AM	Organizers	Video Instance Segmentation track introduction
11:30 AM - 12:00 PM	Challenge participants	VIS winning teams talks
12:00 PM - 12:30 PM	Prof. Kristen Grauman	Objects in First-person Video: Queries and representation
12:30 PM - 01:30 PM	Lunch
01:30 PM - 02:00 PM	Dr. Adam Harley	Large-Scale Fine-Grained Tracking
02:00 PM - 02:10 PM	Organizers	Referring VOS track introduction
02:10 PM - 02:40 PM	Challenge participants	RVOS winning teams talks
02:40 PM - 03:10 PM	Dr. Cordelia Schmid	Dense Video Object Captioning
03:10 PM - 03:30 PM	Coffee Break
03:30 PM - 04:00 PM	Dr. Thomas Kipf	Object-centric Video Models: End-to-end learning with object priors
04:00 PM - 05:00 PM	Round Table Discussion
05:00 PM - 05:20 PM	Organizers	Closing Remarks

Time

Speaker

Topic

08:30 AM - 08:40 AM

Organizers

Opening Remarks

08:40 AM - 09:10 AM

Tim Meinhardt

VIS-à-MOT: A face-to-face of video tracking benchmarks

09:10 AM - 09:20 AM

Organizers

Video Object Segmentation track introduction

09:20 AM - 09:30 AM

Organizers

Video Object Segmentation under Transformations problem introduction

09:30 AM - 10:00 AM

Challenge participants

VOS winning teams talks

10:00 AM - 10:30 AM

Prof. Carl Vondrick

All the Ways to Track Occluded Objects

10:30 AM - 10:50 AM

Coffee Break

10:50 AM - 11:20 AM

Dr. Benjamin Peters

Dynamic Object Vision in Humans and Machines: Bridging human cognitive science and computer vision

11:20 AM - 11:30 AM

Organizers

Video Instance Segmentation track introduction

11:30 AM - 12:00 PM

Challenge participants

VIS winning teams talks

12:00 PM - 12:30 PM

Prof. Kristen Grauman

Objects in First-person Video: Queries and representation

12:30 PM - 01:30 PM

Lunch

01:30 PM - 02:00 PM

Dr. Adam Harley

Large-Scale Fine-Grained Tracking

02:00 PM - 02:10 PM

Organizers

Referring VOS track introduction

02:10 PM - 02:40 PM

Challenge participants

RVOS winning teams talks

02:40 PM - 03:10 PM

Dr. Cordelia Schmid

Dense Video Object Captioning

03:10 PM - 03:30 PM

Coffee Break

03:30 PM - 04:00 PM

Dr. Thomas Kipf

Object-centric Video Models: End-to-end learning with object priors

04:00 PM - 05:00 PM

Round Table Discussion

05:00 PM - 05:20 PM

Organizers

Closing Remarks

Important Dates

May 10th: Launch of the CodaLab challenge server for the validation set
August 1st: Release test sets and allow submissions to the test server
August 10th: Deadline for challenge submission
August 17th: Results are announced and winning teams are invited for a presentation

Schedule

Room S02. Zoom link (code: 720009)

Important Dates

Speakers

Organizers