CARVIEW

MOTORHOMES

Select Language

HTTP/2 200 server: GitHub.com content-type: text/html; charset=utf-8 last-modified: Fri, 31 May 2024 03:47:01 GMT access-control-allow-origin: * strict-transport-security: max-age=31556952 etag: W/"66594835-72c3" expires: Sun, 28 Dec 2025 22:21:28 GMT cache-control: max-age=600 content-encoding: gzip x-proxy-cache: MISS x-github-request-id: FCFF:3827E5:814949:911B9E:6951AB0F accept-ranges: bytes age: 0 date: Sun, 28 Dec 2025 22:11:28 GMT via: 1.1 varnish x-served-by: cache-bom-vanm7210076-BOM x-cache: MISS x-cache-hits: 0 x-timer: S1766959888.999072,VS0,VE223 vary: Accept-Encoding x-fastly-request-id: 0356881b45ffdfb836892710c30c49b97b46e36c content-length: 6991 Chop & Learn

Chop & Learn: Recognizing and Generating Object-State Compositions

Nirat Saini^*, Hanyu Wang^*, Archana Swaminathan, Vinoj Jayasundara, Bo He, Kamal Gupta, Abhinav Shrivastava

University of Maryland, College Park

(ICCV 2023)

Paper arXiv Code Data

Abstract

Recognizing and generating object-state compositions has been a challenging task, especially when generalizing to unseen compositions. In this paper, we study the task of cutting objects in different styles and the resulting object state changes. We propose a new benchmark suite Chop & Learn, to accommodate the needs of learning objects and different cut styles using multiple viewpoints. We also propose a new task of Compositional Image Generation, which can transfer learned cut styles to different objects, by generating novel object-state images. Moreover, we also use the videos for Compositional Action Recognition, and show valuable uses of this dataset for multiple video tasks.

Dataset Collection and Statistics

ChopNLearn is collected with:

3 participants
4 cameras from different viewpoints
20 objects and 7 styles of cutting (+1 with whole)
112 compositions of object state pairs
1338 images of state-object compositions
1260 videos (ranging from 16 seconds to ~12 minutes)

We show the number of samples for each object-style composition in a color-coded manner:

orange represents 12 samples

green represents 8 samples

blue represents 4 samples.

Compositional Image Generation Results

Compositional Image Generation: Given training images of various objects in different states, generate new images of unseen pairs of objects and states.

We consider these methods: