5th Year Master of Computer Science Thesis Presentation

Monday, May 2, 2022 - 10:00am to 11:00am

Location:

In Person Traffic21 Classroom, Gates Hillman 6501

Speaker:

XIAOYU ZHANG, Masters Student Computer Science Department Carnegie Mellon University https://erinzhang1998.github.io/

The Language of Sketches

Creative AI has seen much progress in recent years. Works like DALL-E 2 can generate inspiring art pieces from text descriptions. Instead of synthesizing realistic art works from language, we approach creativity from a different angle and investigate composition of semantic parts and visual concepts in sketches. For example, people can draw a circle to represent the moon, a scoop of ice-cream, or the face of a cat. In a similar way, language descriptors can be composed to create new concepts. People can draw a large round cat face or a narrow oval cat face. 
In order to study this reuse of abstract concepts, we construct a dataset of language annotated sketches.  We examined current sketch datasets and found that they either lack language annotations or semantic part annotations. Therefore, we collect a dataset of 11,150 (sketch part, text) pairs for 572 face sketches and 787 angel sketches. 
To understand the limits of current vision-language models, we fine-tuned CLIP,  a model pre-trained with a contrastive objective on 400 million (image,text) pairs and can map (image,text) pairs into a joint vision-language embedding space. We observed that (1) CLIP cannot easily generalize to an unseen category on the task of pairing sketches with their descriptions even though similar shapes and descriptions have occurred in training; (2) through fine-tuning, average cosine distance has increased between a pair of descriptors used by annotators to differentiate two sketches. With insights gained about how language and sketches interact in the CLIP embedding space, our aim is to facilitate research into models that can generate sketches in a part-based manner satisfying descriptions given by users of the pictures they have on their minds. 
Thesis Committee:
Oliver Kroemer (Co-Chair)
Yonatan Bisk (Co-Chair)
Jean Oh
Additional Information

For More Information, Contact:

Keywords:

5th Year Master's Thesis Presentation