FSCOCO comprises 10,000 freehand scene vector sketches with per point space-time information by 100 non-expert individuals, offering both object- and scene-level abstraction.
Pinaki Nath Chowdhury, Aneeshan Sain, Ayan Kumar Bhunia, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song
SketchX, Center for Vision Speech and Signal Processing
University of Surrey, United Kingdom
Published at ECCV 2022
We advance sketch research to scenes with the first dataset of freehand scene sketches, FS-COCO. With practical applications in mind, we collect sketches that convey well scene content but can be sketched within a few minutes by a person with any sketching skills. Our dataset comprises 10,000 freehand scene vector sketches with per point space-time information by 100 non-expert individuals, offering both object- and scene-level abstraction. Each sketch is augmented with its text description. Using our dataset, we study for the first time the problem of fine-grained image retrieval from freehand scene sketches and sketch captions. We draw insights on: (i) Scene salience encoded in sketches using the strokes temporal order; (ii) Performance comparison of image retrieval from a scene sketch and an image caption; (iii) Complementarity of information in sketches and image captions, as well as the potential benefit of combining the two modalities. In addition, we extend a popular vector sketch LSTM-based encoder to handle sketches with larger complexity than was supported by previous work. Namely, we propose a hierarchical sketch decoder, which we leverage at a sketch-specific “pre-text” task. Our dataset enables for the first time research on freehand scene sketch understanding and its practical applications
For our dataset, we compute two estimates of the category distribution across our data: (1) Upper Bound: based on semantic segmentation labels in images and (2) Lower Bound: based on the occurrence of a word in a sketch caption.
Total Sketches | # Categories | # Categories per Sketch | # Sketches per Category | ||||||
---|---|---|---|---|---|---|---|---|---|
Mean | Std | Min | Max | Mean | Std | Min | Max | ||
10,000 | 92/150 | 1.37/7.17 | 0.57/3.27 | 1/1 | 5/25 | 99.42/413.18 | 172.88/973.59 | 1/1 | 866/6789 |
https://github.com/pinakinathc/SketchX-SST
You will need to install npm
, nodejs
, mongodb
, pymongo
, numpy
Once you have done the setup, you are ready to run the code. I used a Linode server to host this service. It takes £5 for a month.
npm install
python init_db.py
sudo node server.js
Once you are done collecting data, you can visualise your results at scale using python visualise_sketch.py
. You will need to install bresenham
and cv2
for this. Also modify Line 60, 61
to set the path of MSCOCO data directory and SketchyCOCO data directory.
https://github.com/pinakinathc/fscoco
I use PyTorch
and PyTorch Lightning
for the experiments. If you face some issues with dependencies, please contant me.
I also added some code to run the experiments using HPC (i.e., Condor).
Example for running:
git clone https://github.com/pinakinathc/scene-sketch-dataset.git
cd scene-sketch-dataset/src/sbir_baseline
python main.py
Before you run main.py
ensure that the code is set up in training mode and data path are correct in options.py
.
Downloading this dataset means you agree to the following License / Terms of Use:
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
@inproceedings{fscoco,
title={FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in Context.}
author={Chowdhury, Pinaki Nath and Sain, Aneeshan and Bhunia, Ayan Kumar and Xiang, Tao and Gryaditskaya, Yulia and Song, Yi-Zhe},
booktitle={ECCV},
year={2022}
}
This dataset would not be possible without the support of the following wonderful people:
Anran Qi, Yue Zhong, Lan Yang, Dongliang Chang, Ling Luo, Ayan Das, Zhiyu Qu, Yixiao Zheng, Ruolin Yang, Ranit