Sketch Quality Prediction Using Transformers

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


The quality of an input sketch can affect performance of the computational algorithms. However, the quality of a sketch is not often considered when working with sketch tasks and automated sketch quality prediction has not been previously studied. This thesis presents quality prediction on the "Sketchy" dataset. The method presented here predicts a quality label rather than a zero to one quality metric. This thesis predicts an understandable label rather than a computer-generated quality metric with no human input. Previous tasks like sketch classification have used a transformer architecture to leverage the vector format of sketches. The architecture used in sketch classification was called Sketchformer. The Sketchformer was adopted and trained to predict quality labels of hand-drawn sketches. This Sketchformer architecture achieves 66% accuracy when predicting the 5-labels. The same transformer achieves up to 97% accuracy in a different experiment when combining the different labels into good versus bad (2-label) experiments. The sketchformer significantly outperforms the SVM baseline. The results of the experiments show that the transformer embedding space facilitates separation of 'good' sketch quality from 'bad' sketch quality with high accuracy.



computer vision, sketches, transformers, machine learning