Learning visual similarity for product design with convolutional neural networks

Popular sites like Houzz, Pinterest, and LikeThatDecor, have communities of users helping each other answer questions about products in images. In this paper we learn an embedding for visual search in interior design. Our embedding contains two different domains of product images: products cropped from internet scenes, and products in their iconic form. With such a multi-domain embedding, we demonstrate several applications of visual search including identifying products in scenes and finding stylistically similar products. To obtain the embedding, we train a convolutional neural network on pairs of images. We explore several training architectures including re-purposing object classifiers, using siamese networks, and using multitask learning. We evaluate our search quantitatively and qualitatively and demonstrate high quality results for search across multiple visual domains, enabling new applications in interior design.

References:

1. Babenko, A., Slesarev, A., Chigorin, A., and Lempitsky, V. S. 2014. Neural codes for image retrieval. In ECCV.Google Scholar
2. Bell, S., Upchurch, P., Snavely, N., and Bala, K. 2013. OpenSurfaces: A richly annotated catalog of surface appearance. ACM Trans. on Graphics (SIGGRAPH) 32, 4. Google ScholarDigital Library
3. Chatfield, K., Simonyan, K., Vedaldi, A., and Zisserman, A. 2014. Return of the devil in the details: Delving deep into convolutional nets. In BMVC.Google Scholar
4. Chechik, G., Sharma, V., Shalit, U., and Bengio, S. 2010. Large scale online learning of image similarity through ranking. JMLR. Google ScholarDigital Library
5. Chopra, S., Hadsell, R., and LeCun, Y. 2005. Learning a similarity metric discriminatively, with application to face verification. In CVPR, IEEE Press. Google ScholarDigital Library
6. Garces, E., Agarwala, A., Gutierrez, D., and Hertzmann, A. 2014. A similarity measure for illustration style. ACM Trans. Graph. 33, 4 (July). Google ScholarDigital Library
7. Gingold, Y., Shamir, A., and Cohen-Or, D. 2012. Micro perceptual human computation. TOG 31, 5. Google ScholarDigital Library
8. Girod, B., Chandrasekhar, V., Chen, D. M., Cheung, N.-M., Grzeszczuk, R., Reznik, Y., Takacs, G., Tsai, S. S., and Vedantham, R., 2011. Mobile visual search.Google Scholar
9. Hadsell, R., Chopra, S., and LeCun, Y. 2006. Dimensionality reduction by learning an invariant mapping. In CVPR, IEEE Press. Google ScholarDigital Library
10. Jegou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., and Schmid, C. 2012. Aggregating local image descriptors into compact codes. PAMI 34, 9. Google ScholarDigital Library
11. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., and Darrell, T. 2014. Caffe: Convolutional architecture for fast feature embedding. arXiv:1408.5093.Google Scholar
12. Karayev, S., Trentacoste, M., Han, H., Agarwala, A., Darrell, T., Hertzmann, A., and Winnemoeller, H. 2014. Recognizing image style. In BMVC.Google Scholar
13. Kovashka, A., Parikh, D., and Grauman, K. 2012. Whittlesearch: Image search with relative attribute feedback. In CVPR. Google ScholarDigital Library
14. Krizhevsky, A., Sutskever, I., and Hinton, G. E. 2012. Imagenet classification with deep convolutional neural networks. In NIPS.Google Scholar
15. Kulis, B. 2012. Metric learning: A survey. Foundations and Trends in Machine Learning 5, 4.Google Scholar
16. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., and Jackel, L. D. 1989. Backpropagation applied to handwritten zip code recognition. Neural computation 1, 4. Google ScholarDigital Library
17. Lin, T., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., and Zitnick, C. L. 2014. Microsoft COCO: common objects in context. ECCV.Google Scholar
18. Muja, M., and Lowe, D. G. 2014. Scalable nearest neighbor algorithms for high dimensional data. PAMI.Google Scholar
19. O’Donovan, P., Lībeks, J., Agarwala, A., and Hertzmann, A. 2014. Exploratory font selection using crowdsourced attributes. ACM Trans. Graph. 33, 4. Google ScholarDigital Library
20. Ordonez, V., Jagadeesh, V., Di, W., Bhardwaj, A., and Piramuthu, R. 2014. Furniture-geek: Understanding fine-grained furniture attributes from freely associated text and tags. In WACV, 317–324.Google Scholar
21. Parikh, D., and Grauman, K. 2011. Relative attributes. In ICCV, 503–510. Google ScholarDigital Library
22. Perronnin, F., and Dance, C. 2007. Fisher kernels on visual vocabularies for image categorization. In CVPR.Google Scholar
23. Razavian, A. S., Azizpour, H., Sullivan, J., and Carlsson, S. 2014. CNN features off-the-shelf: an astounding baseline for recognition. Deep Vision (CVPR Workshop). Google ScholarDigital Library
24. Razavian, A. S., Sullivan, J., Maki, A., and Carlsson, S. 2014. Visual instance retrieval with deep convolutional networks. arXiv:1412.6574.Google Scholar
25. Rumelhart, D. E., Hinton, G. E., and Williams, R. J. 1986. Learning internal representations by error-propagation. Parallel Distributed Processing 1. Google ScholarDigital Library
26. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. 2015. Going deeper with convolutions. CVPR.Google Scholar
27. Taigman, Y., Yang, M., Ranzato, M. A., and Wolf, L. 2014. Deepface: Closing the gap to human-level performance in face verification. In CVPR. Google ScholarDigital Library
28. Van Der Maaten, L., and Hinton, G. 2008. Visualizing data using t-SNE. In Journal of Machine Learning.Google Scholar
29. Wang, J., Song, Y., Leung, T., Rosenberg, C., Wang, J., Philbin, J., Chen, B., and Wu, Y. 2014. Learning fine-grained image similarity with deep ranking. In CVPR. Google ScholarDigital Library
30. Weston, J., Ratle, F., and Collobert, R. 2008. Deep learning via semi-supervised embedding. In ICML. Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page:

SIGGRAPH 2015: Technical Papers

“Learning visual similarity for product design with convolutional neural networks”

Conference:

Type(s):

Title:

Session/Category Title: Image Similarity & Search

Presenter(s)/Author(s):

Moderator(s):

Abstract:

References:

ACM Digital Library Publication:

Overview Page:

Sponsored by: