Deep Siamese CNN for Learning Visual Similarity
Developed a multi-task Siamese CNN for transforming images of apparels to a latent space where images of visually similar items and same item in different poses were close together.
Dataset: Images of apparels from e-commerce fashion website Abof with 25 different labels, alongwith different poses of each apparel item
- Used GoogleNet (pre-trained on ImageNet dataset) and used transfer-learning for fine-tuning model for classification task on apparel dataset.
- Multi-task learning : Used the above network in a Siamese architecture, and trained it on pairs of images using Contrastive-Divergence (CD) loss function
Here, is the penalty for similar images that are far away and is the penalty for dissimilar images that are nearby,
- A positive pair is defined if the second image is the same item in a different pose, this allows us to learn a pose-invariant embedding space
- Created a visual search algorithm that used nearest neighbor matching on the pose-invariant embedding space for finding stylistically similar products