Design and train a deep neural network to automatically generate caption for an input image. PyTorch library was used to build the network and it was trained using MS COCO dataset.