The document describes a study on attention-based image captioning using deep learning. The study aims to generate image captions using an encoder-decoder model with an attention mechanism. The encoder is Google InceptionV3 which extracts image features, and the decoder is a GRU that generates captions. The model is trained on the MS COCO dataset and evaluated using BLEU score. Results show the attention mechanism helps focus on relevant image areas to produce descriptive captions.