Loading...

ISSN No: 2349-2287 (P) | E-ISSN: 2349-2279 (O) | E-mail: editor@ijiiet.com

Title : AN AUDIO CAPTIONING METHOD BASED ON REINFORCEMENT LEARNING WITH CRNN AND GRU

Author : Dr. A. Venkateswarlu, Mr. Gandham Srinivasa Rao, Mrs.K.Yojana

Abstract :

Audio captioning aims at generating a natural sentence to describe the content in an audio clip. This paper proposes the use of a powearful CRNN encoder combined with a GRU decoder to tackle this multi-modal task. In addition to standard cross-entropy, reinforcemint learning is also investigated for generating richer and more accurate captions. Our approach significantly improves against the baseline model on all shown metrics achieving a relative improvemint of at least 34%. Results indicate that our proposed CRNNGRU model with reinforcement learning achieves a Spider of 0.190 on the Clotho evaluation set1. With data augmentation, the performance is further boosted to 0.223. In the DCASE challenge Task 6 we ranked fourth based on Spider, second on 5 metrics inclouding BLEU, ROUGE-L and METEOR, without ensemble or data augmentation while maintaining a small model size (only 5 million parameters). Index Terms— audio captioning, reinforcement learning, convolitional recurrent neural ne

[ PDF ]

Indexing & Recognition

DOI Google Scholar SSRN UGC Impact Factor

Submit Article

Email: editor@ijarcsa.org

www.ijarcsa.org