video caption

Introduction

This is a python project for video captioning, using hLSTMat model on the msvd or msr-vtt dataset.

How to use the code?

Data

download msvd dataset
download msr-vtt dataset
extract video feature using https://github.com/Cppowboy/video_feature_extractor

Requirements

python 2.7
tensorflow
tensorboard
numpy
pandas
pickle

Run

First, you need to change the data paths in data_engine.py to your own paths.
Use python train.py to run the train script. use tensorboard --logdir your_log_dir to visualize the train procedure and show the scores.

Reference

https://github.com/zhaoluffy/hLSTMat
https://github.com/yunjey/show-attend-and-tell
Song, Jingkuan, et al. "Hierarchical LSTM with Adjusted Temporal Attention for Video Captioning." arXiv preprint arXiv:1706.01231 (2017).