Zur Seitenansicht


Convolutional LSTM for Next Frame Prediction / submitted by Thomas Adler
VerfasserAdler, Thomas
Begutachter / BegutachterinHochreiter, Sepp
ErschienenLinz, 2017
Umfang59 Blätter : Illustrationen
HochschulschriftUniversität Linz, Univ., Masterarbeit, 2017
Schlagwörter (DE)Convolutional LSTM / Frame Vorhersage
Schlagwörter (EN)convolutional LSTM / next frame prediction
URNurn:nbn:at:at-ubl:1-14085 Persistent Identifier (URN)
 Das Werk ist gemäß den "Hinweisen für BenützerInnen" verfügbar
Convolutional LSTM for Next Frame Prediction [11.34 mb]
Zusammenfassung (Englisch)

Convolutional LSTM is an increasingly popular and very promising algorithm to solve machine learning tasks in the context of video data. Next frame prediction, i.e. predicting how a video will continue, is the machine learning task subject to this thesis. The used data stems from car camera data sets, i.e. the scenes that are to be predicted show ordinary traffic situations recorded from a roof-mounted camera of a car. Convolutional LSTM has already proven feasibility for such tasks. This thesis explores new ways of presenting the data to a known model architecture, the Predictive Coding Network, which is based on convolutional LSTM. One approach is to present delta information, i.e. the change between two consecutive frames, to the model in order to ease the learning task. The second approach examined tries to bundle the models capacity as to concentrate on the prediction of a single frame instead of a sequence. In both cases, the model architecture is not or only marginally affected by the necessary changes. This ensures, that the models remain comparable. Previous to these experiments, a feature of convolutional LSTMs called peepholes is studied with respect to its effectivity in the context of next frame prediction. This is done on a toy data set named Moving MNIST, while the prediction techniques are examined on the real world car camera sets KITTI and Caltech Pedestrian.