A collection of computer vision applications reuse pre-learned features to analyse video frame-by-frame. Those features are classically learned by Convolutional Neural Networks (CNN) trained on high quality images. However, available video content is almost always subject to compression which is nearly never considered during the analysis process. In this paper, we present an empirical study to measure how the visual discrepancy of compressed data limit the learning performance of the CNN model. The learning performance is evaluated using a benchmark of synthetic datasets compressed at various levels using H.264/AVC. We measure the image quality quantitatively using classical evaluation metrics such as Peak Signal to Noise Ratio and Structural SIMilarity. A cross-evaluation is performed to measure the robustness of the CNN model in processing for a wide range of quality-varying visual data. Our experimental results have shown that the performance of the CNN depends on the compression rate. The results show that, in general, higher compression results in lower performance. However performance on lower quality test data can be improved by using lower quality data for CNN training. Finally, our work demonstrates that conditioning the CNN with the compression properties could potentially lead to better learning.
|Title of host publication||2018 International Joint Conference on Neural Networks, IJCNN 2018 - Proceedings|
|Publisher||Institute of Electrical and Electronics Engineers Inc.|
|Publication status||Published - 10 Oct 2018|
|Event||2018 International Joint Conference on Neural Networks, IJCNN 2018 - Rio de Janeiro, Brazil|
Duration: 8 Jul 2018 → 13 Jul 2018
|Name||Proceedings of the International Joint Conference on Neural Networks|
|Conference||2018 International Joint Conference on Neural Networks, IJCNN 2018|
|City||Rio de Janeiro|
|Period||8/07/18 → 13/07/18|
Bibliographical notePublisher Copyright:
© 2018 IEEE.
Copyright 2018 Elsevier B.V., All rights reserved.