Deep convolutional neural networks (CNNs) have presented amazing performance in the task of semantic segmentation. However, the network model is complex, the training time is prolonged, the semantic segmentation accuracy is not high and the real-time performance is not good, so it is difficult to be directly used in the semantic segmentation of road environment images of autonomous vehicles. As one of the three models of deep learning, the auto-encoder (AE) has powerful data learning and feature extracting capabilities from the raw data itself. In this study, the network architecture of auto-encoder and convolutional auto-encoder (CAE) is improved, supervised learning auto-encoder and improved convolutional auto-encoder are proposed, and a hybrid convolutional auto-encoder model is constructed by combining them. It can extract low-dimensional abstract features of road environment images by using convolution layers and pooling layers in front of the network, and then supervised learning auto-encoder are used to enhance and express semantic segmentation features, and finally de-convolution layers and un-pooling layers are used to generate semantic segmentation results. The hybrid convolutional auto-encoder model proposed in this paper not only contains encoding and decoding parts which are used in the common semantic segmentation models, but also adds semantic feature enhancing and representing parts, so that the network which has fewer convolutional and pooling layers can still achieve better semantic segmentation effects. Compared to the semantic segmentation based on convolutional neural networks, the hybrid convolutional autoencoder has fewer network layers, fewer network parameters, and simpler network training. We evaluated our proposed method on Camvid and Cityscapes, which are standard benchmarks for semantic segmentation, and it proved to have a better semantic segmentation effect and good real-time performance.