In: Computer Science
Describe a simple SIMD architecture for “Image processing” application.
Image sensors are used in numerous types of image acquisition devices such as digital cameras, camcorders, and CCTV cameras. Recently, their application region has broadened to include smart devices, and the acquired images are not merely for storage but also for interaction between a human and a computer. To satisfy the many goals of image sensors, the role of image enhancement is more important than ever before.
An image signal processor (ISP) is one of the non-optical devices that enhance the image quality of captured raw images and consists of several image processing algorithms including demosaicing, denoising, and white balancing, as well as other image enhancement algorithms. The latest ISP algorithms that include iterations with adaptive selections according to the image characteristics produce an excellent image quality. The high image quality costs vast amount of calculation, however, and also require complicated adaptive routines that cannot be executed in parallel.
An ISP can be implemented on a dedicated hardware, a general-purpose processor, or a parallel-computing processor. A dedicated hardware implementation, however, shows a high image quality and processing performance at the expense of scalability and flexibility, whereas the implementation of an ISP on a general-purpose processor can be appropriate not only for the high image quality of complicated algorithms, but also for sound scalability and flexibility; however, the implementation cost of the latter is high due to the large computational amount, and a high-performance platform such as a desktop PC is necessary. The high processing performance and low power consumption of a parallel-computing processor are accompanied by scalability and flexibility for software implementation. The implementation of an ISP algorithm on a parallel-computing processor, however, requires further optimization for the utilization of multiple processing elements in parallel. The conventional parallel-ISP-optimization methodology requires the division of the algorithm into data processing parts and control processing parts first, followed by their operation in parallel because of the adaptivity of the ISP algorithm. Very Long Instruction Words (VLIW) architecture can therefore be an easy choice for ISP implementation, even though Single Instruction Multiple Data (SIMD) architecture can exploit a greater extent of parallelism.
The ISP full chain that is suitable for parallel processing is proposed in this paper, and the chain is implemented through an optimization process for SIMD processor architecture to achieve both a high image quality and performance goals. The proposed ISP full chain is shown in Fig. 1.
Fig. 1

Proposed ISP full chain
Full size image
In Fig. 1, GWA is Gray World Assumption, AHD is Adaptive Homogeneity-Directed Demosaicing, BF is Bilateral Filter, AC is Auto Contrast, and LTI is Luminance Transient Improvement.
The way that the high-quality images are processed by all of the algorithms that are present in the proposed ISP chain means that there are no iterations in the algorithm to reduce the execution time of the real-time budget [1]. While the basic idea of the algorithm is maintained, the operations in the algorithm have been simplified for easy parallelization on the SIMD architecture; in addition, heavy memory accesses and excessive computational overheads are reduced by limiting the operational ranges. Each complicated special operation is replaced by a simple operation that performs a similar function and the result was verified by experiments.
The proposed parallel ISP algorithm is targeted to run on the Samsung Reconfigurable Processor (SRP) [2–7] that can be configured as an SIMD processor. Numerous high-quality image processing algorithms form the basis of each of the functional components of the proposed ISP full chain [8–30]. By increasing the homogeneity of the parallel operations in the ISP algorithms, the proposed ISP algorithm can take advantage of the parallel performance of a SIMD processor while maintaining an image quality that can pass the commercial image quality test of Skype [31]. The proposed ISP can handle the resolution of full HD video (1920 × 1080, 30 frames per second) on a 600-MHz SRP that is suitable for smart devices.
This paper comprises the following: Section 2 describes the existing research; Section 3 describes the implementation of the proposed ISP full chain; Section 4 describes the performance verification process and the results of the proposed ISP full chain; and the conclusion is presented in Section 5.
Note : This image and contain collect from google