ISSN : 2349-3917
Pankaj Chaturvedhi*
Department of Computer Science, Bipin Tripathi Kumaon Institute of Technology, Uttarakhand, India
Received date: December 27, 2022, Manuscript No. IPACSIT-23-15954; Editor assigned date: December 29, 2022, PreQC No. IPACSIT-23-15954(PQ); Reviewed date:January 12, 2023, QC No. IPACSIT-23-15954; Revised date: January 19, 2023, Manuscript No. IPACSIT-23-15954(R); Published date:January 27, 2023, DOI: 10.36648/ 2349-3917.11.1.3
Citation:Chaturvedhi P (2023) Method to Distinguish CG Images from PG Images Using a Two-Stream Convolutional Neural Network (CNN). Am J Compt Sci Inform Technol Vol: 11 No: 1: 003.
The rapid development of multimedia tools has been observed over the past ten years. With these tools, anyone can make computer-generated images of the art, objects, and scenes they want. A skilled user of these tools can produce photographs with such photorealism that the human visual system is nearly unable to distinguish between Computer-Generated (CG) and Photographic (PG) images. Digital cameras are typically used to capture PG images. Today, the film industry, virtual reality, digital arts, education, and video games all benefit greatly from computer-generated videos and images. Synthesizing Obama and Face2Face are two examples of real-time video spoofing projects. Another interesting example is a film in which a replica of actress Carrie Fisher was digitally recreated to look exactly like she did in her 1970s films. High-quality, photorealistic CG images can be produced using Generative Adversarial Networks (GANs). However, CG images can be misused in a variety of ways. For instance, a computer-generated fake video or image of a celebrity can be circulated on the Internet or social media to defame the celebrity. Similarly, if CG images containing false information are presented to the judiciary, it may cause social disorder. For experts in image forensics, these developments have resulted in an alarming situation. As a result, a method that can accurately differentiate CG images from PG images must be developed. Although preprocessing the image prior to training is optional, numerous methods in the literature employ a variety of preprocessing strategies, such as high-pass filter RGB to YCbCr conversion and RGB to grayscale conversion. The fundamental steps are the feature extractor and classifier in the second and third steps, respectively. The researchers have looked into various combinations of feature extractors and classifiers over the past few years.
The feature extractors used in the current approaches can be broadly divided into CNN-based feature extractors and hand-crafted feature extractors, which are designed to extract features of a particular type. The Discrete Wavelet Transform (DWT), Contourlet Wavelet Transform (CWT), Quaternion Wavelet Transform (QWT), Discrete Fourier Transform (DFT), Local Binary Pattern (LBP), and Local Ternary Count (LTC) are among the well-designed hand-crafted feature extractors. When used in a variety of image forensic tools, hand-crafted features have produced excellent results, but their generalizability has been limited. To put it another way, these custom-made feature extractors are made to concentrate on specific aspects of an image. In addition, the prior knowledge of the user is required for these hand-crafted features, so their performance can be limited at times, particularly when dealing with complex datasets. Convolutional neural networks or feature extractors based on deep learning, on the other hand, are more diverse. This is due to the fact that these methods learn a large number of features from an image. Prior to 2016, the majority of techniques for detecting computer graphics relied solely on hand-crafted features. Some of the most well-liked methods in this category are. Numerous authors have proposed deep learning-based solutions to this issue in light of the cutting-edge performance of the deep learning-based approach in the field of image forensics. To address this issue, four approaches based on deep learning have recently been developed. In this study, the authors have created four image datasets. An Attention-Based Dual-Branch Convolutional Neural Network (AD-CNN) is proposed as the method for obtaining reliable representations from fused color components. A self-coding module is added to the approach, which uses color images as input to explicitly extract the correlation between color channels. However, the method extracts features using a DenseNet-201 network. A number of deep learning-based approaches to this issue of CG and PG classification were thoroughly examined in a recent study.
This paper has made significant contributions in the following areas: To begin, this paper analyzed and summed up the various shortcomings of the previously proposed approaches to distinguishing between CG and PG images. Second, we demonstrate how two-stream classification using CNN can address the issue of CG and PG classification. RGB-stream and noise-stream were the two models we created. Thirdly, we demonstrate that while the image features learned by these two streams are completely distinct, they both contribute to the achievement of superior discriminative properties between CG and PG images by providing complementary support. As a result, we are able to outperform the most widely used image dataset when compared to current methods. Fourthly, ensemble learning is used to combine the results of the two streams. This is, as far as we are aware, the first work on this issue to employ the ensemble learning model to produce more diverse results. Rather than considering a single model's decision, it may always be preferable to take decisions from two distinct models. Finally, a series of experiments on readily available image datasets demonstrate the two-stream network's robustness. A two-stream convolutional neural network is used in his paper to distinguish photographic images from computer-generated images. The first stream focuses on learning image features based on RGB, while the second stream focuses on learning image features based on noise. The ensemble learning technique is used to combine the results of the two streams.