In: Computer Science
Please discuss the answer with details: (Machine learning using CNN)
I need those questions to answer for fruit detection system apply for project report. Please discuss broadly.
3.Why a fruit detection system using neural network (Cnn) is
needed?
4.what Methods to use in a fruit detection system using neural
network (Cnn)?
How to use TensorFlow to make a fruit detection system using neural
network (Cnn)
3.Answer
A Convolutional neural network (CNN) is a neural network that has one or more convolutional layers and are used mainly for image processing, classification, segmentation and also for other auto correlated data.
Why CNN are hugely popular is because of their architecture , the best thing is there is no need of feature extraction. The system learns to do feature extraction and the core concept of CNN is, it uses convolution of image and filters to generate invariant features which are passed on to the next layer.
CNN applied to different automatic processing tasks of fruit images: classification, quality control, and detection. In the last two years (2019–2020), the use of CNN for fruit recognition has greatly increased obtaining excellent results, either by using new models or with pre-trained networks for transfer learning.
Recent work in deep neural networks has led to the development of a state-of-the-art object detector termed Faster Region-based CNN (Faster R-CNN). We adapt this model, through transfer learning, for the task of fruit detection using imagery obtained from two modalities: colour (RGB) and Near-Infrared (NIR). Early and late fusion methods are explored for combining the multi-modal (RGB and NIR) information. This leads to a novel multi-modal Faster R-CNN model, which achieves state-of-the-art results compared to prior work with the F1 score,. In addition to improved accuracy, this approach is also much quicker to deploy for new fruits, as it requires bounding box annotation rather than pixel-level annotation (annotating bounding boxes is approximately an order of magnitude quicker to perform).
4.Answer
1.Fruit Detection Using a Conditional Random Field
we use a CRF to model colour and visual texture features from multi-spectral images led to the impressive performance for fruit segmentation. The multi-spectral images contain both colour (RGB) and Near-Infrared (NIR) information. The CRF uses both colour and texture features. The colour features are constructed by directly converting the RGB values to the HSV colour space. Visual texture features are extracted from the NIR channel. NIR images are used to calculate texture features, as they were found to be more consistent than the colour imagery. Three sets of visual texture features are used:
(i) Sparse Autoencoder (SAE) features
(ii) Local Binary Pattern (LBP)
(iii) a Histogram of Gradients (HoG) .
Each feature captures a different property, such as the distribution of the local gradient, edges and texture, respectively. It appears that the LBP feature can capture information, such as the smooth surface of fruits, and provides an efficient method for encoding visual texture.
2.Fruit Detection Using Faster R-CNN
Despite the recent progress being made using deep convolutional neural networks on large-scale image classification and detection , accurate object[fruit] detection still remains a challenging problem in the computer vision and machine learning fields. This task requires not only detecting which objects are in a scene, but also where they are located. Accurate region proposal algorithms thus play significant roles in the object detection task.
There are recent works, such selective search, which merges super pixels based on low-level features, and EdgeBoxes, making use of edge information to generate region proposals. However, these methods require as much running time as the detection to hypothesise object locations. Faster R-CNN was proposed to overcome this challenge by introducing the Region Proposal Network , which shares convolutional features with the classification network, and two networks are concatenated as one network that can be trained and tested through an end-to-end process. By doing that, the running time for region proposal generation takes around 10 ms, and this framework can maintain a 5 fps detection rate and outperform the state-of-the-art object detection accuracy using very deep models.
The Faster R-CNN work of uses colour (RGB) images to perform general object detection. It consists of two parts: (i) region proposal
(ii) a region classifier.
The region proposal step produces a set of NP proposed regions (bounding boxes) where the object(s) of interest could reside within the image. The region classifier step then determines if the region belongs to an object class of interest; the classifier could be a 2-class or N-class classifier. To train the Faster R-CNN for our task, we perform fine-tuning . This requires labelled (annotated) bounding box information for each of the classes to be trained.
Multi-Modal Fusion
Here, we present the two methods, late and early fusion, that we use to combine the multi-modal (RGB and NIR) imagery that we have. Late fusion combines the classification decisions from the two modalities. Early fusion alters the structure of the input layer of the VGG network.
Late Fusion
Late fusion combines the classification information from the two modalities, colour and NIR imagery. Using the independently-trained models for each modality
Each modality m produces Nm,P region proposals. To combine the two modalities, these region proposals are combined to form a single set of NP* = m × Nm,P region proposals. A score sm,p is then proposed for the p-th proposed region of the m-th modality. A single score for the p-th region is produced by averaging the response across the modalities,
sp=∑m=1Msm,p
The score is a C-dimensional variable, where C is the number of classes to be classified.
TensorFlow to make a fruit detection system using neural network(cnn)
These are the steps used to training the CNN (Convolutional Neural Network).
Steps:
Step 1: Upload Dataset
Step 2: The Input layer
Step 3: Convolutional layer
Step 4: Pooling layer
Step 5: Convolutional layer and Pooling Layer
Step 6: Dense layer
Step 7: Logit Layer..