Welcome to the Qinsun Instruments Co., LTD! Set to the home page | Collect this site
The service hotline


Related Articles

Product Photo

Contact Us

Qinsun Instruments Co., LTD!
Address:NO.258 Banting Road., Jiuting Town, Songjiang District, Shanghai

Your location: Home > Related Articles > Why deep neural networks are receiving much attention

Why deep neural networks are receiving much attention

Author:QINSUN Released in:2024-03 Click:24

Deep neural networks are a technology in the field of machine learning (ML).

In supervised learning, the problem with previous multi-layer neural networks was that they were prone to getting stuck in local extremum points. If the training samples cover future samples sufficiently, the learned multi-layer weights can be well used to predict new test samples. However, many tasks struggle to obtain sufficient labeled samples. In such cases, simple models such as linear regression or decision trees often achieve better results (better generalization, worse training error) than multi-layer neural networks.

In unsupervised learning, there have been no effective methods for constructing multi-layer networks in the past. The top layer of a multi-layer neural network is a high-level representation of the underlying features, such as pixels at the bottom layer, and nodes in the previous layer may represent horizontal lines or triangles; And the top layer may have a node representing the face. A successful algorithm should be able to maximize the representation of low-level examples by generating top-level features. If all layers are trained simultaneously, the time complexity will be too high; If training one layer at a time, the deviation will be transmitted layer by layer. This will face the opposite problem as in supervised learning, which is severe underfitting.

In 2006, Hinton proposed an effective method for building multi-layer neural networks on unsupervised data. Simply put, it consists of two steps: one is to train one layer of network at a time, and the other is to optimize to make the high-level representation r generated from the original representation x up consistent with the high-level representation r generated from that representation r down as much as possible. The method is

Firstly, construct single-layer neurons layer by layer, so that each time a single-layer network is trained.

2. After training all layers, Hinton uses the walk sleep algorithm for optimization. Change the weights between layers other than the top layer to bidirectional, so that the top layer remains a single-layer neural network, while the other layers become graph models. The upward weight is used for cognition, and the downward weight is used for generation. Then, the Wake Sleep algorithm is used to adjust all weights. This ensures that cognition and generation reach a consensus, which is to ensure that the generated top-level representation can accurately restore the underlying nodes as much as possible. For example, if a top-level node represents a face, then all facial images should activate that node, and the resulting image should be able to represent a rough facial image. The Wake Sleep algorithm is divided into two parts: wake and sleep.

2.1 In the wake stage, the cognitive process generates abstract representations (node states) for each layer through external features and upward weights (cognitive weights), and modifies the downward weights between layers using gradient descent (generated weights). That is to say, "If reality is different from what I imagined, changing my weight will make what I imagined look like this.".

2.2 In the sleep stage, the generation process involves generating low-level states through top-level representations (concepts learned during wake-up) and downward weights, while modifying the upward weights between layers. That is to say, "If the scene in the dream is not the corresponding concept in my mind, change my cognitive weight so that this scene appears to me as this concept.".

Due to the association function of autoencoders, also known as neural networks, the algorithm mentioned above can also be used for supervised learning. Generally, autoencoders refer to all structures that obtain high-level representations from low-level representations and can generate approximate low-level representations from high-level representations. Narrowly, they refer to one of them, which is used in Google's facial recognition. This means that missing inputs can also be correctly encoded. Therefore, the algorithm mentioned above can also be used for supervised learning. During training, y is used as a supplement to the input of the top-level network, and when applied, the top-level network generates y '.