Friday, March 11, 2016

Emergence of Counsicousness and Attention.

There are only two kinds of things that we need to pay attention. First kind of thing is the thing you want to do, but you can't do. The second kind of thing is the thing you can do, but you don’t want to do. Apart from them, things just pass by like fog and smoke and we don’t see exactly how and when they were done. They are things that we do everyday unconsciously. They are routines and trifles that we’ve done thousands of times that some sort of muscle autonomy just takes over, free us from the tedious labor. Eating while watching TV, talking on a cellphone while driving, Reading papers in the toilet, or listening to music while doing homework. We seem to avoid paying any attention to those tedious and trivial jobs as much as we can even when we nothing better to do. These tasks are duties which we have no choice but doing , otherwise our bodies will stop operating. As long as it agrees with our nature, and nothing goes against it, then our bodies just know how and when to do it right itself. Put it philosophically, our nature conforms nature itself. It follows the laws of thermal dynamics, and try to maximize entropy if we don’t spend energy to stop it.

Attention is inhibition in cognitive sense. We have to inhibit our desires when we can’t do it. We can’t spend freely, wake up late, or smoke as much as we want , even though it’s what life is about. We must inhibit the desires when our knowledge tells us they are detrimental. Therefore, we are painfully paying attention not letting our nature unleashed. On the other hand, we are forced to do things we hate and never get used to on daily bases. We can choose the easy way, but we don’t. because our knowledge tell us they are the necessary vice, otherwise something worse may happen. We have to inhibit our disinclination every second to fight our natural aversion for the longer good.
Inhibition deforms our mental structure, builds up mental tension, stress, and strain, and transforms entropy into potential energy in the mental deformation. Attention is inhibition in phenomenological sense, it is heat and light emitted when potential energy stored in deformation exceeds the energy threshold of the structure. The deformation becomes fracture, and permanently changes the mental default structure. Mental energy discharge is attention, attention leads to mental suffer, mental suffer leads to consciousness, consciousness leads to selfness. Selfness is the crystallization of entropy minimization, the maximization of inhibition, and the singularity of universe.
Inhibition is the essential element of nervous system. It is impossible for nervous system to generate meaningful patterns without incorporating inhibitory signal. Without inhibitory neurons, excitation neurons will fire unbridled, and we will be led to delirious madness and eventual self-destruction. Inhibitory neurons are the miniature self. They defines the character of nervous systems. They are the source of animation, and the starting point of the betrayal of God.

Training Deep NN and Theory of Sleep

It is just an idea, and probably somebody has already been working on it. I still feel the urge to share with you about how I think the Deep NN has inspired me on the evolutional purpose of dreaming, and how theory of dreaming can help us better train Deep NN. However, I must disclaim here, that the following discourse are purely speculative.
NN has been known as a kind of unsupervised learning. i.e., it learns the regularity of data without the help of human expert to label them. The NN then can be trained via ‘backprapagation’, during which the weights of the links, and the value (activity) of the nodes and be updated via optimization of ‘objective function’. Typically, cross entropy is used as objective function. The typical problem one usually confront when training the network is the gradient descent on the landscape of the objective function may fall into local minimum. Several techniques are widely used to help escape from suboptimum solution, such as annealing, adding momentum term, or adding stochastic factor into the training. Here I propose an idea that is inspired by damped oscillation pattern of sleep. The damped oscillation of sleeping stage consists four stages, each of which lasts from 10 to 20 minutes with slight variance depending on individual and various physiological factors. The oscillation is illustrated in the following graph.

Each stage are reported to be responsible for different purpose. The REM -> Stage 4 section can be think for as backward propagation, while the Stage 4 -> REM section represents the forward propagation. The shallow-deep sleep cycle help strengthen the memory and also help internalize the vast amount of transient episodes into deep, invariant knowledges. I will elaborate in the next paragraph. The picture below shows the famous LeNet proposed by Yann LeCun. The net resembles brain structure not only in terms of topology but how the information is processed. Each layer ‘pools’(samples) small patch of the input and forms a feature space that is going to be sampled by a super-layer.You can see the trained middle layers as filters of information. Finally, the every single nodes in the end-layer can distinguish objects exclusively. In broader sense, the deep NN can condense knowledge from a huge batch of unsorted data, be it image, sounds, or movements. The success of Deep NN sheds light on the long coveted universal theory that explains how might brain learns, predict, interpret, and create things. But still, we are very far from finding such a theory, for the anatomical counterparts and physiological mechanism that justifies the NN are not yet approved. Even more, I am neither saying that Deep NN can explains EVERYTHING of brain mechanism nor the universal theory exists.

In general it is agreed upon that sleep is essential for long term memory consolidation, and can enhance cognitive processing. The REM sleep, where we usually dream and can easily be awaken by external disturbance, seems to be playing a critical role of learning. It is instructive that during dreaming our mental activity is actually quite similar to the waking state, and the major difference is the mental process is isolated from body movement. By analyzing the texture of dream, we will find out , unsuprisingly in retrospect, that the function and form conforms each other. The structure of dreams reflects exactly how we stores invariances in the deep, hierarchical structure of filters of different functional level.
In the first cycle of sleep, we dive into deep sleep (stage 1 to stage 4) from aware state. This process correspond to the back propagating phase of LeNet (Input layer to end layer), where in each stage of sleep corresponds to optimization of weights of each layer of NN. It doesn’t mean that the physical counterpart of Deep NN must have four layers, because the definition of stage is artificial and only serve certain purpose. There are not real distinct boundaries between them, it is more like a gradual process. During this half-cycle, the optimization of weights cannot be too good, since it can easily fall into local minimum.
At the deep sleep, end-layer that interfaced with hormone signals produces by glands[1], and other subcortical bodies [2] received instruction and starts to send signal back to sub-layers. These end-layers represents emotional components, and inverse the previous process, starting to predict the possible outcome in the sub feature space given such emotions.Since the inverse problem does not exist a unique solution, we won’t see a playback of our input, but a physically and causally plausible theatre, in which a unpredictable, bizarre, but animated drama is on the show. When we see such a show during sleep, our other parts of brain are aroused, resulting awake-like brain state. The vivid experience are overall endogenous, but its constituents can be novel to each other perceptrons, as the hierarchical brain structure always branch out when goes downward, and the genesis of artifacts is nonlinear due to convoluted operation or quasi-randomized internal driving sources.

During the first REM stage we are exposed to fabricated scenarios that appears arbitrary and illogical, but are strongly emotional and animated. These scenarios are oftentimes consistent in sense of causality and physics, but their motifs and development are totally unpredictable. Take the visual recognition for example. We can generate a face by synthesize features according to the structure of NN, but no single line on of the synthesized image will have the same shape, same shade, and orientation. In a sense the dream provides artificial data let allows us to train the Deep NN again. The subsequent descent of the sleep stage allows optimization for the second time. This helps the NN escape from local optimum of objective function.
This cycle of forward modeling -backpropagation is repeated a few times but not to many for 1) the data novelty deprecate after each cycle. 2) over-fitting. 3) time to wake up. Due to the pyramid shaped hierarchical structure, deeper layers has progressively less perceptrons/ filters, and therefore more prone to over-fitting. This accounts for why the depth of each sleep cycle decrease in light of learning and memory. The lower the hierarchy, the more perceptrons to be trains, and therefore the shallower stages (REM, stage 1, stage 2) occupy the primary proportion in the later part of the whole sleep.
What is the benefit of interfacing the emotional layer with emotional related hormone sources ? It is postulated that the hormones are supervising signal that instruct our brain to learned scenarios that embeds negative emotions ( stress, anxiety, fear, and anger) so that we can handle social relations better. Learning to endure and even utilize our negative emotions during sleep prepares us better to handle frustration and unpredictable difficulties in real world, where most of time we live under certain pressure of survival. It equips us with a inherit sense of crisis, which helped our ancestors survived disasters that those with different mindsets did not.

[1]such as pineal gland, Pituitary gland
[2] like amygdala (emotional memory), mammillary body (recollective, or episodic memory), and hippocampus (spacial memory, STM, LTM).
[3] Other theories of sleep : Activation synthetic theory ; Continual -activation theory ; Reverse learning ; Dreams as excitations of long term memory;