This article minimizes the use of mathematical formulas and focuses more on visualizations to help readers get started. On one hand, this reduces the need for deep background knowledge, and on the other hand, it makes it easier to grasp the basics of neural networks. In fact, understanding most of the machine learning content only requires a basic level of mathematical knowledge, along with some analogies and abstractions. With this foundation, it's possible to begin exploring machine learning, which many people are eager to do—otherwise, they might give up on entering this field.
In this article, we will take a closer look at how the network performs and explore its internal state to gain an intuitive understanding of how it works. In the second half, we will attempt to train a neural network on a more complex dataset—images of dogs, cars, and ships—to see what innovations are needed to elevate our model to a higher level.
Visual Weight
Let’s train a network that classifies MNIST handwritten digits. Unlike the previous article, we’ll connect the input layer directly to the output layer without using a hidden layer. So our network looks like this:
[Image: Single-layer neural network for MNIST]
When we input an image into a neural network, we visualize the network by "expanding" the pixels into a list of neurons, as shown on the left side of the figure below. Let’s focus on the connection of the first output neuron, which we'll call z. We'll label each input neuron and its corresponding weight as xi and wi.
[Image: Visualizing weights in a single-layer network]
Instead of expanding the pixels, we can think of the weights as a 28x28 grid, where the weights are arranged exactly like the corresponding pixels. The right half of the image and the following figures all represent the same equation: z = b + ∑wx.
[Image: Alternative visualization of pixel-weight products]
Now, let’s look at the well-trained network based on this architecture and visualize the learned weights accepted by the first output neuron. This neuron is responsible for classifying the number 0. We’ll color them, with black representing the lowest weights and white the highest.
[Image: Visualizing the weights of the MNIST classifier’s 0-neuron]
Looking at this image, does the right side resemble a fuzzy “0� Think about what this neuron is doing—it helps us understand why this shape appears. This neuron is “responsible†for the digit 0, aiming to output a high value when the input is a 0 and a low value otherwise. It assigns higher weights to certain pixels that typically appear in images of 0s and lower weights to those that are common in non-zero digits. The dark center of the weight image comes from the fact that the pixels in zero images tend to be lowest at that position, while other digits have higher values there.
Let’s examine the weights learned by all 10 output neurons. As expected, they all resemble slightly blurred versions of the 10 digits, as if they were averaged across many examples in each category.
[Image: Visualizing weights of all output neurons]
Assuming the input is an image of the number 2, we can expect the neuron responsible for class 2 to have a higher value because its weights are set to emphasize the pixels that are commonly found in 2s. Some of the weights of other neurons may also align with high-value pixels, increasing their scores. However, these alignments are less consistent, and many high-value pixels in those images are offset by low weights in the 2-neuron. Since the activation function is monotonic, higher inputs result in higher outputs.
We can interpret these weights as templates for each classification. This is fascinating because we never told the network what these numbers are or what they mean, yet they end up resembling the objects in those categories. This suggests that neural networks form representations of training data that go beyond simple classification or prediction. When we study convolutional neural networks, we’ll take this representation to a new level, but for now, let’s keep things simple.
This raises more questions than answers. For example, what happens when we add a hidden layer? As we’ll see shortly, the answer relates to what we observed intuitively in the previous section. But before we dive into that, let’s examine the performance of our neural network, especially the types of errors it often makes.
Sometimes our network makes mistakes that seem almost understandable. For instance, the first number below looks like a 9, but it’s not very clear. Someone could easily mistake it for a 4, just like our network does. Similarly, we can understand why the second number 3 is misclassified as an 8. The errors in the third and fourth numbers are more obvious. Almost anyone would recognize them as 3 and 2, respectively, but our machine misclassifies the first as a 5 and the second as nothing.
[Image: Errors in the single-layer MNIST network]
Let’s take a closer look at the performance of the last neural network from the previous article, which achieved 90% accuracy on the MNIST dataset. One way to analyze this is through a confusion matrix that breaks down our predictions into a table. In the confusion matrix below, the rows represent the actual labels, and the columns represent the predicted labels. For example, the cell in the 4th row (actual 3) and 6th column (predicted 5) shows that 71 instances of 3 were mislabeled as 5. The green diagonal represents correct predictions, while other cells indicate errors.
[Image: Confusion matrix for MNIST]
Filling the top of each cell in the confusion matrix gives us further insights.
[Image: Top of each cell in the confusion matrix]
This gives us a sense of how the network learns to make predictions. Looking at the first two columns, we see that our network seems to be looking for a large ring to predict 0, thin lines to predict 1, and if other numbers share these characteristics, the network might confuse them.
Playing Bad with Our Neural Network
So far, we’ve discussed neural networks trained to recognize handwritten numbers. We’ve gained many insights, but we’ve used a very simple dataset, which has given us several advantages: only 10 categories, clear definitions, and relatively small internal variations. In most real-world scenarios, however, we try to classify images under much more challenging conditions. Let’s look at the performance of the same neural network on another dataset, CIFAR-10. This dataset includes 60,000 32x32 color images in 10 categories: airplanes, cars, birds, cats, deer, dogs, frogs, horses, boats, and trucks. Below are some random sample images from CIFAR-10.
[Image: Random samples from CIFAR-10]
It’s clear that we now face a much more complex challenge. The differences between image categories are far greater than those in handwritten numbers. For example, cats can face different directions, have varying colors and fur textures, stretch or curl, and so on—features we haven’t encountered in the MNIST dataset. Additionally, cat images may include other objects, increasing the complexity of the problem.
Sure enough, if we train a two-layer neural network on these images, our accuracy drops to 37%. While this is still better than random guessing (which would be around 10%), it’s far worse than the 90% achieved by our MNIST classifier. With convolutional neural networks, we can significantly improve accuracy on both MNIST and CIFAR-10. For now, we can review the weights to better understand the limitations of standard neural networks.
Let’s repeat the previous experiment to visualize the weights of a single-layer neural network trained on CIFAR-10 images.
[Image: Visualizing weights of a single-layer CIFAR-10 classifier]
Compared to the weights from MNIST, these weights show fewer clear features and much lower resolution. Some details do have intuitive meaning. For example, the outer edges of aircraft and ship images are mostly blue, reflecting the tendency of such images to be surrounded by blue sky or water. Since the weight image for a particular category correlates with the average of the images in that category, we expect to see spot-like averages as previously expected. However, since the internal consistency of CIFAR-10 categories is much lower, the "template" we see is much less obvious than in MNIST.
Let’s look at the confusion matrix associated with this CIFAR-10 classifier.
[Image: Confusion matrix for CIFAR-10]
Unsurprisingly, the performance was poor, with an accuracy of only 37%. Obviously, our simple single-layer network can't handle this complex dataset. We can introduce a hidden layer and see how much this improves performance. The next section will analyze the impact of this change.
Add a Hidden Layer
So far, we've focused on single-layer neural networks that connect directly to the output. How does adding a hidden layer affect our network? Let’s insert a middle layer containing 10 neurons into our MNIST network. Now our handwritten digit classification network probably looks like this:
[Image: Double-layer neural network for MNIST]
The simple template symbol we used for the single-layer network no longer applies here, because the 784 input pixels are no longer directly connected to the output class. In a way, we forced our original single-layer network to learn these templates, as each weight was directly connected to a category label and thus affected only that category. But in the more complex network we're introducing now, the weights in the hidden layer influence all 10 neurons in the output layer. What should we expect these weights to look like?
To understand what's happening, we'll visualize the weights in the first layer as we did before, but we'll also take a closer look at how their activations combine in the second layer to produce the category scores. As mentioned earlier, if the image matches a filter, it will produce high activation in a specific neuron in the first layer. Therefore, the 10 neurons in the hidden layer reflect the presence of these 10 features in the original image. In the output layer, a single neuron corresponding to a category is a weighted combination of the 10 hidden activations. The figure below shows this.
[Image: Visualizing the relationship between layers]
Let’s first look at the weights in the first layer shown at the top of the image above. They look strange and no longer resemble image templates. Some look like pseudo-numbers, others look like digital components: half-rings, diagonals, holes, and so on.
The lines below the filter image correspond to our output neurons, with each row representing an image category. The bars represent the weights assigned to the activations of the 10 filters by the hidden layer. For example, class 0 seems to favor the first layer filter with a higher value on the outer edge (since the number 0 tends to have this). It dislikes the middle filter with a lower value (usually corresponding to the hole in the middle of the number 0). Class 1 is almost the opposite, favoring the middle filter, which you might associate with the vertical stroke of the number 1.
The advantage of this approach is flexibility. For each category, a wider range of input patterns can trigger the corresponding output neurons. Each category can be activated by multiple abstract features from the previous hidden layer or combinations thereof. Essentially, we can learn different types of numbers, shapes, and more. For most tasks, this usually improves the network's performance (although not always).
Features and Representations
Let’s summarize what we’ve learned in this article. In both single-layer and multi-layer neural networks, each layer performs a similar function: it transforms data from the previous layer into a “high-level†representation of the data. “High-level†means it contains a more compact and prominent version of the data, similar to how a summary is a “high-level†representation of a book. For example, in the two-layer network we discussed, the first layer maps the “lower†pixel data to “high-level†features like strokes and circles, and then maps those to the final output (the actual number). This concept of transforming data into smaller, more meaningful information lies at the heart of machine learning and is a major function of neural networks.
By adding a hidden layer to the neural network, we give it the opportunity to learn features at multiple levels of abstraction. This leads to a richer representation of the data, where the earlier layers contain lower-level features and the later layers combine these features into higher-level representations.
As we've seen, hidden layers can improve accuracy, but only to a certain extent. As more layers are added, accuracy may stop improving, leading to increased computational costs—we can’t simply ask the network to memorize every image category in the hidden layer. It turns out that using a convolutional neural network is a more effective approach.
High Precision Linear CT
Our company`s current transformers have high precision,wide range,small volume and good linearity that can be used to the field of electronic watt-hour meter, electric energy metering, electronic detection.
Performance
â—Power frequency insulation strength:The insulation between the primary winding and the secondary winding and the ground part of the CT can bear 4kV power frequency voltage for 1minute
â—Interturn insulation strength:The secondary open circuit, the primary winding through the rated current 1min, no inter-turn damage in the transformer
â—The deviation is better than the industry standards and national standards
High Precision Liner Ct,Useful Current Transformer,Customized Current Transformer,Durable Current Transformer
Anyang Kayo Amorphous Technology Co.,Ltd. , https://www.kayoamotech.com