Neural network evaluation and internal state analysis

2025-08-16 02:50:28

This article minimizes the use of complex mathematical formulas, focusing instead on visual explanations to help readers get started with neural networks. By reducing the need for advanced mathematical knowledge, it makes the topic more accessible and easier to grasp. In fact, most machine learning concepts can be understood with a basic level of math, combined with some analogy and abstraction. This approach allows many people to begin exploring the field, as without such simplification, many might give up on learning about machine learning. In this article, weâ€™ll take a closer look at how neural networks perform by examining their internal states to gain some intuition about how they work. In the second half, we will attempt to train a neural network on a more complex datasetâ€”images of dogs, cars, and shipsâ€”to see what improvements are needed to enhance our networkâ€™s performance. Visual Weight Letâ€™s start by training a network that classifies MNIST handwritten digits. Unlike previous examples, we'll map the input layer directly to the output layer without adding a hidden layer. So our network looks like this: [Image: Single-layer neural network for MNIST] When we input an image into a neural network, we visualize it by â€œexpandingâ€ the pixels into a list of neurons, as shown in the left side of the figure below. Let's focus on the first output neuron, which we'll call z. Each input neuron and its corresponding weight will be labeled as xi and wi. [Image: Visualizing weights for the first output neuron] Instead of expanding the pixels, we can also represent the weights as a 28x28 grid, where the weights align exactly with the corresponding pixels. The right side of the image and the following figures all express the same equation: z = b + âˆ‘wx. [Image: Alternative visualization of pixel-weight products] Now letâ€™s look at the well-trained network based on this architecture and visualize the learned weights for the first output neuron, which is responsible for classifying the number 0. We color the weights, with black representing the lowest and white the highest. [Image: Visualizing the weights of the 0-neuron in MNIST classifier] Looking at the image, does the right side resemble a fuzzy 0? Think about what this neuron is doingâ€”it helps explain why the image has that shape. This neuron is â€œresponsibleâ€ for detecting the digit 0, aiming to produce a high value when the input is 0 and a low value otherwise. It assigns higher weights to pixels that are typically bright in images of 0 and lower weights to those that are usually bright in non-zero digits. The dark center in the weight image comes from the fact that the pixels in 0 images tend to be low in that region, while other digits often have higher values there. Looking at the weights learned by all 10 output neurons, we see that they resemble slightly blurred versions of the digits 0 through 9. It seems as if the network has averaged many images belonging to each category. [Image: Visualizing the weights of all output neurons in MNIST classifier] If the input is an image of the digit 2, we expect the neuron responsible for class 2 to activate more strongly because its weights are set to emphasize pixels that are typically bright in images of 2. Some weights of other neurons may also align with these high-value pixels, increasing their scores. However, the overlap is limited, and many of the high-value pixels in these images are counteracted by low weights in the 2-class neuron. The activation function doesnâ€™t change this, as it is monotonicâ€”higher inputs lead to higher outputs. We can interpret these weights as templates for the output classes. It's fascinating that the network never received explicit instructions about what the numbers mean but still ended up resembling the actual objects. This suggests that neural networks form representations of the training data that go beyond simple classification or prediction. As we move into convolutional neural networks, we'll explore these representations even further, but for now, we'll keep things simple. This raises more questions than answers. For example, what happens when we add a hidden layer? As we'll see, the answer relates to what we observed intuitively earlier. But before diving deeper, let's examine the performance of our neural network, especially the types of errors it tends to make. Sometimes our network makes mistakes that seem almost understandable. For instance, the first number below appears to be a 9, but it's not very clear. Someone could easily mistake it for a 4, just like our network does. Similarly, the second number, a 3, is misclassified as an 8. The errors in the third and fourth numbers are more obvious. Almost anyone would recognize them as 3 and 2, respectively, but our network misclassifies the first as a 5 and the second as something else. [Image: Examples of errors in the single-layer MNIST network] Letâ€™s take a closer look at the performance of the last network discussed in the previous article, which achieved 90% accuracy on the MNIST dataset. One way to analyze this is by creating a confusion matrix that breaks down our predictions into a table. In the confusion matrix below, rows represent the actual labels, and columns represent the predicted labels. For example, the cell in the 4th row (actual label 3) and 6th column (predicted label 5) indicates that 71 instances of 3 were mislabeled as 5. The green diagonal shows correct predictions, while other cells show errors. [Image: Confusion matrix for MNIST network] Filling the top of each cell with the count gives us valuable insights. [Image: Top confidence samples in the confusion matrix] This gives us a sense of how the network learns to make predictions. Looking at the first two columns, we see that the network seems to look for a large ring to predict 0, thin lines for 1, and if other numbers have similar features, the network might misclassify them. Playing Bad Our Neural Network So far, weâ€™ve only looked at neural networks trained to recognize handwritten numbers. While we gained a lot of insights, we used a very simple dataset with clear categories and small internal variations. In real-world scenarios, however, we often face much more complex image classifications. Letâ€™s see how the same neural network performs on another dataset, CIFAR-10. CIFAR-10 includes 60,000 32x32 color images across 10 categories: airplanes, cars, birds, cats, deer, dogs, frogs, horses, boats, and trucks. Here are some random samples from CIFAR-10. [Image: Random samples from CIFAR-10] It's clear that the differences between these image categories are much more complex than in MNIST. For example, cats can face different directions, have various colors and fur textures, stretch or curl, and so onâ€”features we didn't encounter in handwritten digits. Additionally, cat images may include other objects, increasing the complexity. Sure enough, when we train a two-layer neural network on these images, our accuracy drops to 37%, which is better than random guessing (10%), but far less than the 90% achieved on MNIST. With convolutional neural networks, we can significantly improve accuracy on both datasets. For now, weâ€™ll continue analyzing the weights to understand the limitations of standard neural networks. Letâ€™s repeat the previous experiment, this time training a single-layer network on CIFAR-10 images. The resulting weights are shown below. [Image: Visualizing the weights of the single-layer CIFAR-10 classifier] Compared to the MNIST weights, these have fewer clear features and lower resolution. Some details do have intuitive meaning, such as the blue edges in airplane and ship images, reflecting the tendency of these images to be surrounded by sky or water. Since the weight images correlate with the average of the images in each category, we expect to see a spot-like average color. However, due to the lower internal consistency of CIFAR-10 classes, the "template" is much less distinct than in MNIST. Letâ€™s look at the confusion matrix associated with this CIFAR-10 classifier. [Image: Confusion matrix for CIFAR-10 classifier] Unsurprisingly, the performance was poor, with an accuracy of only 37%. Obviously, our simple single-layer network struggles with this complex dataset. Weâ€™ll introduce a hidden layer and see how much performance improves. The next section will explore this. Add a Hidden Layer So far, we've focused on single-layer neural networks directly connected to the output. How does adding a hidden layer affect our network? Let's insert a middle layer containing 10 neurons into our MNIST network. Now, our handwritten digit classification network looks like this: [Image: Double-layer neural network for MNIST] The simple template symbol from the previous single-layer network no longer applies, since the 784 input pixels are not directly connected to the output. In a way, we forced our original single-layer network to learn these templates, as each weight was directly connected to a category label and affected only that category. However, in the more complex network we're introducing now, the weights in the hidden layer influence all 10 output neurons. What should we expect these weights to look like? To understand what's happening, we'll visualize the weights in the first layer as before, but we'll also examine how their activations combine in the second layer to produce category scores. As mentioned earlier, if an image matches a filter, it will activate a specific neuron in the first layer. The 10 neurons in the hidden layer reflect the presence of these 10 features in the original image. In the output layer, each neuron corresponds to a category and is a weighted combination of the 10 hidden activations. The figure below illustrates this. [Image: Visualization of hidden layer and output layer connections] Looking at the first layer weights at the top of the image, they appear strange and no longer resemble image templates. Some look like pseudo-numbers, others like digital components: half-rings, diagonals, holes, etc. The lines below the filter image correspond to our output neurons, with each row representing an image category. The bars show the weights assigned to the activations of the 10 filters by the hidden layer. For example, class 0 seems to favor the outer edge filter, as zeros tend to have that shape. It dislikes the middle filter, which usually corresponds to the hole in the center of a zero. Class 1 is the opposite, favoring the middle filter, which likely represents the vertical stroke of a 1. The advantage of this approach is flexibility. For each category, a wider range of input patterns can trigger the corresponding output neurons. Each category can be activated by several abstract features from the previous hidden layer or combinations thereof. Essentially, we can learn different types of numbers, shapes, and more. For most tasks, this usually improves the networkâ€™s performance (although not always). Features and Representations Letâ€™s summarize what weâ€™ve learned in this article. In both single-layer and multi-layer neural networks, each layer performs a similar function: it transforms data from the previous layer into a â€œhigh-levelâ€ representation. â€œHigh-levelâ€ means it contains a more compact and meaningful version of the data, similar to how a summary is a high-level representation of a book. For example, in the two-layer network above, we map the â€œlowerâ€ pixel data to â€œhigh-levelâ€ features like strokes and circles in the first layer, and then map these features to the final output (the actual number). This concept of transforming data into smaller but more meaningful information is central to machine learning and a key function of neural networks. By adding a hidden layer to the network, we allow it to learn features at multiple levels of abstraction. This results in a richer data representation, where earlier layers contain lower-level features and later layers combine these features to form higher-level representations. As weâ€™ve seen, hidden layers can improve accuracy, but only to a certain extent. As more layers are added, accuracy may plateau, and computational costs increaseâ€”we canâ€™t simply ask the network to memorize every image category in the hidden layer. Instead, using a convolutional neural network proves to be a more effective approach.

Nano Core For Smart Meter

Good DC immune component core

The resisting DC iron core has strong resistance DC component ability, wide current range, few additional circuits and devices, strong reliability and insensitive to interference.

The resistance DC composite single magnetic core is a single resistance DC magnetic core, which has the characteristics of high linearity, high precision, easy phase error compensation and good high and low temperature characteristics. At the same time, compared with the composite core, the single core has stronger stability and smaller volume.

Our excellent magnetic iron core has high permeability,low coercivity and loss,excellent performance on DC immue and temperature stability that can be widely used to the electronic watt-hour meter,resistance DC component transformer and electrical power system measurement.

Nano Core For Smart Meter,High Product Quality Nanocrystalline core, Prominent Effects Nanocrystalline core

Anyang Kayo Amorphous Technology Co.,Ltd. , https://www.kayoamotech.com