When the Computer Starts Guessing Land Cover: A Practical Look at Unsupervised Classification - Packet Flow: Journey to Network & Cybersecurity Expertise

Using ENVI Classic, Landsat bands, K-Means, and ISODATA to turn a satellite image into a land-cover map

In this lab, we used unsupervised image classification to separate land-cover types around Cañon City. The process started with a false-color composite, moved through spectral signatures, and then compared K-Means and ISODATA classification results. The short version: the computer can group pixels, but it still needs human judgment. Otherwise, you get a very colorful map that looks impressive and may still be lying to your face.

Introduction

Remote sensing sounds very sophisticated until you realize that, at some point, you are staring at a satellite image and asking the computer, “Can you please sort these pixels into groups that make sense?”

That is basically the heart of unsupervised classification.

For this lab, we worked with imagery of the Cañon City area using ENVI Classic. The goal was to examine how different surfaces, such as vegetation, water, bare soil, rock, urban areas, and native vegetation, appear across different spectral bands. From there, we tested unsupervised classification using K-Means and ISODATA.

The work showed an important lesson: classification is not just pressing a button and accepting the colorful output. The software can group pixels based on spectral similarity, but the analyst still has to interpret what those groups actually mean.

*The standard false-color composite displays healthy vegetation in bright red, while urban areas appear more blue-gray. This view provides the first visual clue for separating land-cover types.*

Starting With a False-Color Composite

The first step was creating a standard false-color composite. In this display, near-infrared information is emphasized, which makes healthy vegetation stand out in bright red. Urban areas, bare ground, and other non-vegetated surfaces appear in more muted tones such as gray, blue-gray, white, or brown.

This matters because a normal-looking image does not always reveal the differences between land-cover types clearly. A false-color composite makes vegetation easier to identify because healthy plants strongly reflect near-infrared energy. That is why vegetation pops out in red, while built-up areas and exposed surfaces look very different.

In practical terms, the false-color image becomes the visual starting point. Before running any classification algorithm, it helps the analyst understand what the image is showing.

Reading the Landscape Through Spectral Signatures

Before jumping into classification, I examined several spectral signatures. A spectral signature shows how a selected surface responds across different wavelengths. Different materials reflect and absorb energy differently, which makes it possible to distinguish one land-cover type from another.

Healthy Vegetation

Healthy vegetation showed the expected strong response in the near-infrared portion of the spectrum. This is why it appeared bright red in the false-color composite.

Water

Water behaved very differently. It generally had a lower spectral response compared with vegetation and exposed land surfaces. This helps explain why rivers and water features often appear dark or visually distinct in classified imagery.

Barren Areas and Bare Soil/Rock

Barren areas and bare soil or rock showed more moderate spectral patterns. These surfaces can be tricky because they may overlap spectrally with urban materials or dry vegetation. That is one reason classification in complex terrain is not always clean.

*Barren areas display a more moderate spectral response, which can overlap with other dry or exposed land-cover types.*

*Bare soil and rock have a distinct but sometimes confusing spectral pattern, especially in mountainous or rugged terrain.*

Urban Areas

Urban areas appeared more blue-gray in the false-color image. Spectrally, they were less dramatic than healthy vegetation and could resemble other built or exposed surfaces.

*Urban surfaces show a relatively stable spectral pattern and can overlap with bare ground or other non-vegetated surfaces.*.

Native Vegetation

Native vegetation had its own spectral behavior, but it was not necessarily as visually obvious as the bright red healthy vegetation areas. This is where classification becomes more interpretive.

Running K-Means: The First Attempt Was Too Strict

After reviewing the image and spectral signatures, I used K-Means classification. K-Means is an unsupervised classification method that groups pixels into a fixed number of classes based on spectral similarity. The analyst chooses the number of classes, and the algorithm tries to organize the image into those groups.

The first K-Means run used strict parameters. The maximum standard deviation was set to 5, and the maximum distance error was set to 3. In plain English, the classifier was being picky. Very picky.

The result was incomplete. Many pixels were not assigned to any class and appeared as black, unclassified areas. The software was only willing to classify pixels that closely matched the statistical average of each group. Anything that was slightly different got left behind.

*The initial K-Means classification used strict parameters, including 20 classes, maximum standard deviation of 5, and maximum distance error of 3.*

*The first K-Means result left many pixels black because the parameters were too restrictive.*

This is a useful reminder: cleaner rules do not always produce better maps. Sometimes strict settings sound good in theory but fail in the real world because landscapes are messy. Mountains, shadows, mixed pixels, roads, vegetation, and exposed rock do not politely arrange themselves for the algorithm.

Loosening the Parameters: A Better Classification

To improve the result, I relaxed the classification parameters. The maximum standard deviation was increased to 10, and the distance error was increased to 8. This made the classifier more inclusive. Instead of rejecting pixels that were not near-perfect matches, the algorithm assigned more pixels to the closest available class.

The result was a major improvement. More of the image was classified, and the map began to look more complete.

*Relaxing the K-Means parameters allowed more pixels to be assigned to classes, improving the classification output.*

*The improved K-Means result filled in much more of the image compared with the earlier attempt.*

However, the statistics still showed that about 33.94% of the image remained unclassified. That meant roughly 66% of the scene had been classified. Better, yes. Done, no.

*The statistics table showed that the classification had improved, but the unclassified percentage was still too high.*

This is the part of remote sensing where the software says, “Technically, I did what you asked,” and the analyst says, “Yes, but not enough to be useful.”

Pushing K-Means Further

The next step was to relax the parameters again. The maximum standard deviation was increased to 20, and the distance error was increased to 15. This widened the statistical net even more.

With these settings, more difficult pixels, including shadows, mixed terrain, and varied rock surfaces, were assigned to one of the eight classes. The output became a more continuous land-cover map.

*Increasing the K-Means tolerances allowed the classifier to include more difficult pixels and produce a more complete thematic map.*

*Display #4 shows that most of the black unclassified areas were filled in after relaxing the parameters.*

The statistics for Display #4 showed that unclassified pixels made up 11.45% of the image. That means approximately 88.55% of the scene was classified. This was close to the 90% target, but still just short.

*The classification reached 88.55% coverage, which was close to the target but still slightly below the 90% classification threshold.*

Finally, the parameters were relaxed again to a maximum standard deviation of 25 and a distance error of 20. This produced an even more complete classification.

*The final K-Means adjustment used a maximum standard deviation of 25 and a distance error of 20 to reduce the unclassified areas further.*

*The final K-Means output shows a more continuous classification, with the statistical net widened enough to classify more of the scene.*

At this point, the classification became more usable. But there is a tradeoff. Relaxing the parameters fills in more pixels, but it may also force uncertain pixels into classes where they do not perfectly belong. In other words, fewer black holes, but possibly more questionable assignments.

That is the classic remote sensing bargain: completeness versus accuracy.

Comparing K-Means and ISODATA

After working through K-Means, I compared the results with ISODATA. ISODATA stands for Iterative Self-Organizing Data Analysis. It is also an unsupervised classification method, but it is more flexible than K-Means.

K-Means requires the analyst to set a fixed number of classes. The algorithm then forces pixels into those groups based on distance from class means. ISODATA can adjust during the process. It can split classes that are too diverse and merge classes that are too similar.

That makes ISODATA better suited for landscapes where the surface types are complicated, mixed, or difficult to separate.

*ISODATA allows more flexible classification by adjusting clusters during iteration through splitting and merging.*

The final comparison showed that the K-Means result looked more noisy or speckled. This makes sense because K-Means forces pixels into a fixed class structure, even when some pixels do not fit neatly.

The ISODATA result appeared cleaner. Because it can split and merge classes, it better adjusted to the actual landscape. This helped represent features such as the Arkansas River and nearby urban areas more clearly.

*The K-Means result appears more speckled, while the ISODATA result looks smoother and better organized for the landscape.*

What This Lab Shows

The biggest lesson from this lab is that unsupervised classification is powerful, but it is not magic.

The software can identify patterns in the image, but it does not truly “know” what vegetation, water, urban land, or bare rock are. It only knows that some pixels are mathematically similar. The analyst still has to interpret the output, compare it against the original image, check statistics, and decide whether the result makes sense.

This lab also shows why parameter settings matter. When the K-Means settings were too strict, much of the image remained unclassified. When the settings were relaxed, the classification became more complete. But relaxing the settings too much can also reduce confidence because the software may assign pixels to the nearest class even when the fit is not ideal.

ISODATA helped address some of these limitations by allowing clusters to split and merge during processing. The result was a cleaner-looking classification that better reflected the complexity of the landscape.

Conclusion

Unsupervised classification is a useful way to explore satellite imagery, especially when the analyst does not already have training data. In this exercise, the process began with a false-color composite, moved into spectral signature analysis, and then tested K-Means and ISODATA classification methods.

K-Means showed how sensitive classification can be to parameter settings. Strict settings left too many pixels unclassified. More relaxed settings improved coverage but introduced the usual concern about whether every assigned pixel truly belonged where it was placed.

ISODATA provided a more flexible approach by adjusting clusters during the classification process. For a complex landscape like Cañon City, with urban areas, vegetation, bare soil, rock, shadows, and the Arkansas River, that flexibility produced a cleaner and more realistic result.

The final takeaway is simple: satellite image classification is not just software work. It is interpretation work. The computer can group the pixels. The human still has to decide whether the map tells the truth.