Part of a brief series that started here.
How can we use k-Means to understand and/or manipulate photographic images? As a first example, here’s a classic from the poster-printing world: choosing a very small number of inks to represent a full-tone photo.
In our example, we grab random photos from the web – some work great as posters, some… not so much. But the code will do its best given the narrow constraints: all it knows is grayscale values, and we’ve reduce our calculation to just one dimension: the values along the grayscale histogram from each picture.
As described in previous posts, k-Means is a surprisingly simple method for discovering “clumps” of information from any source – even if the clumps are subtle and hard to spot with the naked eye. Basically you just guess, evaluate your guesses, and keep refining them over and over until it’s Just Right.
While not mentioned in earlier posts, k-Means is often considered a kind of machine learning: meaning that if you’ve understood a little of the previous posts, you already have an interesting bit of ML under your belt; an ML flavor rather different from what the media typically describes as “AI.”
Most of the big ML successes in recent products like Google’s Pixel Lens or Apple’s Siri come from what’s known as supervised learning: that is, there is a predefined teacher who knows the right answers for some very specific problem, and then supervises the computer until it can solve problems on its own. Computers can learn by repeating lessons over and over again, if the teacher can give them enough quiz questions.
Simple example: If you show a computer 1,000,000 pictures, and every time there are Lego blocks shown in one of those pictures, you label it “Lego,” then after a lots of trial and error the computer may learn to label Lego (and non-Lego) pictures all by itself.
The hard part, of course, is that first you need to find a million pictures, carefully and correctly labelled for Legos (and also many that you know are “not Legos,” becasue the task isn’t “find the Legos in this picture,” it’s “tell me: is there a Lego in this picture?”), and have all those labelled pictures handy somewhere on your hard drive. That’s before you even begin.
ML-savvy readers might point out that if we’re seeking only one category, “Legos,” maybe only several thousand pictures are needed. Fair enough. Do you have them ready? How much time do you think it will take to find even those? Supervised learning always requires a lot of potentially-laborious supervision up front.
Another different class of methods, and k-Means is on the fringe of this camp, are called UNsupervised learning, where we don’t need a teacher. Instead, the computer can learn something by itself, based only on the information at hand.
In this case, we want it to figure out what inks to use for printing a poster of a specific picture if our print process only has, say, three or four inks, as in silk-screen or hand-press printing.
The Example Code
As before, you can stop/start the sample by clicking on the sample area.
Also as before, we let the guessed means move around until they’re stable, then randomly take one mean away.
One of the color means doesn’t move: it’s always pinned to K=100%, aka black.
A nice characteristic of this method is that it doesn’t actually do its calculations directly on the pixels: that would be slow. Instead, we just look at the image histogram, which only has 256 levels to compare, rather than the 16384 pixels of even these little sample pictures.
No matter how large your B&W input image might be, if it’s an 8-bit picture it still only has 256 gray levels, and so the ink-finder is just as fast regardless of picture dimensions. The only “slow” parts just need be done once: the histogram collection, and then the final remapping of output display pixels. The “learning” passes are very fast.
This example figures out inks for each single, specific picture. The four inks for a light-toned picture may be inappropriate for a darker picture. But the ones k-Means selects for any individual picture will tend to be exactly the ones that give you great separation of the active tones of the shot.
It’s also possible that for very narrow-toned photos, k-Means can even select fewer inks than you might have originally thought you needed. Rare but possible.
To experiment on your own, consider making the criteria for ink-reduction “looser” – currently an ink might persist in the list even if it’s only printed for a single pixel. Is there a better criterion?