Part of a brief series that started here.
The world is full of patterns: art, physics, weather, music. In some camps, it’s been proposed that human ability to grasp at patterns is the core of conscious experience. We understand our surroundings, and one another, as a complex carpet of overlapping and expected patterns. The patterns might be hard to fathom at first, then… like the sudden twist in a mystery story, Aha! the experience of understanding. It all makes sense now.
With that possibility for epiphany in mind, here’s a paradoxical pattern to consider: when is a point a line, and when is line a point, especially when that line is.. a curve? And how, may I ask, does this relate to beer?
May we present: The Hough Transform…
The story of the Hough Transform, like many great ideas, begins with our last question above: the beer.
In 1952 physicist Donald A. Glaser invented the “bubble chamber,” a device for tracing the motions of charged particles moving through a liquid. Some of the initial experiments used beer inside the chamber. Of course they did, this was at a university. But the only reported results from the high-energy beer physics experiments, alas, were some bad smells (seriously). Switching from beer to liquid hydrogen led to Glaser winning the Nobel Prize only eight years later, in 1960.
The year before that prize, 1959, another researcher, Paul Hough, published “Machine Analysis of Bubble Chamber Pictures,” in the popular Proceedings of the International Conference on High Energy Accelerators and Instrumentation. If you missed that, he also filed a 1962 patent.
The problem Hough was trying to get at was this: how to find straight lines in very low-quality, sketchy, incomplete pictures? Including ones where there’s a lot of extra noise from spurious bubbles, low-quality signals (this is 1959, using a 1950’s-era TV camera), and how to do it quickly (at TV camera speed)?
Breaking down our perceptions into patterns is a big part of what has been termed “artificial intelligence” at least as far back as those days in the 1950’s. Hough, or you or I, could look at a picture with broken lines and instantly understand how the pieces connect into a single continuous stroke. But how might this perception be achieved automatically?
Lines Can Be Points
Hough had the genius idea to think of Lines as Points.
Pictures like those on a TV camera or a computer display are made from points in X and Y dimensions: pixels, arranged across the width and height of the picture. Just like Mr. Evan’s Anaylytic Geometry class back in Junior High. As Mr. Evans probably told you, in algebra any perfect, infinite line on an XY picture can be described by a little equation that you might recall from that classroom: y = m⋅x + b where m is the slope (how steep is the line?), and b is the “y-intercept” where the line crosses the x=0 axis (that is: changing b moves the line up and down on the chart).
Since a perfect infinite line doesn’t care about the specific x or y, we can just express it as m and b — which in turn could just be a single point (m,b) on its own 2D graph, where, say, the horizontal axis is slope m and the vertical one is b. Which is exactly what Hough did.
A problem with this description is that vertical lines, where the slope is infinite, can’t be drawn. This worked out okay for Hough’s specific case. In fact it worked great for him. But it’s kind of bad if we want to use this to look at most everyday pictures, which have lots of vertical lines relating to buildings, trees, and so forth.
A decade later, two researchers at Stanford, Duda & Hart, realized you could just use a different equation for straight lines to fix that problem.
The equation they prefered uses an angle between zero and 180 degrees to define the line’s tilt (called theta or θ),and the radial distance of the line from the (0,0) origin (called rho or ρ). The ρ distance for lines that appear in any real picture can never be farther than the distance between a picture’s opposite corners.
The illustration below shows the relationships. Any line that can appear in the dark image box on the right is also represented by the single highlighted point in the white rectangle of (θ, ρ) space on the left. You should be able to see how changing values of tilt and distance affect where the picture line is drawn.
The specific line equation for theta and rho, which you’re not at all required to recall later, is: x⋅cosθ + y⋅sinθ = ρ
Points and Lines to Curves
Okay, we have some interesting ways to write about straight lines, but how do we go from a picture, made up of many single points, to finding meaningful straight lines that join those points? Especially since we don’t know which points are “good” important points, and which ones are just other random parts of the picture?
First, Hough knew that the lines he cared about were along edges in the picture. In the bubble chamber, a picture was contrasty black-and-white, like in the photo above (but lower quality). So he just watched for sudden changes in brightness along each tiny part of the TV signal, and only paid attention to places where the values jumped between light and dark or back. In our sample code, we use a pretty similar edge finder called the Sobel filter, likewise looks for changes in neighboring pixel brightnesses and mixes edges found in both vertical and horizontal directions (we can talk about the specifics of Sobel in another post).
If a pixel isn’t on an edge, we can ignore it – a big time saver for some pictures.
So now that we’ve picked some points (pixels) as best candidates to be part of a line, let’s consider just one pixel’s worth: one isolated point. There’s an infinite number of possible lines that could pass through that point, with tilts ranging around a full 180 degrees. Lines tilting up, tilting sideways, anywhere between.
If we look at it in our θ/ρ space, we can plot a graph of all the possible values of the tilts and corresponding distances for the possible lines that pass through that point, for every angle from -90 degrees to + 90 degrees. The resulting graph shape will show a smooth, sinuous curve of different possible distances, one for each angle we consider.
Time for another Hough insight.
Curves to Lines
Let’s now consider a second point on the same picture. It too has an infinite number of possible lines in the picture that could cross it, but only one of those lines will cross both the first and second points.
If we add to our graph the same sort of curve described above for that new point, the two curves of possibilities will always cross one another, exactly once, somewhere on our graph of θ/ρ space. And where the curves cross is exactly the point that describes the line between those two original picture points.
Mind blown. (ymmv)
Curves to Lines: Taking It to the Voters
You might be asking: Couldn’t we have found a line between two points in a much, much, easier way?
Sure, for just two points. But what about for lots of points?
Let’s add a third point. It might not be along the same connecting line as those first two, but if it is, we have that much more reason to think that our line is a good one. Its curve will intersect in just the same spot. The more points that line up, the stronger our conviction that this particular combination of theta and rho is one we like. We can keep doing this for every candidate point.
So let’s let every pixel vote!
To allow voting, we can just accumulate all the curves on one gray-scale graph of pixels, adding up the intersections for all possible tilt angles and for all the edge points in the picture. The brightest spot(s) in the graph will correspond to the strongest lines in the original picture.
In our sample, we’ve slowed the process down a bit for the benefit of human viewing – in practice it’s very quick, and even quicker if you can exploit a GPU (as is done, say, in the cameras of a self-driving car, but not done here).
There you have it! Hough’s brilliant idea.
- Lines can be points
- An infinite collection of possible lines can be a line (of lines)
- …or a curve (of lines)
- When drawn in a collection of pixel points, curves made from points can vote for the best lines that might connect those points
Since 1959 Hough’s gymnastic transform been extended for many other uses and different sorts of shapes. His paper is one of the most-cited in all of computer vision.
I have no idea if it’s been used to trace actual beer bubbles. Maybe someone should try, the effort could at least be entertaining while the test supplies last.
The Example Code
As usual, all code here is contained within the HTML — no calling-out to external resources like d3 etc. Selecting “view source” or looking at this github page will reveal all. The single
<script> tag contains the main demo object as
You may have noticed we skipped past the details of the Sobel filter edge extraction. Extracting edges can be straightforward using image convolutions. Convolutions are cool but not the direct topic of this post, so I’ve borrowed and tweaked the convolution code here from “html5rocks.com.” To make that clearer all such code is organized in a sub-object
A nice attribute of using a raster image to collect the possibility curves is that it lets us precalculate all the potentially-slow sin and cos functions exactly once as the page loads. A big time saver.
This code loads random images from the internet. Every now and then, the image doesn’t actually exist, and the code will halt. Just press the button near the top figure to skip forward to the next picture, and things will carry on again.