Point Inside a Polygon in Ruby
Recently my team needed to find out if a point was inside a given polygon and, as usual, we found some code on the internet that did what we wanted. And, as usual, there wasn't much explanation as to what it was doing. Here's the code (after we converted it to Ruby):
So, you know, it works and I'm grateful to somebody for posting it but we had no idea how it works. So we wrapped 2 tests around it (one where the point is not in the polygon and another where it is). But that one method had a Flog score of 80 -- it was killing me. Also 2 tests where nowhere near exercising all the possible paths through this code and since it was completely un-obvious as to what was going on this was just a bug waiting to happen.
I decided to refactor this method so that it made a wee bit more sense. I did a little bit of reading on the internet about the ray casting algorithm, got out a pen and paper, and figured out what the hell was going on. Before I go on to explain how I changed the code, I should probably explain that we are using GeoRuby to give us a bunch of geometric classes and we added the contains_point? method to GeoRuby::SimpleFeatures::LinearRing. Inside a LinearRing instance self[0] is the first point in the polygon, self[1] is the second, and so on. Also, point.x gives you the x value of the point. Oh, and self.size returns the number of points in the polygon
Here's the refactored code:
And here's what I think this algorithm is doing: If you take any point and draw a ray from it, in any direction, and that ray crosses through the sides of a polygon 0 or and even number of times then the point is outside the polygon. If the ray crosses the sides of a polygon an odd number of times then it is inside said polygon. If you don't believe me get out a pen and paper and draw it or look at this wikipedia article.
So the contains_point? method starts out by assuming the point is outside the polygon since its ray has yet to cross any of the borders. Then it needs to look at each line segment of the polygon (otherwise known as a border or edge) so, if you look at the i and j iterators, they are working their way around the polygon and are used to define a point on the polygon and the point immediately preceding it. Now the algorithm looks to see if the point in question is between the max and min y values for a given line segment using the appropriately named point_is_between_the_ys_of_the_line_segment? method. If that's true then a horizontal ray could travel through the current line segment. But it might not, depending on which way it is traveling. So the next method 'ray_crosses_through_line_segment?,' checks to make sure. Now if the ray has crossed through the line segment, or edge, of the polygon you set contains_point to the opposite of whatever it was. Why? Well if it had crossed 0 times contains_point would be false and one more crossing would bring it to a total of 1 (an odd number) and so contains_point should now be true. Every time you find another crossing you flip from odd to even or vice-versa and so you must flip the variable also.
Not too hard, huh? But if there are many sides to your polygon (and we sometimes have lots of sides) then this can be quite a slow operation to do en masse (and we are planning to do it very en masse) so while doing some research for the refactor I found an optimization:
A bounding box is the smallest possible box that can completely contain a polygon. If you draw a bounding box around the polygon, check to see if the point is outside the bounding box, and it is then you're done. You fail fast because if the point is outside the bounding box there is no way it is inside the polygon. GeoRuby gives you a bounding_box method for free (but it's not too hard to find the max and min x's and y's of a polygon) so I was able to create the above method and insert it into the original pretty easily:
Now contains_point? will give up early if there's no chance.
1 | def contains_point?(point) |
So, you know, it works and I'm grateful to somebody for posting it but we had no idea how it works. So we wrapped 2 tests around it (one where the point is not in the polygon and another where it is). But that one method had a Flog score of 80 -- it was killing me. Also 2 tests where nowhere near exercising all the possible paths through this code and since it was completely un-obvious as to what was going on this was just a bug waiting to happen.
I decided to refactor this method so that it made a wee bit more sense. I did a little bit of reading on the internet about the ray casting algorithm, got out a pen and paper, and figured out what the hell was going on. Before I go on to explain how I changed the code, I should probably explain that we are using GeoRuby to give us a bunch of geometric classes and we added the contains_point? method to GeoRuby::SimpleFeatures::LinearRing. Inside a LinearRing instance self[0] is the first point in the polygon, self[1] is the second, and so on. Also, point.x gives you the x value of the point. Oh, and self.size returns the number of points in the polygon
Here's the refactored code:
1 | def contains_point?(point) |
And here's what I think this algorithm is doing: If you take any point and draw a ray from it, in any direction, and that ray crosses through the sides of a polygon 0 or and even number of times then the point is outside the polygon. If the ray crosses the sides of a polygon an odd number of times then it is inside said polygon. If you don't believe me get out a pen and paper and draw it or look at this wikipedia article.
So the contains_point? method starts out by assuming the point is outside the polygon since its ray has yet to cross any of the borders. Then it needs to look at each line segment of the polygon (otherwise known as a border or edge) so, if you look at the i and j iterators, they are working their way around the polygon and are used to define a point on the polygon and the point immediately preceding it. Now the algorithm looks to see if the point in question is between the max and min y values for a given line segment using the appropriately named point_is_between_the_ys_of_the_line_segment? method. If that's true then a horizontal ray could travel through the current line segment. But it might not, depending on which way it is traveling. So the next method 'ray_crosses_through_line_segment?,' checks to make sure. Now if the ray has crossed through the line segment, or edge, of the polygon you set contains_point to the opposite of whatever it was. Why? Well if it had crossed 0 times contains_point would be false and one more crossing would bring it to a total of 1 (an odd number) and so contains_point should now be true. Every time you find another crossing you flip from odd to even or vice-versa and so you must flip the variable also.
Not too hard, huh? But if there are many sides to your polygon (and we sometimes have lots of sides) then this can be quite a slow operation to do en masse (and we are planning to do it very en masse) so while doing some research for the refactor I found an optimization:
1 | def outside_bounding_box?(point) |
A bounding box is the smallest possible box that can completely contain a polygon. If you draw a bounding box around the polygon, check to see if the point is outside the bounding box, and it is then you're done. You fail fast because if the point is outside the bounding box there is no way it is inside the polygon. GeoRuby gives you a bounding_box method for free (but it's not too hard to find the max and min x's and y's of a polygon) so I was able to create the above method and insert it into the original pretty easily:
1 | def contains_point?(point) |
Now contains_point? will give up early if there's no chance.
Comments
97.3: flog total
19.5: flog/method average
30.8: GeoRuby::SimpleFeatures::LinearRing#outside_bounding_box?
27.6: GeoRuby::SimpleFeatures::LinearRing#ray_crosses_through_line_segment?
19.0: GeoRuby::SimpleFeatures::LinearRing#contains_point?
18.8: GeoRuby::SimpleFeatures::LinearRing#point_is_between_the_ys_of_the_line_segment?
1.1: GeoRuby::SimpleFeatures::LinearRing#none
Consider creating a C extension for this. The ability to do pointer math makes this a very optimizable problem in C.
Toby
If it didn't cut the curve at all, the point in question is definitely outside. If it cuts it even number of times, it passed through a pair of (outside, inside), (inside, outside) transition even number of times, and therefore should be outside. If it cuts the curve odd number of times, e.g. once, the line went from outside to inside, and therefore the point in question is inside.
Took your contains_point?, called it include?, added it to a Polygon class (which is an array of Points) and now I can ask if polygon.include?(point), nice.
Sorry to revive an old post, but you have a typo in your code. In the original code, you should drop the "return c" another line. Now, in your case, it never iterates the while loop, it will just do it once.
the ending should be
j = 1
end
end
return c
end
Thank you,
Michael