I often work with geospatial data, primarily points on the Earth’s surface represented by longitude-latitude pairs. Typically, a table will have columns for longitude and latitude stored as one of the number data types, usually decimal or numeric.
Occasionally, though, I see longitude and latitude values stored in columns defined with the float data type or even the real data type. Storing longitude and latitude values as either of these approximate-number data types sets a trap for the unwary when using the SQL Server geography methods, which I’ll illustrate with an example. (The demonstration code is available at the link at the end of this post.)
First, let’s declare some variables to hold longitude and latitude values for two points: (-118.123499999 35.123450000) and (-118.123000000 35.123400000) as float and instances of the SQL Server geography data type representing those points:
We will create geography instances from longitude-latitude pairs with the STGeomFromText() method, which takes a well-known text (WKT) definition and an SRID as its arguments. We will construct the WKT by converting the longitude and latitude values into varchar() values and concatenating them with string literals:
With the geography instances created, we will apply the STDistance() function to calculate the distance between our two points:
Wait – what? Our longitude-latitude pairs are clearly different, so there should be at least some distance between those points. Why did the STDistance() function return a zero?
We have fallen directly into the trap I alluded to at the beginning of this post. The problem arises when CONVERTing or CASTing float values to character data types (the examples shown here all use CONVERT, but the demonstration code includes examples using CAST to the same effect). By default, the output of such a conversion will be limited to six digits. We can see this by SELECTING the concatenated strings we fed into the STDistance() function:
Note carefully that the float values were rounded to six digits when converted to varchar(). Since this rounding resulted in identical WKT for both points, we only calculated the distance between a point and itself, which, by definition, is zero.
Wait a minute, you say, shouldn’t 35.123450000 have rounded up to 35.1235? Well, if we were applying the ROUND() function to round this value to four decimal places, it will indeed round up to 35.1235. In the course of the conversion from float to varchar(), however, SQL Server rounds 35.123450000 down to return ‘35.1234’ but would round 35.123450001 up to return ‘35.1235’:
The real data type causes an even bigger problem:
I call this a bigger problem because 11.094 meters is wrong – plausible, but wrong (trust me on this – I’ll show you the correct result in a minute). This wrong but plausible result may make it much harder to spot the issue.
This apparently happens because the real data type seems to follow different rounding rules when converting to varchar(), yielding different WKT for each point:
So what to do about this problem? Ideally, we would change the data type of the longitude and latitude columns to a fixed numeric type like decimal or numeric (which don’t suffer from the conversion issue, as shown in the demonstration code), but we sometimes lack the ability to make such fundamental changes to the database. We could also convert the float values to a fixed numeric type before converting them to varchar() values, but that requires two conversions.
The STR() function provides the solution we need. STR() allows the user to set the target precision and scale when converting a float value into a varchar() value. We can use the STR() function to control the conversion of our longitude and latitude values so that we do not lose precision, whether converting from float or real (the demonstration code includes an example with the real data type, too):
As we can see, this technique yields the correct distance between our two points, approximately 45.9 meters, because the STR() function converted the float values to varchar() values with no loss of precision. Real values will be implicitly converted to float values when applying the STR() function, resulting in some loss of accuracy in the final result, as shown in the demonstration code.
We’ve seen how longitudes and latitudes stored as float or real values can trip up unsuspecting developers. If that foible alone is not enough to convince you that longitudes and latitudes should not be stored as approximate numeric types, my next post will show that these data types provide no appreciable advantage over fixed numeric types for this purpose.
References and further reading: