My last post described a trap for the unwary when using latitude and longitude values stored as the float data type in SQL Server, which left me wondering why anyone would use the float data type for this purpose. I questioned whether the float data type offers any advantages over the exact numeric types (decimal and numeric) for storing latitude and longitude values. I could think of only two factors that would affect the choice of data type here: (1) accuracy and precision and (2) storage requirements. After some research and testing, I concluded that an exact numeric data type, decimal(9,6), will be a better choice for storing latitude and longitude values in most cases because it provides a reasonable degree of accuracy and precision and requires less storage space than the float data type.
Accuracy and precision of latitude and longitude values
Accuracy and precision, as I use those terms here, define how well a measured value represents the actual or true value. Accuracy means the degree of closeness of a measurement to the actual or true value. Precision refers to the number of significant digits used to represent a measured value. The precision of a measured value limits the degree of accuracy that can be ascribed to a measurement, and presenting a measured value with greater precision than warranted can lead to overconfidence in the accuracy of the measurement. For latitude and longitude values (essentially measured values) to appropriately represent the actual values, then, the data type used to store them must accommodate a reasonable degree of accuracy and precision.
The accuracy of latitude and longitude values depends on the accuracy of the system used to determine them. The well-known Global Positioning System (GPS) uses the transit time of radio signals transmitted by orbiting satellites to determine latitude and longitude. In an evaluation of the accuracy of the GPS system for the quarter ending June 30, 2013, the U.S. Federal Aviation Administration determined that GPS coordinates were accurate to about 10 meters or less 99.99 percent of the time (PDF). GPS augmentation systems can boost this accuracy to a few centimeters.
We can use this accuracy figure to determine the precision with which we can represent latitude and longitude values determined using GPS. Generally, the number of significant digits used to represent a measurement should be determined by the limit of accuracy of the measuring device. At the geographic center of the 48 contiguous United States, 0.000001 (one one-millionth) degree of latitude represents about 11 centimeters and 0.000001 degree of longitude represents about 8.5 centimeters (online calculator for distance per degree) , or roughly the accuracy of GPS with augmentation. Accordingly, six decimal places would be appropriate for representing GPS measurements of latitude and longitude.
We now know that we need a data type that can represent values between -180 and 180 (the range of longitude values; latitude values range from -90 to 90 ) to six decimal places. We have two choices of data type to accommodate this range: decimal(9,6), which can store up to nine digits total and up to six digits to the right of the decimal point, and float(53), which can store up to 15 significant digits (float(24) will only store seven significant digits, so it’s not a candidate). As we’ll see in the next section, though, decimal(9,6) has another big advantage that makes it the clear choice.
All other things being equal, storing the same data in less storage space is a good thing. Storing more bytes than necessary causes needless I/O (PDF), and consumes disk space that may be relatively inexpensive but is definitely not free. If we can accomplish our purpose with fewer bytes, generally we should.
That’s why decimal(9,6) wins out over float(53) for storing latitude and longitude values. A float(53) value requires eight bytes of storage space while a decimal(9,6) value only requires five bytes. While saving three bytes of storage per row (or more likely, six bytes per row, since latitude and longitude values usually come in pairs) may not seem like much, using those extra bytes to store latitude and longitude values as float(53) buys us almost nothing and costs us at least something compared to storing those values as decimal(9,6).
A quick example
Just to verify that decimal(9,6) will suffice to store latitude and longitude values for almost all purposes, I put together a quick example: calculating the distance from John F. Kennedy International Airport in New York to Los Angeles International Airport to the centimeter (download example code). First, I calculated the distance using the float data type with nine decimal places (I obtained the latitude and longitude coordinates of each airport to six decimal places and added three pseudorandom digits to each):
I then calculated the distance using the decimal(9,6) data type:
The calculated distance varies by only 27 cm between the float data type calculation with nine decimal places and the decimal(9,6) calculation. That’s a difference of only 0.00000678%. I suspect that there are very few measurement systems in the world that are precise enough at distances of almost 4,000 km that this difference could be considered significant, so the more storage-efficient decimal(9,6) data type will be preferable to the float(53) data type for storing latitude and longitude coordinates in most cases.