Storing postal addresses along with a shop order, or a customer, is a pretty common place thing and often a really important piece of data. So it’s a bit weird that people seem to put so little thought into it, and the same amount of time devoted to it.
The standard way of doing it for me has been storing the lines of the address along side the item it’s associated with. There’d be two tables like this:
customer (customer_id, name, address1, address2, town, county, postcode, ..)
order (order_id, customer_id, address1, address2, town, county, postcode, ..)
Although you could get the order’s address via the customer_id, I wouldn’t consider this to be redundant data. A customer might change their address later on, but we would want the old address to stay with the order.
It is redundant in the way that the data is still duplicated, when it could be stored somewhere else and then referred to with a foreign key. So the above pseudo-table type could be changed like this:
customer (customer_id, name, address_id, ..)
order (order_id, customer_id, address_id, ..)
address (address_id, address1, address2, town, county, postcode)
Now each property in the country will have its own row, its own ID.
This means you need to change the flow of your application though. Whereas before you accessors and setters could just change the address attributes easily, you now you need to think about if you’re editing, or adding a new address. Though you can handle this by making an Address object (which you could have done before, but would have felt more hacked together).
Having an address table like this means you can hold more data about the property too; the longitude and latitude data, or image or the property. You’d have no place to put this data before, and in the case of long/lat you’d have to have looked it up each time from an external source.
The table still isn’t optimised though – you’ll notice that you’re recording “Birmingham” thousands of times, using around 10 bytes each time. Not awful, but if you’re looking to safe space then turning 20 megabytes of “Birmingham” into 10bytes might look attractive. To overcome this you could have the following:
address (address_id, line1, line2, lin3, a_town_id, a_county_id)
address_town (a_town_id, name)
address_count (a_county_id, name);
If you’re that into saving storage data though, you might consider putting the tables into their own database, and using that database across all of your websites and services. No use in collecting it hundreds of times.
Another problem which crops up is that users never enter helpful data. Instead of gathering the data from users though, use a service like PostCoder. Only ask the user for their postcode and let them select an address, rather than entering it. That way you get a somewhat standardised address format which every user will be using.
The problem with PostCoder is that they don’t have incredibly helpful data sometimes. A search on my home road returns “Janes, Hair Salon” over the first two address lines. Sometimes the business name is apart of the returned address, and other times not. I’m really not sure where to get reliable data.