Data Quality Example: Where two streets have one name
Mo Data stashed this in Big Data Preparation
Two examples – English and Afrikaans street names such as Church Street vs Kerkstraat, as well as the inconsistent use of abbreviations such as 4th Avenue vs Fourth Avenue.
A related problem is when two streets have one name. It is of course quite possible for two streets to have the same name. Nelson Mandela Drive, for example, has become a very common name for major roads across the country. However, in reality these streets are typically far enough apart that other address elements can differentiate them.
When we factor in human error, however, then address data becomes a lot more complex, and simplistic matchingstrategies fail more often than is acceptable.
One example – people remember street names but may not remember the street type. When asked for an address of a place they are not really familiar with they may say “Osbourne Road” when in fact they mean “Osbourne Street”. When this kind of error is captured into corporate data in can cause reel challenges.
Another common example is a “corner of” type address – “Our offices are on the corner of 1st and Main Street.” Is that the corner of “1st Avenue”, “1st Street” or “1st Crescent”? In geographies where each of these options may exist where have now created one name for multiple streets.
As mentioned previously, if simplistic, statistical matches are now applied to these data sets where may have a scenario where “1st Street” is a better match to “1st Crescent” than “Church Street” is to “Kerkstraat”. Similar examples could easily be applied to Name, and other data elements.
An automated match solution is a critical component of any master data management technology stack – but if inappropriately applied can cause more problems than it is worth. Some solutions simply do not offer the granularity to deal with these kinds of complexities – meaning that your operational staff will be overwhelmed by exceptions that must be manually verified.