Who owns geography reference data at Facebook or master data at LinkedIn? The first questions that needs an answer when introducing data governance are typically: what data do we have, where is it and who owns it in terms of risk and compliance as well as cost/value. A data owner is typically one person that is accountable to put everything in place to have acceptable data in terms of definition, quality and protection. In this blog I’ll give two real live use case examples.
“I see death people” at LinkedIn!
Last week I received note from LinkedIn to like or comment on a work anniversary of a former BI-colleague. I had to swallow a couple of times as this person had died about a year ago from a sudden heart attack..
The lifecycle of master user data at social media platforms is typically owned by the users themselves. But as companies cannot manage the outside world, they typically assign ownership there where all master user data comes together in a central database. In order to keep it up-to-date they need to introduce controls and procedures at the entry points to correctly enter the data. In the case of a sudden death, a person is no longer capable of updating its own master user data. As such he/she depends on other people to update it. I looked it up and found idd a submission form to log this information in LinkedIn. Most data I could indeed enter, but a reference to a cause of death I really couldn’t declare. It also felt a bit creepy filling in the form.
Who owns Referential Geography Data at Facebook?
Seems it also took Facebook several weeks to correct a data quality issue on referential geography data. A couple of weeks ago, the residents of the sleepy island of Gotland in the Baltic sea off Sweden, started to notice that many of their Facebook entries were listing them as living in Norway, not Sweden. For a link to this article please click here.
For referential data however such as hierarchies on geographical data (country codes – zipcodes – cities), the ownership should not reside with users as they are a given. Geographies are what they are. Even though most of them barely ever change there are small movements such as conflicts that redraw some country boundaries here and there.
The social community applying a 4 eyes on my data quality?
Social and big media data updates are done so frequently that it is a lot harder to put data governance controls on it. Besides managing the volumes, variety and velocity of data, it seems as if the veracity remains the hardest thing to get under control. Looking at the above examples of Facebook and LinkedIn of the last few days, what’s interesting to see is that the mandate on what data quality is acceptable enough, seems to be totally put with the user community. As such I wonder how many declarations it would take to pronounce someone death? How many votes/remarks does it take to claim a city? What percentage of death people data is actually being monetized for advertising? It’s clear that even social media companies have imminent data governance concerns to tackle. May I suggest to focus on Veracity first ;-)!