CARVIEW |
ReadWriteWeb
Is Google's Social Graph API a Creeping Privacy Violation?
I love me some screenscaping and mashups and data portability, but when it comes to personal information things get a little more complicated.
I'm in San Francisco today at Dappercamp, an event concerning a tool that's always got the rights of those it interfaces with in mind as an issue. Keynote speaker Mitch Kapor just told the group that the foundations of the web are sharing and openness and that intellectual property rights online should be constructed around and respecting those qualities.
It was a refreshing way to frame the often contentious relationship between corporate content publishers and those of us on the margins seeking to mash things up, but similar issues are beginning to arise in terms of personal and interpersonal information about users.
What Google is Doing
The new Google Social Graph API was a big move last week in this direction. The Social Graph API lets developers draw connections between your friends on one service and your friends on another. It indexes XFN (XHTML Friends Network) and FOAF (Friend of a Friend) data, standard microformats that publishers like Twitter or Facebook can append to your friend relationships inside their services.
The Case of the Googled MySpace
Though in most cases the API pulls in publicly available information explicitly marked up with one of two microformats, there is no standard yet developed for user opt-in or opt-out. Google's Social Graph API is also not limited to XNF and FOAF data. MySpace CTO Aber Whitcomb told me this afternoon that the API includes a custom mechanism to extract social connections between friends on MySpace, though that social network does not yet publish XFN/FOAF.
Whitcomb pointed out that while there's interesting information you can learn by looking outward at who a person's friends are, there may be even more information of interest discoverable by looking inward from a social circle in at one person they are all connected to. That's not information that said individual had very explicit control over, though it's fascinating to think about from an application's perspective. None the less, apparently the absence of XFN/FOAF data in your social network is no assurance that it won't be pulled into the new Google API, either. The Google API page says "we currently index the public Web for XHTML Friends Network (XFN), Friend of a Friend (FOAF) markup and other publicly declared connections." In other words, it's not opt-in by even publishers - they aren't required to make their information available in marked-up code.
Where's the user control? While MySpace, for example, says on one hand that user privacy is of the utmost importance - they also say they will deal with the Google Social Graph situation if problems arise. In practice that actually sounds ok to me, but on principle I think there's a cautionary tale here.
Some have said that those not wanting to connect their profiles in a machine readable way simply shouldn't link them at all; others argue that privacy is an illusion and that we need to get over it. Both of those are vastly insufficient responses to the situation.
Issues and Objections
When Robert Scoble was kicked out of Facebook late last year, some users agreed with the company and said "the fact that I've friended you in Facebook doesn't mean I give you permission to take my info outside of Facebook."
Aside from the fundamental absence of user control here (and that is not going to be an easy problem to solve), there are other issues that other bloggers have brought up.
Joshua Porter ties together much of the conversation, first in a post with good comments about why he's excited about the Social Graph API and then in a follow up post summarizing some objections.
Thomas Vander Wal contends that this new API is exactly the kind of thing that "social engineering hackers have waited for." An extended view of phishing, essentially, says that people with malevolent intentions are often able to fool victims by leveraging weak social ties to gain unearned trust. Exposing a whole network of weak social ties makes that easier to do.
Tim O'Reilly is balls-to-the-wall about the post-privacy future. "It’s a lot like the evolutionary value of pain," he says. "Search (searching the social graph) creates feedback loops that allow us to learn from and modify our behavior. A false sense of security helps bad actors more than tools that make information more visible."
danah boyd brings a cold splash of reality to the discussion in her post on privacy and privilege. She points out that teen agers in particular expose themselves very selectively. There are, boyd points out, many people in this world less privileged than Silicon Valley power-nerds are and to whom privacy is very important. I would ask you whether someone escaping from domestic violence, for example, ought to be expected to know how or how not to get their various profiles around the web tied together. Should they not have profiles on the web? Should they not use the web? I think there are ways that these questions can be answered more subtlety than they are today.
How it Ought to Be
The long and short of the situation is this. The ability to determine social connections across multiple websites is a powerful thing. All of us should be asked to opt-in to allowing our social connections to be indexed.
As much as I want the data to be flowing and free - it's not an abstract loss of potential profit from no longer falsely scarce digital content that's at issue when it comes to social connections - as it is with other types of published web content. It's a matter of free will and sometimes personal safety. Web users should not be asked to give these things up in exchange for participation in all that the internet is making possible. It doesn't have to be that way and so it shouldn't be.

0 TrackBacks
Listed below are links to blogs that reference this entry: Is Google's Social Graph API a Creeping Privacy Violation?.
TrackBack URL for this entry: https://www.readwriteweb.com/cgi-bin/mt/mt-tb.cgi/3231
Comments
Subscribe to comments for this post OR Subscribe to comments for all Read/WriteWeb posts
-
Yes, I'm concerned lack of control over which of my personal relationships will be revealed, and to what extent. I'm also concerned that once again Google has popped a balloon, reducing to zero the value in mapping social relationships for other internet businesses. I've written some more about it here.
Posted by: alan jones | February 4, 2008 2:15 PM -
google is creepy. check out how they track your browsing habits.
Posted by: bboing | February 4, 2008 2:26 PM
-
Excellent points, but at what point should that user control come in. The Google API is mining publicly declared relationships. Those standards, or services using those standards, e.g. Wordpress/Twitter, etc, should have mechanisms in place to either mask those declared relationships or perhaps make them private or just allow users to opt out, e.g. Twitter relationships could be set up not to use XFN if the user opts out.
Here's another thought. How about using OpenID as the central hub where you declare your relationships, which are private and which are public. I kinda like that idea.
Posted by: Deepak | February 4, 2008 2:39 PM -
Completely right on the money, and thank you for this roundup. When people start getting found on social networks that they didn't want to be found on, Google will have some major backlash to deal with.
Yes, users shouldn't have to re-enter their social graph on every site they join, but they shouldn't have to worry about being exposed where they don't want to be either.
We're dealing with a graph, and as such, a connection between two once-obscure points becomes so much more important when a tool to comprehensively traverse that graph becomes available.
Posted by: Luigi Montanez | February 4, 2008 4:43 PM -
This reminds me of the original controversy over the Facebook News Feed. People complained about privacy, but the issue really wasn't privacy, it was transparency. Facebook published nothing in the news feed that wasn't already accessible to a user.
Same deal here - nothing that a Google API request yields is not already public information. The API simply makes it easier to find and analyze.
Personally, I've come to the place of thinking that if you want something to be private, don't put it on the Web in a publicly accessible way. I understand that people may not want certain data indexed, but if data can be accessed without authentication (e.g. Facebook data), then it's possible to use it with technologies like this API.
Which is why I've said for a while that Facebook being a walled garden isn't necessarily as bad as some people make it out to be, by the way...
Posted by: theharmonyguy | February 4, 2008 6:10 PM -
in real life you cannot prevent people from gossiping about you, all you can do is limit their access to your life ... so, yes, opt in, as opposed to opt out is the online equivalent
much of the privacy we have thought we have is the privacy an ostrich has with its head in the sand, and online realities are making this clear
history shows governments will abuse privacy if it helps the acquisition of power, that is a given
if you cannot keep a secret, just like real life, it will be available for all. personal reticence is important, but limited
privacy is over, get used to it, live accordingly. in the us of a, in fear ridden times, the stasi, i.e your neighbors, will report you anyway
Posted by: gregory | February 4, 2008 6:55 PM -
I see your point on this Marshall and I think there is a challenge with opt in:
The fundamental way in which google does it, like page rank, is completely generic. Like companies did not opt in to be indexed, its tough to make this work for individuals.
The root of the problem here is that google should not make it algorithm scrape information which does not follow the "rel" tags.
Imagine for a second if only blogs where indexed. Then problem goes away because you have control. Now add back MySpace. What needs to happen is the check box in MySpace saying declare my relationships. That the opt-in, but it is not under Google's control.
Posted by: Alex Iskold | February 4, 2008 8:12 PM -
Great post, Marshall.
One thing that people seem to be forgetting, though, is that the only way for GOOG to combine your Flickr and Twitter connections is if you explicitly link the two (i.e. link to your Flickr acct from your Twitter profile).
If users did not have this control, I might also fall in the "creeped out" camp. But really, if I link to my Flickr, Delicious and MyBlogLog accounts from my MySpace account, can I really presume that these accounts will (or even should) remain distinct?
To belabor the point, if someone escaping domestic abuse links their Flickr account to their profile on AbuseVictim.net.org, can they really claim ignorance later on if Google merges the two social circles?
https://marcoullier.com/blog/2008/02/04/googles-social-graph-api-only-as-powerful-as-you-let-it-be/
Posted by: Eric Marcoullier | February 4, 2008 8:44 PM
-
@Eric, I might surely link my Flickr account to Twitter account, but I may not want Google to know that. The linking of these accounts is more a convinience for me and not for someone else to make conclusion and/or assumptions.
Posted by: Ravi | February 4, 2008 9:10 PM -
So how did this get to be a Google privacy thing?
Look, you sign up for these services, and some of them put your information out on the open web. Any privacy and control issues thus reside with the services themselves, not the companies that are indexing the open web.
So if you want an opt-in for your social connections to be indexed, you'd better demand that from the services themselves, ask them to ensure material can't be crawled.
But skip the entire tagging, Marshall. Got a blog? Do you link from your blog to your various social profiles? Even if you do NOT tag those up with social markers, it's not hard to create a profile for you. People search engines have been doing this already. And the blame? Well, ultimate that resides on the people who put the info out in the first place.
Privacy issues are serious, but all the Google Social API is demonstrating is how much information is already out there and accessible to anyone who wants to mine it. And people are mining it, no doubt. Google just raises the awareness of that for the ordinary individual.
Posted by: Danny Sullivan| February 5, 2008 3:51 AM
-
Danny,
Personally, I do not have a great issue around what information about me is connecting dots to other things I do personally. My customers have an interest in not having people connect what I am doing to them as many see the work I do as competitive advantage (enterprise and smaller organizations in competitive markets). Often I am researching and pulling together background information that is pulling together solutions to their problems and providing a solution and/or strategy that is based on their needs, business structure, internal DNA, and internal psychology. This leads me to keep some accounts and usernames private.
On a personal level I keep some accounts private on services so I can be more open and share with people I know, or people whom I have a fair understanding of to get the information I share. Much of this is based on safety. But when Google SocialGraph API exposes my Twitter connections for an account that I flagged as private there is failure in that system. Not a D grade and needs improvement, but failure "F". I am not sure if it is Twitter not flagging it properly for propagation in their system of an over zealous Google trying to show how smart they are (social network mapping inside enterprise has been around for many years as I first ran into it in the late 1990s and it was rather impressive). Google bumped up the scale to from a few tens and hundreds of thousands of people to millions on the web.
Where the really big issue that I see social engineering hacks. The Google SocialGraph API exposes the connections between people (a much more valuable commodity for evil than connecting one's own information). As Google seems to regularly do they build seemingly with out good planning or forethought about privacy issue (let alone issues around personal wishes of sharing and privacy). I greatly understand that tools that are not simple are less likely to be used so having a tool that is mindful of wishes (particularly stated values like Twitter privacy).
The connecting of the links between people in an open and easy to access manner is very problematic. I have sat through more than a my fair share of after the fact meetings about hackings. More often than not the problem is not security holes in the software or systems but social engineering hacks (software and system holes do make up a large share, but even many of those hacks came from information derived from a social engineering hack). The social engineering hack is based on a hacker/thief convincing somebody to trust them (weak ties work best) and then having them try and get their social mark (someone else's great term) to divulge information or to be their proxy. Having a SocialGraph mapped provides easy targeting and scenario building. If the hacker sees that Jane is your friend and Jeff is connected to her many places and you are connected in one place, the hacker can easily use Jeff or one of his connections to build a brilliantly believable SPAM or other communication.
It is not myself, so much, that I worry about as I years of sitting through hack meetings has me lacking trust and I am quick to flag things that are slightly odd. Most of my XFN statements are behind a login or other authentication system that track who has seen that page. I am worried about who can see my connections and use that against people I am friends with, it is really easy to do. Until you have been through a few hacks discovery meetings the awareness is not up. In a sense Google's SocialGraph API just showed all the houses that do not lock their doors.
Posted by: vanderwal | February 5, 2008 7:03 AM -
After chatting with Kevin Marks of Google he pointed out the issue with Twitter showing up in the SocialGraph API. The Twitter connections are public but the connections are openly available, which would explain why they were pulled in.
Posted by: vanderwal | February 5, 2008 2:31 PM -
People need to keep in mind that information they put on the web is accessible to anybody. Goodies and Baddies.
Posted by: Pat Hawks | February 5, 2008 3:35 PM
This responsibility falls on you.
If I link to your Flickr and use XFN to tell Google that you're my friend, Google (or anybody else) knows nothing more about you than they otherwise would, except that I claim to know you. That's it. That's all this is about.
You "opt-in" when you start publishing on the web in the first place.
TheHarmonyGuy hit the nail right on the head when he mentioned the Facebook news feed.
You don't want people to know something, don't put it on the web. That is your responsibility. Not mine. -
I think the core mistake here is to confuse "privacy by obscurity is dead" with "privacy is dead". Other easy slips include confusing Google with the Web, and assuming that FOAF/XFN are radically different from normal, well-structured markup.Yes there's a big privacy issue here, but it'd still be there if Google weren't indexing.
Longer version: https://danbri.org/words/2008/02/05/267
Posted by: Dan Brickley | February 6, 2008 10:35 AM














Grab this swicki from eurekster.com
RECENT JOBS
RWW READERS
POPULAR TAGS
- microsoft
- social networking
- amazon
- yahoo
- search
- myspace
- mobile
- video
- semantic web
- music
- mobile web
- innovation
- youtube
- opensocial
- conferences
- blogging
- apple
- privacy
- mp3
- internet tv
- dataportability
- data portability
- startups
- social graph
- politics
- openid
- obama
- itunes
- flickr
- apps
- web 2.0
- tagging
- photos
- online music
- microblogging
- macworld
- iphone
- ibm
- firefox
- enterprise
- election
- wikipedia
- widgets
- web2.0
- RSS
Home | Products | Trends | Digital Media | Web Office | International | Events | Jobs | Archives
RWW Network | ReadWriteWeb | ReadWriteTalk | Last100 | AltSearchEngines
Leave a comment