The Farcaster Content Graph

... and the historical underbelly of "public by default" design choices

Let me start by saying that I am a huge fan of Farcaster - the ecosystem, the ethos, but most importantly, the community. I want FC and everyone involved to win.

That being said, a long-term concern I have for Farcaster is how our individual content graphs (our casts/behavior) are public by default rather than private by design.

The wild, wild web2

Historically, we have seen social media companies start out way too open in order to build adoption, only to rein it in once it got to the scale where they could adequately monetize their user data (or they became a target of regulation).

The FB API, for example, used to allow direct access to friend feeds and behavioral data for app developers such as Zynga (and me), which allowed developers to better target and promote their apps/games to their friends as well as target in-app purchases.

Eventually, Zynga overdid it and FB had to shut down access to parts of the API in order to stop widespread abuse.

But in the grand scheme of things, the case of Zynga and other consumer businesses using your data for marketing products is a relatively trivial problem.

Being influenced to purchase something can be a minor annoyance, but being influenced to think something is where things become very problematic.

Not all influence is created equal

For example, from 2012 until sometime around 2015/16, 3rd party companies such as Cambridge Analytica were collecting data such as the public profile/timeline/news feed, page likes, birthday, and current city on around 87 million FB users, 70 million of which were within the United States.

They analyzed this data to create detailed psychographic profiles on each individual, which they would then use for political influence campaigns via hyper-targeted ads for clients such as Ted Cruz and Donald Trump.

If you are unfamiliar with the Cambridge Analytica scandal, you should definitely take a moment to read up on it, as it shows just how powerful psychographic analysis and targeting can be at scale.

And remember, this was all done before the era of ubiquitous AI agents that can be trained and deployed extremely quickly.

So, what does this have to do with Farcaster?

At this early stage, it's still relatively innocuous. But if the vision is to onboard millions of users over time, the idea that our individual content graph is public by default rather than private by design will eventually lead to massive "Cambridge Analytica" style exploits.

For example, a simple POC I explored recently was to set up my own hub and pull all of the casts for my FID and feed it into chatGPT to analyze and create a personality profile based on the Big Five personality traits (crude but effective for now).

Within seconds, I had a fairly accurate picture of my communication style and personality traits, which could then be fed into other models to train AI agents to impersonate (or influence) me.

The possibilities of how this data can be exploited are nearly unlimited and its consequences extremely dangerous once it gets to a large enough scale and in the hands of those with bad motives.

Who is responsible, or even capable of, privacy protection in an open graph?

For all its faults, a closed content graph such as FB or X at least have a single gatekeeper to your data to rein in abuse.

In an open content graph, who is responsible for - or even has the ability to do - that?

As a user on Farcaster, how do I protect my content graph from those who would use it to analyze my data and manipulate or target me personally?

What is the overwhelming justification for the "public by default" design choice? Is it to accelerate builder adoption until we run into the same problems we've seen occur on other social graphs?

Just because the graph is decentralized does not mean that it is precluded from becoming the target of regulators if/when a privacy issue inevitably occurs.

Privacy by design, disclosure by choice

I would propose that we need to solve this problem sooner than later so that the default option is privacy by design, disclosure by choice.

My signature should be required any time I choose to expose my cast or behavior data to any 3rd party clients/apps, and I should be able to revoke that decision at any time.

Will this mean that builders on the ecosystem will now need to endure some friction in order to get access to your data?

Yes, it does - but that is what real graph ownership looks and feels like to me.

And, ultimately, I think it will lead to a much healthier, secure, and sustainable network.

Some additional reading if you're interested:

Finally, we suggest that regulations of psychological targeting should be accompanied by a mindset that fosters (1) privacy by design to make it easy for individuals to act in line with their privacy goals, as well as (2) disclosure by choice, to allow individuals to freely decide whether and when they might be willing to forsake their privacy for better service.

Collect this post to permanently own it.
baz.eth logo
Subscribe to baz.eth and never miss a post.
  • Loading comments...