The Farcaster Content Graph

Let me start by saying that I am a huge fan of Farcaster - the ecosystem, the ethos, but most importantly, the community. I want FC and everyone involved to win.

That being said, a long-term concern I have for Farcaster is how our individual content graphs (our casts/behavior) are public by default rather than private by design.

The wild, wild web2

Historically, we have seen social media companies start out way too open in order to build adoption, only to rein it in once it got to the scale where they could adequately monetize their user data (or they became a target of regulation).

The FB API, for example, used to allow direct access to friend feeds and behavioral data for app developers such as Zynga (and me), which allowed developers to better target and promote their apps/games to their friends as well as target in-app purchases.

Eventually, Zynga overdid it and FB had to shut down access to parts of the API in order to stop widespread abuse.

But in the grand scheme of things, the case of Zynga and other consumer businesses using your data for marketing products is a relatively trivial problem.

Being influenced to purchase something can be a minor annoyance, but being influenced to think something is where things become very problematic.

Not all influence is created equal

For example, from 2012 until sometime around 2015/16, 3rd party companies such as Cambridge Analytica were collecting data such as the public profile/timeline/news feed, page likes, birthday, and current city on around 87 million FB users, 70 million of which were within the United States.

They analyzed this data to create detailed psychographic profiles on each individual, which they would then use for political influence campaigns via hyper-targeted ads for clients such as Ted Cruz and Donald Trump.

Cambridge Analytica: how did it turn clicks into votes?

Whistleblower Christopher Wylie explains the science behind Cambridge Analytica's mission to transform surveys and Facebook data into a political messaging weapon

https://www.theguardian.com

If you are unfamiliar with the Cambridge Analytica scandal, you should definitely take a moment to read up on it, as it shows just how powerful psychographic analysis and targeting can be at scale.

And remember, this was all done before the era of ubiquitous AI agents that can be trained and deployed extremely quickly.

So, what does this have to do with Farcaster?

At this early stage, it's still relatively innocuous. But if the vision is to onboard millions of users over time, the idea that our individual content graph is public by default rather than private by design will eventually lead to massive "Cambridge Analytica" style exploits.

For example, a simple POC I explored recently was to set up my own hub and pull all of the casts for my FID and feed it into chatGPT to analyze and create a personality profile based on the Big Five personality traits (crude but effective for now).

Within seconds, I had a fairly accurate picture of my communication style and personality traits, which could then be fed into other models to train AI agents to impersonate (or influence) me.

The possibilities of how this data can be exploited are nearly unlimited and its consequences extremely dangerous once it gets to a large enough scale and in the hands of those with bad motives.

Who is responsible, or even capable of, privacy protection in an open graph?

For all its faults, a closed content graph such as FB or X at least have a single gatekeeper to your data to rein in abuse.

In an open content graph, who is responsible for - or even has the ability to do - that?

As a user on Farcaster, how do I protect my content graph from those who would use it to analyze my data and manipulate or target me personally?

What is the overwhelming justification for the "public by default" design choice? Is it to accelerate builder adoption until we run into the same problems we've seen occur on other social graphs?

Just because the graph is decentralized does not mean that it is precluded from becoming the target of regulators if/when a privacy issue inevitably occurs.

Privacy by design, disclosure by choice

I would propose that we need to solve this problem sooner than later so that the default option is privacy by design, disclosure by choice.

My signature should be required any time I choose to expose my cast or behavior data to any 3rd party clients/apps, and I should be able to revoke that decision at any time.

Will this mean that builders on the ecosystem will now need to endure some friction in order to get access to your data?

Yes, it does - but that is what real graph ownership looks and feels like to me.

And, ultimately, I think it will lead to a much healthier, secure, and sustainable network.

Some additional reading if you're interested:

https://www.sciencedirect.com/science/article/abs/pii/S2352250X19301332

Finally, we suggest that regulations of psychological targeting should be accompanied by a mindset that fosters (1) privacy by design to make it easy for individuals to act in line with their privacy goals, as well as (2) disclosure by choice, to allow individuals to freely decide whether and when they might be willing to forsake their privacy for better service.

Mosio 🎩

Commented 1 year ago

I just subscribed to @baz.eth on /paragraph! Check it out:

Fateme

Commented 1 year ago

🙄🙄🙄

Mosio 🎩

Commented 1 year ago

✨ 💓

Fateme

Commented 1 year ago

could you lemme know what is it 🙄

Barry

Commented 1 year ago

Hey all - just wanted to share some of my concerns about the public nature of our casts/behavior on FC. Hoping for an open dialogue on where we think this is headed, and in the long run, how to provide privacy protections from a user content perspective. https://paragraph.xyz/@barrycollier/farcaster-content-graph

Cameron Armstrong

Commented 1 year ago

appreciate you taking the time to write and share, yet it feels like the best way to not expose yourself to the privacy concerns around a public, permissionless content network is to not post on one?

Taylor

Commented 1 year ago

Thank you for raising this important topic and for your thoughtful treatment of the question in your article.

Barry

Commented 1 year ago

Thank you, Taylor! 💜

GIGAMΞSH

Commented 1 year ago

Good read! Hoping there are encrypted channels in the future, but I think the default open data will foster a collective immune system against the abuse that happens on centralized networks. Ex: Strong digital identity/reputation + a norm of only trusting media signed by a reputable source.

Barry

Commented 1 year ago

Thanks for reading! Channel-level privacy was an interesting idea, and was something I was going to bring up to @dwr.eth once I formulated my thoughts around it. I'm definitely not a proponent of centralized networks. I'm less concerned about the individual one-off abuse and more concerned about the masses at scale

Barry

Commented 1 year ago

Just thinking long-term, who are the users on FC? How big does FC get? If it's in the 10's/100's of millions, it will include mainstream users w/o skills or understanding, and it exposes them to bad actors (e.g., mis/disinformation, etc.) If we don't expect FC to reach that scale, it's only an isolated nuisance.

Ivy

Commented 1 year ago

this is at odds with the permissionless north star

Barry

Commented 1 year ago

Yeah, I understand do you think the concern about how it will be abused is valid, or do you believe that all of your personal content being publicly available will never be abused? because I’ve pointed to some historical examples of how lack of privacy ends badly I think there are models to satisfy both goals

Ivy

Commented 1 year ago

it's a valid concern, it's going to be abused and i'm not concerned about it being abused in current year you have to assume that AI or something is going to scrape publicly available data, farcaster is far past the 'private clubhouse' era

Ivy

Commented 1 year ago

further to that is that we are still in the benevolent dictator era which means using farcaster implies you are on board with merkle / @dwr.eth and @v north stars, one of the biggest of which is permissionless but even if that wasn't true i don't think a proposal for privacy by design would win on votes rn

Frank

Commented 1 year ago

“all of your personal content” doesn’t sound accurate; people share what they want but likely it’s a small percentage of the totality of their content/data

rileybeans

Commented 1 year ago

glad we're seeing more conversations about privacy/AI like these. hopefully teams can have healthy discussions with users and experts on these topics. the answer is not always either "don't use it" or "build your own client," there's more to this imo

chicago

Commented 1 year ago

Would love to hear @aeluteia thoughts about this. Literally the MOST privacy conscious person I've ever met and impressively so! 😎😁

Barry

Commented 1 year ago

Would love to hear @aeluteia's thoughts as well! And just to be clear, while I'm for a consumer's right to choose how their data is accessed, my larger concern is the societal impact of a (near) zero privacy protocol at scale. If FC wants to grow to 1B+ DAU without any privacy controls, it feels very problematic.

Trish🫧

Commented 1 year ago

Same

Commented 1 year ago

I really like this idea of 'privacy by design, disclosure by choice'. I'm sure there are tradeoffs, and I would be curious to read an analysis of what those are from the application's perspective. From the user's perspective discovery might be one of them, which is why optional disclosure being built in is cool.

Commented 1 year ago

I wish I could say I still was! I've capitulated a fair bit lately. Largely for the reason that Moxie Marlinspike articulated years ago, which is that opting out is hard. It's not just particular tools or applications that you end up opting out of, but also the networks and communities that use those tools.

Semui

Commented 1 year ago

thanks for sharing, this is an interesting take. I take the view that everything is public in a sense, even on proprietary platforms given data breaches or how they may sell my data to third parties. With FarCaster being public by default, at least I don't have a false sense of security. I think awareness is key.

Barry

Commented 1 year ago

Thanks, Semui. I have a similar view on that, personally. I don't have any illusion of privacy online. I was just writing a similar thought as well For me, the concern is what happens with a (near) zero privacy protocol at scale if/when mainstream users adopt the network https://warpcast.com/baz.eth/0x39cf24ac

Semui

Commented 1 year ago

Thanks for the additional context. I think a useful example are those cases we see on Twitter where an account has an unexpected tweet that goes viral, and then they go into “protected” mode. No protected mode on Farcaster 🙈 Mindset shift needed there.

Liam 🎩🔮⛓️💊

Commented 1 year ago

I like this take

Trish🫧

Commented 1 year ago

All the things I’ve been thinking. Thank you. That personalty profile frame really unnerved me. I knew it was coming but still I’d like to have a little more control. At least the right to know who is accessing what, why and how often. It also makes me want a community owned client even more than I did

jp 🎩🚢

Commented 1 year ago

1/ Public spaces, digital and IRL, will continue to exist and serve an important purpose at one end of the privacy spectrum. Public networks are designed to amplify information distribution, so if I want my media to be seen by as many people (and agents) as possible, that’s where I will go.

jp 🎩🚢

Commented 1 year ago

2/ In the pre-AI era, public content was consumed by crawlers like Google, because we wanted it to be found. IMO Cambridge Analytica was abuse by a bad actor using FB’s data, violating FB’s ToS, and then FB taking the PR fall because it had a target on its back for commoditizing the news industry

Barry

Commented 1 year ago

All of that feels accurate re: CA/FB The question I posed: if FC gets to scale w/ mainstream adoption, how do we prevent another CA from occurring here? FB was able to shut off access to CA, but anyone can spend $10 like I did, spin up an ec2 inst. and download the entire FC graph to build psych models on every FID

jp 🎩🚢

Commented 1 year ago

3/ private digital spaces will coexist with public digital spaces. Note that privacy is a cultural phenomenon, and mean different things to different cultures. Most extreme example are Germany vs Brazil. Individual granularity of privacy is more often a stated preference vs a revealed one.

0xdesigner

Commented 1 year ago

cambridge analytica is the perfect reference point. influence and manipulation is way easier than most people realize, and leaves us pretty vulnerable in ways we're not conscious of. on the bright side, we'll get better targeted ads

Barry

Commented 1 year ago

💯 I'd love to believe that society en masse has the critical thinking skills to be immune to this, but history and basic human psychology has proven otherwise I am looking forward to more relevant ads, though 😬 or maybe a personal AI shopper trained to know what I want at all times Ok, maybe it's all worth it

Let me start by saying that I am a huge fan of Farcaster - the ecosystem, the ethos, but most importantly, the community. I want FC and everyone involved to win.

That being said, a long-term concern I have for Farcaster is how our individual content graphs (our casts/behavior) are public by default rather than private by design.

The wild, wild web2

Eventually, Zynga overdid it and FB had to shut down access to parts of the API in order to stop widespread abuse.

But in the grand scheme of things, the case of Zynga and other consumer businesses using your data for marketing products is a relatively trivial problem.

Being influenced to purchase something can be a minor annoyance, but being influenced to think something is where things become very problematic.

Not all influence is created equal

Cambridge Analytica: how did it turn clicks into votes?

Whistleblower Christopher Wylie explains the science behind Cambridge Analytica's mission to transform surveys and Facebook data into a political messaging weapon

https://www.theguardian.com

If you are unfamiliar with the Cambridge Analytica scandal, you should definitely take a moment to read up on it, as it shows just how powerful psychographic analysis and targeting can be at scale.

And remember, this was all done before the era of ubiquitous AI agents that can be trained and deployed extremely quickly.

So, what does this have to do with Farcaster?

Within seconds, I had a fairly accurate picture of my communication style and personality traits, which could then be fed into other models to train AI agents to impersonate (or influence) me.

The possibilities of how this data can be exploited are nearly unlimited and its consequences extremely dangerous once it gets to a large enough scale and in the hands of those with bad motives.

Who is responsible, or even capable of, privacy protection in an open graph?

For all its faults, a closed content graph such as FB or X at least have a single gatekeeper to your data to rein in abuse.

In an open content graph, who is responsible for - or even has the ability to do - that?

As a user on Farcaster, how do I protect my content graph from those who would use it to analyze my data and manipulate or target me personally?

What is the overwhelming justification for the "public by default" design choice? Is it to accelerate builder adoption until we run into the same problems we've seen occur on other social graphs?

Just because the graph is decentralized does not mean that it is precluded from becoming the target of regulators if/when a privacy issue inevitably occurs.

Privacy by design, disclosure by choice

I would propose that we need to solve this problem sooner than later so that the default option is privacy by design, disclosure by choice.

My signature should be required any time I choose to expose my cast or behavior data to any 3rd party clients/apps, and I should be able to revoke that decision at any time.

Will this mean that builders on the ecosystem will now need to endure some friction in order to get access to your data?

Yes, it does - but that is what real graph ownership looks and feels like to me.

And, ultimately, I think it will lead to a much healthier, secure, and sustainable network.

Some additional reading if you're interested:

https://www.sciencedirect.com/science/article/abs/pii/S2352250X19301332

Finally, we suggest that regulations of psychological targeting should be accompanied by a mindset that fosters (1) privacy by design to make it easy for individuals to act in line with their privacy goals, as well as (2) disclosure by choice, to allow individuals to freely decide whether and when they might be willing to forsake their privacy for better service.

Mosio 🎩

Commented 1 year ago

I just subscribed to @baz.eth on /paragraph! Check it out:

Fateme

Commented 1 year ago

🙄🙄🙄

Mosio 🎩

Commented 1 year ago

✨ 💓

Fateme

Commented 1 year ago

could you lemme know what is it 🙄

Barry

Commented 1 year ago

Cameron Armstrong

Commented 1 year ago

appreciate you taking the time to write and share, yet it feels like the best way to not expose yourself to the privacy concerns around a public, permissionless content network is to not post on one?

Taylor

Commented 1 year ago

Thank you for raising this important topic and for your thoughtful treatment of the question in your article.

Barry

Commented 1 year ago

Thank you, Taylor! 💜

GIGAMΞSH

Commented 1 year ago

Barry

Commented 1 year ago

Barry

Commented 1 year ago

Ivy

Commented 1 year ago

this is at odds with the permissionless north star

Barry

Commented 1 year ago

Ivy

Commented 1 year ago

Ivy

Commented 1 year ago

Frank

Commented 1 year ago

“all of your personal content” doesn’t sound accurate; people share what they want but likely it’s a small percentage of the totality of their content/data

rileybeans

Commented 1 year ago

chicago

Commented 1 year ago

Would love to hear @aeluteia thoughts about this. Literally the MOST privacy conscious person I've ever met and impressively so! 😎😁

Barry

Commented 1 year ago

Trish🫧

Commented 1 year ago

Same

Commented 1 year ago

Semui

Commented 1 year ago

Barry

Commented 1 year ago

Semui

Commented 1 year ago

Liam 🎩🔮⛓️💊

Commented 1 year ago

I like this take

Trish🫧

Commented 1 year ago

jp 🎩🚢

Commented 1 year ago

jp 🎩🚢

Commented 1 year ago

Barry

Commented 1 year ago

jp 🎩🚢

Commented 1 year ago

0xdesigner

Commented 1 year ago

Barry

Commented 1 year ago

The Farcaster Content Graph

baz.eth

The Farcaster Content Graph

The Farcaster Content Graph

... and the historical underbelly of "public by default" design choices

The wild, wild web2

Not all influence is created equal

Cambridge Analytica: how did it turn clicks into votes?

So, what does this have to do with Farcaster?

Who is responsible, or even capable of, privacy protection in an open graph?

Privacy by design, disclosure by choice

The Farcaster Content Graph

baz.eth

The Farcaster Content Graph

The Farcaster Content Graph

... and the historical underbelly of "public by default" design choices

The wild, wild web2

Not all influence is created equal

Cambridge Analytica: how did it turn clicks into votes?

So, what does this have to do with Farcaster?

Who is responsible, or even capable of, privacy protection in an open graph?

Privacy by design, disclosure by choice