How can we determine if someone is underfollowed?

Understanding Underfollowedness

I read this cast by Dan Romero on Farcaster and one word in that cast intrigued me: UnderFollowed. What does this word mean? Is it a qualitative thing or can we quantize it with data? First, we know that we can query a few things about an account on farcaster: Following count, follower count, and number of engagements received or dished out. I gave these metrics to CHatGPT as a starting point for a good UnderFollowedness metric and chatGPT came up with this formula:

$$\frac{Engagements(recasts + likes)}{Followers} * \frac{Following}{Followers} $$

This is a good formula for these reasons:

  1. we can easily obtain the data

  2. It measures the following/follower ratio

  3. It considers received engagements to followers count

  4. Multiplying 2 by 3 may reveal highly engaged users in the community but with a relatively low follower count.

The problem with this formula is that users with huge follower counts and engagement numbers can get Higher scores (in this formula, a higher score means more underfollowed), So You may have to choose a followers count threshold and penalize accounts that have more followers than that number.

How can we implement this with farcaster data?

The cast wanted the accounts interacted with and followed in the past week, we start with followed in the last 7 days:

SELECT target_fid, fname, display_name, s.created_at  
FROM dune.neynar.dataset_farcaster_links as s
join dune.neynar.dataset_farcaster_profile_with_addresses as a on a.fid = s.target_fid
WHERE type = 'follow'
  AND s.deleted_at IS NULL
  AND s.fid = {{your fid}}
  AND s.created_at >= current_date - interval '7' day

This is a simple SQL join. we join the FIDs of accounts followed by an FID in the last 7 days (put FID in {{your fid}} ) and connect those FIDs with farcaster names and handles.

Now we put that query in a CTE (to just query it once and save processing time) and with three little CTEs we get following, followers, and engagements (likes + recasts) of those accounts:

with df as
(SELECT target_fid, fname, display_name, s.created_at  
FROM dune.neynar.dataset_farcaster_links as s
join dune.neynar.dataset_farcaster_profile_with_addresses as a on a.fid = s.target_fid
WHERE type = 'follow'
  AND s.deleted_at IS NULL
  AND s.fid = {{your fid}}
  AND s.created_at >= current_date - interval '7' day),
  
follower_count as (
SELECT target_fid ,count(*) as follower_count   FROM  dune.neynar.dataset_farcaster_links WHERE
      type = 'follow'
      AND deleted_at IS NULL
      and target_fid in (select target_fid from df ) group by target_fid ),
      
 follow_count as (
SELECT fid ,count(*) as follow_count FROM  dune.neynar.dataset_farcaster_links WHERE
      type = 'follow'
      AND deleted_at IS NULL
      and fid in (select target_fid from df ) group by fid),
      
engagement_count as (select  target_fid,   count(*) as engagement_count from dune.neynar.dataset_farcaster_reactions where target_fid in
(select target_fid from df ) group by target_fid)

Now that we have everything together, we join them in a neat little table:

with df as
(SELECT target_fid, fname, display_name, s.created_at  
FROM dune.neynar.dataset_farcaster_links as s
join dune.neynar.dataset_farcaster_profile_with_addresses as a on a.fid = s.target_fid
WHERE type = 'follow'
  AND s.deleted_at IS NULL
  AND s.fid = {{your fid}}
  AND s.created_at >= current_date - interval '7' day),
  
follower_count as (
SELECT target_fid ,count(*) as follower_count   FROM  dune.neynar.dataset_farcaster_links WHERE
      type = 'follow'
      AND deleted_at IS NULL
      and target_fid in (select target_fid from df ) group by target_fid ),
      
 follow_count as (
SELECT fid ,count(*) as follow_count FROM  dune.neynar.dataset_farcaster_links WHERE
      type = 'follow'
      AND deleted_at IS NULL
      and fid in (select target_fid from df ) group by fid),
      
engagement_count as (select  target_fid,   count(*) as engagement_count from dune.neynar.dataset_farcaster_reactions where target_fid in
(select target_fid from df ) group by target_fid)
    
select df.target_fid , df.fname, df.display_name, df.created_at,
follow_count.follow_count ,follower_count.follower_count,
(cast( follow_count.follow_count as Decimal(18,2) ) / cast( follower_count.follower_count as Decimal(18,2) )) *
(cast(engagement_count as Decimal(18,2)) / cast( follower_count.follower_count as Decimal(18,2) )) as metric
from df
join   follower_count on df.target_fid = follower_count.target_fid
join   follow_count   on df.target_fid = follow_count.fid
join engagement_count on df.target_fid = engagement_count.target_fid 
where follower_count.follower_count < 666 order by metric desc limit 10

I chose 666 as a follower count threshold, everyone with more than that is not underfollowed in my opinion. this underfollowedness metric is a work in progress but it kinda works for now. I want to incorporate other indicators like account age or cast quality into the formula. Another thing I think I can do is favor accounts that have been active recently. This still has a long way to go but you can try my formula in this DUNE dashboard.

Loading...
highlight
Collect this post to permanently own it.
Subscribe to Tinkering Onchain and never miss a post.