Feedback requested: June 30 release candidate for "Steem Follower Checker" browser extension

remlaps - 7 months ago

A couple of weeks ago, I posted Curating for Value: How "Follower Network Strength" Improves Steem Post Ranking, where I described planned changes to the Steem Follower Checker browser extension. As noted in that post (and previously), my target for the next version is June 30. Here were the changes that I had planned for that target.

Change "follower count" to "time-averaged follower count" as one factor of the "Follower Network Strength" calculation.

Tune parameters to award top scores to a smaller proportion of accounts.

And now, I believe that I have accomplished both of those. I also put together a script so that I could visualize changes between versions with real world data. In the new version, the X parameter is "new followers per month" instead of a raw follower count, and the Y parameter is still the median follower reputation.

So, now I want to request feedback on the new version. Additional changes (if any) between now and June 30 will be limited to cosmetics and to addressing issues that are revealed by this discussion. Please look over the visualizations below and let me know if you have any questions or concerns.

If you want to look at the actual code, you can review the 2024q2dev branch. If you want to try it out, you can even switch to that branch and install it by following the same instructions from the README file.

Here's what the theoretical heatmap looks like for the tentative June 30 version:

up to 100 followers per month

this covers most of the real world data

up to 500 followers per month

New in this version, I tailored the algorithm so that it's possible to score above 1 at lower median reputation values for ultra-high follower counts

Theory is nice, but what does it look like in practice?

This time around, I created a script and a PowerBI report that lets me visualize the scoring with the author accounts from 24-hours worth of real world posts and comments. Here's what that looks like:

(anywhere it says "old", that means the May 18 version. If it says current or new or doesn't specify, that's what I plan to merge in on June 30)

Numbers of accounts by score (May 18 method on the left, June 30 method on the right)

- Coloring is scaled to the median in each bin for followers (May 18) or "new followers per month" (June 30, tentative)

- Grey is low, yellow is medium, and orange is high

Heat map from the May 18 version

Bubble chart from the May 18 version

- Bigger bubbles have more accounts during the 24 hour period.

- Low scores are grey, mid-range scores are yellow, high scores are orange

Heat map from the tentative June 30 version

Bubble chart from the tentative June 30 version

- Bigger bubbles have more accounts during the 24 hour period.

- Low scores are grey, mid-range scores are yellow, high scores are orange

Looking ahead

I mentioned before that after June 30, my target will be to update the scoring method once per quarter. The main problem that I'm currently planning to focus on for September 30 is the problem of old accounts that were once high-powered but have now gone dormant.

The changes here already start to address that because the score declines a bit for every month that an author doesn't add followers. However, in looking at the real world data, I can see that it doesn't decay anywhere near fast enough. I'm considering one or both of two options to address this:

As suggested by @moecki, I may make use of the SDS from @steemchiller in order to focus just on new followers within a certain time window.
I may implement something like a half-life in the calculation so that an increasing portion of follower accounts get scored as inactive after some period of time.

Here's some code that demonstrates the concept:

#!/usr/bin/python3

# Per year values
halflife=0.5
growthrate=1200

N=60 # Number of months

followers=0
followersPerMonth=0
activeFollowers=0
activeFollowersPerMonth=0
inactiveFollowers=0
inactivated=0

days=N / 12 * 365.25
currentDay=1
halflifePerDay=halflife/365.25
growthratePerDay=growthrate/365.25

while currentDay < days:
    followers=followers+growthratePerDay
    activeFollowers=followers - inactiveFollowers
    activeFollowers = followers - activeFollowers * halflifePerDay - inactiveFollowers
    inactiveFollowers = followers - activeFollowers
    followersPerMonth = followers * (365.25 / ( currentDay  * 12 ))
    activeFollowersPerMonth = activeFollowers * (365.25 / ( currentDay  * 12 ))

    print (f"Day: {currentDay}, Followers: {followers}, Active Followers: {activeFollowers}, Inactive followers: {inactiveFollowers}, Followers Per Month: {followersPerMonth}, Active Followers Per Month: {activeFollowersPerMonth}")
    currentDay = currentDay + 1

And here are some LibreOffice charts for a hypothetical 8-year old account with 2000 followers with an assumed half-life of 1 year.

Monthly	Total

Conclusion

Overall, there is still a lot of room for improvement, but I think this version is clearly improved over the May 18 version. I also think that it is already able to give curators some insight into the potential reach of posts by any particular author.

Additionally, I think that the new capability to visualize the real-world impact of methodology changes will be a big help with future development.

Looking forward to receiving your feedback.

Thank you for your time and attention.

As a general rule, I up-vote comments that demonstrate "proof of reading".

Steve Palmer is an IT professional with three decades of professional experience in data communications and information systems. He holds a bachelor's degree in mathematics, a master's degree in computer science, and a master's degree in information systems and technology management. He has been awarded 3 US patents.

^{Pixabay license, source}

Reminder

Visit the /promoted page and #burnsteem25 to support the inflation-fighters who are helping to enable decentralized regulation of Steem token supply growth.