I’ll never forget launching my first open-source project and sharing it on Reddit…
I had spent a couple of days at my parents’ place over Christmas that year and decided to use some of my spare time to work on a Python library I christened Schedule.
The idea behind Schedule was very simple and had a narrow focus (I find that that that’s always a good idea for libraries by the way):
Developers would use it like a timer to periodically call a function inside their Python programs.
The kicker was that Schedule used a funky “natural sounding” syntax to specify the timer interval. For example, if you wanted to run a function every 10 minutes you’d do this:schedule.every(10).minutes.do(
Or, if you wanted to run a particular task every day at 10:30 in the morning, you’d do this:schedule.every().day.at('10:
Because I was so frustrated with Cron’s syntax I thought this approach was really cool. And so I decided this would be the first Python module I’d release as open-source.
I cleaned up the code and spent some time coming up with a nice README file—because that’s really the first thing that your potential users will see when they check out your library.
Once I had my module available on PyPI and the source code on GitHub I decided to call some attention to the project. The same night I posted a link to the repository to Reddit and a couple of other sites.
I still remember that I had shaky hands when I clicked the “submit” button…
It’s scary to put your work out there for the whole world to judge! Also, I didn’t know what to expect.
Would people call me stupid for writing a “simple” library like that?
Would they think my code wasn’t good enough?
Would they find all kinds of bugs and publicly shame me for them? I felt almost a physical sense of dread about pushing the “submit” button on Reddit that night!
The next morning I woke up and immediately checked my email. Were there any comments? Yes, about twenty or so!
I started reading through all of them, faster and faster—
And of course my still frightful mind immediately zoomed in on the negative ones, like
“Cool idea, but not particularly useful”,
“The documentation is not enough”,
“Not a big fan of the pseudo-english syntax. Way too clever and gimmicky.”
At this point I was starting to feel a *little* discouraged… This was just something I wrote in a couple of hours and gave away for free!
The comment that really made my stomach churn was one from a particularly well known member of the Python community:
“And another library with global state 🙁 … Such an API should not even exist. It sets a bad example.”
Ouch, that hurt. I really looked up to that person and had used some of their libraries in other projects…
It was almost like my worst fears were now playing out in front of me!
I’d never be able to get another job as a Python developer after this…
At the time I didn’t see the positive and supportive comments in that discussion thread. I didn’t see the almost 70 upvotes. I didn’t see the valuable lessons hidden in the seemingly rude comments. I dwelled on the negative and felt terrible and depressed that whole day.
So how do you think this story ends?
Did I delete the Schedule repo, switched careers and never looked at Reddit again?
Schedule now has almost 3,000 stars on GitHub and is among the top 70 Python repositories (out of more than 215,000). When PyPI’s download statistics were still working I saw that it got several thousand downloads per month. I get emails every week from people asking questions about it or thanking me for writing it…
Isn’t that crazy!? How’s that possible after all of these disheartening comments?
My answer is “I don’t know”—and I also don’t think that Schedule is a particularly great library that deserves all this attention, by the way.
But, it seems to solve a problem for some people. It also seems to have a polarizing effect on developers who see it—some love it, some hate it.
Today I’m glad I shipped Schedule that night.
Glad because it was helpful to so many people over the years and glad because it helped me develop a thicker skin when it comes to sharing and launching things publicly.
I’m writing you this meandering email because not very long ago I found this comment buried in my Reddit message history:
As someone who has posted a number of projects and blog posts in r/Python, just wanted to drop you a line and encourage that you don’t let the comments in your thread get you down. You see all those upvotes? Those are people that like your library, but don’t really have a comment to make in the thread proper. My biggest issue with /r/Python is that it tends towards cynicism and sometimes cruelty rather than encouragement and constructive criticism.
Keep up the great work,
Wow! What a positive and encouraging comment!
Back when I felt discouraged by all of these negative comments I must’ve missed it. But reading it a few years later made me re-live that whole situation and it showed me how much I’d grown as a developer and as a person in the meantime.
If you find yourself in a similar situation, maybe feeling bogged down by the developer community who can be unfiltered and pretty rude sometimes, don’t get discouraged.
Even if some people don’t like what you did there can be thousands who love your work.
It’s a big pond, and sometimes the best ideas are polarizing.
The only way to find out is to ship, ship, ship.
— Dan Bader (RealPython.com)
PRAW, an acronym for “Python Reddit API Wrapper”, is a python package that allows for simple access to reddit’s API.
Some of the lessons that stood out most for me:
- Think of SSDs as cheap RAM, not expensive disk. When reddit moved from spinning disks to SSDs for the database the number of servers was reduced from 12 to 1 with a ton of headroom. SSDs are 4x more expensive but you get 16x the performance. Worth the cost.
- Give users a little bit of power, see what they do with it, and turn the good stuff into features. One of the biggest revelations for me was how much reddit learns from its users and how much it relies on users to make the site run smoothly. Users are going to tell you a lot of things you don’t know. For example, reddit gold started as a joke in the community. They made it a product and users love it.
- It’s not necessary to build a scalable architecture from the start. You don’t know what your feature set will be when you start out so you want know what your scaling problems will be. Wait until your site grows so you can learn where your scaling problems are going to be.
- Treat nonlogged in users as second class citizens. By always giving logged out always cached content Akamai bears the brunt for reddit’s traffic. Huge performance improvement.
..As of 2012 they had 240 servers supporting 2 billion pageviews a month and 2TB of data in Postgres. All high-traffic data was moved off of EBS and onto local ephemeral disks.
.. Did not account for increased latency after moving to EC2. In the datacenter they had submillisecond access between machines so it was possible to make a 1000 calls to memache for one page load. Not so on EC2. Memcache access times increased 10x to a millisecond which made their old approach unusable. Fix was to batch calls to memcache so a large number of gets are in one request.
.. Did not expire data. At reddit you can see comments all the way back to the beginning of time. They’ve started to put limits so you can’t vote on old comments or add comments to old threads. This causes data to grow and grow over time which makes it more and more difficult to keep the hot data in the database.
.. Put a limit on everything. Everything that can happen repeatedly put a high limit on it and raise or lower the limit as needed. Block users if the limit is passed. This protects the service. Example is uploading files of logos for subreddits. Users figured out they could upload really big files and harm the system. Don’t accept huge text blobs either. Someone will figure out how to send you 5GB of text.
.. Put everything into a queue. Votes, comments, thumbnail creation, precomputed queries, spam processing and corrections. Queues allow you to know when there’s a problem by monitoring queue lengths. Side benefit is queues hide problems from users because things like vote requests are in the queue and if they aren’t applied immediately nobody notices.
.. Be an active in your own community. Reddit users love that reddit admins are actually on the site and interacting with them.
.. Let users do the work for you. On a site with user input one problem is always cheating, spam, and fraud. Most of the work of moderation is done by thousands of volunteers who take care of most of the spam problem. It works amazingly well and is one of the reasons the reddit team can remain small.