Welcome Wagon: Classifying Comments on Stack Overflow

I (Jason) wrote The Stack Overflow Comment Evaluator 5000™, a simple application that presents you with a comment thread from a post on Stack Overflow and asks you to rate each comment in the thread as Fine, Unwelcoming, or Abusive.

Prevalence of comment categories

If we take a majority vote on the rating of each comment (with ties going to the worse rating) comments on Stack Overflow break down like so…

Rating % of comments
Fine 92.3%
Unwelcoming 7.4%
Abusive 0.3%

 

According to those of us deeply involved here and familiar with Stack Overflow, about 7% of comments on Stack Overflow are unwelcoming. What did some unwelcoming comments look like? These combine elements of real comments to show typical examples.

  • “This is becoming a waste of my time and you won’t listen to my advice. What are the supposed benefits of making it so much more complex?”
  • “Step 1. Do not clutter the namespace. Then get back to us.”
  • “The code you posted cannot yield this result. Please post the real code if you hope to get any help.”
  • “This error is self explanatory. You need to check…”
  • “I have already told how you can… If you can’t make it work, you are doing something wrong.”

This stuff isn’t profane, hate, or outright abuse, but it’s certainly unwelcoming. Looking at majority voting is one approach, but the experience of being not welcomed is not a majority vote kind of thing; it’s deeply personal. What if we looked at the distributions of the ratings by individual?

 

 

.. Firstly the “unwelcoming” comments aren’t unwelcoming. They’re all valid criticisms, I imagine. You haven’t given us the context in which they’re said, which’d be extremely helpful here.

The abusive comments make up 1 in 250. Which is tiny. That’s such a small amount that I find it hard to believe you’re even worrying about it. It’s -genuinely- impressive that it’s so much lower than a great deal of other websites that seek to achieve the same thing as this one. You should be extremely proud of this. It’s never going to be perfect.

I love this site and I love what it does. It’s genuinely a great platform for doing what it does. But you should work on improving the parts of it that are lacking (like chat!) before you try to handle such a minutely small problem.

 

.. The unwelcoming comments are valid criticisms, that can be expressed in a much more welcoming manner. The problem is large enough that StackOverflow is stereotyped and memed as being unwelcoming. That warrants attention, in my book.

.. I think you’re coming at this from the wrong angle. You’re thinking “let’s calculate a metric, and if that metric is below X, we don’t have a problem. Everyone who thinks there is a significant problem is wrong.”

Whereas I think the reality is: a huge number of people think there is a problem. Women in particular, who don’t contribute because they think it’s an unwelcoming place. So I’d reframe it as “let’s calculate a metric X, and now we know that at level X, we have a problem. We don’t yet know at what level we don’t have a problem anymore, but it must be less than X’.

 

.. SO claims to be a Q/A platform for professional and enthusiast programmers and for questions about programming that are tightly focused on a specific problem. So the questions must exactly tell that specific problem. This is often nothing what a beginner can do. So in my opinion SO is not a Q/A platform for beginners in programming. So this user group will always feel not welcoming here simply because it is not. Maybe SO should providing a special beginners Q/A portal additionally?

 

.. Maybe my personal bias is showing, but these 5 example “unwelcoming” comments don’t look like that to me at all. I’ve definitely gotten way harsher ones on my posts in the past and at no point did I feel them to be unwelcoming whatsoever.

And leaving these up so readers can identify the users leaving them, or even completly citing them so they can be easily found in SEDE is a privacy violation in my opinion and nobody deserves to be put in a hall of shame like that, especially not on stackexchange where the standing policy has been to allow people to correct their misgivings in private. Disappointing.

 

.. As it was stated above, those examples of unwelcoming comments above weren’t actual comments pulled from SO; they were pieced together from little bits and pieces of unwelcome comments. You can’t identify the writers of the original comments, as they weren’t written by a user. There is no privacy violation. No one has been put in a hall of shame.

Moderate Github Pages comments

Comm(ent|it) uses the Github API and Jekyll to help you store visitors comments directly in your repository.

Moderate with Git

 

By default, each comment generates a commit on a feature branch. A pull request is sent to merge the branch. To moderate comments before merging, just use $ git rebase -i master and remove the unwanted comm(ent|it)s.