Welcome Wagon: Classifying Comments on Stack Overflow

I (Jason) wrote The Stack Overflow Comment Evaluator 5000™, a simple application that presents you with a comment thread from a post on Stack Overflow and asks you to rate each comment in the thread as Fine, Unwelcoming, or Abusive.

Prevalence of comment categories

If we take a majority vote on the rating of each comment (with ties going to the worse rating) comments on Stack Overflow break down like so…

Rating % of comments
Fine 92.3%
Unwelcoming 7.4%
Abusive 0.3%

 

According to those of us deeply involved here and familiar with Stack Overflow, about 7% of comments on Stack Overflow are unwelcoming. What did some unwelcoming comments look like? These combine elements of real comments to show typical examples.

  • “This is becoming a waste of my time and you won’t listen to my advice. What are the supposed benefits of making it so much more complex?”
  • “Step 1. Do not clutter the namespace. Then get back to us.”
  • “The code you posted cannot yield this result. Please post the real code if you hope to get any help.”
  • “This error is self explanatory. You need to check…”
  • “I have already told how you can… If you can’t make it work, you are doing something wrong.”

This stuff isn’t profane, hate, or outright abuse, but it’s certainly unwelcoming. Looking at majority voting is one approach, but the experience of being not welcomed is not a majority vote kind of thing; it’s deeply personal. What if we looked at the distributions of the ratings by individual?

 

 

.. Firstly the “unwelcoming” comments aren’t unwelcoming. They’re all valid criticisms, I imagine. You haven’t given us the context in which they’re said, which’d be extremely helpful here.

The abusive comments make up 1 in 250. Which is tiny. That’s such a small amount that I find it hard to believe you’re even worrying about it. It’s -genuinely- impressive that it’s so much lower than a great deal of other websites that seek to achieve the same thing as this one. You should be extremely proud of this. It’s never going to be perfect.

I love this site and I love what it does. It’s genuinely a great platform for doing what it does. But you should work on improving the parts of it that are lacking (like chat!) before you try to handle such a minutely small problem.

 

.. The unwelcoming comments are valid criticisms, that can be expressed in a much more welcoming manner. The problem is large enough that StackOverflow is stereotyped and memed as being unwelcoming. That warrants attention, in my book.

.. I think you’re coming at this from the wrong angle. You’re thinking “let’s calculate a metric, and if that metric is below X, we don’t have a problem. Everyone who thinks there is a significant problem is wrong.”

Whereas I think the reality is: a huge number of people think there is a problem. Women in particular, who don’t contribute because they think it’s an unwelcoming place. So I’d reframe it as “let’s calculate a metric X, and now we know that at level X, we have a problem. We don’t yet know at what level we don’t have a problem anymore, but it must be less than X’.

 

.. SO claims to be a Q/A platform for professional and enthusiast programmers and for questions about programming that are tightly focused on a specific problem. So the questions must exactly tell that specific problem. This is often nothing what a beginner can do. So in my opinion SO is not a Q/A platform for beginners in programming. So this user group will always feel not welcoming here simply because it is not. Maybe SO should providing a special beginners Q/A portal additionally?

 

.. Maybe my personal bias is showing, but these 5 example “unwelcoming” comments don’t look like that to me at all. I’ve definitely gotten way harsher ones on my posts in the past and at no point did I feel them to be unwelcoming whatsoever.

And leaving these up so readers can identify the users leaving them, or even completly citing them so they can be easily found in SEDE is a privacy violation in my opinion and nobody deserves to be put in a hall of shame like that, especially not on stackexchange where the standing policy has been to allow people to correct their misgivings in private. Disappointing.

 

.. As it was stated above, those examples of unwelcoming comments above weren’t actual comments pulled from SO; they were pieced together from little bits and pieces of unwelcome comments. You can’t identify the writers of the original comments, as they weren’t written by a user. There is no privacy violation. No one has been put in a hall of shame.

Stack Overflow: Code of Conduct

This Code of Conduct helps us build a community that is rooted in kindness, collaboration, and mutual respect.

Unacceptable Behavior

No subtle put-downs or unfriendly language.Even if you don’t intend it, this can have a negative impact on others.

Unfriendly
Friendly
“You could Google this in 5 seconds.”
“This is called Invariance and Covariance. If you Google it, you’ll find tutorials that can explain it much better than we can in an answer here.”
“If you bothered to read my question, you’d know it’s not a duplicate.”
“I don’t think this is a duplicate. My question is about cement board, while the question you linked is about drywall.”
“Are you speaking English? If so, I can’t tell.”
“I’m having trouble understanding your question. I think you’re asking how to add a swap after system installation. Is that correct?”
“I came to get help, not to get my question edited.”
“Thanks for trying to help, but your edit isn’t what I meant. I’ve removed your edit, and have updated my question so it’s clearer.”

No name-calling or personal attacks.Focus on the content, not the person. This includes terms that feel personal even when they’re applied to content (e.g. “lazy”).

Stack Overflow Culture

I was in the beta of SO. I almost never interact with it anymore.

Asking a question on SO is a last resort to me, and I get a horrid sinking feeling in my gut when I feel forced to do so. The people[1] who are still active on it seem to be people who thrive on pedantry and whose goal is to find any potential flaw in your question and feel smart for pointing it out.

You begin to realise no one is actually reading your question in good faith, so you start getting defensive: filling your questions with disclaimers about how your example code is just an example[2], how you know there are other ways you could do it but you’re constrained toward this direction for various reasons[3], and so on and so forth, until you feel like you spend more time defensively shoring up your question from attacks than actually constructing the question in the first place[4]

I still read SO, but as someone who was around before it existed I don’t really feel like the quality of answers is any higher than the random forum posts of yore, it’s just that they’re all under the same URL now, and the same user interface.

Which I suppose is something.

[1] Not all people™, but definitely the general feeling tends this direction

[2] classic situation: you simplify your code to Foo and Bar levels to show the problem cleanly, so people chastise you for having a complex data structure / worrying about performance / whatever for such simple code

[3] e.g., “How do I achieve X” gets turned into people saying “Why would you want to achieve X, that’s stupid”

[4] This is not the same as researching the issue and trying as many things as you can think of, which is definitely helpful in any context of question asking