Goodbye, Clean Code

It’s a Phase

Obsessing with “clean code” and removing duplication is a phase many of us go through. When we don’t feel confident in our code, it is tempting to attach our sense of self-worth and professional pride to something that can be measured. A set of strict lint rules, a naming schema, a file structure, a lack of duplication.

You can’t automate removing duplication, but it does get easier with practice. You can usually tell whether there’s less or more of it after every change. As a result, removing duplication feels like improving some objective metric about the code. Worse, it messes with people’s sense of identity: “I’m the kind of person who writes clean code”. It’s as powerful as any sort of self-deception.

Once we learn how to create abstractions, it is tempting to get high on that ability, and pull abstractions out of thin air whenever we see repetitive code. After a few years of coding, we see repetition everywhere — and abstracting is our new superpower. If someone tells us that abstraction is a virtue, we’ll eat it. And we’ll start judging other people for not worshipping “cleanliness”.

I see now that my “refactoring” was a disaster in two ways:

Firstly, I didn’t talk to the person who wrote it. I rewrote the code and checked it in without their input. Even if it was an improvement (which I don’t believe anymore), this is a terrible way to go about it. A healthy engineering team is constantly building trust. Rewriting your teammate’s code without a discussion is a huge blow to your ability to effectively collaborate on a codebase together.

Secondly, nothing is free. My code traded the ability to change requirements for reduced duplication, and it was not a good trade. For example, we later needed many special cases and behaviors for different handles on different shapes. My abstraction would have to become several times more convoluted to afford that, whereas with the original “messy” version such changes stayed easy as cake.

Am I saying that you should write “dirty” code? No. I suggest to think deeply about what you mean when you say “clean” or “dirty”. Do you get a feeling of revolt? Righteousness? Beauty? Elegance? How sure are you that you can name the concrete engineering outcomes corresponding to those qualities? How exactly do they affect the way the code is written and modified?

I sure didn’t think deeply about any of those things. I thought a lot about how the code looked — but not about how it evolved with a team of squishy humans.

Coding is a journey. Think how far you came from your first line of code to where you are now. I reckon it was a joy to see for the first time how extracting a function or refactoring a class can make convoluted code simple. If you find pride in your craft, it is tempting to pursue cleanliness in code. Do it for a while.

But don’t stop there. Don’t be a clean code zealot. Clean code is not a goal. It’s an attempt to make some sense out of the immense complexity of systems we’re dealing with. It’s a defense mechanism when you’re not yet sure how a change would affect the codebase but you need guidance in a sea of unknows.

Let clean code guide you. Then let it go.

pyronos: Simple and sweet load testing module.

Creates a “results” folder in the current directory to store all of the load testings.

# Simple usage.
pyronos  get 25 simple

# Send head request.
pyronos  head 25 simple

# Dump logs.
pyronos  get 25 simple -d

# Send requests sequentially.
pyronos  get 25 simple -s

# Print progress of sequential requests.
pyronos  get 25 simple -s -p

$ pyronos -h
usage: pyronos [-h] [-f {simple,stem,step}] [-o {csv,json,yml}] [-s] [-p] [-d]
               [-v]
               url {get,head,options,delete,post,put} num_of_reqs

Simple and sweet load testing module.

positional arguments:
  url                   url of website
  {get,head,options,delete,post,put}
                        http method
  num_of_reqs           number of requests

optional arguments:
  -h, --help            show this help message and exit
  -f {simple,stem,step}, --figure {simple,stem,step}
                        type of figure
  -o {csv,json,yml}, --output {csv,json,yml}
                        type of output
  -s, --sequential      sequential requests
  -p, --print-progress  print progress
  -d, --dump-logs       dump logs
  -v, --version         show program's version number and exit

Python Envelope: Mailing for human beings.

Envelopes is a wrapper for Python’s email and smtplib modules. It aims to make working with outgoing e-mail in Python simple and fun.

 

from envelopes import Envelope, GMailSMTP

envelope = Envelope(
    from_addr=(u'from@example.com', u'From Example'),
    to_addr=(u'to@example.com', u'To Example'),
    subject=u'Envelopes demo',
    text_body=u"I'm a helicopter!"
)
envelope.add_attachment('/Users/bilbo/Pictures/helicopter.jpg')

# Send the envelope using an ad-hoc connection...
envelope.send('smtp.googlemail.com', login='from@example.com',
              password='password', tls=True)

# Or send the envelope using a shared GMail connection...
gmail = GMailSMTP('from@example.com', 'password')
gmail.send(envelope)

Make Scrapy follow links and collect data

# -*- coding: utf-8 -*-
import scrapy


# item class included here 
class DmozItem(scrapy.Item):
    # define the fields for your item here like:
    link = scrapy.Field()
    attr = scrapy.Field()


class DmozSpider(scrapy.Spider):
    name = "dmoz"
    allowed_domains = ["craigslist.org"]
    start_urls = [
    "http://chicago.craigslist.org/search/emd?"
    ]

    BASE_URL = 'http://chicago.craigslist.org/'

    def parse(self, response):
        links = response.xpath('//a[@class="hdrlnk"]/@href').extract()
        for link in links:
            absolute_url = self.BASE_URL + link
            yield scrapy.Request(absolute_url, callback=self.parse_attr)

    def parse_attr(self, response):
        item = DmozItem()
        item["link"] = response.url
        item["attr"] = "".join(response.xpath("//p[@class='attrgroup']//text()").extract())
        return item

 

Use Google’s Cache to crawl sites

Some websites implement certain measures to prevent bots from crawling them, with varying degrees of sophistication. Getting around those measures can be difficult and tricky, and may sometimes require special infrastructure. Please consider contacting commercial support if in doubt.

Here are some tips to keep in mind when dealing with these kinds of sites:

  • rotate your user agent from a pool of well-known ones from browsers (google around to get a list of them)
  • disable cookies (see COOKIES_ENABLED) as some sites may use cookies to spot bot behaviour
  • use download delays (2 or higher). See DOWNLOAD_DELAY setting.
  • if possible, use Google cache to fetch pages, instead of hitting the sites directly
  • use a pool of rotating IPs. For example, the free Tor project or paid services like ProxyMesh. An open source alternative is scrapoxy, a super proxy that you can attach your own proxies to.
  • use a highly distributed downloader that circumvents bans internally, so you can just focus on parsing clean pages. One example of such downloaders is Crawlera

get URL Query params with jquery & js

// Assuming “?order=1&total=30”

While URLSearchParams is ideal, not all browsers support that API. There’s a polyfill available but if you want a tiny function for basic query string parsing, the following is a function stolen from the A-Frame VR toolkit which parses the query string to get the key’s value you’d like:

function getUrlParameter(name) {
    name = name.replace(/[\[]/, '\\[').replace(/[\]]/, '\\]');
    var regex = new RegExp('[\\?&]' + name + '=([^&#]*)');
    var results = regex.exec(location.search);
    return results === null ? '' : decodeURIComponent(results[1].replace(/\+/g, ' '));
};

With the function above, you can get individual parameter values:

getUrlParameter('order'); // "1"

Converting Unicode in Python 3: from Character Code to Decimal

Given the Control Code column in the Wikipedia List of Unicode Characters:

 

Example 1: The Cent character

Code Glyph Decimal Description
U+0041 A 65 Latin Capital letter A

> Python Prompt:

> code = ‘0041’
>>> decimal = int(code,16)
>>> decimal
65
>>> chr(decimal)
‘A’

Example 2: The Cent character

Code Glyph Decimal Html Description
U+00A2 ¢ 0162 ¢ Cent sign          

> Python Prompt:

> code = ’00A2′
>>> decimal = int(code,16)
>>> decimal
162
>>> chr(decimal)
‘¢’

Example 3: The Greek Sigma character

Code Glyph Decimal Description
03A3 Σ 931 Greek Capital Letter Sigma

> Python Prompt

> code = ’03A3′
>>> decimal = int(code,16)
>>> decimal
931
>>> chr(decimal)
‘Σ’

Example 4: Soccer Ball

0 1 2 3 4 5 6 7 8 9 A B C D E F
U+26Bx

> Python Prompt:

> code = ’26BD’
>>> decimal = int(code,16)
>>> decimal
9917
>>> chr(decimal)
‘⚽’

Note: The Soccer ball did not display correctly in my Windows Shell, but rendered properly when I copied it into a Chrome WordPress textarea.

 

Example 5: Emoticons

1F60E 😎 smiling face with sunglasses

>>> code = ‘1F60E’
>>> decimal = int(code,16)
>>> decimal
128526
>>> chr(decimal)
‘😎’