How do I gzip compress a string in Python?

Python3 version of Sven Marnach’s 2011 answer:

import gzip
exampleString = 'abcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijmortenpunnerudengelstadrocksklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuvabcdefghijklmnopqrstuv123'
compressed_value = gzip.compress(bytes(exampleString,'utf-8'))
plain_string_again  = gzip.decompress(compressed_value)

pyronos: Simple and sweet load testing module.

Creates a “results” folder in the current directory to store all of the load testings.

# Simple usage.
pyronos  get 25 simple

# Send head request.
pyronos  head 25 simple

# Dump logs.
pyronos  get 25 simple -d

# Send requests sequentially.
pyronos  get 25 simple -s

# Print progress of sequential requests.
pyronos  get 25 simple -s -p

$ pyronos -h
usage: pyronos [-h] [-f {simple,stem,step}] [-o {csv,json,yml}] [-s] [-p] [-d]
               url {get,head,options,delete,post,put} num_of_reqs

Simple and sweet load testing module.

positional arguments:
  url                   url of website
                        http method
  num_of_reqs           number of requests

optional arguments:
  -h, --help            show this help message and exit
  -f {simple,stem,step}, --figure {simple,stem,step}
                        type of figure
  -o {csv,json,yml}, --output {csv,json,yml}
                        type of output
  -s, --sequential      sequential requests
  -p, --print-progress  print progress
  -d, --dump-logs       dump logs
  -v, --version         show program's version number and exit

Python Envelope: Mailing for human beings.

Envelopes is a wrapper for Python’s email and smtplib modules. It aims to make working with outgoing e-mail in Python simple and fun.


from envelopes import Envelope, GMailSMTP

envelope = Envelope(
    from_addr=(u'', u'From Example'),
    to_addr=(u'', u'To Example'),
    subject=u'Envelopes demo',
    text_body=u"I'm a helicopter!"

# Send the envelope using an ad-hoc connection...
envelope.send('', login='',
              password='password', tls=True)

# Or send the envelope using a shared GMail connection...
gmail = GMailSMTP('', 'password')

Converting Unicode in Python 3: from Character Code to Decimal

Given the Control Code column in the Wikipedia List of Unicode Characters:


Example 1: The Cent character

Code Glyph Decimal Description
U+0041 A 65 Latin Capital letter A

> Python Prompt:

> code = ‘0041’
>>> decimal = int(code,16)
>>> decimal
>>> chr(decimal)

Example 2: The Cent character

Code Glyph Decimal Html Description
U+00A2 ¢ 0162 ¢ Cent sign          

> Python Prompt:

> code = ’00A2′
>>> decimal = int(code,16)
>>> decimal
>>> chr(decimal)

Example 3: The Greek Sigma character

Code Glyph Decimal Description
03A3 Σ 931 Greek Capital Letter Sigma

> Python Prompt

> code = ’03A3′
>>> decimal = int(code,16)
>>> decimal
>>> chr(decimal)

Example 4: Soccer Ball

0 1 2 3 4 5 6 7 8 9 A B C D E F

> Python Prompt:

> code = ’26BD’
>>> decimal = int(code,16)
>>> decimal
>>> chr(decimal)

Note: The Soccer ball did not display correctly in my Windows Shell, but rendered properly when I copied it into a Chrome WordPress textarea.


Example 5: Emoticons

1F60E 😎 smiling face with sunglasses

>>> code = ‘1F60E’
>>> decimal = int(code,16)
>>> decimal
>>> chr(decimal)

Unicode & Character Encodings in Python: A Painless Guide


import unicodedata

>> print(u”Test\u2014It”)


>> s = u”Test\u2014It”

>> ord(s[4])


>>> chr(732)
>>> c = chr(732)
>>> ord(c)

escape_characters = set()

if ord(c) in escape_characters:




>> c = String.fromCharCode(732);
>> c.charCodeAt(0);
>> String.fromCharCode(0904)
>> c = String.fromCharCode(parseInt(‘2014’,16))   2014 = hex
>> c.charCodeAt(0);
c = String.fromCharCode(39);
>> c.charCodeAt(0);


var str = String.fromCharCode(e.which);
$(‘#charCodeAt’)[0].value = str.charCodeAt(0);
$(‘#fromCharCode’)[0].value = encodeURIComponent(str);

jQuery String Functions

  • charAt(n): Returns the character at the specified index in a string. The index starts from 0.
    1 var str = "JQUERY By Example";
    2 var n = str.charAt(2)
    4 //Output will be "U"
  • charCodeAt(n): Returns the Unicode of the character at the specified index in a string. The index starts from 0.
    1 var str = "HELLO WORLD";
    2 var n = str.charCodeAt(0);
    4 //Output will be "72"

Mathias Bynens: JavaScript Has a Unicode Problem:

As my JavaScript escapes tool would tell you, the reason is the following:

>> 'ma\xF1ana' == 'man\u0303ana'

>> 'ma\xF1ana'.length

>> 'man\u0303ana'.length

The first string contains U+00F1 LATIN SMALL LETTER N WITH TILDE, while the second string uses two separate code points (U+006E LATIN SMALL LETTER N and U+0303 COMBINING TILDE) to create the same glyph. That explains why they’re not equal, and why they have a different length.

However, if we want to count the number of symbols in these strings the same way a human being would, we’d expect the answer 6 for both strings, since that’s the number of visually distinguishable glyphs in each string. How can we make this happen?

In ECMAScript 6, the solution is fairly simple:

function countSymbolsPedantically(string) {
	// Unicode Normalization, NFC form, to account for lookalikes:
	var normalized = string.normalize('NFC');
	// Account for astral symbols / surrogates, just like we did before:
	return punycode.ucs2.decode(normalized).length;

The normalize method on String.prototype performs Unicode normalization, which accounts for these differences. If there is a single code point that represents the same glyph as another code point followed by a combining mark, it will normalize it to the single code point form.

>> countSymbolsPedantically('mañana') // U+00F1
>> countSymbolsPedantically('mañana') // U+006E + U+0303

For backwards compatibility with ECMAScript 5 and older environments, String.prototype.normalize polyfill can be used.

Turning a code point into a symbol

String.fromCharCode allows you to create a string based on a Unicode code point. But it only works correctly for code points in the BMP range (i.e. from U+0000 to U+FFFF). If you use it with an astral code point, you’ll get an unexpected result.

>> String.fromCharCode(0x0041) // U+0041
'A' // U+0041

>> String.fromCharCode(0x1F4A9) // U+1F4A9
'' // U+F4A9, not U+1F4A9

The only workaround is to calculate the code points for the surrogate halves yourself, and pass them as separate arguments.

>> String.fromCharCode(0xD83D, 0xDCA9)
'💩' // U+1F4A9

If you don’t want to go through the trouble of calculating the surrogate halves, you could resort to Punycode.js’s utility methods once again:

>> punycode.ucs2.encode([ 0x1F4A9 ])
'💩' // U+1F4A9

Luckily, ECMAScript 6 introduces String.fromCodePoint(codePoint) which does handle astral symbols correctly. It can be used for any Unicode code point, i.e. from U+000000 to U+10FFFF.

>> String.fromCodePoint(0x1F4A9)
'💩' // U+1F4A9

For backwards compatibility with ECMAScript 5 and older environments, use String.fromCodePoint() polyfill.


Getting a code point out of a string

Similarly, if you use String.prototype.charCodeAt(position) to retrieve the code point of the first symbol in the string, you’ll get the code point of the first surrogate half instead of the code point of the pile of poo character.

>> '💩'.charCodeAt(0)

Luckily, ECMAScript 6 introduces String.prototype.codePointAt(position), which is like charCodeAt except it deals with full symbols instead of surrogate halves whenever possible.

>> '💩'.codePointAt(0)

For backwards compatibility with ECMAScript 5 and older environments, use String.prototype.codePointAt() polyfill.


Real-world bugs and how to avoid them

This behavior leads to many issues. Twitter, for example, allows 140 characters per tweet, and their back-end doesn’t mind what kind of symbol it is — astral or not. But because the JavaScript counter on their website at some point simply read out the string’s length without accounting for surrogate pairs, it wasn’t possible to enter more than 70 astral symbols. (The bug has since been fixed.)

Many JavaScript libraries that deal with strings fail to account for astral symbols properly.


Introducing… The Pile of Poo Test™

Whenever you’re working on a piece of JavaScript code that deals with strings or regular expressions in some way, just add a unit test that contains a pile of poo (💩) in a string, and see if anything breaks. It’s a quick, fun, and easy way to see if your code supports astral symbols. Once you’ve found a Unicode-related bug in your code, all you need to do is apply the techniques discussed in this post to fix it.



Stack Overflow on String.fromCharCode():

inArray returns the index of the element in the array, not a boolean indicating if the item exists in the array. If the element was not found, -1 will be returned.

So, to check if an item is in the array, use:

if(jQuery.inArray("test", myarray) !== -1)
  • String.fromCodePoint() Not supported by Internet Explorer.  From Safari 10
  • String.fromCharCode() Supported since for ever, double as fast
  • The difference:

    Although most common Unicode values can be represented with one 16-bit number (as expected early on during JavaScript standardization) and fromCharCode() can be used to return a single character for the most common values (i.e., UCS-2 values which are the subset of UTF-16 with the most common characters), in order to deal with ALL legal Unicode values (up to 21 bits), fromCharCode() alone is inadequate. Since the higher code point characters use two (lower value) “surrogate” numbers to form a single character, String.fromCodePoint() (part of the ES6 draft) can be used to return such a pair and thus adequately represent these higher valued characters.

Rate Limiting with Python and Redis (GitHub)


import time


from client import get_redis_client
from exceptions import RateLimitExceeded
def rate_per_second(count):
    def _rate_per_second(function):
        def __rate_per_second(*args, **kwargs):
client = get_redis_client()
            key = frate-limit:{int(time.time())}
            if int(client.incr(key)) > count:
                raise RateLimitExceeded
            if client.ttl(key) == 1# timeout is not set
                client.expire(key, 1# expire in 1 second
            return function(*args, *kwargs)
        return __rate_per_second
    return _rate_per_second
@rate_per_second(100# example: 100 requests per second
def my_function():
    pass  # do something
if __name__ == __main__:
    success = fail = 0
    for i in range(2000):
            success += 1
        except RateLimitExceeded:
            fail += 1
        time.sleep(5/1000# sleep every 5 milliseconds
    print(fSuccess count = {success})
    print(fFail count = {fail})