w3lib.url.canonicalize_url

w3lib.url.canonicalize_url(urlkeep_blank_values=Truekeep_fragments=Falseencoding=None)[source]

Canonicalize the given url by applying the following procedures:

  • sort query arguments, first by key, then by value
  • percent encode paths ; non-ASCII characters are percent-encoded using UTF-8 (RFC-3986)
  • percent encode query arguments ; non-ASCII characters are percent-encoded using passed encoding (UTF-8 by default)
  • normalize all spaces (in query arguments) ‘+’ (plus symbol)
  • normalize percent encodings case (%2f -> %2F)
  • remove query arguments with blank values (unless keep_blank_values is True)
  • remove fragments (unless keep_fragments is True)

The url passed can be bytes or unicode, while the url returned is always a native str (bytes in Python 2, unicode in Python 3).

>>> import w3lib.url
>>>
>>> # sorting query arguments
>>> w3lib.url.canonicalize_url('http://www.example.com/do?c=3&b=5&b=2&a=50')
'http://www.example.com/do?a=50&b=2&b=5&c=3'
>>>
>>> # UTF-8 conversion + percent-encoding of non-ASCII characters
>>> w3lib.url.canonicalize_url(u'http://www.example.com/r\u00e9sum\u00e9')
'http://www.example.com/r%C3%A9sum%C3%A9'
>>>

 

software experts Django vs Flask vs Pyramid: Choosing a Python Web Framework

Pyramid is the most flexible of the three. It can be used for small apps as we’ve seen here, but it also powers big-name sites like Dropbox. Open Source communities like Fedora choose it for applications like their community badges system, which receives information about events from many of the project’s tools to award achievement-style badges to users. One of the most common complaints about Pyramid is that it presents so many options it can be intimidating to start a new project.