Thursday, 28 August


Montreal Python User Group: Montréal-Python 48: Incorrect jiujitsu - Call for speakers

 ∗ Planet Python

Pythonisthas from Montreal, it's time for us for our back to school special. We are coming back from our summer vacation and we are hosting our next meetup at the offices of our friends from Shopify on St-Laurent street on Tuesday September 23th at 6:30 pm.

We especially love to hear from new speakers. If you haven't given a talk at Montréal-Python before, a 5 or 10 minute lightning talk would be a great start, but we also have slots for 10 to 40 minutes talks!

It's a perfect opportunity if you would like to show us what you've discovered and created, especially if you are planning to present your talk at PyCon.

Don't forget, the call speakers for PyCon 2015 is ending on Sept, 15th

Some topic suggestions:

  • Give a beginner's introduction to a Python library you've been using!
  • Talk about a project you're working on!
  • Show us unit testing, continuous integration or Python documentation tools!
  • Tell us about a Python performance problem you've run into and how you solved it!
  • The standard Python library is full of amazing things. Have you learned how multiprocessing or threading or GUI programming works recently? Tell us about it!
  • Explain how to get started with Django in 5 minutes!

We're always looking out for 10 to 40 minutes talks, or a quick 5 minutes flash presentation. If you discovered or learned something that you find interesting, we'd love to help you let others learn about it! Send your proposals to




Wednesday, 27 August



Mike Driscoll: wxPython: Converting wx.DateTime to Python datetime

 ∗ Planet Python

The wxPython GUI toolkit includes its own date / time capabilities. Most of the time, you can just use Python’s datetime and time modules and you’ll be fine. But occasionally you’ll find yourself needing to convert from wxPython’s wx.DateTime objects to Python’s datetime objects. You may encounter this when you use the wx.DatePickerCtrl widget.

Fortunately, wxPython’s calendar module has some helper functions that can help you convert datetime objects back and forth between wxPython and Python. Let’s take a look:

def _pydate2wxdate(date):
     import datetime
     assert isinstance(date, (datetime.datetime,
     tt = date.timetuple()
     dmy = (tt[2], tt[1]-1, tt[0])
     return wx.DateTimeFromDMY(*dmy)
def _wxdate2pydate(date):
     import datetime
     assert isinstance(date, wx.DateTime)
     if date.IsValid():
          ymd = map(int, date.FormatISODate().split('-'))
          return None

You can use these handy functions in your own code to help with your conversions. I would probably put these into a controller or utilities script. I would also rewrite it slightly so I wouldn’t import Python’s datetime module inside the functions. Here’s an example:

import datetime
import wx
def pydate2wxdate(date):
     assert isinstance(date, (datetime.datetime,
     tt = date.timetuple()
     dmy = (tt[2], tt[1]-1, tt[0])
     return wx.DateTimeFromDMY(*dmy)
def wxdate2pydate(date):
     assert isinstance(date, wx.DateTime)
     if date.IsValid():
          ymd = map(int, date.FormatISODate().split('-'))
          return None

You can read more about this topic on this old wxPython mailing thread. Have fun and happy coding!



Nick Coghlan: The transition to multilingual programming

 ∗ Planet Python

A recent thread on python-dev prompted me to summarise the current state of the ongoing industry wide transition from bilingual to multilingual programming as it relates to Python's cross-platform support. It also relates to the reasons why Python 3 turned out to be more disruptive than the core development team initially expected.

A good starting point for anyone interested in exploring this topic further is the "Origin and development" section of the Wikipedia article on Unicode, but I'll hit the key points below.

Monolingual computing

At their core, computers only understand single bits. Everything above that is based on conventions that ascribe higher level meanings to particular sequences of bits. One particular important set of conventions for communicating between humans and computers are "text encodings": conventions that map particular sequences of bits to text in the actual languages humans read and write.

One of the oldest encodings still in common use is ASCII (which stands for "American Standard Code for Information Interchange"), developed during the 1960's (it just had its 50th birthday in 2013). This encoding maps the letters of the English alphabet (in both upper and lower case), the decimal digits, various punctuation characters and some additional "control codes" to the 128 numbers that can be encoded as a 7-bit sequence.

Many computer systems today still only work correctly with English - when you encounter such a system, it's a fairly good bet that either the system itself, or something it depends on, is limited to working with ASCII text. (If you're really unlucky, you might even get to work with modal 5-bit encodings like ITA-2, as I have. The legacy of the telegraph lives on!)

Working with local languages

The first attempts at dealing with this limitation of ASCII simply assigned meanings to the full range of 8-bit sequences. Known collectively as "Extended ASCII", each of these systems allowed for an additional 128 characters, which was enough to handle many European and Cyrillic scripts. Even 256 characters was nowhere near sufficient to deal with Indic or East Asian languages, however, so this time also saw a proliferation of ASCII incompatible encodings like ShiftJIS, ISO-2022 and Big5. This is why Python ships with support for dozens of codecs from around the world.

This proliferation of encodings required a way to tell software which encoding should be used to read the data. For protocols that were originally designed for communication between computers, agreeing on a common text encoding is usually handled as part of the protocol. In cases where no encoding information is supplied (or to handle cases where there is a mismatch between the claimed encoding and the actual encoding), then applications may make use of "encoding detection" algorithms, like those provided by the chardet package for Python. These algorithms aren't perfect, but can give good answers when given a sufficient amount of data to work with.

Local operating system interfaces, however, are a different story. Not only don't they inherently convey encoding information, but the nature of the problem is such that trying to use encoding detection isn't practical. Two key systems arose in an attempt to deal with this problem:

  • Windows code pages
  • POSIX locale encodings

With both of these systems, a program would pick a code page or locale, and use the corresponding text encoding to decide how to interpret text for display to the user or combination with other text. This may include deciding how to display information about the contents of the computer itself (like listing the files in a directory).

The fundamental premise of these two systems is that the computer only needs to speak the language of its immediate users. So, while the computer is theoretically capable of communicating in any language, it can effectively only communicate with humans in one language at a time. All of the data a given application was working with would need to be in a consistent encoding, or the result would be uninterpretable nonsense, something the Japanese (and eventually everyone else) came to call mojibake.

It isn't a coincidence that the name for this concept came from an Asian country: the encoding problems encountered there make the issues encountered with European and Cyrillic languages look trivial by comparison.

Unfortunately, this "bilingual computing" approach (so called because the computer could generally handle English in addition to the local language) causes some serious problems once you consider communicating between computers. While some of those problems were specific to network protocols, there are some more serious ones that arise when dealing with nominally "local" interfaces:

  • networked computing meant one username might be used across multiple systems, including different operating systems
  • network drives allow a single file server to be accessed from multiple clients, including different operating systems
  • portable media (like DVDs and USB keys) allow the same filesystem to be accessed from multiple devices at different points in time
  • data synchronisation services like Dropbox need to faithfully replicate a filesystem hierarchy not only across different desktop environments, but also to mobile devices

For these protocols that were originally designed only for local interoperability communicating encoding information is generally difficult, and it doesn't necessarily match the claimed encoding of the platform you're running on.

Unicode and the rise of multilingual computing

The path to addressing the fundamental limitations of bilingual computing actually started more than 25 years ago, back in the late 1980's. An initial draft proposal for a 16-bit "universal encoding" was released in 1988, the Unicode Consortium was formed in early 1991 and the first volume of the first version of Unicode was published later that same year.

Microsoft added new text handling and operating system APIs to Windows based on the 16-bit C level wchar_t type, and Sun also adopted Unicode as part of the core design of Java's approach to handling text.

However, there was a problem. The original Unicode design had decided that "16 bits ought to be enough for anybody" by restricting their target to only modern scripts, and only frequently used characters within those scripts. However, when you look at the "rarely used" Kanji and Han characters for Japanese and Chinese, you find that they include many characters that are regularly used for the names of people and places - they're just largely restricted to proper nouns, and so won't show up in a normal vocabulary search. So Unicode 2.0 was defined in 1996, expanding the system out to a maximum of 21 bits per code point (using up to 32 bits per code point for storage).

As a result, Windows (including the CLR) and Java now use the little-endian variant of UTF-16 to allow their text APIs to handle arbitrary Unicode code points. The original 16-bit code space is now referred to as the Basic Multilingual Plane.

While all that was going on, the POSIX world ended up adopting a different strategy for migrating to full Unicode support: attempting to standardise on the ASCII compatible UTF-8 text encoding.

The choice between using UTF-8 and UTF-16-LE as the preferred local text encoding involves some complicated trade-offs, and that's reflected in the fact that they have ended up being at the heart of two competing approaches to multilingual computing.

Choosing UTF-8 aims to treat formatting text for communication with the user as "just a display issue". It's a low impact design that will "just work" for a lot of software, but it comes at a price:

  • because encoding consistency checks are mostly avoided, data in different encodings may be freely concatenated and passed on to other applications. Such data is typically not usable by the receiving application.
  • for interfaces without encoding information available, it is often necessary to assume an appropriate encoding in order to display information to the user, or to transform it to a different encoding for communication with another system that may not share the local system's encoding assumptions. These assumptions may not be correct, but won't necessarily cause an error - the data may just be silently misinterpreted as something other than what was originally intended.
  • because data is generally decoded far from where it was introduced, it can be difficult to discover the origin of encoding errors.
  • as a variable width encoding, it is more difficult to develop efficient string manipulation algorithms for UTF-8. Algorithms originally designed for fixed width encodings will no longer work.
  • as a specific instance of the previous point, it isn't possible to split UTF-8 encoded text at arbitrary locations. Care needs to be taken to ensure splits only occur at code point boundaries.

UTF-16-LE shares the last two problem, but to a lesser degree (simply due to the fact most commonly used code points are in the 16-bit Basic Multilingual Plane). However, because it isn't generally suitable for use in network protocols and file formats (without significant additional encoding markers), the explicit decoding and encoding required encourages designs with a clear separation between binary data (including encoded text) and decoded text data.

Through the lens of Python

Python and Unicode were born on opposites side of the Atlantic ocean at roughly the same time (1991). The growing adoption of Unicode within the computing industry has had a profound impact on the evolution of the language.

Python 1.x was purely a product of the bilingual computing era - it had no support for Unicode based text handling at all, and was hence largely limited to 8-bit ASCII compatible encodings for text processing.

Python 2.x was still primarily a product of the bilingual era, but added multilingual support as an optional addon, in the form of the unicode type and support for a wide variety of text encodings. PEP 100 goes into the many technical details that needed to be covered in order to incorporate that feature. With Python 2, you can make multilingual programming work, but it requires an active decision on the part of the application developer, or at least that they follow the guidelines of a framework that handles the problem on their behalf.

By contrast, Python 3.x is designed to be a native denizen of the multilingual computing world. Support for multiple languages extends as far as the variable naming system, such that languages other than English become almost as well supported as English already was in Python 2. While the English inspired keywords and the English naming in the standard library and on the Python Package Index mean that Python's "native" language and the preferred language for global collaboration will always be English, the new design allows a lot more flexibility when working with data in other languages.

Consider processing a data table where the headings are names of Japanese individuals, and we'd like to use collections.namedtuple to process each row. Python 2 simply can't handle this task:

>>> from collections import namedtuple
>>> People = namedtuple("People", u"陽斗 慶子 七海")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/", line 310, in namedtuple
    field_names = map(str, field_names)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)

Users need to either restrict themselves to dictionary style lookups rather than attribute access, or else used romanised versions of their names (Haruto, Keiko, Nanami for the example). However, the case of "Haruto" is an interesting one, as there at least 3 different ways of writing that as Kanji (陽斗, 陽翔, 大翔), but they are all romanised as the same string (Haruto). If you try to use romaaji to handle a data set that contains more than one variant of that name, you're going to get spurious collisions.

Python 3 takes a very different perspective on this problem. It says it should just work, and it makes sure it does:

>>> from collections import namedtuple
>>> People = namedtuple("People", u"陽斗 慶子 七海")
>>> d = People(1, 2, 3)
>>> d.陽斗
>>> d.慶子
>>> d.七海

This change greatly expands the kinds of "data driven" use cases Python can support in areas where the ASCII based assumptions of Python 2 would cause serious problems.

Python 3 still needs to deal with improperly encoded data however, so it provides a mechanism for arbitrary binary data to be "smuggled" through text strings in the Unicode Private Use Area. This feature was added by PEP 383 and is managed through the surrogateescape error handler, which is used by default on most operating system interfaces. This recreates the old Python 2 behaviour of passing improperly encoded data through unchanged when dealing solely with local operating system interfaces, but complaining when such improperly encoded data is injected into another interface. The codec error handling system provides several tools to deal with these files, and we're looking at adding a few more relevant convenience functions for Python 3.5.

The underlying Unicode changes in Python 3 also made PEP 393 possible, which changed the way the CPython interpreter stores text internally. In Python 2, even pure ASCII strings would consume four bytes per code point on Linux systems. Using the "narrow build" option (as the Python 2 Windows builds from do) reduced that the only two bytes per code point when operating within the Basic Multilingual Plane, but at the cost of potentially producing wrong answers when asked to operate on code points outside the Basic Multilingual Plane. By contrast, starting with Python 3.3, CPython now stores text internally using the smallest fixed width data unit possible. That is, latin-1 text uses 8 bits per code point, UCS-2 (Basic Multilingual Plane) text uses 16-bits per code point, and only text containing code points outside the Basic Multilingual Plane will expand to needing the full 32 bits per code point. This can not only significantly reduce the amount of memory needed for multilingual applications, but may also increase their speed as well (as reducing memory usage also reduces the time spent copying data around).

Are we there yet?

In a word, no. Not for Python 3.4, and not for the computing industry at large. We're much closer than we ever have been before, though. Most POSIX systems now default to UTF-8 as their default encoding, and many systems offer a C.UTF-8 locale as an alternative to the traditional ASCII based C locale. When dealing solely with properly encoded data and metadata, and properly configured systems, Python 3 should "just work", even when exchanging data between different platforms.

For Python 3, the remaining challenges fall into a few areas:

  • helping existing Python 2 users adopt the optional multilingual features that will prepare them for eventual migration to Python 3 (as well as reassuring those users that don't wish to migrate that Python 2 is still fully supported, and will remain so for at least the next several years, and potentially longer for customers of commercial redistributors)
  • adding back some features for working entirely in the binary domain that were removed in the original Python 3 transition due to an initial assessment that they were operations that only made sense on text data (PEP 361 summary: bytes.__mod__ is coming back in Python 3.5 as a valid binary domain operation, bytes.format stays gone as an operation that only makes sense when working with actual text data)
  • better handling of improperly decoded data, including poor encoding recommendations from the operating system (for example, Python 3.5 will be more sceptical when the operating system tells it the preferred encoding is ASCII and will enable the surrogateescape error handler on sys.stdout when it occurs)
  • eliminating most remaining usage of the legacy code page and locale encoding systems in the CPython interpreter (this most notably affects the Windows console interface and argument decoding on POSIX. While these aren't easy problems to solve, it will still hopefully be possible to address them for Python 3.5)

More broadly, each major platform has its own significant challenges to address:

  • for POSIX systems, there are still a lot of systems that don't use UTF-8 as the preferred encoding and the assumption of ASCII as the preferred encoding in the default C locale is positively archaic. There is also still a lot of POSIX software that still believes in the "text is just encoded bytes" assumption, and will happily produce mojibake that makes no sense to other applications or systems.
  • for Windows, keeping the old 8-bit APIs around was deemed necessary for backwards compatibility, but this also means that there is still a lot of Windows software that simply doesn't handle multilingual computing correctly.
  • for both Windows and the JVM, a fair amount of nominally multilingual software actually only works correctly with data in the basic multilingual plane. This is a smaller problem than not supporting multilingual computing at all, but was quite a noticeable problem in Python 2's own Windows support.

Mac OS X is the platform most tightly controlled by any one entity (Apple), and they're actually in the best position out of all of the current major platforms when it comes to handling multilingual computing correctly. They've been one of the major drivers of Unicode since the beginning (two of the authors of the initial Unicode proposal were Apple engineers), and were able to force the necessary configuration changes on all their systems, rather than having to work with an extensive network of OEM partners (Windows, commercial Linux vendors) or relatively loose collaborations of individuals and organisations (community Linux distributions).

Modern mobile platforms are generally in a better position than desktop operating systems, mostly by virtue of being newer, and hence defined after Unicode was better understood. However, the UTF-8 vs UTF-16-LE distinction for text handling exists even there, thanks to the Java inspired Dalvik VM in Android (plus the cloud-backed nature of modern smartphones means you're even more likely to be encounter files from multiple machines when working on a mobile device).

09:10 eGenix PyRun - One file Python Runtime 2.0.1 GA

 ∗ Planet Python


eGenix PyRun is our open source, one file, no installation version of Python, making the distribution of a Python interpreter to run based scripts and applications to Unix based systems as simple as copying a single file.

eGenix PyRun's executable only needs 11MB for Python 2 and 13MB for Python 3, but still supports most Python application and scripts - and it can be compressed to just 3-4MB using upx, if needed.

Compared to a regular Python installation of typically 100MB on disk, eGenix PyRun is ideal for applications and scripts that need to be distributed to several target machines, client installations or customers.

It makes "installing" Python on a Unix based system as simple as copying a single file.

eGenix has been using the product internally in the mxODBC Connect Server since 2008 with great success and decided to make it available as a stand-alone open-source product.

We provide both the source archive to build your own eGenix PyRun, as well as pre-compiled binaries for Linux, FreeBSD and Mac OS X, as 32- and 64-bit versions. The binaries can be downloaded manually, or you can let our automatic install script install-pyrun take care of the installation: ./install-pyrun dir and you're done.

Please see the product page for more details:

    >>> eGenix PyRun - One file Python Runtime


This is a patch level release of eGenix PyRun 2.0. The major new feature in 2.0 is the added Python 3.4 support.

New Features

  • Upgraded eGenix PyRun to work with and use Python 2.7.8 per default.

Enhancements / Changes

  • Fixed a bug in the license printer to show the correct license URL.

install-pyrun Quick Install Enhancements

eGenix PyRun includes a shell script called install-pyrun, which greatly simplifies installation of PyRun. It works much like the virtualenv shell script used for creating new virtual environments (except that there's nothing virtual about PyRun environments).

With the script, an eGenix PyRun installation is as simple as running:

./install-pyrun targetdir

This will automatically detect the platform, download and install the right pyrun version into targetdir.

We have updated this script since the last release:

  • Updated install-pyrun to default to eGenix PyRun 2.0.1 and its feature set.

For a complete list of changes, please see the eGenix PyRun Changelog.

Please see the eGenix PyRun 2.0.0 announcement for more details about eGenix PyRun 2.0.


Please visit the eGenix PyRun product page for downloads, instructions on installation and documentation of the product.

More Information

For more information on eGenix PyRun, licensing and download instructions, please write to

Enjoy !

Marc-Andre Lemburg,

Anatoly Techtonik: How to make RAM disk in Linux

 ∗ Planet Python

UPDATE (2014-08-27): Exactly three years later I discovered that Linux already comes with RAM disk enabled by default, mounted as `/dev/shm` (which points to `/run/shm` on Debian/Ubuntu):
$ df -h /dev/shm
Filesystem Size Used Avail Use% Mounted on
tmpfs 75M 4.0K 75M 1% /run/shm
See detailed info here.

*RAM disk* is a term from the past when DOS was alive and information was stored on disks instead of internet. If you created image of some disk, it was possible to load it into memory. Memory disks were useful to load software from Live CDs. Usually software needs some space to write data during boot sequence, and RAM is the fastest way to setup one.

Filesystem space in memory can be extremely useful today too. For example, to run tests without reducing resource of SSD. While the idea is not new, there was no incentive to explore it until I've run upon tmpfs reference in Ubuntu Wiki.

For example, to get 2Gb of space for files in RAM, edit /etc/fstab to add the following line:
tmpfs     /var/ramspace       tmpfs     defaults,size=2048M     0     0
/var/ramspace is now the place to store your files in memory.



Fabio Zadrozny: PyDev 3.7.0, PyDev/PyCharm Debugger merge, Crowdfunding

 ∗ Planet Python

PyDev 3.7.0 was just released.

There are some interesting things to talk about in this release...

The first is that the PyDev debugger was merged with the fork which was used in PyCharm. The final code for the debugger (and the interactive console) now lives at: This effort was backed-up by Intellij, and from now on, work on the debugger from either front (PyDev or PyCharm) should benefit both -- pull requests are also very welcome :)

With this merge, PyDev users will gain GEvent debugging and breakpoints at Django templates (but note that the breakpoints can only be added through the LiClipse HTML/Django Templates editor), and in the interactive console front (which was also part of this merge), the asynchronous output and console interrupt are new.

This release also changed the default UI for the PyDev editor (and for LiClipse editors too), so, the minimap (which had a bunch of enhancements) is now turned on by default and the scrollbars are hidden by default -- those that prefer the old behavior must change the settings on the minimap preferences to match the old style.

Also noteworthy is that the code-completion for all letter chars is turned on by default (again, users that want the old behavior have to uncheck that setting from the code completion preferences page), and this release also has a bunch of bugfixes.

Now, I haven't talked about the crowdfunding for keeping the support on PyDev and a new profiler UI ( after it finished... well, it didn't reach its full goal -- in practice that means the profiler UI will still be done and users which supported it will receive a license to use it, but it won't be open source... all in all, it wasn't that bad either, it got halfway through its target and many people seemed to like the idea -- in the end I'll know if keeping it on even if not having the full target reached was a good idea or not only after it's commercially available (as the idea is that new licenses will be what will cover for its development expenses and will keep it going afterwards).

A note for profiler contributors is that I still haven't released an early-release version, but I'm working on it :)

As for PyDev, the outcome of the funding also means I have fewer resources to support it than I'd like, but given that LiClipse ( provides a share of its earnings to support PyDev and latecomers can still contribute through, I still hope that I won't need to lower its support (which'd mean taking on other projects in the time I currently have for PyDev), and I think it'll still be possible to do the things outlined in the crowdfunding regarding it.


An Event Apart: The Role of Visual Design

 ∗ LukeW | Digital Product Design + Strategy

At An Event Apart in Chicago IL 2014, Jenny Lam talked about the value of visual design in digital products and shared some tips for evaluating aesthetics. Here's my notes from her talk Hit it With a Pretty Stick:

  • User experience designers need to understand a lot of things in addition to visual design. But as the discipline has matured, our understanding and evaluation of aesthetics has not. How do we champion aesthetics in our work and in our organizations?
  • Most of us believe in the value of visual design but in the real world we often have to convince others as well.
  • Visual design's impact on the bottom line is real. For example, Mint licensed a technology from someone else and added a user experience on top. That created $170 million dollars in value. Every dollar spent on aesthetics yielded Gillete $415+ dollars vs. only a $7 return from advertising. Design-driven companies outperform the S&P by 228%.
  • Companies that invest in design have better customer satisfaction, increased loyalty, employee retention, and more.
  • Aesthetics also communicate credibility and trust. In Stanford research, look & feel is primary driver of credibility.

Visual Design & Teams

  • For creative projects, we need creative leaders. If there's a leader at the top with a creative vision, great. If not, the creative leadership can come from the hands-on design team.
  • Interaction designers and visual designers have different skills. Interaction: HCI trained, Product Definition, User Flows. Visual: Graphic Design, Sensory-minded, Brand-centric. Together they're a powerful combination.
  • Give your visual designers accountability. Empower them. Carry through on aesthetics internally to create a design culture.
  • Dotted line relationships to the marketing team can help design teams get the resources they need to create great experiences. Marketing tends to have big budgets and cares about the visual aspect of products.
  • Visual designers can take ownership of in-house creative. Shirts, posters, etc are very visible and can show off visual design quality.

Aesthetic Principles

  • Aesthetics are about three components: integrity (how true & cohesive is the design), harmony (how the parts relate to the whole), and radiance (how we feel when we experience a product).
  • Integrity puts us out there, allows our brand to be memorable. The visual interface has become as important as a brand logo.
  • We remember only really good experiences and really bad ones. Not average experiences.
  • Harmony: all our elements need to support the central story. Use patterns, textures, and color sets to unify designs into a cohesive whole.
  • Look to nature for ideas of color harmony.
  • Radiance: light, shadow, and material allows you to create a sense of environment.
  • Make sure you tweak/edit default settings in your drawing apps. Don't use standard drop shadows, design them. Keep dimensions "human relatable": how would things look in real life?
  • Details matter. When we're delighted, the interface feels like fun and easier to use. Look for opportunities to delight.

Tools & Techniques

  • Methods to create a design language: futurecasting, moodboards, positioning matrices.
  • Futurecasting: imagine the end state & how people will feel. What will the press release be, how can the visual design support that?
  • Start with words: talk to stakeholders to figure out the visual direction that's right for a project. Ask people why they chose specific adjectives.
  • Positioning matrices (where a brand fits on a spectrum), moodboards, and more can help set the right visual direction.
  • Everyone can have an opinion but critique is not art direction.
  • Rules of critique: visual designer is the owner & gets a veto, write down agreed upon goals, focus on feedback not on solutions, don't come up with solutions as a group.
  • Say: “I don’t know what to focus on first.” Not: “It’s too cluttered.”
  • Say: “I’m having a hard time reading the text." Not: “Make the font bigger!”
  • Remind people you are a professional.

An Event Apart: Icon Design Process

 ∗ LukeW | Digital Product Design + Strategy

At An Event Apart in Chicago IL 2014, Jon Hicks discussed the modern icon design process and shared useful design and development tips for icons. Here's my notes from his talk Icon Design Process:

  • Icons can be used to support navigation, action, and messaging in Web sites and applications. They can also reinforce status by providing more information than just color.
  • We had a visual language before we had written language: symbols, hieroglyphics, etc. So symbols were around for a long time before they made their way to computers in 1974.
  • Today there's lots of royalty free icon sets available for use in sites and apps -so why make your own? Icons sets might not be the right size or style for your usage. You may need more or less icons than exist in a ready-made set. In these cases and more, you may need custom icons.
  • The icon design process: research, drawing, and deployment (which changes frequently).
  • Research: a client brief and icon audit can reveal areas of inconsistency, gaps, or duplicates in the icon design of a site. Compile a list of the icons you'll need and what they represent.
  • How do you go from a word to a finished icon? You have two options: iconic (literal) or symbolic (needs to be learned).
  • When possible, follow conventions for your icon designs. The Noun Project is a great resource for common visual symbols. But be aware of local considerations. Symbols like piggy banks, owls, and thumbs up may have inappropriate meanings in other cultures.
  • Truly symbolic icons are more easily understood. The difference between outlined and solid icons is not the determining factor for comprehension.
  • Don't get too fancy with your icons just to make them different. Make your icon as simple as possible but no simpler.
  • Drawing: use whatever tools you are comfortable with. Start with a pixel grid to align gaps and weights within an icon. Your gird does not need to be even.
  • Automatic resizing of icons to create larger images might not provide the right ratio/balance between elements. You may need to tweak line weights or sizes to make things look right at different sizes.
  • With an icon set, you may need to adjust sizing and alignment to make things appear optically the same. Different shapes will appear bigger/smaller when displayed as part of a set.
  • Think about where shadows will fall within an icon to create the right balance of space.
  • Just when people started embracing SVG, Adobe started to remove SVG features from their apps since no one was using them before.
  • Sketch is starting to mature enough to be a viable alternative to Illustrator for drawing icons.
  • Svgo-gui is a simple drag and drop tool for optimizing SVG images. You can further compress SVG by GZIP-ing them on your server.
  • Deployment: Icon fonts or SVG -which one to use? Both can be right for your project.
  • Why use Icon fonts? One small file, accessible & scalable, easily styled with CSS, no sprites needed, supported in IE4+
  • Why use SVG? less hassle, support (3 versions back), avoids sprites, can use multiple colours, are still style-able with CSS animations
  • Grumpicon is a tool that can help you create SVG art for your sites.

An Event Apart: How To Champion Ideas Back At Work

 ∗ LukeW | Digital Product Design + Strategy

At An Event Apart in Chicago IL 2014, Scott Berkun discussed the modern icon design process and shared useful design and development tips for icons. Here's my notes from his talk How To Champion Ideas Back At Work:

  • The real designer is the person with the power to make decisions. It doesn't matter what title they have or their background.
  • In most situations, the final decision maker is not trained in creative disciplines.
  • If you really want to make an impact, you may need to remove the word design or engineering from your title.
  • Today designers can be founders and bring their ideas to life.

Meeting Others

  • You take in information at an event, internalize it, then make use of that information afterward.
  • This requires you to pay attention to the information you're hearing. At first you can take in lots of info but over time, you can retain less information.
  • Staying connected helps you champion ideas. Design requires working with other people.
  • Networking: ask everyone for a business card, saying thank you starts a conversation, post your notes during an event people will find you, if you use LinkedIn, write a personal message.
  • Start introductions with a simple authentic point: I met you at ___."
  • Casual professional events allow you to re-connect. Find your local UX happy hour and invite others.

What to Champion

  • Events are abstractions -they need to apply to a variety of people and their needs.
  • Our lives are specific -we need to deal with specific contexts on a regular basis.
  • To remember what you've learned, try min/max note taking. Take 5 bullets per talk, note some links & reflections, post a summary on your blog and tweet it out, post it at work, share it with your boss.
  • Make a chart of lessons learned and map them to a specific problem at work where you'd like to apply the ideas you heard. Include the people that need to be involved.
  • We like to imagine successes were perfect. We romanticize the role of the creator. But in reality there's always lots of frustration, dead ends, and adjustments.
  • When you start working on a project, you don't know what the outcome will be. That's the role of the creator.

How to Champion

  • The real process is not get idea, build, and ship. Instead there's a lot of convincing in between.
  • Being outspoken makes you a target.
  • Language is manipulation. Every bit of writing, design, or code has intent.
  • Charm and persuasion is emotional. It's not logical, it's designed. There is no abstraction, being charming depends on who you are trying to charm.
  • Instead of "here's what you should be doing", focus on "here's what will solve your problem".
  • The people with power are often the ones most resistant to change. They benefit from the status quo.
  • How to convince your boss: be awesome at your job. The best people on the team are more likely to get heard.
  • Get support from an influential coworker. Plan a trial, including how to evaluate it.
  • Pitch, repeat, your reputation will grow over time.

An Event Apart: Content for Sensitive Situations

 ∗ LukeW | Digital Product Design + Strategy

At An Event Apart in Chicago IL 2014 Kate Kiefer Lee talked over writing content for legal, help, error, and other forms of sensitive content. Here's my notes from her talk on Touchy Subjects: Creating Content for Sensitive Situations:

  • When face to face, we get immediate feedback from people because we can see them and understand their feelings. That's empathy and its often missing in our online content.
  • We need to take the empathy we already use everyday and translate that to our online software.
  • There are a lot of topics that are sensitive by nature: health, medicine, money, religion, politic, fundraising, private information.
  • Urgent messages: we need to tell people bad news quickly. Errors, downtime, warnings, rejections, and apologies are all examples of urgent messages that are time-sensitive.
  • Not all touchy subjects are urgent. Think 311 vs. 911: help documents, customer service emails, forms, contact pages, legal policies, etc.
  • Make a list of all your content types. Pull out any you think are urgent, bad news, or touchy subjects. Then map them to people's emotions.
  • People have all kinds of feelings when interacting with your content. When someone's needs are being met they may feel very different then when their needs are not being met. How can you meet people's needs?
  • Match your reader's feelings to the tone you use in your content. Examples: error messages map to frustration and need gentle, calm, and serious messages. In help documents, we want to be helpful and friendly.
  • Put yourself in people's shoes to decide how to write your content for them. We're not writing for writings sake, we're communicators trying to help people to do certain things.


  • Be clear: all content needs to be concise and focused.
  • Get to the point: don't try to soften bad news, just get it out.
  • Stay calm: don't use exclamation points or all caps.
  • Be serious: you don't need to funny all the time.
  • Accept responsibility.
  • Be nice: you don't always have to be interesting or clever, but you can always be nice.
  • When you adopt these principles, you help people become more effective, reduce customer service issues, and improve word-of-mouth marketing.
  • Read all your messages out loud. This helps you catch errors and typos, improves flow and makes you sound human. It also makes you more empathetic and naturally puts you in a conversational frame of mind.

Content Types

  • Errors: we want to be gentle, calm, direct, and serious. Example: "We regret to inform you that we are unable to process your request as your credit card has expired." Instead try: "Your credit card has expired. Please try another card."
  • Say exactly what you mean and say it nicely.
  • Customer Service: make sure you are not being repetitive. Treat people like people. They are stressed out, frustrated, or confused. Don't repeat canned messages.
  • Help documents: people may be there trouble-shooting. Don't let your personality get in the way. Use extremely specific titles. Titles are very important in help documents -they help guide people to what they need.
  • After clarity, consistency is the most important thing in help documents. Keep your interface terms consistent in your service and your help.
  • Feedback/Contact content: there's not a lot of room for personality here. What works on a product page is less likely to work on a contact page. Reduce the amount of information you collect up front. You can always collect more later with follow-on questions.
  • Don't keep your voice and tone the same across different content types. Adapt to different situations appropriately.
  • Unsubscribe pages: people may be annoyed or frustrated. Validate their feelings and offer a solution (less email).
  • Social media: people are interested and curious but still need to be courteous, sensitive, and direct. Often times, you should listen more than you talk on social media.
  • Don't become delusional about your importance online. There's many times you're better off not saying anything.
  • Legal policies: people may be confused, and apprehensive. Be calm, through, and clear. You don't want to look like you are hiding something.
  • Terms of service can include summaries to help people understand the big picture. But people are agreeing to the full terms. Work on making all your text working.
  • Editorially and Automatic make their legal and terms of service freely available to others to reuse and update.
  • Apologies: when you apologize you need to own it. Show you understand the seriousness of your issue. Don't say "our apologies for any inconvenience this may have caused." Take responsibility. Be specific about what you did wrong and say what you'll do to prevent it in the future.
  • When we are in a hurry, sometimes we forget to be nice. Create some templates of possible content types before you need them. Know who needs to sign-off so you can apologize quickly. Create an emergency contact list of who needs to be involved when sensitive situations arise.

Teach These Concepts

  • We're all content people. The more people on your team know about how to manage sensitive content, the more cohesive your messaging will be.
  • A voice and tone guide can be a resource for your team members. Give them the tools they need. Example from Mailchimp: Voice & Tone Guide.
  • Focus on making a communication guide not a style & grammar guide.
  • When we're talking about content we often focus too much on us and what we want to say. Our goal is not for people to compliment our content, its for people to get things done quickly and easily.


Mike C. Fletcher: Python-dbus needs some non-trivial examples

 ∗ Planet Python

So I got tired of paying work this afternoon and decided I would work on getting a dbus service started for Listener. The idea here is that there will be a DBus service which does all the context management, microphone setup, playback, etc and client software (such as the main GUI and apps that want to allow voice coding without going through low-level-grotty simulated typing) can use to interact with it.

But how does one go about exposing objects on DBus in the DBus-ian way? It *seems* that object-paths should produce a REST-like hierarchy where each object I want to expose is presented at /com/vrplumber/listener/context/... but should that be done on-demand? If I have 20 contexts, should I expose them all at start-up, or should the user "request" them one at a time (get_context( key ) -> path?). Should I use a  ObjectTree? How do I handle deletion/de-registration in such a way that clients are notified of the removed objects? I can hack these things in, but it would be nice to know the *right* way to do this kind of work. Should I expose functions that process directories (import this directory), or only those which process in-memory data-sets (add these words to the dictionary), can (python) DBus handle many MBs of data? What does a proper "real" DBus service look like?

So, anyone know of some good examples of python-dbus services exposing non-trivial services? Many objects, many methods, object life-cycle operations, many signals, yada, yada?

(BTW, my hacks are up on github if anyone cares to hit me with a clue-stick).

Ian Ozsvald: Why are technical companies not using data science?

 ∗ Planet Python

Here’s a quick question. How come more technical companies aren’t making use of data science? By “technical” I mean any company with data and the smarts to spot that it has value, by “data science” I mean any technical means to exploit this data for financial gain (e.g. visualisation to guide decisions, machine learning, prediction).

I’m guessing that it comes down to an economic question – either it isn’t as valuable as some other activity (making mobile apps? improving UX on the website? paid marketing? expanding sales to new territories?) or it is perceived as being valuable but cannot be exploited (maybe due to lack of skills and training or data problems).

I’m thinking about this for my upcoming keynote at PyConIreland, would you please give me some feedback in the survey below (no sign-up required)?

To be clear – this is an anonymous survey, I’ll have no idea who gives the answers.

Create your free online surveys with SurveyMonkey , the world’s leading questionnaire tool.


If the above is interesting then note that we’ve got a data science training list where we make occasional announcements about our upcoming training and we have two upcoming training courses. We also discuss these topics at our PyDataLondon meetups. I also have a slightly longer survey (it’ll take you 2 minutes, no sign-up required), I’ll be discussing these results at the next PyDataLondon so please share your thoughts.

Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

Tuesday, 26 August



 ∗ The Go Programming Language Blog


Go is a statically typed language that does not permit operations that mix numeric types. You can't add a float64 to an int, or even an int32 to an int. Yet it is legal to write 1e6*time.Second or math.Exp(1) or even 1<<('\t'+2.0). In Go, constants, unlike variables, behave pretty much like regular numbers. This post explains why that is and what it means.

Background: C

In the early days of thinking about Go, we talked about a number of problems caused by the way C and its descendants let you mix and match numeric types. Many mysterious bugs, crashes, and portability problems are caused by expressions that combine integers of different sizes and "signedness". Although to a seasoned C programmer the result of a calculation like

unsigned int u = 1e9;
long signed int i = -1;
... i + u ...

may be familiar, it isn't a priori obvious. How big is the result? What is its value? Is it signed or unsigned?

Nasty bugs lurk here.

C has a set of rules called "the usual arithmetic conversions" and it is an indicator of their subtlety that they have changed over the years (introducing yet more bugs, retroactively).

When designing Go, we decided to avoid this minefield by mandating that there is no mixing of numeric types. If you want to add i and u, you must be explicit about what you want the result to be. Given

var u uint
var i int

you can write either uint(i)+u or i+int(u), with both the meaning and type of the addition clearly expressed, but unlike in C you cannot write i+u. You can't even mix int and int32, even when int is a 32-bit type.

This strictness eliminates a common cause of bugs and other failures. It is a vital property of Go. But it has a cost: it sometimes requires programmers to decorate their code with clumsy numeric conversions to express their meaning clearly.

And what about constants? Given the declarations above, what would make it legal to write i = 0 or u = 0? What is the type of 0? It would be unreasonable to require constants to have type conversions in simple contexts such as i = int(0).

We soon realized the answer lay in making numeric constants work differently from how they behave in other C-like languages. After much thinking and experimentation, we came up with a design that we believe feels right almost always, freeing the programmer from converting constants all the time yet being able to write things like math.Sqrt(2) without being chided by the compiler.

In short, constants in Go just work, most of the time anyway. Let's see how that happens.


First, a quick definition. In Go, const is a keyword introducing a name for a scalar value such as 2 or 3.14159 or "scrumptious". Such values, named or otherwise, are called constants in Go. Constants can also be created by expressions built from constants, such as 2+3 or 2+3i or math.Pi/2 or ("go"+"pher").

Some languages don't have constants, and others have a more general definition of constant or application of the word const. In C and C++, for instance, const is a type qualifier that can codify more intricate properties of more intricate values.

But in Go, a constant is just a simple, unchanging value, and from here on we're talking only about Go.

String constants

There are many kinds of numeric constants—integers, floats, runes, signed, unsigned, imaginary, complex—so let's start with a simpler form of constant: strings. String constants are easy to understand and provide a smaller space in which to explore the type issues of constants in Go.

A string constant encloses some text between double quotes. (Go has also has raw string literals, enclosed by backquotes ``, but for the purpose of this discussion they have all the same properties.) Here is a string constant:

"Hello, 世界"

(For much more detail about the representation and interpretation of strings, see this blog post.)

What type does this string constant have? The obvious answer is string, but that is wrong.

This is an untyped string constant, which is to say it is a constant textual value that does not yet have a fixed type. Yes, it's a string, but it's not a Go value of type string. It remains an untyped string constant even when given a name:

const hello = "Hello, 世界"

After this declaration, hello is also an untyped string constant. An untyped constant is just a value, one not yet given a defined type that would force it to obey the strict rules that prevent combining differently typed values.

It is this notion of an untyped constant that makes it possible for us to use constants in Go with great freedom.

So what, then, is a typed string constant? It's one that's been given a type, like this:

const typedHello string = "Hello, 世界"

Notice that the declaration of typedHello has an explicit string type before the equals sign. This means that typedHello has Go type string, and cannot be assigned to a Go variable of a different type. That is to say, this code works:

// +build OMIT

package main

import "fmt"

const typedHello string = "Hello, 世界"

func main() {
    var s string
    s = typedHello

but this does not:

// +build OMIT

package main

import "fmt"

const typedHello string = "Hello, 世界"

func main() {
    type MyString string
    var m MyString
    m = typedHello // Type error

The variable m has type MyString and cannot be assigned a value of a different type. It can only be assigned values of type MyString, like this:

// +build OMIT

package main

import "fmt"

const typedHello string = "Hello, 世界"

func main() {
	type MyString string
	var m MyString
    const myStringHello MyString = "Hello, 世界"
    m = myStringHello // OK

or by forcing the issue with a conversion, like this:

// +build OMIT

package main

import "fmt"

const typedHello string = "Hello, 世界"

func main() {
	type MyString string
	var m MyString
    m = MyString(typedHello)

Returning to our untyped string constant, it has the helpful property that, since it has no type, assigning it to a typed variable does not cause a type error. That is, we can write

m = "Hello, 世界"


m = hello

because, unlike the typed constants typedHello and myStringHello, the untyped constants "Hello, 世界" and hello have no type. Assigning them to a variable of any type compatible with strings works without error.

These untyped string constants are strings, of course, so they can only be used where a string is allowed, but they do not have type string.

Default type

As a Go programmer, you have certainly seen many declarations like

str := "Hello, 世界"

and by now you might be asking, "if the constant is untyped, how does str get a type in this variable declaration?" The answer is that an untyped constant has a default type, an implicit type that it transfers to a value if a type is needed where none is provided. For untyped string constants, that default type is obviously string, so

str := "Hello, 世界"


var str = "Hello, 世界"

means exactly the same as

var str string = "Hello, 世界"

One way to think about untyped constants is that they live in a kind of ideal space of values, a space less restrictive than Go's full type system. But to do anything with them, we need to assign them to variables, and when that happens the variable (not the constant itself) needs a type, and the constant can tell the variable what type it should have. In this example, str becomes a value of type string because the untyped string constant gives the declaration its default type, string.

In such a declaration, a variable is declared with a type and initial value. Sometimes when we use a constant, however, the destination of the value is not so clear. For instance consider this statement:

// +build OMIT

package main

import "fmt"

func main() {
    fmt.Printf("%s", "Hello, 世界")

The signature of fmt.Printf is

func Printf(format string, a ...interface{}) (n int, err error)

which is to say its arguments (after the format string) are interface values. What happens when fmt.Printf is called with an untyped constant is that an interface value is created to pass as an argument, and the concrete type stored for that argument is the default type of the constant. This process is analogous to what we saw earlier when declaring an initialized value using an untyped string constant.

You can see the result in this example, which uses the format %v to print the value and %T to print the type of the value being passed to fmt.Printf:

// +build OMIT

package main

import "fmt"

const hello = "Hello, 世界"

func main() {
    fmt.Printf("%T: %v\n", "Hello, 世界", "Hello, 世界")
    fmt.Printf("%T: %v\n", hello, hello)

If the constant has a type, that goes into the interface, as this example shows:

// +build OMIT

package main

import "fmt"

type MyString string

const myStringHello MyString = "Hello, 世界"

func main() {
    fmt.Printf("%T: %v\n", myStringHello, myStringHello)

(For more information about how interface values work, see the first sections of this blog post.)

In summary, a typed constant obeys all the rules of typed values in Go. On the other hand, an untyped constant does not carry a Go type in the same way and can be mixed and matched more freely. It does, however, have a default type that is exposed when, and only when, no other type information is available.

Default type determined by syntax

The default type of an untyped constant is determined by its syntax. For string constants, the only possible implicit type is string. For numeric constants, the implicit type has more variety. Integer constants default to int, floating-point constants float64, rune constants to rune (an alias for int32), and imaginary constants to complex128. Here's our canonical print statement used repeatedly to show the default types in action:

// +build OMIT

package main

import "fmt"

func main() {
    fmt.Printf("%T %v\n", 0, 0)
    fmt.Printf("%T %v\n", 0.0, 0.0)
    fmt.Printf("%T %v\n", 'x', 'x')
    fmt.Printf("%T %v\n", 0i, 0i)

(Exercise: Explain the result for 'x'.)


Everything we said about untyped string constants can be said for untyped boolean constants. The values true and false are untyped boolean constants that can be assigned to any boolean variable, but once given a type, boolean variables cannot be mixed:

// +build OMIT

package main

import "fmt"

func main() {
    type MyBool bool
    const True = true
    const TypedTrue bool = true
    var mb MyBool
    mb = true      // OK
    mb = True      // OK
    mb = TypedTrue // Bad

Run the example and see what happens, then comment out the "Bad" line and run it again. The pattern here follows exactly that of string constants.


Floating-point constants are just like boolean constants in most respects. Our standard example works as expected in translation:

// +build OMIT

package main

import "fmt"

func main() {
    type MyFloat64 float64
    const Zero = 0.0
    const TypedZero float64 = 0.0
    var mf MyFloat64
    mf = 0.0       // OK
    mf = Zero      // OK
    mf = TypedZero // Bad

One wrinkle is that there are two floating-point types in Go: float32 and float64. The default type for a floating-point constant is float64, although an untyped floating-point constant can be assigned to a float32 value just fine:

// +build OMIT

package main

import "fmt"

func main() {
	const Zero = 0.0
	const TypedZero float64 = 0.0
    var f32 float32
    f32 = 0.0
    f32 = Zero      // OK: Zero is untyped
    f32 = TypedZero // Bad: TypedZero is float64 not float32.

Floating-point values are a good place to introduce the concept of overflow, or the range of values.

Numeric constants live in an arbitrary-precision numeric space; they are just regular numbers. But when they are assigned to a variable the value must be able to fit in the destination. We can declare a constant with a very large value:

    const Huge = 1e1000

—that's just a number, after all—but we can't assign it or even print it. This statement won't even compile:

// +build OMIT

package main

import "fmt"

func main() {
	const Huge = 1e1000

The error is, "constant 1.00000e+1000 overflows float64", which is true. But Huge might be useful: we can use it in expressions with other constants and use the value of those expressions if the result can be represented in the range of a float64. The statement,

// +build OMIT

package main

import "fmt"

func main() {
	const Huge = 1e1000
    fmt.Println(Huge / 1e999)

prints 10, as one would expect.

In a related way, floating-point constants may have very high precision, so that arithmetic involving them is more accurate. The constants defined in the math package are given with many more digits than are available in a float64. Here is the definition of math.Pi:

Pi    = 3.14159265358979323846264338327950288419716939937510582097494459

When that value is assigned to a variable, some of the precision will be lost; the assignment will create the float64 (or float32) value closest to the high-precision value. This snippet

// +build OMIT

package main

import (

func main() {
    pi := math.Pi

prints 3.141592653589793.

Having so many digits available means that calculations like Pi/2 or other more intricate evaluations can carry more precision until the result is assigned, making calculations involving constants easier to write without losing precision. It also means that there is no occasion in which the floating-point corner cases like infinities, soft underflows, and NaNs arise in constant expressions. (Division by a constant zero is a compile-time error, and when everything is a number there's no such thing as "not a number".)

Complex numbers

Complex constants behave a lot like floating-point constants. Here's a version of our now-familiar litany translated into complex numbers:

// +build OMIT

package main

import "fmt"

func main() {
    type MyComplex128 complex128
    const I = (0.0 + 1.0i)
    const TypedI complex128 = (0.0 + 1.0i)
    var mc MyComplex128
    mc = (0.0 + 1.0i) // OK
    mc = I            // OK
    mc = TypedI       // Bad

The default type of a complex number is complex128, the larger-precision version composed of two float64 values.

For clarity in our example, we wrote out the full expression (0.0+1.0i), but this value can be shortened to 0.0+1.0i, 1.0i or even 1i.

Let's play a trick. We know that in Go, a numeric constant is just a number. What if that number is a complex number with no imaginary part, that is, a real? Here's one:

    const Two = 2.0 + 0i

That's an untyped complex constant. Even though it has no imaginary part, the syntax of the expression defines it to have default type complex128. Therefore, if we use it to declare a variable, the default type will be complex128. The snippet

// +build OMIT

package main

import "fmt"

func main() {
	const Two = 2.0 + 0i
    s := Two
    fmt.Printf("%T: %v\n", s, s)

prints complex128: (2+0i). But numerically, Two can be stored in a scalar floating-point number, a float64 or float32, with no loss of information. Thus we can assign Two to a float64, either in an initialization or an assignment, without problems:

// +build OMIT

package main

import "fmt"

func main() {
	const Two = 2.0 + 0i
    var f float64
    var g float64 = Two
    f = Two
    fmt.Println(f, "and", g)

The output is 2 and 2. Even though Two is a complex constant, it can be assigned to scalar floating-point variables. This ability for a constant to "cross" types like this will prove useful.


At last we come to integers. They have more moving parts—many sizes, signed or unsigned, and more—but they play by the same rules. For the last time, here is our familiar example, using just int this time:

// +build OMIT

package main

import "fmt"

func main() {
    type MyInt int
    const Three = 3
    const TypedThree int = 3
    var mi MyInt
    mi = 3          // OK
    mi = Three      // OK
    mi = TypedThree // Bad

The same example could be built for any of the integer types, which are:

int int8 int16 int32 int64
uint uint8 uint16 uint32 uint64

(plus the aliases byte for uint8 and rune for int32). That's a lot, but the pattern in the way constants work should be familiar enough by now that you can see how things will play out.

As mentioned above, integers come in a couple of forms and each form has its own default type: int for simple constants like 123 or 0xFF or -14 and rune for quoted characters like 'a', '世' or '\r'.

No constant form has as its default type an unsigned integer type. However, the flexibility of untyped constants means we can initialize unsigned integer variables using simple constants as long as we are clear about the type. It's analogous to how we can initialize a float64 using a complex number with zero imaginary part. Here are several different ways to initialize a uint; all are equivalent, but all must mention the type explicitly for the result to be unsigned.

var u uint = 17
var u = uint(17)
u := uint(17)

Similarly to the range issue mentioned in the section on floating-point values, not all integer values can fit in all integer types. There are two problems that might arise: the value might be too large, or it might be a negative value being assigned to an unsigned integer type. For instance, int8 has range -128 through 127, so constants outside of that range can never be assigned to a variable of type int8:

// +build OMIT

package main

func main() {
    var i8 int8 = 128 // Error: too large.
	_ = i8

Similarly, uint8, also known as byte, has range 0 through 255, so a large or negative constant cannot be assigned to a uint8:

// +build OMIT

package main

func main() {
    var u8 uint8 = -1 // Error: negative value.
	_ = u8

This type-checking can catch mistakes like this one:

// +build OMIT

package main

func main() {
    type Char byte
    var c Char = '世' // Error: '世' has value 0x4e16, too large.
	_ = c

If the compiler complains about your use of a constant, it's likely a real bug like this.

An exercise: The largest unsigned int

Here is an informative little exercise. How do we express a constant representing the largest value that fits in a uint? If we were talking about uint32 rather than uint, we could write

const MaxUint32 = 1<<32 - 1

but we want uint, not uint32. The int and uint types have equal unspecified numbers of bits, either 32 or 64. Since the number of bits available depends on the architecture, we can't just write down a single value.

Fans of two's-complement arithmetic, which Go's integers are defined to use, know that the representation of -1 has all its bits set to 1, so the bit pattern of -1 is internally the same as that of the largest unsigned integer. We therefore might think we could write

// +build OMIT

package main

func main() {
    const MaxUint uint = -1 // Error: negative value

but that is illegal because -1 cannot be represented by an unsigned variable; -1 is not in the range of unsigned values. A conversion won't help either, for the same reason:

// +build OMIT

package main

func main() {
    const MaxUint uint = uint(-1) // Error: negative value

Even though at run-time a value of -1 can be converted to an unsigned integer, the rules for constant conversions forbid this kind of coercion at compile time. That is to say, this works:

// +build OMIT

package main

func main() {
    var u uint
    var v = -1
    u = uint(v)
	_ = u

but only because v is a variable; if we made v a constant, even an untyped constant, we'd be back in forbidden territory:

// +build OMIT

package main

func main() {
    var u uint
    const v = -1
    u = uint(v) // Error: negative value
	_ = u

We return to our previous approach, but instead of -1 we try ^0, the bitwise negation of an arbitrary number of zero bits. But that fails too, for a similar reason: In the space of numeric values, ^0 represents an infinite number of ones, so we lose information if we assign that to any fixed-size integer:

// +build OMIT

package main

func main() {
    const MaxUint uint = ^0 // Error: overflow

How then do we represent the largest unsigned integer as a constant?

The key is to constrain the operation to the number of bits in a uint and avoiding values, such as negative numbers, that are not representable in a uint. The simplest uint value is the typed constant uint(0). If uints have 32 or 64 bits, uint(0) has 32 or 64 zero bits accordingly. If we invert each of those bits, we'll get the correct number of one bits, which is the largest uint value.

Therefore we don't flip the bits of the untyped constant 0, we flip the bits of the typed constant uint(0). Here, then, is our constant:

// +build OMIT

package main

import "fmt"

func main() {
    const MaxUint = ^uint(0)
    fmt.Printf("%x\n", MaxUint)

Whatever the number of bits it takes to represent a uint in the current execution environment (on the playground, it's 32), this constant correctly represents the largest value a variable of type uint can hold.

If you understand the analysis that got us to this result, you understand all the important points about constants in Go.


The concept of untyped constants in Go means that all the numeric constants, whether integer, floating-point, complex, or even character values, live in a kind of unified space. It's when we bring them to the computational world of variables, assignments, and operations that the actual types matter. But as long as we stay in the world of numeric constants, we can mix and match values as we like. All these constants have numeric value 1:

'b' - 'a'

Therefore, although they have different implicit default types, written as untyped constants they can be assigned to a variable of any integer type:

// +build OMIT

package main

import "fmt"

func main() {
    var f float32 = 1
    var i int = 1.000
    var u uint32 = 1e3 - 99.0*10.0 - 9
    var c float64 = '\x01'
    var p uintptr = '\u0001'
    var r complex64 = 'b' - 'a'
    var b byte = 1.0 + 3i - 3.0i

    fmt.Println(f, i, u, c, p, r, b)

The output from this snippet is: 1 1 1 1 1 (1+0i) 1.

You can even do nutty stuff like

// +build OMIT

package main

import "fmt"

func main() {
    var f = 'a' * 1.5

which yields 145.5, which is pointless except to prove a point.

But the real point of these rules is flexibility. That flexibility means that, despite the fact that in Go it is illegal in the same expression to mix floating-point and integer variables, or even int and int32 variables, it is fine to write

sqrt2 := math.Sqrt(2)


const millisecond = time.Second/1e3


bigBufferWithHeader := make([]byte, 512+1e6)

and have the results mean what you expect.

Because in Go, numeric constants work as you expect: like numbers.


Ian Ozsvald: Python Training courses: Data Science and High Performance Python coming in October

 ∗ Planet Python

I’m pleased to say that via our ModelInsight we’ll be running two Python-focused training courses in October. The goal is to give you new strong research & development skills, they’re aimed at folks in companies but would suit folks in academia too. UPDATE training courses ready to buy (1 Day Data Science, 2 Day High Performance).

UPDATE we have a <5min anonymous survey which helps us learn your needs for Data Science training in London, please click through and answer the few questions so we know what training you need.

“Highly recommended – I attended in Aalborg in May “:… upcoming Python DataSci/HighPerf training courses”” @ThomasArildsen

These and future courses will be announced on our London Python Data Science Training mailing list, sign-up for occasional announces about our upcoming courses (no spam, just occasional updates, you can unsubscribe at any time).

Intro to Data science with Python (1 day) on Friday 24th October

Students: Basic to Intermediate Pythonistas (you can already write scripts and you have some basic matrix experience)

Goal: Solve a complete data science problem (building a working and deployable recommendation engine) by working through the entire process – using numpy and pandas, applying test driven development, visualising the problem, deploying a tiny web application that serves the results (great for when you’re back with your team!)

  • learn basic numpy, pandas and data cleaning
  • be confident with Test Driven Development and debugging strategies
  • create a recommender system and understand its strengths and limitations
  • use a Flask API to serve results
  • learn Anaconda and conda environments
  • take home a working recommender system that you can confidently customise to your data
  • £300 including lunch, central London (24th October)
  • additional announces will come via our London Python Data Science Training mailing list
  • Buy your ticket here

High Performance Python (2 day) on Thursday+Friday 30th+31st October

Students: Intermediate Pythonistas (you need higher performance for your Python code)

Goal: learn high performance techniques for performant computing, a mix of background theory and lots of hands-on pragmatic exercises

  • Profiling (CPU, RAM) to understand bottlenecks
  • Compilers and JITs (Cython, Numba, Pythran, PyPy) to pragmatically run code faster
  • Learn r&d and engineering approaches to efficient development
  • Multicore and clusters (multiprocessing, IPython parallel) for scaling
  • Debugging strategies, numpy techniques, lowering memory usage, storage engines
  • Learn Anaconda and conda environments
  • Take home years of hard-won experience so you can develop performant Python code
  • Cost: £600 including lunch, central London (30th & 31st October)
  • additional announces will come via our London Python Data Science Training mailing list
  • Buy your ticket here

The High Performance course is built off of many years teaching and talking at conferences (including PyDataLondon 2013, PyCon 2013, EuroSciPy 2012) and in companies along with my High Performance Python book (O’Reilly). The data science course is built off of techniques we’ve used over the last few years to help clients solve data science problems. Both courses are very pragmatic, hands-on and will leave you with new skills that have been battle-tested by us (we use these approaches to quickly deliver correct and valuable data science solutions for our clients via ModelInsight). At PyCon 2012 my students rated me 4.64/5.0 for overall happiness with my High Performance teaching.

@ianozsvald [..] Best tutorial of the 4 I attended was yours. Thanks for your time and preparation!” @cgoering

We’d also like to know which other courses you’d like to learn, we can partner with trainers as needed to deliver new courses in London. We’re focused around Python, data science, high performance and pragmatic engineering. Drop me an email (via ModelInsight) and let me know if we can help.

Do please join our London Python Data Science Training mailing list to be kept informed about upcoming training courses.

Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

Catherine Devlin: %sql: To Pandas and Back

 ∗ Planet Python

A Pandas DataFrame has a nice to_sql(table_name, sqlalchemy_engine) method that saves itself to a database.

The only trouble is that coming up with the SQLAlchemy Engine object is a little bit of a pain, and if you're using the IPython %sql magic, your %sql session already has an SQLAlchemy engine anyway. So I created a bogus PERSIST pseudo-SQL command that simply calls to_sql with the open database connection:

%sql PERSIST mydataframe

The result is that your data can make a very convenient round-trip from your database, to Pandas and whatever transformations you want to apply there, and back to your database:

In [1]: %load_ext sql

In [2]: %sql postgresql://@localhost/
Out[2]: u'Connected: @'

In [3]: ohio = %sql select * from cities_of_ohio;
246 rows affected.

In [4]: df = ohio.DataFrame()

In [5]: montgomery = df[df['county']=='Montgomery County']

In [6]: %sql PERSIST montgomery
Out[6]: u'Persisted montgomery'

In [7]: %sql SELECT * FROM montgomery
11 rows affected.
[(27L, u'Brookville', u'5,884', u'Montgomery County'),
(54L, u'Dayton', u'141,527', u'Montgomery County'),
(66L, u'Englewood', u'13,465', u'Montgomery County'),
(81L, u'Germantown', u'6,215', u'Montgomery County'),
(130L, u'Miamisburg', u'20,181', u'Montgomery County'),
(136L, u'Moraine', u'6,307', u'Montgomery County'),
(157L, u'Oakwood', u'9,202', u'Montgomery County'),
(180L, u'Riverside', u'25,201', u'Montgomery County'),
(210L, u'Trotwood', u'24,431', u'Montgomery County'),
(220L, u'Vandalia', u'15,246', u'Montgomery County'),
(230L, u'West Carrollton', u'13,143', u'Montgomery County')]


Brendan Scott: Permission problems with Kivy/Android

 ∗ Planet Python

Kivy is not very helpful in helping you track down permission problems.  If everything else seems working fine, make sure that you have the right Android permissions to access the relevant Android services/hardware.  This is the log file for not having the CAMERA permission: I/python  (11009):    File “[]/android_zbar_qrcode_master/.buildozer/android/app/”, line 170, in startI/python  (11009):    File “jnius_export_class.pxi”, […]



 ∗ The Go Programming Language Blog


What happens in Portland in July? OSCON! At this year's conference, Go was more present than ever before, with five talks, two workshops, a Birds of a Feather session, and a meetup.


Matt Stine talked about his experience switching from Java to Go with A recovering Java developer learns Go while Steve Francia presented Painless Data Storage with MongoDB and Go. Steve also presented Go for Object Oriented Programmers, where he explained how some object oriented concepts can be implemented in Go.

Finally, Josh Bleecher Snyder talked about his experience writing tools to work with Go source code in Gophers with hammers, and Francesc Campoy talked about all the things that could have gone wrong and what the Go team did to prevent them Inside the Go playground.


At the beginning of OSCON's workshop day, Steve Francia presented how to build a web application and a CLI tool during Getting started with Go to a big room full of Gophers.

In the afternoon, Chris McEniry gave his Quick introduction to system tools programming with Go where he went over some useful skills to write system tools using Go and its standard library.

Additional events

To take advantage of the increased Gopher population in Portland during OSCON, we organized two extra events: the first PDXGolang meetup and a Birds of a Feather session.

At the meetup Francesc Campoy talked about Go Best Practices and Kelsey Hightower gave a great introduction to Kubernetes, a container management system for clusters written in Go by Google. If you live in Portland, make sure you join the group and come along to the next meeting.

The "Birds of a Feather" (or, more aptly, "Gophers of a Feather") was a lot of fun for everyone involved. We hope to see more of you there next year.

In conclusion

Thanks to all the gophers that participated in OSCON. After the successes of this year we look forward to more Go fun at OSCON 2015.



An Event Apart: Atomic Design

 ∗ LukeW | Digital Product Design + Strategy

In his Atomic Design talk at An Event Apart in Chicago IL, Brad Frost talked about the benefits of design systems on the Web and shared some tools he's created to help people work quickly with responsive design. Here are my notes from his talk:

  • We've focused on designing Web pages for long time. The idea of the printed page doesn't make sense anymore.
  • Pixel perfection & designing the same experience for all devices is not possible. Phones, tablets, and laptops are not the same.
  • What are the building blocks of design? Things that go beyond typography and color choices. What interaction components are important?
  • There are a lot of frameworks available for responsive design but they come with issues: one-size fits all requirements, lookalike issues, potential for bloat and un-needed stuff, they might not do everything you need, you have to subscribe to someone else's code structures.
  • Responsive deliverables should look a lot like fully-functioning Twitter Bootstrap-style systems custom tailored for your clients' needs. -Dave Rupert
  • Instead of page-based designs. We need design systems. Lot of people have been exploring design systems and the processes needed to design like this.
  • Front-end style guides make things easier to test, provide you a better workflow and shared vocabulary. Examples exist from MailChimp, Starbucks, Yelp, and more.
  • More patterns create more problems. You need to dedicate people to create and manage a style library. Over time this becomes an auxiliary project which may be seen only as a designer/developer tool. These libraries are often incomplete and only serve present cases. Avoid developing a "spray of modules".

Atomic Design

  • At the end of the day, we're working with atoms -with atomic units. Atoms combine to form elements, elements create molecules, molecules create organisms, and so on. All matter is compromised of atoms.
  • Atoms on the Web are HTML elements like: labels, inputs, buttons, headers, etc. More abstract atoms include colors and fonts. These can't be broken down any further without loosing their meaning. Atoms are often not too useful on their own but they do allow you to see your global styles laid out at once.
  • The real power comes from combing atoms into molecules. An input field and button can be combined into things like a search box. Molecules are groups of atoms bonded together. They are more concrete than atoms and encourage a “do one thing and do it well” approach.
  • Organisms are groups of molecules joined together to form a distinct section. They can consist of similar and/or different molecule types.
  • Templates allow you place organisms inside of Web pages. The begin life was wireframes and increase their fidelity over time. Templates are client facing and eventually become the deliverable/production code.
  • Pages are a specific instance of a template. They are high fidelity when representational content is replaced with real content. Pages test the effectiveness of a template: how does it scale and stretch to different kinds of content.
  • Atomic design allows us to traverse from abstract to concrete. Creators can focus on the atoms and molecules and Clients can focus on pages and templates.

Pattern Lab

  • Pattern Lab is a comprehensive custom component library, a pattern starter kit, a design system builder, a practical viewport resizer, and a design annotation tool.
  • Pattern Lab is not a UI framework.
  • Pattern Lab allows you to style up individual components as you build pages quickly using includes built with Mustache (logic-less templates). The page can be assembled fast through the pre-built components (organisms) in Pattern Lab. So right away you can test the effectiveness of templates. At the same time, people can be working on refining designs and more.
  • Templates allow you to see the structure of content without filling it with real content (up front). Includes within each template allow you to stitch components tougher quickly and see how things fit together. A local JSON file can be used to fill in these includes with real data/content.
  • Pattern Lab includes Ish which is a viewport resizer that gives you a different sized screen in small, medium, large sizes so you don't get stuck on specific device sizes like other tools suggest (480, 768, 1024, etc.).
  • Annotations allow you to make specific notes on interface components. This explains some of the design decisions made in an interface.
  • Lineage gives you a list of where components are used in your site.
  • Why use Pattern Lab? It fills the post-PSD void and serves as a hub for the entire design process for everyone: information architects, designers, developers, and clients. It allows you to easily traverse from abstract to concrete. You can write whatever HTML/CSS/JS as you please. Pattern Lab encourages flexibility and document as you go.


  • What's the hardest part of responsive Web design? Most people say people and process.
  • Set the right expectations. Our reality still consists of design review meetings where people look over static mock-ups. But we can't sell Web sites like paintings. They are living things that fill a variety of containers.
  • The waterfall process doesn't allow designers and developers to collaborate closely. This model is broken, we need to work together along the way.
  • Gather information through interviews, analytics, style inventories, and more to collect everything you need.
  • Do an interface inventory: document your interface, promote consistency, establish scope, and lay the groundwork for a future style guide or pattern library.
  • Establish direction: define site-wide patterns and start pulling these components into a Pattern Lab environment. This allows everyone to keep working on their various components: type, IA, colors, etc.
  • Typecast is a tool that allows you to try out a number of different typefaces on common interface elements. This helps isolate decisions.
  • This kind of iterative approach lets you keep iterating and making decisions as you go. Design, build, test, repeat. When you're finished changing, you're finished.
  • Collaboration and communication trumps deliverables.

An Event Apart: Designing in the Space Between Devices

 ∗ LukeW | Digital Product Design + Strategy

In his Mind the Gap: Designing in the Space Between Devices talk at An Event Apart in Chicago IL 2014, Josh Clark talked about the opportunities and challenges of designing interactions between devices. Here are my notes from his talk:

  • We're starting to get a better handle on how to get our services and information on our screens. But we need to also consider the gaps between our devices.
  • 15 years ago, Palm Pilots made it really easy to beam contact information between devices. This has gotten harder today. We have multiple devices and multiple standards to contend with.
  • In the UK, the people switch devices 21 times per hour (TV and smartphone). 90% of people move between devices and use them simultaneously.
  • People hack connections between devices: search, email, and text. The systems we have aren't ready to cross the gap between devices.
  • We are still using "remote control" style interactions to get things between devices. We still beam things to screens today.
  • Synching content has gotten much better for static content. But our challenge is to create interactions that allow us to shift behaviors and content across devices.
  • Apple recently announced Handoff that always you to move tasks seamlessly between devices: start an email on your phone, continue you it instantly on your laptop.
  • For sequenced tasks, we can use inputs like cameras to bring up content on different screens.
  • For simultaneous tasks, we can use technologies like Web sockets and Web RTC to create browser to browser interactions that happen across devices in real time.
  • Its not enough to just share content across devices, not just content. How do we sync verbs not just nouns?
  • Bluetooth LE provides identification of devices so they can interact with objects.
  • How can we have computers literally "talk" to each other? Use can use Web audio to beam a unique ultrasound signal form one device to another. That can trigger actions or even authentication.
  • Screens encumber and constrain us. They create opportunity but anchor and distract us from the real world.
  • Don't design for screens, design for people.
  • Make use of physical real-world interactions. Examples: DrumPants allow you to tap your pants to make sounds. Our goal isn't to make things frivilous or silly but designing toys allow us to explore and learn.
  • The physical nature of interfaces can make them feel more natural. Proximity has a lot of potential as an interaction model.
  • Cross-device interactions are not challenges of technology today. They're challenges of imagination. We have the tools to make this happen today but need to apply ourselves to this kind of design.
  • We have so much technology, we can create new magic on a daily basis.

Digital and Physical

  • Digital interfaces have been becoming more physical over the past few years as we've created more mobile computers.
  • At the same time, physical objects are become digital. It is increasingly easy to add connectivity and sensors to everyday objects. Anything can become a source of data and a controller of data.
  • Our job as designers to help people be as lazy as possible.
  • Anything that can be connected will be connected. Everyday objects are now digital gadgets. How can I usefully sync actions across all of these objects.
  • Physical things have digital representations: user reviews, histories, etc.
  • Social machines: what does it mean when devices are participating in our digital lives? When our objects and places are participating in our digital lives. We don't want noisy interfaces in our lives: don't make everything full of personality.
  • Software makes hardware scale. An endless variety of content can come from devices connected to people. The lifespan and durability of hardware can be extended through software.
  • LG has created a chat interface for their appliances. You can chat with them to collect information and start/stop tasks.
  • Connected devices won't always say nice things. They can be hacked like other devices.
  • Cars, appliances, and more can have security and privacy issues we need to be mindful of as well.
  • Software is ideology, embedded with values. The genie is out of the bottle, the systems are out there already. So it is up to us to do the right thing. Be conscious about the behavior your interfaces shape.
  • Honor intention but don't assume it. Knowing the physical facts, isn't enough to assume what people want to do.
  • We're already switching between devices. Plan for it in your designs.
  • Don't just sync content and files, sync status, actions and processes.
  • Peer to peer technologies (even in the browser) can make connections between devices.
  • Think about how you can move interactions off of screens and onto sensors. It's not a challenge of technology.

An Event Apart: UX Strategy Means Business

 ∗ LukeW | Digital Product Design + Strategy

In his presentation at the An Event Apart in Chicago IL, Jared Spool walked through the importance of content and user experience for businesses. Here are my notes from his talk:

Content Maters

  • Everything we make has content in it. Everything we make has design. We can't silo these things. Everything we design is a combination of content and the experience of interacting with that content and service.
  • Content and user experience cannot be separated. The delivery of content is as important as the content itself. Great content plus great design equals great user experience.
  • Content is at the center of many experiences success and failures. Apple's iOS 6 maps had great interaction design but poor content, which made it an overall failure. They underestimated the complexity of mapping content.
  • Google had over 10,000 people years invested in correcting mapping errors. That's quite a head start. Apple did not see the problem in Maps coming. It cost them dearly.
  • Delivering the content is as important as the content itself.
  • We use the word strategy all the time -so what does it really mean? A strategy is a plan to achieve a desired outcome. When things don't work, do we have the right strategy?
  • UX strategy can't predict outcomes. If your strategy can't predict outcomes, then the strategy is broken.
  • We have to go back to basics. With strategy, we have to go back to business models.

Business Models & Design

  • Amazon can make money even when they sell products at cost. They turn their inventory every 20 days. Best Buy turns it every 74 days. Standard retail payments are 45 days. Amazon has cash float which earns them interest while others are waiting for payment.
  • Business models are designed. Even non-profits need a business plan and a sustainable business model. They have to make a profit, they just don't distribute it to shareholders.
  • When we don't understand the business model, we can't design the right experience.
  • What are our business model options?
  • Executives care about 5 things: increase revenue, decrease cost, increase new business, increase existing business, increase shareholder value.
  • Take anything you design and evaluate it through these considerations. What levers is it moving and why?
  • Zappos allows you to easily return things you buy on the site. How does this map to business model considerations? Clear instructions and labels on how to return products allowed them to decrease support costs and to increase the amount of products people keep, which increases revenue. You can see the effect of design on these five business principles.
  • Not everything we do can fit into business priorities. Know how to map what you do to what creates value for your organization. This applies to businesses, academic institutions, government, and more.
  • We need to connect the dots between what we do and what executives care about (5 business priorities). Know what the organization is trying to do so you can support the business.


  • "Find the Content": go to an advertising supported Website and try to find the content.
  • Everything in an advertising model is designed to give you the experience you don't want. To distract you from what you do want.
  • Advertising may increase revenue, but they don't move other business priorities. There is constant tension between ads and experience and it is getting worse.
  • Out of 1,707 ads you may click on 17 (or less than .1%). Banner ads are clicked .04% of the time. 31% of ads are never seen by users (off screen, etc.). 50% of mobile ads are clicked on accident. You are 475x more likely to survive a plane crash than to click on a banner ad.
  • Advertising only increases revenue, it doesn't move any of the other levers in a business model.
  • When we don't pay for the product, we are the product.
  • On Walgreens web site, 58% of the clicks go to elements that take up 3.9% of the space on a screen.
  • Advertising is extortion. We well ads to advertisers then charge users to remove ads from their experience.
  • Can ads work? Yes. But in a specific context only.
  • Seducible moments are when you can get users to take action. Advertising in seducible moments can work. But invasive broad advertising doesn't.
  • Advertising should be the business model of last resort.

Business Models Beyond Advertising

  • New York Times allows people to read ten articles per month for free. After that, you need to pay for a subscription plan. The NYT now makes more money from the metered paywall than from ads.
  • Metered paywalls allow newspapers to earn more money than they do from advertising.
  • To make this model work, you need to have excellent content that people will pay for.
  • Re-purposed content, supporting product sales, in-app purchases, alternative channel revenue, content distribution, and more allow you to make money in different ways.
  • Repurposed content is selling content in different formats like blog posts in books.
  • In app sales bring in 3 times the revenue of advertising for mobile apps.
  • Become familiar with different business models, so you know the options your designs can move.

Creating Delightful Experiences

  • Poor content hurts our business dramatically. But does great content help the business?
  • Good content can add value to products.
  • In a study, adding a narrative to products increased their value.
  • Participants in a study, spent less of their budget on Walmart (89%) but much more at Crutchfield (237%). The difference was custom, well-written content.
  • Delightful content is not free. We need a business model to support it.
  • We can't just think in terms of screens and UIs, we need to understand the business as well. Map user experience and delightful content to the business you are trying to create.
  • The best UX strategists create the right experience by understanding how business works.

10 Years Ago in ALA: Pocket Sized Design

 ∗ A List Apart: The Full Feed

The web doesn’t do “age” especially well. Any blog post or design article more than a few years old gets a raised eyebrow—heck, most people I meet haven’t read John Allsopp’s “A Dao of Web Design” or Jeffrey Zeldman’s “To Hell With Bad Browsers,” both as relevant to the web today as when they were first written. Meanwhile, I’ve got books on my shelves older than I am; most of my favorite films came out before I was born; and my iTunes library is riddled with music that’s decades, if not centuries, old.

(No, I don’t get invited to many parties. Why do you ask oh I get it)

So! It’s probably easy to look at “Pocket-Sized Design,” a lovely article by Jorunn Newth and Elika Etemad that just turned 10 years old, and immediately notice where it’s beginning to show its age. Written at a time when few sites were standards-compliant, and even fewer still were mobile-friendly, Newth and Etemad were urging us to think about life beyond the desktop. And when I first re-read it, it’s easy to chuckle at the points that feel like they’re from another age: there’s plenty of talk of screens that are “only 120-pixels wide”; of inputs driven by stylus, rather than touch; and of using the now-basically-defunct handheld media type for your CSS. Seems a bit quaint, right?

And yet.

Looking past a few of the details, it’s remarkable how well the article’s aged. Modern users may (or may not) manually “turn off in-line image loading,” but they may choose to use a mobile browser that dramatically compresses your images. We may scoff at the idea of someone browsing with a stylus, but handheld video game consoles are impossibly popular when it comes to browsing the web. And while there’s plenty of excitement in our industry for the latest versions of iOS and Android, running on the latest hardware, most of the web’s growth is happening on cheaper hardware, over slower networks (PDF), and via slim data plans—so yes, 10 years on, it’s still true that “downloading to the device is likely to be [expensive], the processors are slow, and the memory is limited.”

In the face of all of that, what I love about Newth and Etemad’s article is just how sensible their solutions are. Rather than suggesting slimmed-down mobile sites, or investing in some device detection library, they take a decidedly standards-focused approach:

Linearizing the page into one column works best when the underlying document structure has been designed for it. Structuring the document according to this logic ensures that the page organization makes sense not only in Opera for handhelds, but also in non-CSS browsers on both small devices and the desktop, in voice browsers, and in terminal-window browsers like Lynx.

In other words, by thinking about the needs of the small screen first, you can layer on more complexity from there. And if you’re hearing shades of mobile first and progressive enhancement here, you’d be right: they’re treating their markup—their content—as a foundation, and gently layering styles atop it to make it accessible to more devices, more places than ever before.

So, no: we aren’t using @media handheld or display: none for our small screen-friendly styles—but I don’t think that’s really the point of Newth and Etemad’s essay. Instead, they’re putting forward a process, a framework for designing beyond the desktop. What they’re arguing is for a truly device-agnostic approach to designing for the web, one that’s as relevant today as it was a decade ago.

Plus ça change, plus c’est la même chose.


Monday, 25 August



Tomer Filiba: D for the Win

 ∗ Planet Python

I'm a convert! I've seen the light!

By the way, be sure to read part 2 as well.

You see, Python is nice and all and it excels in so many domains, but it was not crafted for the ever growing demands of the industry. Sure, you can build large-scale projects in Python (and I have built), but you take it out of the lab and into the real world, the price you pay is just too high. Literally. In terms of work per CPU cycle, you can't do worst.

The C10M problem is a reiteration of the C10K problem. In short, today's commodity hardware can handle millions of packets per second, but in reality you hardly ever reach such numbers. For example, I worked a short while at a company that used AWS and had tens of twisted-based Python servers accepting and logging requests (not doing any actual work). They managed to squeeze ~500 requests/sec out of this setup (per machine), which escalated in cost rather quickly. Moving to PyPy (not without trouble) did triple the numbers or so, but still, the cost simply didn't scale.

Python, I love you, but you help instill Gate's law -- "The speed of software halves every 18 months". In the end, we pay for our CPU cycles and we want to maximize our profit. It's not you, Guido, it's me. I've moved on to the C10M world, and for that I'd need a programming language that's designed for system programming with a strong and modern type system (after all, I love duck typing). I need to interface with external systems, so a C ABI is desirable (no foreign function interface), and meta-programming is a huge plus (so I won't need to incorporate cumbersome code-generation in my build system). Not to mention that mission-critical code can't allow for the occasional NameError or NoneType has no member __len__ exceptions. The code must compile.

I've looked into rust (nice, but will require a couple of years to mature enough for a large-scale project) and go (Google must be joking if they actually consider it for system programming), but as strange as it may sound, I've finally found what I've been looking for with D.

Dlang Dlang Über Alles

System programming is a vast ocean of specifics, technicalities and constraints, imposed by your specific needs. Instead of boring you to death with that, I thought it would be much more intriguing to compare D and Python. In other words, I'll try to show how D speaks fluent Python.

But first things first. In (the probable) case you don't know much D -- imagine it's what C++ would have dreamed to be. It offers cleaner syntax, much shorter compilation time, (optional) garbage collection, highly expressive templates and type inference, Pythonic operator overloading (implemented as rewriting), object-oriented and functional capabilities (multi-paradigm like Python), intermingles high-level constructs (like closures) with low-level ones (naked functions in inline assembly) to produces efficient code, has strong compile-time introspection capabilities and some extra cools features in the domain of code generation: mixin -- which evaluates an arbitrary string of D code at compile time, and CTFE -- compile-time function execution. Whoa, that was long.

In general, D follows Python's duck-typed (or protocol-oriented) spirit. If a type provides the necessary interface ("protocol") for an operation, it will just work, but you can also test for compliance at compile time. For example, ranges are a generalization of generators in Python. All you need to do in order to be an InputRange is implement bool empty(), void popFront() and auto front(), and you can use isInputRange!T to test whether T adheres the protocol. By the way, the exclamation point (!), which we'll soon get acquainted with, distinguishes compile-time arguments from runtime ones.

For brevity's sake, I'm not going to demonstrate all the properties I listed up there. Instead, I'll show why Python programmers ought to love D.

Case Study #1: Generating HTML

In an old blog post I outlined my vision of HTML templating languages: kill them all. I argued they are all but crippled-down forms of Python with an ugly syntax, so just give me Python and an easy way to programmatically manipulate the DOM.

I've later extended the sketch into a library in its own right, named srcgen. You can use it to generate HTML, C-like languages and Python/Cython code. I used it in many of my commercial projects when I needed to generate code.

So here's an excerpt of how's it done in srcgen:

def buildPage():
    doc = HtmlDocument()
    with doc.head():
        doc.title("das title") = "foobar", type="text/css")

    with doc.body():
        with doc.div(class_="mainDiv"):
            with doc.ul():
                for i in range(5):
                    with = str(i), class_="listItem"):
                        doc.text("I am bulletpoint #", i)

    return doc.render()

And here's how it's done in D:

auto buildPage() {
    auto doc = new Html();

    with (doc) {
        with (head) {
            title("das title");
            link[$.rel = "foobar", $.type = "text/css"];
        with (body_) {
            with(div[$.class_ = "mainDiv"]) {
                with (ul) {
                    foreach(i; 0 .. 5) {
                        with (li[$.id = i, $.class_ = "listItem"]) {
                            text("I am bulletpoint #");

    return doc.render();

You can find the source code on github, just keep in mind it's a sketch I wrote for this blog post, not a feature-complete library.

The funny thing is, Python's with and D's with are not even remotely related! The Python implementation builds a stack of context managers, while with in D merely alters symbol lookup. But lo and behold! The two versions are practically identical, modulo curly braces. You get the same expressive power in both.

Case Study #2: Construct

But the pinnacle is clearly my D version of Construct. You see, I've been struggling for many years to create a compiled version of Construct. Generating efficient, static code from declarative constructs would make the library capable of handling real-world data, like packet sniffing or processing of large files. In other words, you won't have to write a toy parser in Construct and then rewrite it (by hand) in C++.

The issues with my C version of Construct were numerous, but they basically boiled down to the fact I needed a stronger object model to represent strings, dynamic arrays, etc., and adapters. The real power of Construct comes from adapters, which operate at the representational ("DOM") level of the data, rather on its binary form. That required lambdas, closures and other higher-level concepts that C lacks. I even tried writing a Haskell version, given that Haskell is so high-level and functional, but my colleague and I had given hope after a while.

Last week, it struck me that D could be the perfect candidate: it has all the necessary high-level concepts while being able to generate efficient code with meta-programming. I began fiddling with a D version, which proved extremely promising. So without further ado, I present dconstruct -- an initial sketch of the library.

This is the canonical PascalString declaration in Python:

>>> pascal_string = Struct("pascal_string",
...     UBInt8("length"),
...     Array(lambda ctx: ctx.length, Field("data", 1),),
... )
>>> pascal_string.parse("\x05helloXXX")
Container({'length': 5, 'data': ['h', 'e', 'l', 'l', 'o']})
>>>, data="hello"))

And here's how it's done in D:

struct PascalString {
    Field!ubyte length;
    Array!(Field!ubyte, "length") data;

    // the equivalent of 'Struct' in Python,
    // to avoid confusion of keyword 'struct' and 'Struct'
    mixin Record;

PascalString ps;
auto stream = cast(ubyte[])"\x05helloXXXX".dup;
// {length: 5, data: [104, 101, 108, 108, 111]}

Through the use of meta-programming (and assuming inlining and optimizations), that code snippet there actually boils down to something like

struct PascalString {
    ubyte length;
    ubyte[] data;

    void unpack(ref ubyte[] stream) {
        length = stream[0];
        stream = stream[1 .. $]; // advance stream
        data = stream[0 .. length];
        stream = stream[length .. $];  // advance stream

Which is as efficient as it gets.

But wait, there's more! The real beauty here is how we handle the context. In Python, Construct builds a dictionary that travels along the parsing/building process, allowing constructs to refer to previously seen objects. This is possible in D too, of course, but it's highly inefficient (and not type safe). Instead, dconstruct uses a trick that's commonly found in template-enabled languages -- creating types on demand:

struct Context(T, U) {
    T* _curr;
    U* _;
    alias _curr this;   // see below

auto linkContext(T, U)(ref T curr,  ref U parent) {
    return Context!(T, U)(&curr, &parent);

The strange alias _curr this is a lovely feature of D known as subtyping. It basically means that any property that doesn't exist at the struct's scope will we forwarded to _curr, e.g., when I write and myCtx has no member named foo, the code is rewritten as

As we travel along constructs, we link the current context with its ancestor (_). This means that for each combination of constructs, and at each nesting level, we get a uniquely-typed context. At runtime, this context is nothing more than a pair of pointers, but at compile time it keeps us type-safe. In other words, you can't reference a nonexistent field and expect the program to compile.

A more interesting example would thus be

struct MyStruct {
    Field!ubyte length;
    YourStruct child;

    mixin Record;

struct YourStruct {
    Field!ubyte whatever;
    Array!(Field!ubyte, "_.length") data;  // one level up, then 'length'

    mixin Record;

MyStruct ms;

When we unpack MyStruct (which recursively unpacks YourStruct), a new context ctx will be created with ctx._curr=&ms.child and ctx._=&ms. When YourStruct refers to "_.length", the string is implanted into ctx, yielding ctx._.length. If we refered to the wrong path or misspelled anything, it would simply not compile. That, and you don't need dictionary lookups at runtime -- it's all resolved during compilation.

So again, this is a very preliminary version of Construct, miles away from production grade, but you can already see where it's going.

By the way, you can try out D online at dpaste and even play around with my demo version of dconstruct over there.

In Short

Python will always have a special corner in my heart, but as surprising as it may be (for a guy who's made his career over Python), this rather unknown, rapidly-evolving language, D, has become my new language of choice. It's expressive, concise and powerful, offers short compilation times (as opposed to C++) and makes programming both fun and efficient. It's the language for the C10M age.



Gocept Weblog: September, 18th–20th: DevOps Sprint

 ∗ Planet Python

Since we have a strong history in web development, but also were involved in operating web applications we developed, the DevOps movement hit our nerves.

Under the brand name “Flying Circus” we are establishing a platform respecting the DevOps principles.

A large portion of our day-to-day work is dedicated to DevOps related topics. We like to collaborate by sharing ideas and work on tools we all need to make operations and development of web applications a smooth experience. A guiding question: how can we improve the operability of web applications?

A large field of sprintable topics comes to our mind:


Enable web application developers to integrate logging mechanisms into their apps easily. By using modern tools like Logstash for collecting and analyzing of the data, operators are able to find causes performance or other problems efficiently.

Live-Debugging and Monitoring

Monitoring is a must when operation software. At least for some people (including ourselves), Nagios is not the best fit for DevOps teams.


We always wanted to have reproducable automated deployments. Coming from the Zope world, started with zc.buildout, we developed our own deployment tool batou. More recently upcoming projects, such as ansible, and tools (more or less) bound to cloud services like heroku.


After using bacula for a while, we started to work on backy, which aims to work directly on volume files of virtual machines.

and more…

Join us to work on these things and help to make DevOps better! The sprint will take place at our office, Forsterstraße 29, Halle (Saale), Germany. On September, 20th we will have a great party in the evening.

If you want to attend, please sign up on



For your stay in Halle, we can recommend the following Hotels: “City Hotel am Wasserturm”, “Dorint Hotel Charlottenhof”, “Dormero Hotel Rotes Ross”. For those on budget, there is the youth hostel Halle ( Everything is in walking distance from our office.



Rob Galanakis: GeoCities and the Qt Designer

 ∗ Planet Python

In a review of my book, Practical Maya Programming with Python, reviewer W Boudville suggests my advice of avoiding the Qt Designer is backwards-looking and obsolete, such as writing assembler instead of C for better performance, or using a text file to design a circuit instead of a WYSIWYG editor. I am quite sure he (assuming it is a he) isn’t the only person with such reservations.

Unfortunately, the comparison is not at all fair. Here’s a more relevant allegory:

Hey, did you hear about this awesome thing called geocities? You can build a website by just dragging and dropping stuff, no programming required!

We’ve had WYSIWYG editors for the web for about two decades (or longer?), yet I’ve never run into a professional who works that way. I think WYSIWYG editors are great for people new to GUI programming or a GUI framework, or for mock-ups, but it’s much more effective to do production GUI work through code. Likewise, we’ve had visual programming systems for even longer, but we’ve not seen one that produces a result anyone would consider maintainable. Sure, we’ve had some luck creating state machine tools, but we are nowhere close for the more general purpose logic required in a UI. And even these state machine tools are only really useful when they have custom nodes written in code.

Finally, WYSIWYG editors can be useful in extremely verbose frameworks or languages. I wouldn’t want to use WinForms in C# without the Visual Studio Designer. Fortunately for Pythonistas, PySide and PyQt are not WinForms!

I have no doubt that at some point WYSIWYG editors will become useful for GUI programming. Perhaps it will require 3D displays or massively better libraries. I don’t know. But for today and the foreseeable future, I absolutely discourage the use of the Qt Designer for creating production GUIs with Python.


Running Go Applications in the Background

 ∗ go, web, go

A regular question on the go-nuts mailing list, in the #go-nuts IRC channel and on StackOverflow seems to be: how do I run my Go application in the background? Developers eventually reach the stage where they need to deploy something, keep it running, log it and manage crashes. So where to start?

There's a huge number of options here, but we'll look at a stable, popular and cross-distro approach called Supervisor. Supervisor is a process management tool that handles restarting, recovering and managing logs, without requiring anything from your application (i.e. no PID files!).


We're going to assume a basic understanding of the Linux command line, which in this case is understanding how to use a text-editor like vim, emacs or even nano, and the importance of not running your application as root—which I will re-emphasise throughout this article! We're also going to assume you're on an Ubuntu 14.04/Debian 7 system (or newer), but I've included a section for those on RHEL-based systems.

I should also head off any questions about daemonizing (i.e. the Unix meaning of daemonize) Go applications due to interactions with threaded applications and most systems (aka Issue #227).

Note: I'm well aware of the "built in" options like Upstart (Debian/Ubuntu) and systemd (CentOS/RHEL/Fedora/Arch). I'd even originally wrote this article so that it provided examples for all three options, but it wasn't opinionated enough and was therefore confusing for newcomers (at whom this article is aimed at).

For what it's worth, Upstart leans on start-stop-daemon too much for my liking (if you want it to work across versions), and although I really like systemd's configuration language, my primary systems are running Debian/Ubuntu LTS so it's not a viable option (until next year!). Supervisor's cross-platform nature, well documented configuration options and extra features (log rotation, email notification) make it well suited to running production applications (or even just simple side-projects).

Installing Supervisor

I've been using Supervisor for a long while now, and I'm a big fan of it's centralised approach: it will monitor your process, restart it when it crashes, redirect stout to a log file and rotate that all within a single configuration.

There's no need to write a separate logrotated config, and there's even a decent web-interface (that you should only expose over authenticated HTTPS!) included. The project itself has been around 2004 and is well maintained.

Anyway, let's install it. The below will assume Ubuntu 14.04, which has a recent (>= 3.0) version of Supervisor. If you're running an older version of Ubuntu, or an OS that doesn't package a recent version of Supervisor, it may be worth installing it via pip and writing your own Upstart/systemd service file.

$ sudo apt-get install supervisor

Now, we also want our application user to be able to invoke supervisorctl (the management interface) as necessary, so we'll need to create a supervisor group, make our user a member of that group and modify Supervisor's configuration file to give the supervisor group the correct permissions on the socket.

$ sudo addgroup --system supervisor
# i.e. 'sudo adduser deploy supervisor'
$ sudo adduser <yourappuser> supervisor
$ logout
# Log back in and confirm which should now list 'supervisor':
$ groups

That's the group taken care of. Let's modify the Supervisor configuration file to take this into account:

chmod=0770                       # ensure our group has read/write privs
chown=root:supervisor            # add our group


supervisor.rpcinterface_factory = supervisor.rpcinterface:make_main_rpcinterface


files = /etc/supervisor/conf.d/*.conf # default location on Ubuntu

And now we'll restart Supervisor:

$ sudo service supervisor restart

If it doesn't restart, check the log with the below:

$ sudo tail /var/log/supervisor/supervisord.log

Typos are the usual culprit here. Otherwise, with the core configuration out of the way, let's create a configuration for our Go app.

Configuring It

Supervisor is infinitely configurable, but we'll aim to keep things simple. Note that you will need to modify the configuration below to suit your application: I've commented the lines you'll need to change.

Create a configuration file at the default (Ubuntu) includes directory:

# where 'mygoapp' is the name of your application
$ sudo vim /etc/supervisor/conf.d/mygoapp.conf 

... and pull in the below:

command=/home/yourappuser/bin/yourapp # the location of your app
user=yourappuser # the user your app should run as (i.e. *not* root!)
directory=/srv/www/ # where your application runs from
environment=APP_SETTINGS="/srv/www/" # environmental variables
stdout_logfile=/var/log/supervisor/yourapp.log # the name of the log file.

Let's step through it:

  • user is who we want the application to run as. I typically create a "deploy" user for this purpose. We should never run an Internet-facing application as root, so this is arguably the most important line of our configuration.
  • logfile_maxbytes and logfile_backups handle log rotation for us. This saves us having to learn another configuration language and keeps our configuration in one place. If your application generates a lot of logs (say, HTTP request logs) then it may be worth pushing maxbytes up a little.
  • autostart runs our program when supervisord starts (on system boot)
  • autorestart=true will restart our application regardless of the exit code.
  • startretries will attempt to restart our application if it crashes.
  • environment defines the environmetal variables to pass to the application. In this case, we tell it where the settings file is (a TOML config file, in my case).
  • redirect_stderr will re-direct error output to our log file. You can keep a separate error log if your application generates significant amounts of log data (i.e. HTTP requests) via stdout.

Now, let's reload Supervisor so it picks up our app's config file, and check that it's running as expected:

$ supervisorctl reload
$ supervisorctl status yourapp

We should see a "running/started" message and our application should be ready to go. If not, check the logs in /var/log/supervisor/supervisord.log or run supervisorctl tail yourapp to show our application logs. A quick Google for the error message will go a long way if you get stuck.


If you're running CentOS 7 or Fedora 20, the directory layout is a little different than Ubuntu's (rather, Ubuntu has a non-standard location), so keep that in mind. Specifically:

  • The default configuration file lives at /etc/supervisord.conf
  • The includes directory lives at /etc/supervisord.d/

Otherwise, Supervisor is much the same: you'll need to install it, create a system group, add your user to the group, and then update the config file and restart the service using sudo systemctl restart supervisord.


Pretty easy, huh? If you're using a configuration management tool (i.e. Ansible, Salt, et. al) for your production machines, then it's easy to automate this completely, and I definitely recommend doing so. Being able to recreate your production environment like-for-like after a failure (or moving hosts, or just for testing) is a Big Deal and worth the time investment.

It's also easy to see from this guide how easy it is to add more Go applications to Supervisor's stable: add a new configuration file, reload Supervisor, and off you go. You can choose how aggressive restarts need to be, log rotations and environmental variables on a per-application basis, which is always useful.

HTTP Request Contexts & Go

 ∗ go, web, go

Alternatively titled map[string]interface.

Request contexts, for those new to the terminology, are typically a way to pass data alongside a HTTP request as it is processed by composable handlers (or middleware). This data could be a user ID, a CSRF token, whether a user is logged in or not—something typically derived from logic that you don't want to repeat over-and-over again in every handler. If you've ever used Django, the request context is synonymous with the request.META dictionary.

As an example:

func CSRFMiddleware(http.Handler) http.Handler {
    return func(w http.ResponseWriter, r *http.Request) {
        maskedToken, err := csrf.GenerateNewToken(r)
        if err != nil {
            http.Error(w, "No good!", http.StatusInternalServerError)

        // How do we pass the maskedToken from here...

func MyHandler(w http.ResponseWriter, r *http.Request) {
    // ... to here, without relying on the overhead of a session store,
    // and without either handler being explicitly tied to the other?
    // What about a CSRF token? Or an auth-key from a request header?
    // We certainly don't want to re-write that logic in every handler!

There's three ways that Go's web libraries/frameworks have attacked the problem of request contexts:

  1. A global map, with *http.Request as the key, mutexes to synchronise writes, and middleware to cleanup old requests (gorilla/context)

  2. A strictly per-request map by creating custom handler types (goji)

  3. Structs, and creating middleware as methods with pointer receivers or passing the struct to your handlers (gocraft/web).

So how do these approaches differ, what are the benefits, and what are the downsides?

Global Context Map

gorilla/context's approach is the simplest, and the easiest to plug into an existing architecture.

Gorilla actually uses a map[interface{}]interface{}, which means you need to (and should) create types for your keys. The benefit is that you can use any types that support equality as a key; the downside is that you need to implement your keys in advance if you want to avoid any run-time issues with key types.

You also often want to create setters for the types you store in the context map, to avoid littering your handlers with the same type assertions.

import (

type contextKey int

// Define keys that support equality.
const csrfKey contextKey = 0
const userKey contextKey = 1

var ErrCSRFTokenNotPresent = errors.New("CSRF token not present in the request context.")

// We'll need a helper function like this for every key:type
// combination we store in our context map else we repeat this
// in every middleware/handler that needs to access the value.
func GetCSRFToken(r *http.Request) (string, error) {
    val, ok := context.GetOk(r, csrfKey)
    if !ok {
        return "", ErrCSRFTokenNotPresent

    token, ok := val.(string)
    if !ok {
        return "", ErrCSRFTokenNotPresent

    return token, nil

// A bare-bones example
func CSRFMiddleware(h http.Handler) http.Handler {
    return func(w http.ResponseWriter, r *http.Request) {
        token, err := GetCSRFToken(r)
        if err != nil {
            http.Error(w, "No good!", http.StatusInternalServerError)

        // The map is global, so we just call the Set function
        context.Set(r, csrfKey, token)

        h.ServeHTTP(w, r)

func ShowSignupForm(w http.ResponseWriter, r *http.Request) {
    // We'll use our helper function so we don't have to type assert
    // the result in every handler that triggers/handles a POST request.
    csrfToken, err := GetCSRFToken(r)
    if err != nil {
        http.Error(w, "No good!", http.StatusInternalServerError)

    // We can access this token in every handler we wrap with our
    // middleware. No need to set/get from a session multiple times per
    // request (which is slow!)
    fmt.Fprintf(w, "Our token is %v", csrfToken)

func main() {
    r := http.NewServeMux()
    r.Handle("/signup", CSRFMiddleware(http.HandlerFunc(ShowSignupForm)))
    // Critical that we call context.ClearHandler here, else
    // we leave old requests in the map.
    http.ListenAndServe("localhost:8000", context.ClearHandler(r))

Full Example

The plusses? It's flexible, loosely coupled, and easy for third party packages to use. You can tie it into almost any net/http application since all you need is access to http.Request—the rest relies on the global map.

The downsides? The global map and its mutexes may result in contention at high loads, and you need to call context.Clear() at the end of every request (i.e. on each handler). Forget to do that (or wrap your top-level server handler) and you'll open yourself up to a memory leak where old requests remain in the map. If you're writing middleware that uses gorilla/context, then you need to make sure your package user imports context calls context.ClearHandler on their handlers/router.

Per Request map[string]interface

As another take, Goji provides a request context as part of an (optional) handler type that embeds Go's usual http.Handler. Because it's tied to Goji's (fast) router implementation, it no longer needs to be a global map and avoids the need for mutexes.

Goji provides a web.HandlerFunc type that extends the default http.HandlerFunc with a request context: func(c web.C, w http.ResponseWriter, r *http.Request).

var ErrTypeNotPresent = errors.New("Expected type not present in the request context.")

// A little simpler: we just need this for every *type* we store.
func GetContextString(c web.C, key string) (string, error) {
    val, ok := c.Env[key].(string)
    if !ok {
        return "", ErrTypeNotPresent

    return val, nil

// A bare-bones example
func CSRFMiddleware(c *web.C, h http.Handler) http.Handler {
    fn := func(w http.ResponseWriter, r *http.Request) {
        maskedToken, err := GenerateToken(r)
        if err != nil {
            http.Error(w, "No good!", http.StatusInternalServerError)

        // Goji only allocates a map when you ask for it.
        if c.Env == nil {
            c.Env = make(map[string]interface{})

        // Not a global - a reference to the context map
        // is passed to our handlers explicitly.
        c.Env["csrf_token"] = maskedToken

        h.ServeHTTP(w, r)

    return http.HandlerFunc(fn)

// Goji's web.HandlerFunc type is an extension of net/http's
// http.HandlerFunc, except it also passes in a request
// context (aka web.C.Env)
func ShowSignupForm(c web.C, w http.ResponseWriter, r *http.Request) {
    // We'll use our helper function so we don't have to type assert
    // the result in every handler.
    csrfToken, err := GetContextString(c, "csrf_token")
    if err != nil {
        http.Error(w, "No good!", http.StatusInternalServerError)

    // We can access this token in every handler we wrap with our
    // middleware. No need to set/get from a session multiple times per
    // request (which is slow!)
    fmt.Fprintf(w, "Our token is %v", csrfToken)

Full Example

The biggest immediate gain is the performance improvement, since Goji only allocates a map when you ask it to: there's no global map with locks. Note that for many applications, your database or template rendering will be the bottleneck (by far), so the "real" impact is likely pretty small, but it's a sensible touch.

Most useful is that you retain the ability to write modular middleware that doesn't need further information about your application: if you want to use the request context, you can do so, but for anything else it's just http.Handler. The downside is that you still need to type assert anything you retrieve from the context, although like gorilla/context we can simplify this by writing helper functions. A map[string]interface{} also restricts us to string keys: simpler for most (myself included), but potentially less flexible for some.

Context Structs

A third approach is to initialise a struct per-request and define our middleware/handler as methods on the struct. The big plus here is type-safety: we explicitly define the fields of our request context, and so we know the type (unless we do something naive like setting a field to interface{}).

Of course, what you gain in type safety you lose in flexibility. You can't create "modular" middleware that uses the popular func(http.Handler) http.Handler pattern, because that middleware can't know what your request context struct looks like. It could provide it's own struct that you embed into yours, but that still doesn't solve re-use: not ideal. Still, it's a good approach: no need to type assert things out of interface{}.

import (


type Context struct {
    CSRFToken string
    User      string

// Our middleware *and* handlers must be defined as methods on our context struct,
// or accept the type as their first argument. This ties handlers/middlewares to our
// particular application structure/design.
func (c *Context) CSRFMiddleware(w web.ResponseWriter, r *web.Request, next web.NextMiddlewareFunc) {
    token, err := GenerateToken(r)
    if err != nil {
        http.Error(w, "No good!", http.StatusInternalServerError)

    c.CSRFToken = token
    next(w, r)

func (c *Context) ShowSignupForm(w web.ResponseWriter, r *web.Request) {
    // No need to type assert it: we know the type.
    // We can just use the value directly.
    fmt.Fprintf(w, "Our token is %v", c.CSRFToken)

func main() {
    router := web.New(Context{}).Middleware((*Context).CSRFMiddleware)
    router.Get("/signup", (*Context).ShowSignupForm)

    err := http.ListenAndServe(":8000", router)
    if err != nil {

Full Example

The plus here is obvious: no type assertions! We have a struct with concrete types that we initialise on every request and pass to our middleware/handlers. But the downside is that we can no longer "plug and play" middleware from the community, because it's not defined on our own context struct.

We could anonymously embed their type into ours, but that starts to become pretty messy and doesn't help if their fields share the same names as our own. The real solution is to fork and modify the code to accept your struct, at the cost of time/effort. gocraft/web also wraps the ResponseWriter interface/Request struct with its own types, which ties things a little more closely to the framework itself.

How Else?

One suggestion would be to provide a Context field on Go's http.Request struct, but actually implementing it in a "sane" way that suits the common use case is easier said than done.

The field would likely end up being a map[string]interface{} (or with interface{} as the key). This means that we either need to initialise the map for users—which won't be useful on all of those requests where you don't need to use the request context. Or require package users to check that the map is initialised before using it, which can be a big "gotcha" for newbies who will wonder (at first) why their application panics on some requests but not others.

I don't think these are huge barriers unto themselves, but Go's strong preference for being clear and understandable—at the cost of a little verbosity now and then—is potentially at odds with this approach. I also don't believe that having options in the form of third-party packages/frameworks is a Bad Thing either: you choose the approach that best fits your idioms or requirements.


So which approach should you choose for your own projects? It's going to depend on your use-case (as always). Writing a standalone package and want to provide a request context that the package user can easily access? gorilla/context is probably going to be a good fit (just document the need to call ClearHandler!). Writing something from scratch, or have a net/http app you want to extend easily? Goji is easy to drop in. Starting from nothing? gocraft/web's "inclusive" approach might fit.

Personally, I like Goji's approach: I don't mind writing a couple of helpers to type-assert things I commonly store in a request context (CSRF tokens, usernames, etc), and I get to avoid the global map. It's also easy for me to write middleware that others can plug into their applications (and for me to use theirs). But those are my use cases, so do your research first!

Custom Handlers and Avoiding Globals in Go Web Applications

 ∗ go, web, go

Go's net/http package is extremely flexible, thanks to the fact that it centres on the http.Handler interface. Building around an interface gives you the option of both extending the included implementation and keeping it compatible with other packages out in the wild. Given that the default implementation is pretty simple, we'll look at how we can build our own handler type (to remove error handling repetition), and how to extend it so we can explicitly pass a "context" containing our database pool, template map, a custom logger and so on, letting us remove any reliance on global variables.

Creating Our Custom Handler Type

net/http provides a basic HandlerFunc type that is just func(w http.ResponseWriter, r *http.Request). It's easy to understand, pervasive, and covers most simple use-cases. But for anything more than that, there's two immediate "issues": a) we can't pass any additional parameters to http.HandlerFunc, and b) we have to repeat a lot of error handling code in each handler. If you're new to Go it may not seem immediately obvious how to resolve this but still retain compatibility with other HTTP packages, but thankfully it's an easy problem to solve.

We create our own handler type that satisfies http.Handler (read: it has a ServeHTTP(http.ResponseWriter, *http.Request) method), which allows it to remain compatible with net/http, generic HTTP middleware packages like nosurf, and routers/frameworks like gorilla/mux or Goji.

First, let's highlight the problem:

func myHandler(w http.ResponseWriter, r *http.Request) {
    session, err := store.Get(r, "myapp")
    if err != nil {
        http.Error(w, http.StatusText(http.StatusInternalServerError), http.StatusInternalServerError)
        return // Forget to return, and the handler will continue on

    id := // get id from URL param; strconv.AtoI it; making sure to return on those errors too...
    post := Post{ID: id}
    exists, err := db.GetPost(&post)
    if err != nil {
        http.Error(w, http.StatusText(http.StatusInternalServerError), http.StatusInternalServerError)
        return // Repeating ourselves again 

    if !exists {
        http.Error(w, http.StatusText(http.StatusNotFound), http.StatusNotFound)
        return // ... and again.

    err = renderTemplate(w, "post.tmpl", post)
    if err != nil {
        // Yep, here too...

Things are not only verbose (we have to do this in every handler), but we're at the risk of a subtle and hard-to-catch bug. If we don't explicitly return when we encounter an error—such as a serious database error or when a password comparison fails—our handler will continue. At best this might mean we render an empty struct to our template and confuse the user. At worst, this might mean we write a HTTP 401 (Not Authorised) response and then continue to do things that (potentially) only a logged in user should see or be able to do.

Thankfully, we can fix this pretty easily by creating a handler type that returns an explicit error:

type appHandler func(http.ResponseWriter, *http.Request) (int, error)

// Our appHandler type will now satisify http.Handler 
func (fn appHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    if status, err := fn(w, r); err != nil {
        // We could also log our errors centrally:
        // i.e. log.Printf("HTTP %d: %v", err)
        switch status {
        // We can have cases as granular as we like, if we wanted to
        // return custom errors for specific status codes.
        case http.StatusNotFound:
            notFound(w, r)
        case http.StatusInternalServerError:
            http.Error(w, http.StatusText(http.StatusInternalServerError), http.StatusInternalServerError)
            // Catch any other errors we haven't explicitly handled
            http.Error(w, http.StatusText(http.StatusInternalServerError), http.StatusInternalServerError)

func myHandler(w http.ResponseWriter, r *http.Request) (int, error) {
    session, err := store.Get(r, "myapp")
    if err != nil {
        // Much better!
        return http.StatusInternalServerError, err

    post := Post{ID: id}
    exists, err := db.GetPost(&post)
    if err != nil {
        return http.StatusInternalServerError, err

    // We can shortcut this: since renderTemplate returns `error`,
    // our ServeHTTP method will return a HTTP 500 instead and won't 
    // attempt to write a broken template out with a HTTP 200 status.
    // (see the postscript for how renderTemplate is implemented)
    // If it doesn't return an error, things will go as planned.
    return http.StatusOK, renderTemplate(w, "post.tmpl", data)

func main() {
    // Cast myHandler to an appHandler
    http.Handle("/", appHandler(myHandler))
    http.ListenAndServe(":8000", nil)

This is, of course, nothing new: Andrew Gerrand highlighted a similar approach on the Go blog back in 2011. Our implementation is just an adaptation with a little extra error handling. I prefer to return (int, error) as I find it more idiomatic than returning a concrete type, but you could certainly create your own error type if you wished (but let's just keep it simple for now).

Extending Our Custom Handler Further

A quick aside: global variables get a lot of hate: you don't control what can modify them, it can be tricky to track their state, and they may not be suitable for concurrent access. Still, used correctly they can be convenient, and plenty of Go docs & projects lean on them (e.g. here & here). database/sql's *sql.DB type can be safely used as a global as it represents a pool and is protected by mutexes, maps (i.e. template maps) can be read from (but not written to, of course) concurrently, and session stores take a similar approach to database/sql.

After being inspired by @benbjohnson's article from last week on structuring Go applications and a debate with a fellow Gopher on Reddit (who takes a similar approach), I figured I'd take a look at my codebase (which has a few globals of the above types) and refactor it to explicitly pass a context struct to my handlers. Most of it was smooth sailing, but there's a couple of potential pitfalls you can run into if you want your context instance to be available in more than just the handlers themselves.

Here's the actual global variables I had before:

var (
    decoder   *schema.Decoder
    bufpool   *bpool.Bufferpool
    templates map[string]*template.Template
    db        *sqlx.DB
    store     *redistore.RediStore
    mandrill  *gochimp.MandrillAPI
    twitter   *anaconda.TwitterApi
    log       *log.Logger
    conf      *config // app-wide configuration: hostname, ports, etc.

So, given the custom handler type we created above, how can we turn this list of global variables into a context we can pass to our handlers and our ServeHTTP method—which may want to access our template map to render "pretty" errors or our custom logger—and still keep everything compatible with http.Handler?

package main

import (



// appContext contains our local context; our database pool, session store, template
// registry and anything else our handlers need to access. We'll create an instance of it
// in our main() function and then explicitly pass a reference to it for our handlers to access.
type appContext struct {
    db        *sqlx.DB
    store     *sessions.CookieStore
    templates map[string]*template.Template
    decoder *schema.Decoder
    // ... and the rest of our globals.

// We've turned our original appHandler into a struct with two fields:
// - A function type similar to our original handler type (but that now takes an *appContext)
// - An embedded field of type *appContext
type appHandler struct {
    h func(*appContext, http.ResponseWriter, *http.Request) (int, error)

// Our ServeHTTP method is mostly the same, and also has the ability to
// access our *appContext's fields (templates, loggers, etc.) as well.
func (ah appHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
    // Updated to pass ah.appContext as a parameter to our handler type.
    status, err := ah.h(ah.appContext, w, r)
    if err != nil {
        log.Printf("HTTP %d: %q", status, err)
        switch status {
        case http.StatusNotFound:
            http.NotFound(w, r)
            // And if we wanted a friendlier error page, we can
            // now leverage our context instance - e.g.
            // err := ah.renderTemplate(w, "http_404.tmpl", nil)
        case http.StatusInternalServerError:
            http.Error(w, http.StatusText(status), status)
            http.Error(w, http.StatusText(status), status)

func main() {
    // These are 'nil' for our example, but we'd either assign
    // the values as below or use a constructor function like
    // (NewAppContext(conf config) *appContext) that initialises
    // it for us based on our application's configuration file.
    context := &appContext{db: nil, store: nil} // Simplified for this example

    r := web.New()
    // We pass an instance to our context pointer, and our handler.
    r.Get("/", appHandler{context, IndexHandler})

    graceful.ListenAndServe(":8000", r)

func IndexHandler(a *appContext, w http.ResponseWriter, r *http.Request) (int, error) {
    // Our handlers now have access to the members of our context struct.
    // e.g. we can call methods on our DB type via err := a.db.GetPosts()
    fmt.Fprintf(w, "IndexHandler: db is %q and store is %q", a.db,
    return 200, nil

Everything still remains very readable: we lean on the type system and existing interfaces, and if we just want to use a regular http.HandlerFunc, we can do that too. Our handlers are still wrappable by anything that takes (and spits out) a http.Handler, and if we wanted to ditch Goji and use gorilla/mux or even just net/http, we don't have to change our handler at all. Just make sure that your context's fields are safe for concurrent access. Putting a map in there that requests write to would not be, for example: you'll need a mutex from the sync package for that.

Other than that, it just works. We've reduced repetition around our error handling, we've removed our reliance on globals and our code is still readable.


  • Worth reading is Justinas' great article on errors in Go: read the section on implementing a custom httpError.
  • Writing some HTTP middleware for your Go application? Align with func(http.Handler) http.Handler and you'll end up with something portable. The only "exception" to this rule is when you need to pass state between handlers (i.e. a CSRF token), which is when you'll need to tie yourself to a request context (like Goji's web.C, or gorilla/context). Plenty of middleware doesn't need to do that however.
  • This is how you would catch errors before rendering templates (in short: use a buffer pool).
  • There's a compilable version of the final example that you can leave comments on.

Approximating html/template Inheritance

 ∗ go, web, go

Go's html/template package is fairly minimal compared to templating packages associated with other languages (Jinja, Mustache, even Django's templates), although it makes up for this with security and great docs.

There are however a few "tricks" to using it: specifically when it comes to approximating template inheritance. Being able to specify a base layout (or layouts), stake out your blocks and then fill those blocks with template snippets isn't immediately clear. So how do we do this?

First, we define base.tmpl:

{{ define "base" }}
    {{ template "title" . }}
    {{ template "scripts" . }}
    {{ template "sidebar" . }}
    {{ template "content" . }}
{{ end }}
// We define empty blocks for optional content so we don't have to define a block in child templates that don't need them
{{ define "scripts" }}{{ end }}
{{ define "sidebar" }}{{ end }}

And index.tmpl, which effectively extends our base template.

{{ define "title"}}<title>Index Page</title>{{ end }}
// Notice the lack of the script block - we don't need it here.
{{ define "sidebar" }}
    // We have a two part sidebar that changes depending on the page
    {{ template "sidebar_index" }} 
    {{ template "sidebar_base" }}
{{ end }}
{{ define "content" }}
    {{ template "listings_table" . }}
{{ end }}

Note that we don't need to define all blocks in the base layout: we've "cheated" a little by defining them alongside our base template. The trick is ensure that the {{ define }} blocks in the base template are empty. If you define two blocks and both have content, the application will panic when it attempts to parse the template files (on startup, most likely). There's no "default" content we can fall back on. It's not a a deal-breaker, but it's worth remembering when writing these out.

In our Go application, we create a map of templates by parsing the base template, any necessary snippets, and the template that extends our base template. This is best done at appication start-up (and panics are okay here) so we can fail early. A web application with broken templates is probably not much of a web application.

It's also critical that we ensure any look-ups on map keys (template names) that don't exist are caught (using the comma-ok idiom): otherwise it's a run-time panic.

import (

var templates map[string]*template.Template

// Load templates on program initialisation
func init() {
    if templates == nil {
        templates = make(map[string]*template.Template)

    templatesDir := config.Templates.Path

    layouts, err := filepath.Glob(templatesDir + "layouts/*.tmpl")
    if err != nil {

    includes, err := filepath.Glob(templatesDir + "includes/*.tmpl")
    if err != nil {

    // Generate our templates map from our layouts/ and includes/ directories
    for _, layout := range layouts {
        files := append(includes, layout)
        templates[filepath.Base(layout)] = template.Must(template.ParseFiles(files...))


// renderTemplate is a wrapper around template.ExecuteTemplate.
func renderTemplate(w http.ResponseWriter, name string, data map[string]interface{}) error {
    // Ensure the template exists in the map.
    tmpl, ok := templates[name]
    if !ok {
        return fmt.Errorf("The template %s does not exist.", name)

    w.Header().Set("Content-Type", "text/html; charset=utf-8")
    tmpl.ExecuteTemplate(w, "base", data)

    return nil

We create our templates from a set of template snippets and the base layout (just the one, in our case). We can fill in our {{ template "script" }} block as needed, and we can mix and match our sidebar content as well. If your pages are alike, you can generate this map with a range clause by using a slice of the template names as the keys.

Slightly tangential to this, there's the common problem of dealing with the error returned from template.ExecuteTemplate. If we pass the writer to an error handler, it's too late: we've already written (partially) to the response and we'll end up with a mess in the user's browser. It'll be part of the page before it hit the error, and then the error page's content. The solution here is to write to a bytes.Buffer to catch any errors during the template rendering, and then write out the contents of the buffer to the http.ResponseWriter.

Although you can create your own buffer per-request, using a pool ( reduces allocations and garbage. I benchmarked and profiled a bare approach (as above; write out directly), a 10K fixed buffer per-request (big enough for most of my responses), and a pool of buffers. The pooled approach was the fastest, at 32k req/s vs. 26k req/s (fixed buffer) and 29k req/s (no buffer). Latency was no worse than the bare approach either, which is a huge plus.

import (


var bufpool *bpool.BufferPool

// renderTemplate is a wrapper around template.ExecuteTemplate.
// It writes into a bytes.Buffer before writing to the http.ResponseWriter to catch
// any errors resulting from populating the template.
func renderTemplate(w http.ResponseWriter, name string, data map[string]interface{}) error {
    // Ensure the template exists in the map.
    tmpl, ok := templates[name]
    if !ok {
        return fmt.Errorf("The template %s does not exist.", name)

    // Create a buffer to temporarily write to and check if any errors were encounted.
    buf := bufpool.Get()
    err := tmpl.ExecuteTemplate(&buf, "base", data)
    if err != nil {
        return err

    // Set the header and write the buffer to the http.ResponseWriter
    w.Header().Set("Content-Type", "text/html; charset=utf-8")
    return nil

func init() {
    bufpool = bpool.NewBufferPool(64)

We can catch that returned error in our handler and return a HTTP 500 instead. The best part is that it also makes testing our handlers easier. If you try to take over the http.ResponseWriter with your error handler, you've already sent a HTTP 200 status header, making it much harder to test where things are broken. By writing to a temporary buffer first, we ensure that don't set headers until we're sure the template will render correctly; making testing much simpler.

And that's about it. We have composable templates, we deal with our errors before writing out, and it's still fast.


  • This post was triggered after I asked the question on the /r/golang sub-reddit, which is what prompted me to look at re-using buffers via a pool.
  • Credit goes to this answer on SO for the clever map[string]*template.Template approach, and a thanks to @jonathanbingram for the great "optional blocks" trick.
  • I highly suggest reading Jan Newmarch's html/template tutorial, which covers {{ with }}, {{ range . }} and template.Funcs comprehensively.

httpauth - Basic Auth Middleware For Go

 ∗ go, web, go

httpauth is a HTTP Basic Authentication middleware for Go.

I originally designed it for the Goji micro-framework, but it's compatible with vanilla net/http. We can thank Go's http.Handler interface for that, but I'd recommend Alice to minimise the function wrapping if you're particularly framework adverse.

package main


func main() {

    goji.Use(httpauth.SimpleBasicAuth("dave", "password"))
    // myHandler requires HTTP Basic Auth to access
    goji.Get("/thing", myHandler)


As always, note that HTTP Basic Auth credentials are sent over the wire in plain-text, so serve your application over HTTPS (TLS) using Go's built-in ListenAndServeTLS or nginx up front.

Full examples are in the README, and I'm open to any pull requests.

Generating Secure Random Numbers Using crypto/rand

 ∗ go, web, go

You're writing an application and you need to generate some session keys, CSRF tokens, and HMACs. For all of these activities, you need a crytographically secure pseudo-random number generator (CSPRNG). Or, in other words, a source of random bytes that is unpredictable and without bias. Specifically, you want to mitigate the chance that an attacker can predict future tokens based on previous tokens.

So what does this mean? No math.Rand, no time.UnixNano; in fact, it means that you only (ever) use the system CSPRNG that your operating system provides. This means using /dev/urandom/ and Windows' CryptGenRandom API.

Go's crypto/rand package, thankfully, abstracts these implementation details away to minimise the risk of getting it wrong:


// GenerateRandomBytes returns securely generated random bytes. 
// It will return an error if the system's secure random
// number generator fails to function correctly, in which
// case the caller should not continue.
func GenerateRandomBytes(n int) ([]byte, error) {
    b := make([]byte, n)
    _, err := rand.Read(b)
    // Note that err == nil only if we read len(b) bytes.
    if err != nil {
        return nil, err

    return b, nil

// GenerateRandomString returns a URL-safe, base64 encoded
// securely generated random string.
// It will return an error if the system's secure random
// number generator fails to function correctly, in which
// case the caller should not continue.
func GenerateRandomString(s int) (string, error) {
    b, err := GenerateRandomBytes(s)
    return base64.URLEncoding.EncodeToString(b), err

// Example: this will give us a 44 byte, base64 encoded output
token, err := GenerateRandomString(32)
if err != nil {
    // Serve an appropriately vague error to the
    // user, but log the details internally.

We have two functions here:

  • GenerateRandomBytes is useful when we need the raw bytes for another cryptographic function, such as HMAC keys.
  • GenerateRandomString wraps this and generates keys we can use for session IDs, CSRF tokens and the like. We base64 URL encode the output in order to provide secure strings that we can use in file-names, templates, HTTP headers and to minimise encoding overhead compared to hex encoding.

For CSRF tokens and session [cookie] IDs, 32 bytes (256 bits) is more than sufficient and common in a number of large web frameworks. If you're generating HMAC keys, you'll need to size it based on which HMAC algorithm you are using, and the same goes for AES.

Note that, critically, we always ensure that we check for the extremely rare case our operating systems CSPRNG might fail us with an error. And if it's failing, we want to fail too, because that means there's something seriously wrong with our operating system. Checking and handling errors is also idiomatic Go, and good software practice in general. In this case however it's definitely worth emphasising the need to both log the error and stop processing.

If you are interested in the mechanics behind CSPRNGs I'd recommend reading Schneier's Crytography Engineering, and if you're interested in web security in general, Adam Langley's blog is worth following—noting that Adam contributes to much of Go's crytographic code.


Logilab: Logilab at Debconf 2014 - Debian annual conference

 ∗ Planet Python

Logilab is proud to contribute to the annual debian conference which will take place in Portland (USA) from the 23rd to the 31st of august.

Julien Cristau (debian page) will be giving two talks at the conference :

Logilab is also contributing to the conference as a sponsor for the event.

Here is what we previously blogged about salt and the previous debconf . Stay tuned for a blog post about what we saw and heard at the conference.


Will Kahn-Greene: Dennis v0.5 released! New lint rules, new template linter, bunch of fixes, and now a service!

 ∗ Planet Python

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a linter for finding problems in strings in .po files like invalid Python variable syntax which leads to exceptions
  • a template linter for finding problems in strings in .pot files that make translator's lives difficult
  • a statuser for seeing the high-level translation/error status of your .po files
  • a translator for strings in your .po files to make development easier

v0.5 released!

Since the last release announcement, there have been a handful of new lint rules added:

  • W301: Translation consists of just white space
  • W302: The translation is the same as the original string
  • W303: There are descrepancies in the HTML between the original string and the translated string

Additionally, there's a new template linter for your .pot files which can catch things like:

  • W500: Strings with variable names like o, O, 0, l, 1 which can be hard to read and are often replaced with a similar looking letter by the translator.
  • W501: One-character variable names which don't give translators enough context about what's being translated.
  • W502: Multiple unnamed variables which can't be reordered because the order the variables are expanded is specified outside of the string.

Dennis in action

Want to see Dennis in action, but don't want to install Dennis? I threw it up as a service, though it's configured for SUMO:


I may change the URL and I might create a SUMO-agnostic version. If you're interested, let me know.

Where to go for more

For more specifics on this release, see here:

Documentation and quickstart here:

Source code and issue tracker here:

Source code and issue tracker for Denise (Dennis-as-a-service):

3 out of 5 summer interns use Dennis to improve their posture while pranking their mentors.


How to Internet

 ∗ One Big Fluke

  1. Read a cool blog post you find interesting
  2. Leave a supportive comment and link to related ideas
  3. Receive an email from the post's author saying your comment is spam
  4. They delete your comment

Ahh... wondrous delight. Just another day.



Tweet storms

 ∗ One Big Fluke

1/ Tweets that are numbered are annoying. Please stop being lazy and use a blog.

2/ If it's not important enough for you to edit, it's not important enough for us to read.

Just stop! You're wasting everyone's time.


Armin Ronacher: Revenge of the Types

 ∗ Planet Python

This is part two about "The Python I Would Like To See" and it explores a bit the type system of Python in light of recent discussions. Some of this references the earlier post about slots. Like the earlier post this is a bit of a diving into the CPython interpreter, the Python language with some food for thoughts for future language designers.

As a Python programmer, types are a bit suspicious to you. They clearly exist and they interact in different ways with each other, but for the most part you really only notice their existence when you fail and an exception tells you that a type does not behave like you think it does.

Python was very proud of its approach to typing. I remember reading the language FAQ many years ago and it had a section about how cool duck typing is. To be fair: in practical terms duck typing is a good solution. Because there is basically no type system that fights against you, you are unrestricted in what you can do, which allows you to implement very nice APIs. Especially the common things are super simple in Python.

Almost all the APIs I designed for Python do not work in other languages. Even such simple things such as click's general interface just does not work in other languages. The largest reason for that is that you constantly fight against types.

Recently there have been discussions about adding static typing to Python and I wholeheartedly believe that train long left the station and will never come back. So for the interested, here my thoughts on why I hope Python will not adapt explicit typing.

What's a Type System?

A type system are the rules of how types interact with each other. There is actually a whole part of computer science that seems to be exclusively concerned with types which is pretty impressive by itself. But even if you are not particularly interested in theoretical computer science, type systems are hard to ignore.

I don't want to go too much into type systems for two reasons. The first one is that I barely understand them at all myself. The second one is that they are really not all that important to understand in order to "feel" the consequences of them. For me the way types behave is important because it influences how APIs are designed. So consider this basic introduction more influenced by my obsession with nice APIs than with a correct introduction to types.

Type systems can have many properties but the most important one that sets them all apart is the amount of information they provide when you try to reason with them.

As an example you can take Python. Python has types. There is the number 42 and when you ask the number what type it is, it will reply that it is an integer type. That makes a lot of sense and it allows the interpreter to define the rules of how integers interact with other integers.

However there is one thing Python does not have, and that is composite types. All Python types are primitive. That means that you basically can only work with one of them at the time. The opposite of that is composite types. You do see them in Python every once in a while in other contexts.

The most straightforward composite type that most programming languages have are structs. Python does not have them directly, but there are many situations where libraries need to define their own structs. For instance a Django or SQLAlchemy ORM model is essentially a struct. Each database column is represented through a Python descriptor which in this case, corresponds directly to a field in a struct. So when you say the primary key is called id and it's an IntegerField() you are defining your model as composite type.

Composite types are not limited to structs. When you want to work with more than one integer for instance you would use a collection like an array. In Python you have lists and each item in the list can be of an arbitrary type. This is in contrast with lists defined to be specific to a type (like list of integer).

"List of integer" always says more than list. While you could argue that by iterating over the list you can figure out of which type it is, the empty list causes a problem. When you are given a list in Python without elements in it, you cannot know the type.

The exact same problem is caused by the null reference (None) in Python. When you pass a user to a function and that user might become None you all the sudden do not know that it could be a user object.

So what's the solution? Not having null references and having explicitly typed arrays. Haskell obviously is the language that everybody knows that does this, but there are others which look less hostile. For instance Rust is a language that looks much more like C++ and as such more familiar but brings a very powerful type system to the table.

So how do you express "no user present" if there are no null references? The answer in Rust for instance are option types. Option<User> means there is either Some(user) or None. The former is a tagged enum which wraps a value (a specific user). Because now the variable can be either some value or nothing, all code that deals with it needs to explicitly handle the None case of it will not even compile.

The Future is Gray

In the past the world was very clearly divided between interpreted languages with dynamic typing and ahead of time compiled languages with static typing. This is changing as new trends emerge.

The first indication that we're moving into some unexplored territory was C#. It's a statically compiled language and when it started it was very similar to Java in how the language operated. As the language was improved many new type system related features landed. The most important was the introduction of generics which allowed non compiler provided collections like lists and dictionaries to be strongly typed. After that they also went into the opposite direction of allowing sections of code to opt out of static typing on a variable by variable basis. This is ridiculously useful, especially on the context of working with data provided by webservices (JSON, XML etc.) where you just do some potentially unsafe processing and catch down any type system related exceptions to inform the user about bad input data.

Today C#'s type system is very powerful supporing generics with covariance and contravariance specifications. Not only that, it also grew a lot of language level support to deal with nullable types. For instance the null-coalescing operator (??) was introduced to provide default values for objects represented as null. While C# already went down too far to get rid of null they are controlling the damage that can be done.

At the same time other languages that are traditionally ahead of time compiled and statically typed also explore new areas. While C++ will always be statically typed, it started to explore with type inference on many levels. The days of MyType<X, Y>::const_iterator iter are gone. Today you can in almost all situations replace the type with a mere auto and the compiler will fill in the type for you.

Rust as a language has also excellent support for type inference which lets you write statically typed programs that are entirely void of any type declarations:

use std::collections::HashMap;

fn main() {
    let mut m = HashMap::new();
    m.insert("foo", vec!["some", "tags", "here"]);
    m.insert("bar", vec!["more", "here"]);

    for (key, values) in m.iter() {
        println!("{} = {}", key, values.connect("; "));

I believe we're moving in a future with powerful type systems. I do not believe that this will be the end of dynamic typing but there appears to be a noticable trend of embracing powerful static typing with local type inference.

Python and Explicit Typing

So not long ago someone apparently convinced someone else at a conference that static typing is awesome and should be a language feature. I'm not exactly sure how that discussion went but the end result was that mypy's type module in combination with Python 3's annotation syntax were declared to be the gold standard of typing in Python.

In case you have not seen the proposal yet, it advocates something like this:

from typing import List

def print_all_usernames(users: List[User]) -> None:
    for user in users:

I honestly believe that this is not exactly a good decision for many reasons, the largest being that Python is already suffering having a not exactly good type system. The language actually has different semantics depending on how you look at it.

For static typing to make sense the type system needs to be good. A type system where you take two types and you can figure out how they relate to each other. Python doesn't have that.

Python's Type Semantics

If you have read the previous post about the slot system you might remember that Python has different semantics depending on if a type is implemented in C or in Python. This is a very unique feature of the language and is usually not found in many other places. While it is true that many languages for bootstrapping purposes have types implemented on the interpreter level, they are typically fundamental types and as such special cased.

In Python there are no real "fundamental" types. There are however a whole bunch of types that are implemented in C. These are not at all limited to primitives and fundamental types, they can appear everywhere and without any logic. For instance collections.OrderedDict is a type implemented in Python whereas collections.defaultdict from the same module is implemented in C.

This is actually causing quite a few problems for PyPy which has to emulate the original types as good as possible to achieve a similar enough API that these differences are not noticeable. It is very important to understand what this general difference between C level interpreter code and the rest of the language means.

As an example I want to point out the re module up to Python 2.7. (This behavior has ultimately been changed in the re module, but the general problem of the interpreter working different than the language are still present.)

The re module provides a function (compile) to compile a regular expression into a regular expression pattern. It takes a string and returns a pattern object. Looks roughly like this:

>>> re.compile('foobar')
<_sre.SRE_Pattern object at 0x1089926b8>

As you can see this pattern object comes from the _sre module which is a bit internal but generally available:

>>> type(re.compile('foobar'))
<type '_sre.SRE_Pattern'>

Unfortunately it's a bit of a lie, because the _sre module does not actually contain that type:

>>> import _sre
>>> _sre.SRE_Pattern
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'SRE_Pattern'

Alright, fair enough, would not be the first time that a type lied about its location and it's an internal type anyways. So moving on. We know the type of the pattern, it's an _sre.SRE_Pattern type. As such a subclass of object:

>>> isinstance(re.compile(''), object)

And all objects implement some very common methods as we know. For instance all objects implement __repr__:

>>> re.compile('').__repr__()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: __repr__

Oh. What happened here? Well, the answer is pretty bizarre. Internally the SRE pattern object for reasons unknown to me, until Python 2.7, had a custom tp_getattr slot. In this slot there was a custom attribute lookup which provided access to some custom methods and attributes. When you actually inspect the object with dir() you will notice that lots of stuff is missing:

>>> dir(re.compile(''))
['__copy__', '__deepcopy__', 'findall', 'finditer', 'match',
 'scanner', 'search', 'split', 'sub', 'subn']

In fact, this leads you down to a really bizarre adventure of how this type actually functions. Here is what's happening:

Type type claims that it's a subclass of object. This is true for the CPython interpreter world, but not true for Python the language. That these are not the same things is disappointing but generally the case. The type does not corresponds to the interface of object on the Python layer. Every call that goes through the interpreter works, every call that goes through the Python language fails. So type(x) succeeds, whereas x.__class__ fails.

What's a Subclass

The above example shows that you can have a class in Python that is a subclass of another thing, that disagrees with the behavior of the baseclass. This is especially a problem if you talk about static typing. In Python 3 for instance you cannot implement the interface of the dict type unless you write the type in C. The reason for this is that the type guarantees a certain behavior of the view objects that just simply cannot be implemented. It's impossible.

So when you would statically annotate that the function takes a dictionary with string keys and integer objects, it would not be clear at all if it takes a dict, a dict like object or if it would permit a dictionary subclass.

Undefined Behavior

The bizarre behavior of the pattern objects was changed in Python 2.7, but the core issue remains. As mentioned with the behavior of dicts for instance, the language has different behavior depending on how the code was written and the exact semantics of the type system are completely impossible to understand.

A super bizarre case of these interpreter internals are for instance type comparisons in Python 2. This particular case does not exist like that on Python 3 because the interfaces were changed, but the fundamental problem can be found on many levels.

Let's take sorting of sets as an example. Sets in Python are useful types, but they have very bizarre comparison behavior. In Python 2 we have this function called cmp() which given two types will return a numeric value that indicates which side is larger. A return value smaller than zero means that the first argument is smaller than the second, a return value of zero means that they are equal and any positive number means the second value is larger than the first.

Here is what happens if you compare sets:

>>> cmp(set(), set())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot compare sets using cmp()

Why is that? Not exactly sure to be honest. Probably because of how the comparison operators are actually set subsets and they could not make that work with cmp(). However for instance frozensets compare just fine:

>>> cmp(frozenset(), frozenset())

Except when one of the sets is not empty it will fail. Why? The answer to this is that this is not a language feature, but an optimization in the CPython interpreter. A frozenset interns common values. The empty frozenset is always the same value (as it is immutable and you cannot add to it), so any empty frozenset is the same object. When two objects have the same pointer address, then cmp will generally return 0. Why exactly I could not figure out quickly due to how complex the comparison logic in Python 2 is, but there are multiple code paths in the comparison routines which might produce this result.

The point is not so much that there is a bug, but that Python does not actually have proper semantics for how types interact with each other. Instead the type system's behavior for a really long time has been "whatever CPython does".

You can find countless of changesets in PyPy where they tried to reconstruct behavior in CPython. Given that PyPy is written in Python, it becomes quite an interesting problem for the language. If the Python language was defined purely like the actual Python part of the language is, PyPy would have a lot less problems.

Instance Level Behavior

Now let's assume there would be a hypothetical version of Python that fixes all of the problems mentioned, static types would still not be something that would fit into Python well. A big reason for this is that on the Python language level, types traditionally had very little meaning in regards to how objects interact.

For instance datetime objects are generally comparable with other things, but datetime objects are only comparable to other datetime objects if their timezone awareness is compatible. Similarly the result of many operations is not clear until you look at the object at hand. Adding two strings together in Python 2 can either construct a unicode or a bytestring object. APIs like decoding or encoding from the codecs system can return any object.

Python as a language is too dynamic for annotations to work well. Just consider how important generators are for the language, yet generators could perform different type conversions on every single iteration.

Type annotations would be spotty at best but they might even have negative impact on API design. At the very least they will make things slower unless they are removed at runtime. They could never implement a language that compiles efficiently statically without making Python something it is not.

Baggage and Semantics

I think my personal takeaway from Python the language is that it got ridiculously complex. Python is a language that suffers from not having a language specification and already such complex interactions between different types, that we will probably never end up with one. There are so many quirks and odd little behaviors that the only thing a language specification would ever produce, is a textual description of the CPython interpreter.

On this foundation it makes very little sense in my mind to put type annotations.

I think if someone would want to develop another predominantly dynamically typed language in the future, they should probably go the extra mile to clearly define how types should work. JavaScript does a pretty good job at that. All semantics of builtin types are clearly defined even if they are bizarre. I think this generally is a good thing. Once you have clearly defined how the semantics work, you are open to optimize or later put optional static typing on top.

Keeping a language lean and well defined seems to be very much worth the troubles. Future language designers definitely should not make the mistake that PHP, Python and Ruby did, where the language's behavior ends up being "whatever the interpreter does".

I think for Python this is very unlikely to ever change at this point, because the time and work required to clean up language and interpreter outweighs the benefits.

Sunday, 24 August


Floris Bruynooghe: New pytest-timeout release

 ∗ Planet Python

At long last I have updated my pytest-timeout plugin. pytest-timeout is a plugin to py.test which will interrupt tests which are taking longer then a set time and dump the stack traces of all threads. This was initially developed in order to debug some some tests which would occasionally hang on a CI server and can be used in a variety of similar situations where getting some output is more useful then getting a clean testrun.

The main new feature of this release is that the plugin now finally works nicely with the --pdb option from py.test. When using this option the timeout plugin will now no longer interrupt the interactive pdb session after the given timeout.

Secondly this release fixes an important bug which meant that a timeout in the finaliser of a fixture at the end of the session would not be caught by the plugin. This was mainly because pytest-timeout was not updated since py.test changed the way fixtures where cached on their scope, the introduction of @pytest.fixture(scope='...'), even though this was a long time ago.

So if you use py.test and a CI server I suggest now is as good a time as any to configure it to use pytest-timeout, using a fairly large timeout of say 300 seconds, then forget about it forever. Until maybe one day it will suddenly save you a lot of head scratching and time.


Nigel Babu: Deploying automatically with webhooks

 ∗ Planet Python

Recently, we had a project deliverable to setup a repository that would auto-deploy to the staging server. I spent a bit of time getting this right, so I figured it’d be useful for someone else who had to do this.


I wrote cloaked-spice, a tiny Flask app to do the job here. Flask documentation has a pretty neat example for doing request checksums. I’ve just modified it to work with HMAC and SHA1. It sounds simple, but thanks to silly mistakes, it took a few days to get it working :)

Server Setup

The Flask app is extremely limited in what it can do on the server. It is run as a user called deploy and a group called deploy. All the files that the app can update are owned by the group deploy and editable by the group. Thanks to sudo’s flexibility, it can only run one sudo command. Here’s the line in sudoers file:

deploy ALL=(ALL) NOPASSWD: /usr/bin/service apache2 *

Serving cloaked-spice

I’m using gunicorn and Nginx to serve the app and supervisor to manage the gunicorn process. I’m sure other wsgi servers would be up-to the job just as easily.

Credits to Github for suggesting a name for this that sounds less boring that deploy.

Saturday, 23 August


Frank Wierzbicki: Jython 2.7 beta3 released!

 ∗ Planet Python

On behalf of the Jython development team, I'm pleased to announce that the third beta of Jython 2.7 is available. I'd like to thank Adconion Media Group (now Amobee) for sponsoring my work on Jython. I'd also like to thank the many contributors to Jython.

Jython 2.7b3 brings us up to language level compatibility with the 2.7 version of CPython. We have focused largely on CPython compatibility, and so this release of Jython can run more pure Python apps then any previous release. Please see the NEWS file for detailed release notes. This release of Jython requires JDK 7 or above.

Some highlights of the changes that come in beta 3:

  • Reimplementation of socket/select/ssl on top of Netty 4.
  • Requests now works.
  • Pip almost works (it works with a custom branch).
  • Numerous bug fixes
To get a more complete list of changes in beta 3, see Jim Baker's talk.

As a beta release we are concentrating on bug fixing and stabilization for a
production release.

This release is being hosted at maven central. The traditional installer can be found here. See the installation instructions for using the installer. Three other versions are available:
To see all of the files available including checksums, go here and navigate to the appropriate distribution and version.

Python Piedmont Triad User Group: PYPTUG Meeting - August 25th

 ∗ Planet Python

PYthon Piedmont Triad User Group meeting

Come join PYPTUG at out next meeting (August 25th 2014) to learn more about the Python programming language, modules and tools. Python is the perfect language to learn if you've never programmed before, and at the other end, it is also the perfect tool that no expert would do without.


Meeting will start at 5:30pm.

We will open on an Intro to PYPTUG and on how to get started with Python, PYPTUG activities and members projects, then on to News from the community.

Finally, the main part of the meeting will be a presentation.

Python for C# Developers:

Python Tools for Visual Studio

PTVS 2.1RC has just been released. Microsoft Visual Studio comes in a variety of flavors, including a scaled down free version. PTVS works fine with it. So you can use it without cost to you.

But beyond that, you will learn how C# and Python can interact, how they relate, what typical C# patterns look like in Python, etc.

Who would benefit from Python for C# Developers?

Although it seems to imply that one has to be an actual C Sharp developer to benefit, the truth is that anybody that is intent on writing a program that will run on a Windows computer would benefit.


Monday, August 25th 2014
Meeting starts at 5:30PM


Wake Forest University, close to Polo Rd and University Parkway:

Wake Forest University, Winston-Salem, NC 27109

 Map this

See also this campus map (PDF) and also the Parking Map (PDF) (Manchester hall is #20A on the parking map)

And speaking of parking:  Parking after 5pm is on a first-come, first-serve basis.  The official parking policy is:
"Visitors can park in any general parking lot on campus. Visitors should avoid reserved spaces, faculty/staff lots, fire lanes or other restricted area on campus. Frequent visitors should contact Parking and Transportation to register for a parking permit.

Mailing List

Don't forget to sign up to our user group mailing list:

It is the only step required to become a PYPTUG member.

Meetup Group

In order to get a feel for how much food we'll need, we ask that you register your attendance to this meeting on meetup:



Friday, 22 August






Peter Bengtsson: premailer now with 100% test coverage

 ∗ Planet Python

One of my most popular GitHub Open Source projects is premailer. It's a python library for combining HTML and CSS into HTML with all its CSS inlined into tags. This is a useful and necessary technique when sending HTML emails because you can't send those with an external CSS file (or even a CSS style tag in many cases).

The project has had 23 contributors so far and as always people come in get some itch they have scratched and then leave. I really try to get good test coverage and when people come with code I almost always require that it should come with tests too.

But sometimes you miss things. Also, this project was born as a weekend hack that slowly morphed into an actual package and its own repository and I bet there was code from that day that was never fully test covered.

So today I combed through the code and plugged all the holes where there wasn't test coverage.
Also, I set up Coveralls (project page) which is an awesome service that hooks itself up with Travis CI so that on every build and every Pull Request, the tests are run with --with-cover on nosetests and that output is reported to Coveralls.

The relevant changes you need to do are:

1) You need to go to (sign in with your GitHub account) and add the repo.
2) Edit your .travis.yml file to contain the following:

    - pip install coverage
    - pip install coveralls
    - coveralls

And you need to execute your tests so that coverage is calculated (the coverage module stores everything in a .coverage file which coveralls analyzes and sends). So in my case I change to this:

    - nosetests premailer --with-cover --cover-erase --cover-package=premailer

3) You must also give coveralls some clues. So it reports on only the relevant files. Here's what mine looked like:

source = premailer

omit = premailer/test*

Now, I get to have a cute "coverage: 100%" badge in the README and when people post pull requests Coveralls will post a comment to reflect how the pull request changes the test coverage.

I am so grateful for all these wonderful tools. And it's all free too!



Grzegorz Śliwiński: My first bump with pypy

 ∗ Planet Python

Just until two weeks ago, I was only reading about PyPy. My though about it was somewhat simple: that's the interpreter, that speeds up python code. I expected that some changes to python code, that's supposed to run in PyPy will be required, but I thought it will be easy to spot. I haven't been more wrong.

Read more… (1 min remaining to read)

Reach Tim: Pelican Configuration for

 ∗ Planet Python

You are reading an article on a static blog site that is built with the Pelican static site generator.

This article describes how the blog site is configured.

You can get all the code in my GitHub project reachtim. I use snippets of that code in this article.

I looked at several static site generators; Pelican seemed to fit my brain the best and it has a great and active community.

The documentation was another reason—the sites Pelican Blog and Pelican Docs were indispensible, not to mention the various blog sites that describe how they were setup.

The Goal

I wanted a simple blog with no more machinery involved than I actually would use. I seem to lean toward yagni; I like having only ‘just enough’ machinery to get the job done.

So a statically generated site seemed to be the right thing—I don’t need a database for users or ecommerce and the site has limited interactivity.

I wanted a blog that supports the following for authoring and administration:

  • Simple to use (for me to add content)
  • Markdown + LaTeX math rendering
  • Code highlighting
  • Automatic analytics

Of course the reading experience is just as important, and to support that end of things I wanted:

  • Simple to navigate (for readers)
  • Feed subscriptions
  • Commenting capability

The Directory Structure

I wasn’t sure how much customization I would want to do, so I downloaded the plugin and theme zip files from getpelican/pelican-plugins and getpelican/pelican-themes. That turned out to be a good idea. The directory structure looks like this:


With this structure, if I want a new blog site, I can create a new directory in projects and set the new pelican root directory there. In the configuration file, I can then make local calls to the theme and plugins I want.

To add content, I write in MarkDown (*.md) or RestructuredText (*.rst); if the content is a post, it goes in the articles directory. If it is some other type of content, like the about page, it goes in the pages directory.

For images or other special files, I add them to the images or extra directory and they are copied straight through to the static site.

The Configuration File

In the file, I first set up the basics. Well I actually used pelican-quickstart to get the initial file, but afterwards I went through it line by line, to make sure it was exactly what I wanted and to make sure I understood what was going on.

AUTHOR = u'Tim Arnold'
SITENAME = u'ReachTim'
SITESUBTITLE = 'Python, LaTeX, and XML: coding together.'
TIMEZONE = 'America/New_York'
PATH = 'content'

Then I made a few changes so I can have the articles in their own subdirectory (ARTICLE_PATHS) and set the static paths (the directories and files that are copied over verbatim). So the images and extra directories are copied over with no other processing. The files listed in EXTRA_PATH_METADATA are copied to the root of the OUTPUT_PATH directory.

I’m setting the output directory to be same name as the website, reachtim; it’s my personal preference and fits my deployment scheme. The default name is output.

OUTPUT_PATH = 'reachtim/'
ARTICLE_PATHS = ['articles',]
STATIC_PATHS = ['images', 'extra',]

    'extra/404.html': {'path': '404.html'},
    'extra/403.html': {'path': '403.html'},
    'extra/robots.txt': {'path': 'robots.txt'},
    'extra/.htaccess': {'path':  '.htaccess'},
    'extra/crossdomain.xml': {'path':  'crossdomain.xml'},
    'extra/favicon.ico': {'path':  'favicon.ico'},

It was easy to change the theme or set of plugins since I had all of them in a local directory. I only have to change these lines to try a different theme or add/delete a plugin.

THEME = '../themes/zurb-F5-basic'

I played around with a lot of themes before deciding on the zurb-F5-basic. I like the way it looks and operates. Other than this site, you can see another example here, from the github page: zurb-F5-basic

The plugins:

  • neighbors adds prev_article and next_article variables to the article context, so you can use them in your template.
  • pelican_fontawesome enables you to embed FontAwesome icons in your content. This plugin was not in the getpelican plugins project, so I installed it separately from pelican-fontawesome
  • pelican_gist makes it easy to embed entire GitHub gists into your content.
  • render_math enables the rendering of LaTeX style math by using the MathJax javascript engine.
  • sitemap automatically generates your sitemap which helps search engines know about all of your pages.

The sitemap plugin needs a little more data:

    'format': 'xml',
    'priorities': {
        'articles': 0.5,
        'indexes': 0.5,
        'pages': 0.5
    'changefreqs': {
        'articles': 'monthly',
        'indexes': 'daily',
        'pages': 'monthly'

I also set things up so I have Atom feeds, Disqus commenting capability, and Google Analytics. I manually added the Google Analytics code after I registered the site with Google. Of course you need a Google account for that, and a Disqus account for commenting capability.

DISQUS_SITENAME = 'reachtim'
FEED_ALL_ATOM = 'feeds/all.atom.xml'
CATEGORY_FEED_ATOM = 'feeds/%s.atom.xml'

Finally, I added these two settings to have a more attractive look. The TYPOGRAPHY setting provides a few changes to the typesetting by using the Typogrify library.

There are a lot of choices for MD_EXTENSIONS and you can read about them in their documentation.


Writing Content

You can write using MarkDown or RestructuredText, and each page or article you write will have some metadata at the top of the file. This article has MarkDown metadata:

Title: Pelican Configuration for
Category: Python
Date: 2014-Aug-20
Tags: python, web
Summary: How this site is set up

You can add other metadata to an article—whatever you add will be available to your template in the article’s context.

The content follows the metadata after a blank line and it uses normal MarkDown syntax.

Thursday, 21 August


Jonathan Street: Seizure detection challenge on kaggle

 ∗ Planet Python

Following a hiatus of a couple of years I have rejoined the competitors on kaggle. The UPenn and Mayo Clinic Seizure Detection Challenge had 8 days to run when I decided to participate. For the time I had available I'm quite pleased with my final score. I finished in 27th place with 0.93558. The metric used was area under the ROC curve, 1.0 is perfect and 0.5 being no better than random.

The code is now on github.

Prompted by a post from Zac Stewart I decided to give pipelines in scikit-learn a try. The data from the challenge consisted of electroencephalogram recordings from several patients and dogs. These subjects had different numbers of channels in their recordings, so manually implementing the feature extraction would have been very slow and repetitive. Using pipelines made the process incredibly easy and allowed me to make changes quickly.

The features I used were incredibly simple. All the code is in - I used variance, median, and the FFT which I pooled into 6 bins. No optimization of hyperparameters was attempted before I ran out of time.

Next time, I'll be looking for a competition with longer to run.


12:10 PyDDF Python Sprint 2014

 ∗ Planet Python

The following text is in German, since we're announcing a Python sprint in Düsseldorf, Germany.


Der erste PyDDF Python Sprint 2014 in Düsseldorf:

Samstag, 27.09.2014, 10:00-18:00 Uhr
Sonntag, 28.09.2014, 10:00-18:00 Uhr
(Gebäude 25.41, Erdgeschoss, Raum 45) des
ZIM der HHU Düsseldorf


Das Python Meeting Düsseldorf (PyDDF) veranstaltet zusammen mit dem ZIM der Heinrich-Heine-Universität Düsseldorf ein Python Sprint Wochenende im September.

Der Sprint findet am Wochenende 27/28.09.2013 im Seminarraum (Gebäude 25.41, Erdgeschoss, Raum 45) des ZIM der HHU Düsseldorf stattfinden:
Folgende Themengebiete haben wir als Anregung angedacht:
  • Openpyxl
Openpyxl ist eine Python Bibliothek, mit der man Excel 2010 XLSX/XLSM Dateien lesen und schreiben kann.

Charlie ist Co-Maintainer des Pakets und würde gerne an folgenden Themen arbeiten:

- ElementTree Implementation des lxml.etree.xmlfile Moduls (context manager)
- Co-Routines für die Serialisierung
- Python Code-Object-Generierung anhand des Schemas
  • HTTP Audio Streaming für Mopidy
Mopidy ist ein MPD Musikserver, der viele Internet-Streaming-Dienste abonnieren kann, diese jedoch nur über lokale Audiogeräte ausgibt.

Es wäre schön, wenn man auch Internetradios anschließen könnte, wie z.B. die Squeezebox. Es gibt dazu schon ein Ticket, auf dem man vermutlich aufbauen könnte:


Ziel wäre es, eine Mopidy Extension zu schreiben, die dieses Feature umsetzt.
Natürlich kann jeder Teilnehmer weitere Themen vorschlagen, z.B.
  • Kivy (Python auf Android/iOS)
  • RaspberryPi (wir werden ein paar davon mitbringen)
  • FritzConnection (Python API für die Fritzbox)
  • OpenCV (Bilder von Webcams mit Python verarbeiten)
  • u.a.
Alles weitere und die Anmeldung findet Ihr auf der Sprint Seite:
Teilnehmer sollten sich zudem auf der PyDDF Liste anmelden, da wir uns dort koordinieren:
Wir haben nur begrenzten Platz im Seminarraum, daher wäre es gut, wenn wir die ungefähre Anzahl Teilnehmer schon in Vorfeld einplanen könnten. Platz ist für max. 30 Teilnehmer.

Über das Python Meeting Düsseldorf

Das Python Meeting Düsseldorf ist eine regelmäßige Veranstaltung in Düsseldorf, die sich an Python Begeisterte aus der Region wendet.

Einen guten Überblick über die Vorträge bietet unser PyDDF YouTube-Kanal, auf dem wir Videos der Vorträge nach den Meetings veröffentlichen.

Veranstaltet wird das Meeting von der GmbH, Langenfeld, in Zusammenarbeit mit Clark Consulting & Research, Düsseldorf.

Viel Spaß !

Marc-Andre Lemburg,

Published: 2013-03-20
Please enable JavaScript to make full use of our web-site. Thank you.

Leonardo Giordani: Python 3 OOP Part 3 - Delegation: composition and inheritance

 ∗ Planet Python

Previous post

Python 3 OOP Part 2 - Classes and members

The Delegation Run

If classes are objects what is the difference between types and instances?

When I talk about "my cat" I am referring to a concrete instance of the "cat" concept, which is a subtype of "animal". So, despite being both objects, while types can be specialized, instances cannot.

Usually an object B is said to be a specialization of an object A when:

  • B has all the features of A
  • B can provide new features
  • B can perform some or all the tasks performed by A in a different way

Those targets are very general and valid for any system and the key to achieve them with the maximum reuse of already existing components is delegation. Delegation means that an object shall perform only what it knows best, and leave the rest to other objects.

Delegation can be implemented with two different mechanisms: composition and inheritance. Sadly, very often only inheritance is listed among the pillars of OOP techniques, forgetting that it is an implementation of the more generic and fundamental mechanism of delegation; perhaps a better nomenclature for the two techniques could be explicit delegation (composition) and implicit delegation (inheritance).

Please note that, again, when talking about composition and inheritance we are talking about focusing on a behavioural or structural delegation. Another way to think about the difference between composition and inheritance is to consider if the object knows who can satisfy your request or if the object is the one that satisfy the request.

Please, please, please do not forget composition: in many cases, composition can lead to simpler systems, with benefits on maintainability and changeability.

Usually composition is said to be a very generic technique that needs no special syntax, while inheritance and its rules are strongly dependent on the language of choice. Actually, the strong dynamic nature of Python softens the boundary line between the two techniques.

Inheritance Now

In Python a class can be declared as an extension of one or more different classes, through the class inheritance mechanism. The child class (the one that inherits) has the same internal structure of the parent class (the one that is inherited), and for the case of multiple inheritance the language has very specific rules to manage possible conflicts or redefinitions among the parent classes. A very simple example of inheritance is

``` python class SecurityDoor(Door):



where we declare a new class SecurityDoor that, at the moment, is a perfect copy of the Door class. Let us investigate what happens when we access attributes and methods. First we instance the class

``` python

sdoor = SecurityDoor(1, 'closed') ```

The first check we can do is that class attributes are still global and shared

``` python

SecurityDoor.colour is Door.colour True sdoor.colour is Door.colour True ```

This shows us that Python tries to resolve instance members not only looking into the class the instance comes from, but also investigating the parent classes. In this case sdoor.colour becomes SecurityDoor.colour, that in turn becomes Door.colour. SecurityDoor is a Door.

If we investigate the content of __dict__ we can catch a glimpse of the inheritance mechanism in action

``` python

sdoor.dict {'number': 1, 'status': 'closed'} sdoor.class.dict mappingproxy({'doc': None, 'module': 'main'}) Door.dict mappingproxy({'dict': ,

'colour': 'yellow',
'open': <function at 0xb687e224>,
'__init__': <function Door.__init__ at 0xb687e14c>,
'__doc__': None,
'close': <function Door.close at 0xb687e1dc>,
'knock': <classmethod object at 0xb67ff6ac>,
'__weakref__': <attribute '__weakref__' of 'Door' objects>,
'__module__': '__main__',
'paint': <classmethod object at 0xb67ff6ec>})


As you can see the content of __dict__ for SecurityDoor is very narrow compared to that of Door. The inheritance mechanism takes care of the missing elements by climbing up the classes tree. Where does Python get the parent classes? A class always contains a __bases__ tuple that lists them

``` python

SecurityDoor.bases (,) ```

So an example of what Python does to resolve a class method call through the inheritance tree is

``` python

sdoor.class.bases[0].dict['knock'].get(sdoor) > sdoor.knock > ```

Please note that this is just an example that does not consider multiple inheritance.

Let us try now to override some methods and attributes. In Python you can override (redefine) a parent class member simply by redefining it in the child class.

``` python class SecurityDoor(Door):

colour = 'gray'
locked = True

def open(self):
    if not self.locked:
        self.status = 'open'


As you can forecast, the overridden members now are present in the __dict__ of the SecurityDoor class

``` python

SecurityDoor.dict mappingproxy({'doc': None,

'__module__': '__main__',
'open': <function at 0xb6fcf89c>,
'colour': 'gray',
'locked': True})


So when you override a member, the one you put in the child class is used instead of the one in the parent class simply because the former is found before the latter while climbing the class hierarchy. This also shows you that Python does not implicitly call the parent implementation when you override a method. So, overriding is a way to block implicit delegation.

If we want to call the parent implementation we have to do it explicitly. In the former example we could write

``` python class SecurityDoor(Door):

colour = 'gray'
locked = True

def open(self):
    if self.locked:


You can easily test that this implementation is working correctly.

``` python

sdoor = SecurityDoor(1, 'closed') sdoor.status 'closed' sdoor.status 'closed' sdoor.locked = False sdoor.status 'open' ```

This form of explicit parent delegation is heavily discouraged, however.

The first reason is because of the very high coupling that results from explicitly naming the parent class again when calling the method. Coupling, in the computer science lingo, means to link two parts of a system, so that changes in one of them directly affect the other one, and is usually avoided as much as possible. In this case if you decide to use a new parent class you have to manually propagate the change to every method that calls it. Moreover, since in Python the class hierarchy can be dynamically changed (i.e. at runtime), this form of explicit delegation could be not only annoying but also wrong.

The second reason is that in general you need to deal with multiple inheritance, where you do not know a priori which parent class implements the original form of the method you are overriding.

To solve these issues, Python supplies the super() built-in function, that climbs the class hierarchy and returns the correct class that shall be called. The syntax for calling super() is

``` python class SecurityDoor(Door):

colour = 'gray'
locked = True

def open(self):
    if self.locked:


The output of super() is not exactly the Door class. It returns a super object which representation is <super: <class 'SecurityDoor'>, <SecurityDoor object>>. This object however acts like the parent class, so you can safely ignore its custom nature and use it just like you would do with the Door class in this case.

Enter the Composition

Composition means that an object knows another object, and explicitly delegates some tasks to it. While inheritance is implicit, composition is explicit: in Python, however, things are far more interesting than this =).

First of all let us implement classic composition, which simply makes an object part of the other as an attribute

``` python class SecurityDoor:

colour = 'gray'
locked = True

def __init__(self, number, status):
    self.door = Door(number, status)

def open(self):
    if self.locked:

def close(self):


The primary goal of composition is to relax the coupling between objects. This little example shows that now SecurityDoor is an object and no more a Door, which means that the internal structure of Door is not copied. For this very simple example both Door and SecurityDoor are not big classes, but in a real system objects can very complex; this means that their allocation consumes a lot of memory and if a system contains thousands or millions of objects that could be an issue.

The composed SecurityDoor has to redefine the colour attribute since the concept of delegation applies only to methods and not to attributes, doesn't it?

Well, no. Python provides a very high degree of indirection for objects manipulation and attribute access is one of the most useful. As you already discovered, accessing attributes is ruled by a special method called __getattribute__() that is called whenever an attribute of the object is accessed. Overriding __getattribute__(), however, is overkill; it is a very complex method, and, being called on every attribute access, any change makes the whole thing slower.

The method we have to leverage to delegate attribute access is __getattr__(), which is a special method that is called whenever the requested attribute is not found in the object. So basically it is the right place to dispatch all attribute and method access our object cannot handle. The previous example becomes

``` python class SecurityDoor:

locked = True

def __init__(self, number, status):
    self.door = Door(number, status)

def open(self):
    if self.locked:

def __getattr__(self, attr):
    return getattr(self.door, attr)


Using __getattr__() blends the separation line between inheritance and composition since after all the former is a form of automatic delegation of every member access.

``` python class ComposedDoor:

def __init__(self, number, status):
    self.door = Door(number, status)

def __getattr__(self, attr):
    return getattr(self.door, attr)


As this last example shows, delegating every member access through __getattr__() is very simple. Pay attention to getattr() which is different from __getattr__(). The former is a built-in that is equivalent to the dotted syntax, i.e. getattr(obj, 'someattr') is the same as obj.someattr, but you have to use it since the name of the attribute is contained in a string.

Composition provides a superior way to manage delegation since it can selectively delegate the access, even mask some attributes or methods, while inheritance cannot. In Python you also avoid the memory problems that might arise when you put many objects inside another; Python handles everything through its reference, i.e. through a pointer to the memory position of the thing, so the size of an attribute is constant and very limited.

Movie Trivia

Section titles come from the following movies: The Cannonball Run (1981), Apocalypse Now (1979), Enter the Dragon (1973).


You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.


Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.

Next post

Python 3 OOP Part 4 - Polymorphism

Leonardo Giordani: Python 3 OOP Part 4 - Polymorphism

 ∗ Planet Python

Previous post

Python 3 OOP Part 3 - Delegation: composition and inheritance

Good Morning, Polymorphism

The term polymorphism, in the OOP lingo, refers to the ability of an object to adapt the code to the type of the data it is processing.

Polymorphism has two major applications in an OOP language. The first is that an object may provide different implementations of one of its methods depending on the type of the input parameters. The second is that code written for a given type of data may be used on data with a derived type, i.e. methods understand the class hierarchy of a type.

In Python polymorphism is one of the key concepts, and we can say that it is a built-in feature. Let us deal with it step by step.

First of all, you know that in Python the type of a variable is not explicitly declared. Beware that this does not mean that Python variables are untyped. On the contrary, everything in Python has a type, it just happens that the type is implicitly assigned. If you remember the last paragraph of the previous post, I stated that in Python variables are just pointers (using a C-like nomenclature), in other words they just tell the language where in memory a variable has been stored. What is stored at that address is not a business of the variable.

``` python

a = 5 a 5 type(a) hex(id(a)) '0x83fe540' a = 'five' a 'five' type(a) hex(id(a)) '0xb70d6560' ```

This little example shows a lot about the Python typing system. The variable a is not statically declared, after all it can contain only one type of data: a memory address. When we assign the number 5 to it, Python stores in a the address of the number 5 (0x83fe540 in my case, but your result will be different). The type() built-in function is smart enough to understand that we are not asking about the type of a (which is always a reference), but about the type of the content. When you store another value in a, the string 'five', Python shamelessly replaces the previous content of the variable with the new address.

So, thanks to the reference system, Python type system is both strong and dynamic. The exact definition of those two concepts is not universal, so if you are interested be ready to dive into a broad matter. However, in Python, the meaning of those two words is the following:

  • type system is strong because everything has a well-defined type that you can check with the type() built-in function
  • type system is dynamic since the type of a variable is not explicitly declared, but changes with the content

Onward! We just scratched the surface of the whole thing.

To explore the subject a little more, try to define the simplest function in Python (apart from an empty function)

``` python def echo(a):

return a


The function works as expected, just echoes the given parameter

``` python

echo(5) 5 echo('five') 'five' ```

Pretty straightforward, isn't it? Well, if you come from a statically compiled language such as C or C++ you should be at least puzzled. What is a? I mean: what type of data does it contain? Moreover, how can Python know what it is returning if there is no type specification?

Again, if you recall the references stuff everything becomes clear: that function accepts a reference and returns a reference. In other words we just defined a sort of universal function, that does the same thing regardless of the input.

This is exactly the problem that polymorphism wants to solve. We want to describe an action regardless of the type of objects, and this is what we do when we talk among humans. When you describe how to move an object by pushing it, you may explain it using a box, but you expect the person you are addressing to be able to repeat the action even if you need to move a pen, or a book, or a bottle.

There are two main strategies you can apply to get code that performs the same operation regardless of the input types.

The first approach is to cover all cases, and this is a typical approach of procedural languages. If you need to sum two numbers that can be integers, float or complex, you just need to write three sum() functions, one bound to the integer type, the second bound to the float type and the third bound to the complex type, and to have some language feature that takes charge of choosing the correct implementation depending on the input type. This logic can be implemented by a compiler (if the language is statically typed) or by a runtime environment (if the language is dynamically typed) and is the approach chosen by C++. The disadvantage of this solution is that it requires the programmer to forecast all the possible situations: what if I need to sum an integer with a float? What if I need to sum two lists? (Please note that C++ is not so poorly designed, and the operator overloading technique allows to manage such cases, but the base polymorphism strategy of that language is the one exposed here).

The second strategy, the one implemented by Python, is simply to require the input objects to solve the problem for you. In other words you ask the data itself to perform the operation, reversing the problem. Instead of writing a bunch on functions that sum all the possible types in every possible combination you just write one function that requires the input data to sum, trusting that they know how to do it. Does it sound complex? It is not.

Let's look at the Python implementation of the + operator. When we write c = a + b, Python actually executes c = a.__add__(b). As you can see the sum operation is delegated to the first input variable. So if we write

``` python def sum(a, b):

return a + b


there is no need to specify the type of the two input variables. The object a (the object contained in the variable a) shall be able to sum with the object b. This is a very beautiful and simple implementation of the polymorphism concept. Python functions are polymorphic simply because they accept everything and trust the input data to be able to perform some actions.

Let us consider another simple example before moving on. The built-in len() function returns the length of the input object. For example

``` python

l = [1, 2, 3] len(l) 3 s = "Just a sentence" len(s) 15 ```

As you can see it is perfectly polymorphic: you can feed both a list or a string to it and it just computes its length. Does it work with any type? let's check

``` python

d = {'a': 1, 'b': 2} len(d) 2 i = 5 len(i) Traceback (most recent call last): File "", line 1, in TypeError: object of type 'int' has no len() ```

Ouch! Seems that the len() function is smart enough to deal with dictionaries, but not with integers. Well, after all, the length of an integer is not defined.

Indeed this is exactly the point of Python polymorphism: the integer type does not define a length operation. While you blame the len() function, the int type is at fault. The len() function just calls the __len__() method of the input object, as you can see from this code

``` python

l.len() 3 s.len() 15 d.len() 2 i.len() Traceback (most recent call last): File "", line 1, in AttributeError: 'int' object has no attribute 'len' ```

Very straightforward: the 'int' object does not define any __len__() method.

So, to sum up what we discovered until here, I would say that Python polymorphism is based on delegation. In the following sections we will talk about the EAFP Python principle, and you will see that the delegation principle is somehow ubiquitous in this language.

Type Hard

Another real-life concept that polymorphism wants to bring into a programming language is the ability to walk the class hierarchy, that is to run code on specialized types. This is a complex sentence to say something we are used to do every day, and an example will clarify the matter.

You know how to open a door, it is something you learned in your early years. Under an OOP point of view you are an object (sorry, no humiliation intended) which is capable of interacting with a wood rectangle rotating on hinges. When you can open a door, however, you can also open a window, which, after all, is a specialized type of wood-rectangle-with-hinges, hopefully with some glass in it too. You are also able to open the car door, which is also a specialized type (this one is a mix between a standard door and a window). This shows that, once you know how to interact with the most generic type (basic door) you can also interact with specialized types (window, car door) as soon as they act like the ancestor type (e.g. as soon as they rotate on hinges).

This directly translates into OOP languages: polymorphism requires that code written for a given type may also be run on derived types. For example, a list (a generic list object, not a Python one) that can contain "numbers" shall be able to accept integers because they are numbers. The list could specify an ordering operation which requires the numbers to be able to compare each other. So, as soon as integers specify a way to compare each other they can be inserted into the list and ordered.

Statically compiled languages shall provide specific language features to implement this part of the polymorphism concept. In C++, for example, the language needs to introduce the concept of pointer compatibility between parent and child classes.

In Python there is no need to provide special language features to implement subtype polymorphism. As we already discovered Python functions accept any variable without checking the type and rely on the variable itself to provide the correct methods. But you already know that a subtype must provide the methods of the parent type, either redefining them or through implicit delegation, so as you can see Python implements subtype polymorphism from the very beginning.

I think this is one of the most important things to understand when working with this language. Python is not really interested in the actual type of the variables you are working with. It is interested in how those variables act, that is it just wants the variable to provide the right methods. So, if you come from statically typed languages, you need to make a special effort to think about acting like instead of being. This is what we called "duck typing".

Time to do an example. Let us define a Room class

``` python class Room:

def __init__(self, door):
    self.door = door

def open(self):

def close(self):

def is_open(self):
    return self.door.is_open()


A very simple class, as you can see, just enough to exemplify polymorphism. The Room class accepts a door variable, and the type of this variable is not specified. Duck typing in action: the actual type of door is not declared, there is no "acceptance test" built in the language. Indeed, the incoming variable shall export the following methods that are used in the Room class: open(), close(), is_open(). So we can build the following classes

``` python class Door:

def __init__(self):
    self.status = "closed"

def open(self):
    self.status = "open"

def close(self):
    self.status = "closed"

def is_open(self):
    return self.status == "open"

class BooleanDoor:

def __init__(self):
    self.status = True

def open(self):
    self.status = True

def close(self):
    self.status = False

def is_open(self):
    return self.status


Both represent a door that can be open or closed, and they implement the concept in two different ways: the first class relies on strings, while the second leverages booleans. Despite being two different types, both act the same way, so both can be used to build a Room object.

``` python

door = Door() bool_door = BooleanDoor() room = Room(door) bool_room = Room(bool_door) room.is_open() True room.close() room.is_open() False bool_room.is_open() True bool_room.close() bool_room.is_open() False ```

File Like Us

File-like objects are a concrete and very useful example of polymorphism in Python. A file-like object is a class (or the instance of a class) that acts like a file, i.e. it provides those methods a file object exposes.

Say for example that you code a class that parses an XML tree, and that you expect the XML code to be contained in a file. So your class accepts a file in its __init__() method, and reads the content from it

``` python class XMLReader:

def __init__(xmlfile):
    self.content =

[...] ```

The class works well until your application shall be modified to receive XML content from a network stream. To use the class without modifying it you shall write the stream in a temporary file and load this latter, but this sounds a little overkill. So you plan to change the class to accept a string, but this way you shall change every single code that uses the class to read a file, since now you shall open, read and close the file on your own, outside the class.

Polymorphism offers a better way. Why not storing the incoming stream inside an object that acts like a file, even if it is not an actual one? If you check the io module you will find that such an object has been already invented and provided in the standard Python library.

Other very useful file-like classes are those contained in the gzip, bz2, and zipfile modules (just to name some of the most used), which provide objects that allow you to manage compressed files just like plain files, hiding the decompression/compression machinery.


EAFP is a Python acronym that stands for easier to ask for forgiveness than permission. This coding style is highly pushed in the Python community because it completely relies on the duck typing concept, thus fitting well with the language philosophy.

The concept behind EAFP is fairly easy: instead of checking if an object has a given attribute or method before actually accessing or using it, just trust the object to provide what you need and manage the error case. This can be probably better understood by looking at some code. According to EAFP, instead of writing

``` python if hasattr(someobj, 'open'):





you shall write

``` python try:

except AttributeError:



As you can see, the second snippet directly uses the method and deals with the possible AttributeError exception (by the way: managing exceptions is one of the top Black Magic Topics in Python, more on it in a future post. A very quick preview: I think we may learn something from Erlang - check this).

Why is this coding style pushed so much in the Python community? I think the main reason is that through EAFP you think polymorphically: you are not interested in knowing if the object has the open attribute, you are interested in knowing if the object can satisfy your request, that is to perform the open() method call.

Movie Trivia

Section titles come from the following movies: Good Morning, Vietnam (1987), Die Hard (1988), Spies Like Us (1985), Unforgiven (1992).


You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.


Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.


Python Anywhere: New release - a few new packages, some steps towards postgres, and forum post previews

 ∗ Planet Python

Today's upgrade included a few new packages in the standard server image:

  • OpenSCAD
  • FreeCAD
  • inkscape
  • Pillow for 3.3 and 3.4
  • flask-bootstrap
  • gensim
  • textblob

We also improved the default "Unhandled Exception" page, which is shown when a users' web app allows an exception to bubble up to our part of the stack. We now include a slightly friendlier message, explaining to any of the users' users that there's an error, and explaining to the user where they can find their log files and look for debug info.

And in the background, we've deployed a bunch of infrastructure changes related to postgres support. We're getting there, slowly slowly!

Oh yes, and we've enabled dynamic previews in the forums, so you get an idea of how the markdown syntax will translate. It actually uses the same library as stackoverflow, it's called pagedown. Hope you find 'em useful!



The Obligatory Corrent Events Bullet Point Post

 ∗ andlabs's blog

I wanted to write a more colloquial post but found myself walking into dangerous waters. I also should have done this yesterday when the Zoe Quinn fiasco was fresh in my mind. And in the past 48 hours, other people have said everything I needed to. So here’s a shortlist instead.

First, I’m not going to disclose my opinion of Zoe Quinn, the game designer who was recently accused of cheating on her relationships. The reason why I won’t is simple: the correct response to someone violating your moral and ethical standards is to let your feet and wallet do the talking.

However, for whatever reason, this sparked a massive vigilante campaign against Zoe and both her female and transgender friends, possibly larger in scale than the several she has had thrown at her before. It should go without saying that this is nowhere near the appropriate reaction to someone cheating.

This last awkwardly-placed sentence can branch off in several different directions. First, destructive overreaction seems to be the norm in the gaming community; this article popped up before I could remember all the cases, so just read that.

Second, this abuse doesn’t necessarily exonerate Zoe if she did cheat. Let your own personal ethical standards decide if you should scorn her. Or, if you believe that this whole thing was a lie put out by the ex-boyfriend, exonerate her. Your choice. My own choice shall remain undisclosed.

Third, some people have painted this as an issue of misogyny. The first branch argues that this transcends misogyny, but now I’m also starting to wonder if 4chan was just looking for an excuse to continue the abuse that she got when she put Depression Quest on Steam.

Last, and this is something I’m very likely guilty of: a lot of us tend to shrug off 4chan as just being immature kids acting immaturely and that their actions bear little weight on the world. Wrong. In fact, the exact opposite is true: since 4chan is so popular, they define, in part or in whole, the face of the gaming community. When all is said and done, they’re the vocal ones whose opinions and social model ring true from the heavens above. If they want people to stop hating games, they have to stop hating. Deal with it.

On the subject of Ferguson: First, you’re out of your mind if you think this isn’t about race. Second, this tweet and this one. Third, I wanted to say something here about how this is a genuine act of civil disobedience after many non-examples (Occupy Wall Street, anyone?) and how some things should be tweaked to really make it take effect, but to avoid misinterpretation I won’t. What I will say instead is simple: you’re out of your mind if you think racism was ever dead in America, and there’s almost $100,000 sitting in a gofundme account pointed at the murderer in this case to prove it.

But most of all, thinking that to fix Ferguson we have to get rid of police brutality and not racism is wrong. We need to fix both; it’ll be easier to get rid of racism first, as it’ll leave police brutality with no motive. This is the same opinion I had about that man in Georgia who was wrongfully executed last year or two years ago; if you get rid of the death sentence without getting rid of the racism that landed him one, then he’ll just get life imprisonment without the possibility of parole AND with virtually no chance of exoneration, even with all the organizations dedicated to exonerating the wrongly convicted. And then what?

On Israel and Gaza and Palestine and ISIS and whatever: hold all sides responsible. It’s not about whether one is right or wrong; in my opinion, both are wrong.

I said other things but they’re all on Twitter, so i guess that’s it???? Programming resumes next time; feel free to hate me for these opinions ^^


Ian Ozsvald: Data Science Training Survey

 ∗ Planet Python

I’ve put together a short survey to figure out what’s needed for Python-based Data Science training in the UK. If you want to be trained in strong data science, analysis and engineering skills please complete the survey, it doesn’t need any sign-up and will take just a couple of minutes. I’ll share the results at the next PyDataLondon meetup.

If you want training you probably want to be on our training announce list, this is a low volume list (run by MailChimp) where we announce upcoming dates and suggest topics that you might want training around. You can unsubscribe at any time.

I’ve written about the current two courses that run in October through ModelInsight, one focuses on improving skills around data science using Python (including numpy, scipy and TDD), the second on high performance Python (I’ve now finished writing O’Reilly’s High Performance Python book). Both courses focus on practical skills, you’ll walk away with working systems and a stronger understanding of key Python skills. Your developer skills will be stronger as will your debugging skills, in the longer run you’ll develop stronger software with fewer defects.

If you want to talk about this, come have a chat at the next PyData London meetup or in the pub after.

Ian applies Data Science as an AI/Data Scientist for companies in ModelInsight, sign-up for Data Science tutorials in London. Historically Ian ran Mor Consulting. He also founded the image and text annotation API, co-authored SocialTies, programs Python, authored The Screencasting Handbook, lives in London and is a consumer of fine coffees.

Wednesday, 20 August


Go Concurrency Patterns: Context

 ∗ The Go Programming Language Blog


In Go servers, each incoming request is handled in its own goroutine. Request handlers often start additional goroutines to access backends such as databases and RPC services. The set of goroutines working on a request typically needs access to request-specific values such as the identity of the end user, authorization tokens, and the request's deadline. When a request is canceled or times out, all the goroutines working on that request should exit quickly so the system can reclaim any resources they are using.

At Google, we developed a context package that makes it easy to pass request-scoped values, cancelation signals, and deadlines across API boundaries to all the goroutines involved in handling a request. The package is publicly available as This article describes how to use the package and provides a complete working example.


The core of the context package is the Context type:

// A Context carries a deadline, cancelation signal, and request-scoped values
// across API boundaries. Its methods are safe for simultaneous use by multiple
// goroutines.
type Context interface {
    // Done returns a channel that is closed when this Context is canceled
    // or times out.
    Done() <-chan struct{}

    // Err indicates why this context was canceled, after the Done channel
    // is closed.
    Err() error

    // Deadline returns the time when this Context will be canceled, if any.
    Deadline() (deadline time.Time, ok bool)

    // Value returns the value associated with key or nil if none.
    Value(key interface{}) interface{}

(This description is condensed; the godoc is authoritative.)

The Done method returns a channel that acts as a cancelation signal to functions running on behalf of the Context: when the channel is closed, the functions should abandon their work and return. The Err method returns an error indicating why the Context was canceled. The Pipelines and Cancelation article discusses the Done channel idiom in more detail.

A Context does not have a Cancel method for the same reason the Done channel is receive-only: the function receiving a cancelation signal is usually not the one that sends the signal. In particular, when a parent operation starts goroutines for sub-operations, those sub-operations should not be able to cancel the parent. Instead, the WithCancel function (described below) provides a way to cancel a new Context value.

A Context is safe for simultaneous use by multiple goroutines. Code can pass a single Context to any number of goroutines and cancel that Context to signal all of them.

The Deadline method allows functions to determine whether they should start work at all; if too little time is left, it may not be worthwhile. Code may also use a deadline to set timeouts for I/O operations.

Value allows a Context to carry request-scoped data. That data must be safe for simultaneous use by multiple goroutines.

Derived contexts

The context package provides functions to derive new Context values from existing ones. These values form a tree: when a Context is canceled, all Contexts derived from it are also canceled.

Background is the root of any Context tree; it is never canceled:

// Background returns an empty Context. It is never canceled, has no deadline,
// and has no values. Background is typically used in main, init, and tests,
// and as the top-level Context for incoming requests.
func Background() Context

WithCancel and WithTimeout return derived Context values that can be canceled sooner than the parent Context. The Context associated with an incoming request is typically canceled when the request handler returns. WithCancel is also useful for canceling redundant requests when using multiple replicas. WithTimeout is useful for setting a deadline on requests to backend servers:

// WithCancel returns a copy of parent whose Done channel is closed as soon as
// parent.Done is closed or cancel is called.
func WithCancel(parent Context) (ctx Context, cancel CancelFunc)

// A CancelFunc cancels a Context.
type CancelFunc func()

// WithTimeout returns a copy of parent whose Done channel is closed as soon as
// parent.Done is closed, cancel is called, or timeout elapses. The new
// Context's Deadline is the sooner of now+timeout and the parent's deadline, if
// any. If the timer is still running, the cancel function releases its
// resources.
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc)

WithValue provides a way to associate request-scoped values with a Context:

// WithValue returns a copy of parent whose Value method returns val for key.
func WithValue(parent Context, key interface{}, val interface{}) Context

The best way to see how to use the context package is through a worked example.

Example: Google Web Search

Our example is an HTTP server that handles URLs like /search?q=golang&timeout=1s by forwarding the query "golang" to the Google Web Search API and rendering the results. The timeout parameter tells the server to cancel the request after that duration elapses.

The code is split across three packages:

  • server provides the main function and the handler for /search.
  • userip provides functions for extracting a user IP address from a request and associating it with a Context.
  • google provides the Search function for sending a query to Google.

The server program

The server program handles requests like /search?q=golang by serving the first few Google search results for golang. It registers handleSearch to handle the /search endpoint. The handler creates an initial Context called ctx and arranges for it to be canceled when the handler returns. If the request includes the timeout URL parameter, the Context is canceled automatically when the timeout elapses:

func handleSearch(w http.ResponseWriter, req *http.Request) {
    // ctx is the Context for this handler. Calling cancel closes the
    // ctx.Done channel, which is the cancellation signal for requests
    // started by this handler.
    var (
        ctx    context.Context
        cancel context.CancelFunc
    timeout, err := time.ParseDuration(req.FormValue("timeout"))
    if err == nil {
        // The request has a timeout, so create a context that is
        // canceled automatically when the timeout expires.
        ctx, cancel = context.WithTimeout(context.Background(), timeout)
    } else {
        ctx, cancel = context.WithCancel(context.Background())
    defer cancel() // Cancel ctx as soon as handleSearch returns.

The handler extracts the query from the request and extracts the client's IP address by calling on the userip package. The client's IP address is needed for backend requests, so handleSearch attaches it to ctx:

    // Check the search query.
    query := req.FormValue("q")
    if query == "" {
        http.Error(w, "no query", http.StatusBadRequest)

    // Store the user IP in ctx for use by code in other packages.
    userIP, err := userip.FromRequest(req)
    if err != nil {
        http.Error(w, err.Error(), http.StatusBadRequest)
    ctx = userip.NewContext(ctx, userIP)

The handler calls google.Search with ctx and the query:

    // Run the Google search and print the results.
    start := time.Now()
    results, err := google.Search(ctx, query)
    elapsed := time.Since(start)

If the search succeeds, the handler renders the results:

    if err := resultsTemplate.Execute(w, struct {
        Results          google.Results
        Timeout, Elapsed time.Duration
        Results: results,
        Timeout: timeout,
        Elapsed: elapsed,
    }); err != nil {

Package userip

The userip package provides functions for extracting a user IP address from a request and associating it with a Context. A Context provides a key-value mapping, where the keys and values are both of type interface{}. Key types must support equality, and values must be safe for simultaneous use by multiple goroutines. Packages like userip hide the details of this mapping and provide strongly-typed access to a specific Context value.

To avoid key collisions, userip defines an unexported type key and uses a value of this type as the context key:

// The key type is unexported to prevent collisions with context keys defined in
// other packages.
type key int

// userIPkey is the context key for the user IP address.  Its value of zero is
// arbitrary.  If this package defined other context keys, they would have
// different integer values.
const userIPKey key = 0

FromRequest extracts a userIP value from an http.Request:

func FromRequest(req *http.Request) (net.IP, error) {
    ip, _, err := net.SplitHostPort(req.RemoteAddr)
    if err != nil {
        return nil, fmt.Errorf("userip: %q is not IP:port", req.RemoteAddr)

NewContext returns a new Context that carries a provided userIP value:

func NewContext(ctx context.Context, userIP net.IP) context.Context {
    return context.WithValue(ctx, userIPKey, userIP)

FromContext extracts a userIP from a Context:

func FromContext(ctx context.Context) (net.IP, bool) {
    // ctx.Value returns nil if ctx has no value for the key;
    // the net.IP type assertion returns ok=false for nil.
    userIP, ok := ctx.Value(userIPKey).(net.IP)
    return userIP, ok

Package google

The google.Search function makes an HTTP request to the Google Web Search API and parses the JSON-encoded result. It accepts a Context parameter ctx and returns immediately if ctx.Done is closed while the request is in flight.

The Google Web Search API request includes the search query and the user IP as query parameters:

func Search(ctx context.Context, query string) (Results, error) {
    // Prepare the Google Search API request.
    req, err := http.NewRequest("GET", "", nil)
    if err != nil {
        return nil, err
    q := req.URL.Query()
    q.Set("q", query)

    // If ctx is carrying the user IP address, forward it to the server.
    // Google APIs use the user IP to distinguish server-initiated requests
    // from end-user requests.
    if userIP, ok := userip.FromContext(ctx); ok {
        q.Set("userip", userIP.String())
    req.URL.RawQuery = q.Encode()

Search uses a helper function, httpDo, to issue the HTTP request and cancel it if ctx.Done is closed while the request or response is being processed. Search passes a closure to httpDo handle the HTTP response:

    var results Results
    err = httpDo(ctx, req, func(resp *http.Response, err error) error {
        if err != nil {
            return err
        defer resp.Body.Close()

        // Parse the JSON search result.
        var data struct {
            ResponseData struct {
                Results []struct {
                    TitleNoFormatting string
                    URL               string
        if err := json.NewDecoder(resp.Body).Decode(&data); err != nil {
            return err
        for _, res := range data.ResponseData.Results {
            results = append(results, Result{Title: res.TitleNoFormatting, URL: res.URL})
        return nil
    // httpDo waits for the closure we provided to return, so it's safe to
    // read results here.
    return results, err

The httpDo function runs the HTTP request and processes its response in a new goroutine. It cancels the request if ctx.Done is closed before the goroutine exits:

func httpDo(ctx context.Context, req *http.Request, f func(*http.Response, error) error) error {
    // Run the HTTP request in a goroutine and pass the response to f.
    tr := &http.Transport{}
    client := &http.Client{Transport: tr}
    c := make(chan error, 1)
    go func() { c <- f(client.Do(req)) }()
    select {
    case <-ctx.Done():
        <-c // Wait for f to return.
        return ctx.Err()
    case err := <-c:
        return err

Adapting code for Contexts

Many server frameworks provide packages and types for carrying request-scoped values. We can define new implementations of the Context interface to bridge between code using existing frameworks and code that expects a Context parameter.

For example, Gorilla's package allows handlers to associate data with incoming requests by providing a mapping from HTTP requests to key-value pairs. In gorilla.go, we provide a Context implementation whose Value method returns the values associated with a specific HTTP request in the Gorilla package.

Other packages have provided cancelation support similar to Context. For example, Tomb provides a Kill method that signals cancelation by closing a Dying channel. Tomb also provides methods to wait for those goroutines to exit, similar to sync.WaitGroup. In tomb.go, we provide a Context implementation that is canceled when either its parent Context is canceled or a provided Tomb is killed.


At Google, we require that Go programmers pass a Context parameter as the first argument to every function on the call path between incoming and outgoing requests. This allows Go code developed by many different teams to interoperate well. It provides simple control over timeouts and cancelation and ensures that critical values like security credentials transit Go programs properly.

Server frameworks that want to build on Context should provide implementations of Context to bridge between their packages and those that expect a Context parameter. Their client libraries would then accept a Context from the calling code. By establishing a common interface for request-scoped data and cancelation, Context makes it easier for package developers to share code for creating scalable services.


Machinalis: Making the case for Jython

 ∗ Planet Python


Jython is is an implementation of Python that runs on top of the Java Virtual Machine. Why is it different? Why should I care about it? This blogpost will try to give an answer to those questions by introducing a real life example.

Why Jython?

I had the privilege of working in Java for almost 15 years before I jumped to the Python bandwagon, so for me the value of Jython is pretty obvious. This might not be the case for you if you’ve never worked with either language, so let me tell you (and show you) what makes Jython awesome and useful.

According to the Jython site, these are the features that makes Jython standout over other JVM based languages:

  • Dynamic compilation to Java bytecodes - leads to highest possible performance without sacrificing interactivity.
  • Ability to extend existing Java classes in Jython - allows effective use of abstract classes.
  • Optional static compilation - allows creation of applets, servlets, beans, ...
  • Bean Properties - make use of Java packages much easier.
  • Python Language - combines remarkable power with very clear syntax. It also supports a full object-oriented programming model which makes it a natural fit for Java’s OO design.

I think the first, second and fifth bullets require special attention.

For some reason, a lot of people believe that the JVM is slow. This might have been true on the first years of the platform, but the JVM’s performance has increased a lot since then. A lot has been written on this subject but the following Wikipedia article summarizes the situation pretty well.

As mentioned above, it is possible to use Java classes in Jython. Although this statement is true, it fails to convey what I think is the most important aspect of Jython: there are A LOT of high-quality mature Java libraries out there. The possibility of mixing all this libraries with the flexibility and richness of Python is invaluable. Let me give you a taste of this power.

Until the introduction of the new Date and Time API of Java 8, the only way to handle time properly in Java was to use Joda-Time. Joda-Time is an incredible powerful and flexible library for handling date and time on Java (or any JVM language for that matter). Although there are similar libraries in Python, I still haven’t come across one that can give Joda-Time a run for its money. The following shows a Jython shell session using Joda-Time:

Jython 2.7b2 (default:a5bc0032cf79+, Apr 22 2014, 21:20:17)
[Java HotSpot(TM) 64-Bit Server VM (Oracle Corporation)] on java1.8.0_05
Type "help", "copyright", "credits" or "license" for more information.
>>> from org.joda.time import DateTime
>>> date_time = DateTime()
>>> date_time
>>> date_time.getMonthOfYear()
>>> date_time.withYear(2000)
>>> date_time.monthOfYear().getAsText()
>>> date_time.monthOfYear().getAsShortText(Locale.FRENCH);
>>> date_time.dayOfMonth().roundFloorCopy();

This was just a quick example of the simplest features of Joda-Time. Although most of the features of Joda-Time are present in python-dateutil (with the exception of unusual chronologies), this is just an example. There are other popular Java libraries without a Python counter-part (I’ll show you one on the next section).

As I mentioned before, I switched to Python recently. There was a lot involved in that decision, but the language itself played a major role. The possibility of combining this fantastic language with the power of the JVM and all the Java libraries and tools readily available is an interesting proposition.

Let me show you a real life example that I think summarizes perfectly what Jython matters.

Redacting names on comments

Not too long ago, we had to redact names from comments coming from social media sites. Our first idea was to use NLTK’s NERTagger. This class depends on the Stanford Named Entity Recognizer (NER) which is a Java library. The integration is done invoking a java shell command and analyzing its output. Not only this is far from ideal, it might create some problems if your data isn’t just a piece of large text (which is our case).

This limitation is not caused by the NER API but by the way NLTK interacts with it. Wouldn’t it be nice if we could just write Python code that uses this API? Let’s do just that.

We cannot show you the data we had to work with, but I wrote an IPython Notebook to generate fake comments and save them on a CSV file so our script can work with them.

After the comments have been read, all we need to do is have the classifier tag the tokens, so we can redact the person names from the comments:

classifier = CRFClassifier.getClassifierNoExceptions(

for row in dict_reader:
    redacted_text = row['text']
    classify_result = classifier.classify(row['text']);

    for sentence in classify_result:
        for word in sentence:
            token = word.originalText()
            tag = word.get(AnswerAnnotation)

            if tag == 'PERSON':
                redacted_text = redacted_text.replace(token, '****')

    row['redacted_text'] = redacted_text

This is an excerpt from a Python script available on github to redact names from text coming from a CSV file. All we need to run it is a JRE, Jython 2.7 distribution and the Stanford NER jars. All we need to do is run the following from the command line:

java -Dpython.path=stanford-ner-2014-01-04/stanford-ner.jar -jar jython-standalone-2.7-b2.jar comments_df.csv comments_df_redacted.csv

Although we cannot run the code directly from Python (cPython, that is), we didn’t need to write a single line of Java to get access to the full power of Stanford NER API.


I hope by now you have an idea of just how important Jython is. It has some limitations, like the inability of integrating modules written in C or that it is only compatible with Python 2.7, but I think its advantages far outweigh the shortcomings.

Although we haven’t had the chance to work with .NET, I think the same rationale can be applied to IronPython when it comes to interacting with Microsoft’s framework.

Leonardo Giordani: Python 3 OOP Part 1 - Objects and types

 ∗ Planet Python

About this series

Object-oriented programming (OOP) has been the leading programming paradigm for several decades now, starting from the initial attempts back in the 60s to some of the most important languages used nowadays. Being a set of programming concepts and design methodologies, OOP can never be said to be "correctly" or "fully" implemented by a language: indeed there are as many implementations as languages.

So one of the most interesting aspects of OOP languages is to understand how they implement those concepts. In this post I am going to try and start analyzing the OOP implementation of the Python language. Due to the richness of the topic, however, I consider this attempt just like a set of thoughts for Python beginners trying to find their way into this beautiful (and sometimes peculiar) language.

This series of posts wants to introduce the reader to the Python 3 implementation of Object Oriented Programming concepts. The content of this and the following posts will not be completely different from that of the previous "OOP Concepts in Python 2.x" series, however. The reason is that while some of the internal structures change a lot, the global philosophy doesn't, being Python 3 an evolution of Python 2 and not a new language.

So I chose to split the previous series and to adapt the content to Python 3 instead of posting a mere list of corrections. I find this way to be more useful for new readers, that otherwise sould be forced to read the previoous series.


One of the most noticeable changes introduced by Python 3 is the transformation of the print keyword into the print() function. This is indeed a very small change, compared to other modifications made to the internal structures, but is the most visual-striking one, and will be the source of 80% of your syntax errors when you will start writing Python 3 code.

Remember that print is now a function so write print(a) and not print a.d

Back to the Object

Computer science deals with data and with procedures to manipulate that data. Everything, from the earliest Fortran programs to the latest mobile apps is about data and their manipulation.

So if data are the ingredients and procedures are the recipes, it seems (and can be) reasonable to keep them separate.

Let's do some procedural programming in Python

``` python

This is some data

data = (13, 63, 5, 378, 58, 40)

This is a procedure that computes the average

def avg(d):

return sum(d)/len(d)

print(avg(data)) ```

As you can see the code is quite good and general: the procedure (function) operates on a sequence of data, and it returns the average of the sequence items. So far, so good: computing the average of some numbers leaves the numbers untouched and creates new data.

The observation of the everyday world, however, shows that complex data mutate: an electrical device is on or off, a door is open or closed, the content of a bookshelf in your room changes as you buy new books.

You can still manage it keeping data and procedures separate, for example

``` python

These are two numbered doors, initially closed

door1 = [1, 'closed'] door2 = [2, 'closed']

This procedure opens a door

def open_door(door):

door[1] = 'open'

open_door(door1) print(door1) ```

I described a door as a structure containing a number and the status of the door (as you would do in languages like LISP, for example). The procedure knows how this structure is made and may alter it.

This also works like a charm. Some problems arise, however, when we start building specialized types of data. What happens, for example, when I introduce a "lockable door" data type, which can be opened only when it is not locked? Let's see

``` python

These are two standard doors, initially closed

door1 = [1, 'closed'] door2 = [2, 'closed']

This is a lockable door, initially closed and unlocked

ldoor1 = [1, 'closed', 'unlocked']

This procedure opens a standard door

def open_door(door):

door[1] = 'open'

This procedure opens a lockable door

def open_ldoor(door):

if door[2] == 'unlocked':
    door[1] = 'open'

open_door(door1) print(door1)

open_ldoor(ldoor1) print(ldoor1) ```

Everything still works, no surprises in this code. However, as you can see, I had to find a different name for the procedure that opens a locked door since its implementation differs from the procedure that opens a standard door. But, wait... I'm still opening a door, the action is the same, and it just changes the status of the door itself. So why shall I remember that a locked door shall be opened with open_ldoor() instead of open_door() if the verb is the same?

Chances are that this separation between data and procedures doesn't perfectly fit some situations. The key problem is that the "open" action is not actually using the door; rather it is changing its state. So, just like the volume control buttons of your phone, which are on your phone, the "open" procedure should stick to the "door" data.

This is exactly what leads to the concept of object: an object, in the OOP context, is a structure holding data and procedures operating on them.

What About Type?

When you talk about data you immediately need to introduce the concept of type. This concept may have two meanings that are worth being mentioned in computer science: the behavioural and the structural one.

The behavioural meaning represents the fact that you know what something is by describing how it acts. This is the foundation of the so-called "duck typing" (here "typing" means "to give a type" and not "to type on a keyboard"): if it types acts like a duck, it is a duck.

The structural meaning identifies the type of something by looking at its internal structure. So two things that act in the same way but are internally different are of different type.

Both points of view can be valid, and different languages may implement and emphasize one meaning of type or the other, and even both.

Class Games

Objects in Python may be built describing their structure through a class. A class is the programming representation of a generic object, such as "a book", "a car", "a door": when I talk about "a door" everyone can understand what I'm saying, without the need of referring to a specific door in the room.

In Python, the type of an object is represented by the class used to build the object: that is, in Python the word type has the same meaning of the word class.

For example, one of the built-in classes of Python is int, which represents an integer number

``` python

a = 6 print(a) 6 print(type(a)) print(a.class) ```

As you can see, the built-in function type() returns the content of the magic attribute __class__ (magic here means that its value is managed by Python itself offstage). The type of the variable a, or its class, is int. (This is a very inaccurate description of this rather complex topic, so remember that at the moment we are just scratching the surface).

Once you have a class you can instantiate it to get a concrete object (an instance) of that type, i.e. an object built according to the structure of that class. The Python syntax to instantiate a class is the same of a function call

``` python

b = int() type(b) ```

When you create an instance, you can pass some values, according to the class definition, to initialize it.

``` python

b = int() print(b) 0 c = int(7) print(c) 7 ```

In this example, the int class creates an integer with value 0 when called without arguments, otherwise it uses the given argument to initialize the newly created object.

Let us write a class that represents a door to match the procedural examples done in the first section

``` python class Door:

def __init__(self, number, status):
    self.number = number
    self.status = status

def open(self):
    self.status = 'open'

def close(self):
    self.status = 'closed'


The class keyword defines a new class named Door; everything indented under class is part of the class. The functions you write inside the object are called methods and don't differ at all from standard functions; the nomenclature changes only to highlight the fact that those functions now are part of an object.

Methods of a class must accept as first argument a special value called self (the name is a convention but please never break it).

The class can be given a special method called __init__() which is run when the class is instantiated, receiving the arguments passed when calling the class; the general name of such a method, in the OOP context, is constructor, even if the __init__() method is not the only part of this mechanism in Python.

The self.number and self.status variables are called attributes of the object. In Python, methods and attributes are both members of the object and are accessible with the dotted syntax; the difference between attributes and methods is that the latter can be called (in Python lingo you say that a method is a callable).

As you can see the __init__() method shall create and initialize the attributes since they are not declared elsewhere. This is very important in Python and is strictly linked with the way the language handles the type of variables. I will detail those concepts when dealing with polymorphism in a later post.

The class can be used to create a concrete object

``` python

door1 = Door(1, 'closed') type(door1) print(door1.number) 1 print(door1.status) closed ```

Now door1 is an instance of the Door class; type() returns the class as __main__.Door since the class was defined directly in the interactive shell, that is in the current main module.

To call a method of an object, that is to run one of its internal functions, you just access it as an attribute with the dotted syntax and call it like a standard function.

``` python print(door1.number) 1 print(door1.status) open ```

In this case, the open() method of the door1 instance has been called. No arguments have been passed to the open() method, but if you review the class declaration, you see that it was declared to accept an argument (self). When you call a method of an instance, Python automatically passes the instance itself to the method as the first argument.

You can create as many instances as needed and they are completely unrelated each other. That is, the changes you make on one instance do not reflect on another instance of the same class.


Objects are described by a class, which can generate one or more instances, unrelated each other. A class contains methods, which are functions, and they accept at least one argument called self, which is the actual instance on which the method has been called. A special method, __init__() deals with the initialization of the object, setting the initial value of the attributes.

Movie Trivia

Section titles come from the following movies: Back to the Future (1985) , What About Bob? (1991), Wargames (1983).


You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.


Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.

Next post

Python 3 OOP Part 2 - Classes and members

Leonardo Giordani: Python 3 OOP Part 2 - Classes and members

 ∗ Planet Python

Previous post

Python 3 OOP Part 1 - Objects and types

Python Classes Strike Again

The Python implementation of classes has some peculiarities. The bare truth is that in Python the class of an object is an object itself. You can check this by issuing type() on the class

``` python

a = 1 type(a) type(int) ```

This shows that the int class is an object, an instance of the type class.

This concept is not so difficult to grasp as it can seem at first sight: in the real world we deal with concepts using them like things: for example we can talk about the concept of "door", telling people how a door looks like and how it works. In this case the concept of door is the topic of our discussion, so in our everyday experience the type of an object is an object itself. In Python this can be expressed by saying that everything is an object.

If the class of an object is itself an instance it is a concrete object and is stored somewhere in memory. Let us leverage the inspection capabilities of Python and its id() function to check the status of our objects. The id() built-in function returns the memory position of an object.

In the first post we defined this class

``` python class Door:

def __init__(self, number, status):
    self.number = number
    self.status = status

def open(self):
    self.status = 'open'

def close(self):
    self.status = 'closed'


First of all, let's create two instances of the Door class and check that the two objects are stored at different addresses

``` python

door1 = Door(1, 'closed') door2 = Door(1, 'closed') hex(id(door1)) '0xb67e148c' hex(id(door2)) '0xb67e144c' ```

This confirms that the two instances are separate and unrelated. Please note that your values are very likely to be different from the ones I got. Being memory addresses they change at every execution. The second instance was given the same attributes of the first instance to show that the two are different objects regardless of the value of the attributes.

However if we use id() on the class of the two instances we discover that the class is exactly the same

``` python

hex(id(door1.class)) '0xb685f56c' hex(id(door2.class)) '0xb685f56c' ```

Well this is very important. In Python, a class is not just the schema used to build an object. Rather, the class is a shared living object, which code is accessed at run time.

As we already tested, however, attributes are not stored in the class but in every instance, due to the fact that __init__() works on self when creating them. Classes, however, can be given attributes like any other object; with a terrific effort of imagination, let's call them class attributes.

As you can expect, class attributes are shared among the class instances just like their container

``` python class Door:

colour = 'brown'

def __init__(self, number, status):
    self.number = number
    self.status = status

def open(self):
    self.status = 'open'

def close(self):
    self.status = 'closed'


Pay attention: the colour attribute here is not created using self, so it is contained in the class and shared among instances

``` python

door1 = Door(1, 'closed') door2 = Door(2, 'closed') Door.colour 'brown' door1.colour 'brown' door2.colour 'brown' ```

Until here things are not different from the previous case. Let's see if changes of the shared value reflect on all instances

``` python

Door.colour = 'white' Door.colour 'white' door1.colour 'white' door2.colour 'white' hex(id(Door.colour)) '0xb67e1500' hex(id(door1.colour)) '0xb67e1500' hex(id(door2.colour)) '0xb67e1500' ```

Raiders of the Lost Attribute

Any Python object is automatically given a __dict__ attribute, which contains its list of attributes. Let's investigate what this dictionary contains for our example objects:

``` python

Door.dict mappingproxy({'open': ,

'colour': 'brown',
'__dict__': <attribute '__dict__' of 'Door' objects>,
'__weakref__': <attribute '__weakref__' of 'Door' objects>,
'__init__': <function Door.__init__ at 0xb7062854>,
'__module__': '__main__',
'__doc__': None,
'close': <function Door.close at 0xb686041c>})

door1.dict {'number': 1, 'status': 'closed'} ```

Leaving aside the difference between a dictionary and a mappingproxy object, you can see that the colour attribute is listed among the Door class attributes, while status and number are listed for the instance.

How comes that we can call door1.colour, if that attribute is not listed for that instance? This is a job performed by the magic __getattribute__() method; in Python the dotted syntax automatically invokes this method so when we write door1.colour, Python executes door1.__getattribute__('colour'). That method performs the attribute lookup action, i.e. finds the value of the attribute by looking in different places.

The standard implementation of __getattribute__() searches first the internal dictionary (__dict__) of an object, then the type of the object itself; in this case door1.__getattribute__('colour') executes first door1.__dict__['colour'] and then, since the latter raises a KeyError exception, door1.__class__.__dict__['colour']

``` python

door1.dict['colour'] Traceback (most recent call last): File "", line 1, in KeyError: 'colour' door1.class.dict['colour'] 'brown' ```

Indeed, if we compare the objects' equality through the is operator we can confirm that both door1.colour and Door.colour are exactly the same object

``` python

door1.colour is Door.colour True ```

When we try to assign a value to a class attribute directly on an instance, we just put in the __dict__ of the instance a value with that name, and this value masks the class attribute since it is found first by __getattribute__(). As you can see from the examples of the previous section, this is different from changing the value of the attribute on the class itself.

``` python

door1.colour = 'white' door1.dict['colour'] 'white' door1.class.dict['colour'] 'brown' Door.colour = 'red' door1.dict['colour'] 'white' door1.class.dict['colour'] 'red' ```

Revenge of the Methods

Let's play the same game with methods. First of all you can see that, just like class attributes, methods are listed only in the class __dict__. Chances are that they behave the same as attributes when we get them

``` python is False ```

Whoops. Let us further investigate the matter

``` python

Door.dict['open'] __main__.Door object at 0xb67e162c>> ```

So, the class method is listed in the members dictionary as function. So far, so good. The same happens when taking it directly from the class; here Python 2 needed to introduce unbound methods, which are not present in Python 3. Taking it from the instance returns a bound method.

Well, a function is a procedure you named and defined with the def statement. When you refer to a function as part of a class in Python 3 you get a plain function, without any difference from a function defined outside a class.

When you get the function from an instance, however, it becomes a bound method. The name method simply means "a function inside an object", according to the usual OOP definitions, while bound signals that the method is linked to that instance. Why does Python bother with methods being bound or not? And how does Python transform a function into a bound method?

First of all, if you try to call a class function you get an error

``` python Traceback (most recent call last): File "", line 1, in TypeError: open() missing 1 required positional argument: 'self' ```

Yes. Indeed the function was defined to require an argument called 'self', and calling it without an argument raises an exception. This perhaps means that we can give it one instance of the class and make it work

``` python door1.status 'open' ```

Python does not complain here, and the method works as expected. So is the same as, and this is the difference between a plain function coming from a class an a bound method: the bound method automatically passes the instance as an argument to the function.

Again, under the hood, __getattribute__() is working to make everything work and when we call, Python actually calls However, is a plain function, so there is something more that converts it into a bound method that Python can safely call.

When you access a member of an object, Python calls __getattribute__() to satisfy the request. This magic method, however, conforms to a procedure known as descriptor protocol. For the read access __getattribute__() checks if the object has a __get__() method and calls this latter. So the converstion of a function into a bound method happens through such a mechanism. Let us review it by means of an example.

``` python

door1.class.dict['open'] ```

This syntax retrieves the function defined in the class; the function knows nothing about objects, but it is an object (remember "everything is an object"). So we can look inside it with the dir() built-in function

``` python

dir(door1.class.dict['open']) ['annotations', 'call', 'class', 'closure', 'code', 'defaults', 'delattr', 'dict', 'dir', 'doc', 'eq', 'format', 'ge', 'get', 'getattribute', 'globals', 'gt', 'hash', 'init', 'kwdefaults', 'le', 'lt', 'module', 'name', 'ne', 'new', 'qualname', 'reduce', 'reduce_ex', 'repr', 'setattr', 'sizeof', 'str', 'subclasshook'] door1.class.dict['open'].get <method-wrapper 'get' of function object at 0xb68604ac> ```

As you can see, a __get__ method is listed among the members of the function, and Python recognizes it as a method-wrapper. This method shall connect the open function to the door1 instance, so we can call it passing the instance alone

``` python

door1.class.dict['open'].get(door1) __main__.Door object at 0xb67e162c>> ```

and we get exactly what we were looking for. This complex syntax is what happens behind the scenes when we call a method of an instance.

When Methods met Classes

Using type() on functions defined inside classes reveals some other details on their internal representation

``` python __main__.Door object at 0xb6f9834c>> type( type( ```

As you can see, Python tells the two apart recognizing the first as a function and the second as a method, where the second is a function bound to an instance.

What if we want to define a function that operates on the class instead of operating on the instance? As we may define class attributes, we may also define class methods in Python, through the classmethod decorator. Class methods are functions that are bound to the class and not to an instance.

``` python class Door:

colour = 'brown'

def __init__(self, number, status):
    self.number = number
    self.status = status

def knock(cls):

def open(self):
    self.status = 'open'

def close(self):
    self.status = 'closed'


Such a definition makes the method callable on both the instance and the class

``` python

door1.knock() Knock! Door.knock() Knock! ```

and Python identifies both as (bound) methods

``` python

door1.class.dict['knock'] door1.knock > Door.knock > type(Door.knock) type(door1.knock) ```

As you can see the knock() function accepts one argument, which is called cls just to remember that it is not an instance but the class itself. This means that inside the function we can operate on the class, and the class is shared among instances.

``` python class Door:

colour = 'brown'

def __init__(self, number, status):
    self.number = number
    self.status = status

def knock(cls):

def paint(cls, colour):
    cls.colour = colour

def open(self):
    self.status = 'open'

def close(self):
    self.status = 'closed'


The paint() classmethod now changes the class attribute colour which is shared among instances. Let's check how it works

``` python

door1 = Door(1, 'closed') door2 = Door(2, 'closed') Door.colour 'brown' door1.colour 'brown' door2.colour 'brown' Door.paint('white') Door.colour 'white' door1.colour 'white' door2.colour 'white' ```

The class method can be called on the class, but this affects both the class and the instances, since the colour attribute of instances is taken at runtime from the shared class.

``` python

door1.paint('yellow') Door.colour 'yellow' door1.colour 'yellow' door2.colour 'yellow' ```

Class methods can be called on instances too, however, and their effect is the same as before. The class method is bound to the class, so it works on this latter regardless of the actual object that calls it (class or instance).

Movie Trivia

Section titles come from the following movies: The Empire Strikes Back (1980), Raiders of the Lost Ark (1981), Revenge of the Nerds (1984), When Harry Met Sally (1989).


You will find a lot of documentation in this Reddit post. Most of the information contained in this series come from those sources.


Feel free to use the blog Google+ page to comment the post. The GitHub issues page is the best place to submit corrections.

Next post

Python 3 OOP Part 3 - Delegation: composition and inheritance

Omaha Python Users Group: August 20 Meeting Details

 ∗ Planet Python

Location - a conference room at Gordmans in Aksarben thanks to Aaron Keck.

Meeting starts at 7pm, Wednesday, 8/20/14

Call 402-651-5215 if you have last minute communications.

Parking and entry details:

The building is the northwest corner of 67th and Frances and the Gordmans entrance is next to the “g” sign, about midway along the building.  There’s parking directly out front, but it sometimes fills up in the evenings.  The garage around back is open to the public after 5:30 or 6 as well.

The building doors lock at 5, so Aaron will be standing by to badge people in starting around 6:45.  If you’re running late, or early, just shoot him an email and I can meet you.

- Interesting Python tips and tricks we have discovered recently
- Bring your questions/problems you need help solving
- Scheduling topics and discussions for the next few meetings.



Kushal Das: 10 years and continuing

 ∗ Planet Python

Ten years ago I started a Linux Users Group in Durgapur as I thought that is the only way to go forward. All most no one in the colleges had enough idea other than couple of users in each college. “Learn and teach others”, the motto was very much true from day one and it still holds the perfect place in the group.

The group started with help from a lot of people who were from different places, mostly the ilug-kolkata chapter. Sankarshan, Runa, Sayamindu, Indranil, Soumyadip they all helped in many different ways. Abhijit Majumder, who is currently working as Assistant Professor in IIT Mumbai, donated the money for the domain name in the first year.

After one year, I moved to Bangalore for my job and gave a talk in about that first year’s journey of the group. The focus of the group also changed from just being a user group to a like minded contributors group.

Then from 2008 I started the summer training program, the 7th edition is currently going on. This program actually helped to keep doing the rolling release of contributors from the group. People from different countries participated in the sessions, they became contributors to many upstream projects.

I have to admit that we are close to the Fedora Project and Python, as many of us work on and use these two projects everyday.

We managed to have couple of meetings before, 2006, 2007. We will be meeting again from 29th August to 2nd September in NIT Duragapur, most of the active members are coming down to Durgapur, day times we will be spending in few talks and workshops. From evening we will be busy in developer sprints.

Suchakra Sharma made the new logo and tshirt design for the event.

dgplug logo

The event page is up and the talk schedule is also up with help from Sanisoft. We are using their beautiful conference scheduler application for the same. Come and meet us in Durgapur.