Class variables in Python
I’ve been coding in Python for a long time. Recently, I learned some nitty-gritty differences between class and instance variables. It’s scary to find things about Python that I probably should have known. But it’s great fodder for blog posts. In the interest of helping others avoid the same mistake, here’s what I got wrong and what I learned.
This post is inspired and kinda stolen from a question on StackOverflow, especially millerdev’s answer and Dubslow’s comment. The topic is authoritatively covered by a chapter in The Python Tutorial. I totally should reread the tutorial.
Code examples were tested in the Python 3.5 and Python 2.7 REPL.
OOP, more like oops
In Python, classes and instances have attributes attached to them. For
example, they might have the data total_seen
and count
:
My first mistake was thinking total_seen
and count
were
interchangeable ways of initializing a default value for an
instance. And, in a lot of cases, it worked the way I imagined it
should.
But I was wrong.
They are different
To warm up, here’s an easy difference between them:
total_seen
is only associated with Vegetable
, not eggplant
.
count
only exists after we create eggplant
, and is only
associated with eggplant
. Vegetable
doesn’t treat self.count
special at all.
Words
From the
Python tutorial,
I’m talking about the difference between class and instance
variables. total_seen
is a class variable and is associated with
Vegetable
. name
is an instance variable and is associated with
eggplant
.
Globals
I like to collect spooky programming examples. My favorite type of example is one where the code doesn’t throw errors and looks correct to the untrained eye, but does the wrong thing.
Now that we know class and instance variables are different, let’s come up with a spooky example.
Let’s say we want all classes to actually share the same variable, like to track the global total number of vegetables seen. (Aside: Globals freak me out, and this seems like a questionable approach. But I’ll use it for a contrived example.)
Here’s the spooky code:
When eggplant
is created, it looks for self.total_seen
, and finds
the class attribute, which has a value of 0
. Then it adds one, and
assigns the result to total_seen
’s evil twin, the instance
attribute. The class attribute remains as 0
.
Let’s try that again
Compare that to when we use the class-level attribute
Flashbacks of mutability gotchas
Let’s try modifying self
’s instance-level attribute again, but mix
it up. Instead of using +=
on an int
, let’s append
to a
list. Since the int
was immutable, Python created a copy for us. In
the case of a list, which is mutable, we’re going to mutate the
variable.
Surprise! self
mutates the global in this case. It doesn’t create
anything in the instance.
And, as far as I can tell (and I admit I’m still fuzzy in this area)
I’m not sure if there’s a functional difference here between
self.all_names.append()
and Vegetable.all_names.append()
. From a
coding clarity perspective, I’d rather use Vegetable.all_names
, so
it’s clear we’re modifying a class-level attribute.
We could turn this into another spooky case by flipping the example:
what if I wanted all_names
to be an instance variable? It’ll fit in
nicely with the “def fun(self, some_list=[])
might not do what you
think it does” spooky example.
So tell me the story where you broke everything!
Here’s the part where I assure myself that I probably never broke things. I normally wouldn’t use class variables for global things or initializing lists, and if I did, I’d hope I’d catch it in tests.
Actually, code like that last example instigated this post. I found code like that and updated it to use the class variable explicitly. I couldn’t come up with a test that broke before the change and worked after.
See Also
- The original StackOverflow question and discussion
- Python tutorial