This is pretty bizarre, and seems like it would cause a memory leak if you don't know what you're actually doing. You're allocating a my_dict object on the heap, in a field of the function object, so my_dict never goes out of scope and never gets garbage collected until the function fn itself does, right?
Whereas if you did this instead:
def fn(x):
my_dict={}
my_dict[x] = x * 2
return my_dict
You'd get what you'd expect--a new my_dict object gets created, returned, and then goes out of scope every time fn is called, so it would get garbage collected once there are no more references to that return value. (I think...)
(I don't know that much about how memory allocation and GC works in Python yet, just trying to learn!)
this is not at all bizarre, you just need to know what dynamic means. the best answer i could come up with is "definition is execution". for an enlightening moment, see this piece of code:
def a():
print "a called"
return []
def fn(x=a()):
x.append(1)
print x
fn()
fn()
fn()
i suggest typing this directly into the interpreter instead of a script for better effect.
You got me thinking about how python differs from lisp in this respect, and the subtle way that a functional API has helped this specific situation.
While lisp suffers from the same "definition is execution" gotcha, the effects are far rarer in practice because () is immutable and interned while [] is mutable and usually generated afresh each time it's executed.
$ python
>>> [] is []
False
$ sbcl
* (eq () ())
t
Since [] is mutable, appends of [] can do superficially 'the right thing'.
>>> a = []
>>> a.append(4)
>>> a
[4]
* (setq a ())
* (nconc a '(34))
(34)
* a
()
But there's a reason lisp does seemingly the wrong thing. According to the spec, nconc skips empty arguments (http://www.lispworks.com/documentation/HyperSpec/Body/f_ncon...) Reading between the lines, I'm assuming this makes sense from a perspective in lisp where we communicate even with 'destructive' operations through their return value. This is more apparent when you consider nreverse:
* (setq a '(1 2 3 4))
* (nreverse a)
(4 3 2 1)
* a
(1)
Destructive operations can reuse their input, but they're not required to maintain bindings. There is no precise equivalent of python's .reverse(). Instead, a common idiom is:
* (setq a (nreverse a))
It seems like a weird design decision, but one upshot of it besides encouraging a more functional style is that this optional-arg gotcha loses a lot of its power in lisps. It's very rare to define a default param of a non-empty list, and empty lists can't be modified without assigning to them.
It is perhaps more correct to say "when the function is instantiated". Functions in python aren't exactly "defined". Functions are objects same as everything else. The "def" keyword is how you call the constructor for a function object. The kwarg default values are evaluated once at object construction time and saved.
The "def" statement may construct different distinct function objects from a single definition. Each distinct function object has different default value objects, but each function object reuses its own default value objects each time it's called.
Consider this:
def outer_fn():
def inner_fn(foo={}):
# The id() function returns an internal
# object identifier. Different objects
# have different ids.
print id(foo)
return inner_fn
inner_fn_1 = outer_fn()
inner_fn_2 = outer_fn()
inner_fn_1()
inner_fn_1()
inner_fn_2()
inner_fn_2()
"def" in Python is an executable statement, not a declaration- it creates a function object that gets bound to the specified name in the local namespace. The defaults are evaluated when the function is created.
Rather the dict is bound during "compile" time, and not at each individual invocation (which would be much better, IMO). And during each invocation, it doesn't get re-bound.
Docstrings are (generally)available at the runtime, comments aren't: they are bypassed early in the parsing. They are completely different beasts (in python like in other languages).
I understand that they are different technically, but given that the performance implications are going to be nil for most people and docstrings have a similar function as comments, is it really necessary to be on such a high horse over something so minor?
Strings are not intended to act as comments. Docstrings are intended to act as docstrings. If you need inline comments beyond what is in your docstrings, there is no reason not to use Python's comment character as designed.
It's tangential to the original discussion, but that attitude nearly turned me off of Python entirely when I was first starting. I'd pop onto IRC or a discussion community and ask a question about something, explaining that I was new and the most common responses were:
1. Read the docs and figure it out yourself.
2. Why are you doing X? Only an intellectually feeble person would do X -- normal people do Y.
3. That's a waste of time and I'm not going to tell you how to do it because X, Y, and Z.
The sectioned titled "Inconsistent get interface" compares get() to getattr() which are two unrelated functions. getattr is the same as a property lookup on an object (person.name, person['name']), get() is a method defined by some types which returns a stored value.
In the provided example he calls get on an empty dictionary for the key 1, then calls getattr of 'a' on an int. Finally he calls it again with an optional default argument of None.
The difference is made apparent by the example:
In [13]: test = {'values': 1}
In [14]: getattr(test, 'values')
Out[14]: <function values>
Many of these "quirks" are just odd expectations on the author's part that were not fulfilled, but he does describe a few interesting odds and ends. And I agree completely about modules, importing, eggs, namespaces, and all that mess. One of the most refreshing things about learning Clojure after using Python for a while was the relative absence of confusion surrounding these issues, aided by Leiningen.
One of my favorite tomfooleries in Python is this:
>>> True, False = False, True
It doesn't have much practical effect, since most logical tests don't use the True and False constants directly. But it's a good way to perplex the unwary.
(Some dialects of) Smalltalk had `true become: false`, which not only changed the names of `true` and `false`, but also all references to `true` are replaced with references to `false`
I don't consider the sort example to be a quirk at all, though some of the other examples are reasonable. It's more a definition of what methods are supposed to do in the first place: a behavior that is applied on the instance of the object. Unless you're doing functional programming (or implementing certain specialized design patterns), you wouldn't expect dog.move() to return a new dog, so why should list.sort() return a new list?
I always figured it was because "sort" is a verb. Over time I've come to feel that functions named with a verb phrase ought to be procedure-like, in that they have side effects and return either nothing (which in python means None) or some kind of error indicator (and in python you'd generally use exceptions instead).
There are exceptions (e.g., the "get" prefix, used for getters, which tend to return values and have no side-effects), and, as with any rule of thumb, better to break it than create something ugly. But as rules go, I've found this one a pretty useful one to stick to.
It would be, if only it were true. Consider Array#delete_if or Array#pop, for example, although there are many others too.
Matz has written about this - https://www.ruby-forum.com/topic/176830#773946 - and said "The bang (!) does not mean "destructive" nor lack of it mean non destructive either. The bang sign means "the bang version is more dangerous than its non bang counterpart; handle with care".
This is one of the most commonly misunderstood things about Ruby in my experience (enough so that some library developers do apply a ! == destructive naming system) and would certainly make an equivalent "Ruby quirks" list IMHO! :-)
I generally agree with you but I think the recommendation would be that list.sort() should return the original list in its new, sorted state, not a new list.
All in-place operations in python are supposed to return None, as a way of explicitly signaling that the operation was done on the existing object. It's not always followed outside the standard library, but it's very rare to run across an exception.
It's counter-intuitive at first, especially in this case, but it is consistent.
The one that always confuses me is that multiple "for"s in the same comprehension work the opposite way from how you expect:
>>> [[(a, b) for a in [1, 2, 3]] for b in [4, 5, 6]]
[[(1, 4), (2, 4), (3, 4)], [(1, 5), (2, 5), (3, 5)], [(1, 6), (2, 6), (3, 6)]]
>>> [(a, b) for a in [1, 2, 3] for b in [4, 5, 6]]
[(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)]
Some of these quirks are explained incorrectly. I find the real explanations interesting, so please allow me to provide them.
> 999+1 is not 1000
The examples given in this section do not actually show what the poster thinks! Python's interning of small integers is not relevant. It only applies to integers in the range -5 through 256.
The real explanation for why "1000 is 1000" evaluates to True has to do with the Python compiler. By evaluating this expression as a single statement in the interactive interpreter, the compiler notices that the constant value 1000 is repeated more than once. Therefore, it is able to re-use the same object.
But if you provide the value in more than one statement, the compiler is unable to do this:
>>> a = 256
>>> b = 256
>>> a is b
True
>>> a = 257
>>> b = 257
>>> a is b
False
Note that this behavior exists because each statement in the interactive interpreter is compiled separately. If you place the code within a function instead, the entire function is compiled at once, and the object is reused once more:
>>> def f():
... a = 257
... b = 257
... return a is b
...
>>> f()
True
> Ellipsis?
Apparently Ellipsis is always “bigger” than anything, as opposite to None, which is always “smaller” than anything.
Not so. Ellipsis follows Python 2's default rules for the comparison of unrelated types.
>>> Ellipsis < ()
True
The default is documented as comparing objects "consistently but arbitrarily"[1]. The actual rules are:
1) None is the smallest object.
2) Followed by numbers.
3) Followed by all other objects. Objects of distinct types are compared by the lexical ordering of their type names.
This can easily lead to senseless orderings, when two types define an ordered relationship between themselves, but another type happens to have a name that is lexically between them, as with str, tuple, and unicode:
The Ellipsis constant is an instance of a type named "ellipsis", and so it is smaller than instances of most of the other non-numeric builtin types, except for dict.
The actual use of Ellipsis has nothing to do with recursive containers printing an ellipsis in their repr. It's part of a wacky special syntax that exists for NumPy's benefit:
>>> d = {}
>>> d[...] = None
>>> d
{Ellipsis: None}