Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Multiple assignment and tuple unpacking improve Python code readability (2018) (treyhunner.com)
166 points by bjourne on May 8, 2022 | hide | past | favorite | 96 comments


This is an excellent article!

In functional languages this is called "destructuring assignment".

This article could be a great jumping off point into namedtuple[1], the walrus operator [2] and the Pattern Matching[3] (Python 3.10), which assist into getting back into the symbolic domain from the structural domain, though you will have to come up with names! :)

Remember that namedtuple has the following additional methods

    _make               Class method that makes a new instance from an existing sequence or iterable
    _asdict             Return a new dict which maps field names to their corresponding values
    _replace            Return a new instance of the named tuple replacing specified fields with new values
    _fields             Tuple of strings listing the field names
    _field_defaults     Dictionary mapping field names to default values.
[1] https://docs.python.org/3/library/collections.html#collectio...

[2] https://peps.python.org/pep-0572/

[3] https://peps.python.org/pep-0636/


Why that underscore? I thought that’s a signal for “method is for internal use, only”.

Also (but a bit minor, given that Google tells me Python’s named tuples are immutable, so I guess programmers will know that): if I were to read foo._replace(bar=42) I would think that to modify foo.

Because of that, I like scala’s foo.copy(bar=42) more. It makes it clearer that you have to do foo = foo.copy(bar=42) to ‘change’ the value of foo (actually: bind it to a new value).


The underscore is to avoid conflicts with your own attributes named make, asdict, etc. So you could do:

  MyClass = namedtuple("MyClass", ["make", "asdict", "fields"])
  x = MyClass(make=10, asdict=11, fields=12)
  print(x.asdict)
The "underscore-to-avoid-naming-conflicts" approach is used elsewhere in the Python standard library, too, although another example isn't coming to me at the moment.


Perhaps even weirder, the Enum class uses names like `_value_`. At one time, inventing nonstandard "magic method" names like `__value__` was considered off-limits, because in principle those names could later be used by the Python language spec itself. In practice, such additions have been very rare, and library authors have become a lot less squeamish about defining their own.

The core data model of "almost everything is a mutable lookup table" can be easy to work with and reason about at times, but sometimes it's a little too simple, and you end up with these kinds of ad-hoc and inconsistent workarounds even in the standard library.


That is correct, although I agree with the GP that it's a bit unclear. My preferred practice for this is to put the underscore at the end, so it would be make_ and asdict_


> In functional languages this is called "destructuring assignment".

I thought it's just called pattern matching. Most functional languages don't have assignment and have bindings instead.


"Pattern matching", in the context of functional programming languages, most often refers to the complex combination of destructuring bind and conditional control flow based on which binding succeeded.

This article describing the Python feature of multiple assignment demonstrates only the destructuring assignment (assignment, because it is Python and these variables can be reassigned). There is no control flow tied to this operation, except for the exceptions that can be thrown when the LHS and RHS have incompatible structure.

Clojure is another example of a language with destructuring binding that is not tied to a control flow construct. Clojure functions and `let` bindings can use a destructuring syntax to bind to elements in a data structure. There is no tightly-linked control flow structure that goes along with this binding mechanism. It is my understanding that there are several libraries available that implement pattern matching, but I do not use these.

Most often, when I have seen pattern matching described to an audience with a background in imperative languages, it is compared to a super-powered switch statement. This gives the impression that the important part is the control flow. Languages such as Python and Clojure show that there is inherent value in the destructuring assignment/bind, even out of the context of control flow.


I would argue that destructuring is a special case of pattern matching.


IOW, the match/case tool can pattern match and destructure at the same time. Som uninterestingly, it is possible to do only one of those things:

Iterator unpacking:

    first, *middle, last = t
is similar to a match/case with a sequence pattern:

    match t:
        case first, *middle, last:
            ...
The former works for any iterable and the latter requires a sequence. The former raises an error for a non-iterable and the latter just skips the case for a non-sequence. Otherwise, they are similar.

However, arguing that one is a special case of the other misses the entire point of structural pattern matching. We would almost never substitute the latter for the former. They mostly live in different worlds because we use them and think about them much differently:

    # Something that would work but they we don't do
    for item in somedict.items():    
        match item:
            case [key, value]:
                 print(key.upper())
                 print(value)
                 print()
            case _:
                 raise RuntimeError('unexpected kind of item')

    # What we actually do
    for key, value in somedict.items(): 
        print(key.upper())
        print(value)
        print()
Sorry for the small rant, but I think this pedantic point isn't a useful line of thinking. That concepts and implementation have some overlap but should occupy a different concept maps in our brains — very much like the idea that dicts can do set-like things but that we're better off thinking about dicts as doing lookups and sets as doing uniquification, membership-testing, and set-to-set operations.


one step at a time people will realize mainstream imperative programming and its derivative has been a long waste of time


I don't think that's a valid conclusion here at all.


the amount of code that was

    - point to a structure / object
    - declare a variable
    - access the object / structure field and assign it to var
    - repeat for every information you want to extract from the struct / object
is large and now replaced by something transparent


Memory-managed and dynamically typed programming languages already solve this boilerplate problem mostly. If anything, the primary advantages of functional programming languages are not functional purity, referential transparency, or elegant abstractions; but HM-style type systems, typeclasses, and tidy succinct syntax.


I believe HM was derived from structural and purely functional experience.


In what way is multiple assignment and tuple unpacking not imperative code?


You might never create any actual variable, you point to access/ reference reads passed to other functions. Of course in python it's only syntactic sugar but bit by bit culture will evolve.


Assignment is not necessarily mutation. Every language, including functional ones can assign values to names. You have unlocked +1 reductionist pedantry.


As a python veteran, I'm quite envious of the advanced destructuring you get in some languages.

Unpacking doesn't let you specify default values if cardinality doesn't match, nor can you extract dictionaries by keys.

Since operators.itemgetter and itertools.slice are limited, I end up either adding a dependency or reinventing the wheel for every project.


> Unpacking doesn't let you specify default values if cardinality doesn't match, nor can you extract dictionaries by keys.

I'm a bit confused by this one, but maybe I'm misunderstanding. Does this not accomplish what you're defining?

  def parse_record(first=0, second=1, **all_the_others) -> dict:
      return dict(first=first, second=second, **all_the_others)

  dictionary = dict(first=99, third=2)
  parse_record(**dictionary)

  > {"first": 99, "second": 1, "third": 2}


That's packing. Unpacking would be something like this:

    d = {'a' : 10, 'b' : 20, 'c' : 30}
    {'a' : v1, 'b' : v2, **others} = d
    print(v1, v2, others) # prints 10, 20, {'c' : 30}


Exactly.

The working code in Python, as a function, would be:

    def extract(mapping, *args, **kwargs):
        for a in args:
            yield mapping[a]
        for key, default in kwargs.items():
            yield mapping.get(key, default)


there is a pattern matching in Python 3.10 (match/case) which can match dictionaries. https://peps.python.org/pep-0636/#going-to-the-cloud-mapping...


Yes, but since it's an expression it is very verbose for a simple assignment.

Python really needs something like:

    {foo: "default", mapping} = mapping


You might want:

    foo = mapping.pop("foo", "default")
It removes "foo" key from mapping if any. `foo` is "default" if "foo" is missing and `mapping["foo"]` otherwise.

Dictionary unpacking compatible with the match/case syntax could look like:

   {"foo": foo, **rest} = mapping 
It asserts that there is "foo" key in mapping and puts its value into `foo`.


This is what i liked most working a bit in JS. How you can deconstruct, even in the function definition!!! Using {}. Why did they have to use that for sets in python ahhh.

Also using ? For attributes that might be there and being able to chain that is sooo awesome.

These two things make for example working with nested Json in JS so much better than in Python.


The ? attribute would indeed be a welcome addition, but not to work in the same way as in JS. In JS, it ignore null values. That wouldn't work in python.

We would need a ? to says "ignore AttributeError, IndexError and KeyError on that access".

However, the use of {} for set is not a problem.

First, remember than Python is much, much older than JS. Insight is 20/20. Second, you could always use a different syntax.


> Unpacking doesn't let you specify default values if cardinality doesn't match

Functional languages that have pattern matching have wildcard matches that allow you to explicitly handle things that don't match as expected and also have tail matching for lists.

> nor can you extract dictionaries by keys

I'm not exactly sure what you mean here, but you can definitely do that for records in F# and maps/structs in Elixir. It's done all the time.


I think the parent comment is referring to how you can do e.g., `let {x, y, z} = point` (JS, object/dictionary) or `let Point {x, y, z} = point} (Rust, struct) but in Python your only option is `x = point.x; y = point["y"]` etc


Indeed.

Now you can do:

    import operator
    foo, bar, baz = operator.itemgetter('foo', 'bar', 'baz')(mapping)
But it's verbose, inelegant, and doesn't handle missing values.


I misread their first sentence. I see now that they’re lamenting what I mentioned not being in Python. I was thinking they meant in general and not just in Python.


Great article. I use unpacking regularly for a lot of the reasons given, as it helps ensure data is in its expected shape. However, I never knew of "star assignment" eg

  first, second, *others = some_tuple
I've opted for less-readable positional indexing several times for not realizing this. Looks like I have some updates to do.


This reminds me of an idea I wrote about years ago called Pass-Through Lists, where every assignment is implicitly a destructuring assignment: https://jonathanwarden.com/2014/06/19/pass-through-lists/


> However, I never knew of "star assignment" eg

   first, second, *others = some_tuple
For what it's worth, I think that's often called "splat".


You can also do:

    first, second, *_ = some_tuple
If you don't care about the others


A bit sad that this stopped working in lambdas in Python 3, it was nice to have in Python 2:

    (lambda (x, y): x+y)((2, 3),)


As an Erlang devotee, when I switched jobs and started writing Python I was happy to find that destructuring worked for function arguments, only to be disappointed when I realized Python 2 was about dead and Python 3 had dropped it because “no one uses it”.


I also miss this capability. It was super helpful for working with 2-D or 3-D point vectors:

    lambda (x1, y1), (x2, y2):  return abs(x2 - x1) + abs(y2 - y1)


That's pretty cool, shame it doesn't work in 3 too.

Not as nice, but:

    (lambda x, y: x+y)(*(2, 3))
Seems to work


Now do

  lambda i,(x,y): x[i]+y[j]
I never really understood the rationale here. Pep8 hates lambdas, which is also baffling. It seems as though tptb don't like lambdas, but there wasn't enough political will to remove them entirely, so they kneecapped them instead. Because... more lines = more readable??

But we've got a walrus operator now, because... less lines = more readable?


FYI here's the thread where tuple parameters were removed.

https://mail.python.org/archives/list/python-dev@python.org/...


People who respect the law and love sausage should never watch either of them being made ;-)

The referenced thread isn't Python's finest hour. A: "Hey, this feature is slightly inconvenient to implement in AST, let's kill it." B: "Many users haven't heard of this, so let's kill it." C: "I find it useful and there's not real advantage in killing it". A&B: "Okay, it's gone".

Really, you're better-off just enjoying a language that is mostly nice and mostly well-designed. Don't read the development threads. It will spoil the illusion.


If only it had been used in Django.

Similar thing about adding animated png support in Firefox, https://bugzilla.mozilla.org/show_bug.cgi?id=257197

Petabytes of bandwidth wasted, and probably gigawatts of electricity.

This like google internal behavior leaking out in the real world. Actively user hostile or at least indifferent.


> So the short answer is: yes, we are attached to it, we will miss it. No guarantees about throwing fits, though <wink>.

Observe: me, throwing a fit.


I cursed and swore for twenty full minutes when they removed tuple unpacking from parameters. God that pissed me off. Sure, I was one of only a handful of people who ever used it, and it had it's warts, but it was so elegant.


>>> Hey, we don't like this feature anymore, is anybody using it?"

>> Yes, we're using it"

> Okay, it's gone"

[rage]

The one thing I agree with in the thread is the confusion of

  foo = lambda (x,y): x+y
  foo(1,2) #boom
I think a better approach would be to raise warnings when sole parameters are parenthesized, unless written as

  foo = lambda (x,y),: x+y
which is kinda gross, but cogent with the weirdness of singletons:

  a, = [1]
  b = (1,)
  c = 1,
Alas, I've never had Guido's ear, so I shout into the void.


You golf a lambda and then deride walrus in the same comment? Please explain yourself.

And if you really want to make the perfect lambda, you can do that with code generation and frame walking/patching.

http://farmdev.com/src/secrets/framehack/index.html


That wasn't golf. Are you going to tell me that this is more legible?

   lambda i, t: t[0][i] + t[1][i]
edit: and... that frame hack thing is fun, but what does it have to do with anything?

Also, note that I don't "deride" the walrus operator. But it does smack of inconsistency. Are we aiming for clean notation or not?


You want tuple params back into an inlined function declaration? You can do that with a framehack. One of the examples from the link is Ruby style string interpolation, f-strings in Python, before f-string support.

You can fight Python on lambda support, or you can build the language you want. It gives you the tools.

I am not arguing about lambda legibility, but tilting at windmills rather than just walking around them. They can't move.


Do you really use framehacks in production for the sake of syntactic sugar like this? It's fair that Python can do this sort of thing[0], but it seems like begging for trouble to make it load-bearing without a very strong reason.

[0] That level of access was essential when I was working on https://codewords.recurse.com/issues/seven/dragon-taming-wit...


If the local sugar leads to a lower incidence of diabetes, then yes.

I have never had a framehack spontaneously break. Self proclaimed Pythonistas of course make a sour face, like their God was offended, but that isn't a logical argument against.

I really enjoy your compiler art, please keep it up. Are you doing anything with Wasm these days?


Thank you!

I dislike style nazis too, e.g. carping when Peter Norvig's code won't pass PEP 8.

I'm just leery of the expected cost in this kind of case. It can go on working for years until some new complication or some change in the ecosystem makes it suddenly create a really weird problem. Or when you want to try moving to a fancy new Python implementation, you find you have this friction. Matter of judgement where some chance of such messes is paid for by what it can do for you. (Of course when it's less "load bearing" the balance shifts.) With https://coverage.readthedocs.io/en/stable/ for example, it used bytecode hacks to do something you couldn't do otherwise, and that's unlikely to mess you up.

I have had old C programs go crazy years later in a really hard to debug way because newer compilers may interpret your code like your ex-wife's divorce lawyer (as Kragen put it). Back in the day a lot of us thought we had a different kind of relationship with C compilers, and it'd be fine to code to that informal social contract. (Just a loose analogy.)

I'm piddling away at https://github.com/darius/cant these days. (Some of the motivation was feeling too confined by Python, actually.) No Wasm, but I'm happy it exists! I tried to make a system like it 20 years ago (Idel) and gave up too soon.


I love E! Or at least the problems it is trying to solve. As you know Wasm also has a capabilities model. And it is fairly trivial to persist the Wasm heap, it just an array of bytes. I think Wasm aligns nicely.

Chez is a great Scheme, but it doesn't have a Wasm backend. I find https://github.com/schism-lang/schism very interesting.

As for C programs going crazy, well yeah. I did a thing where I would copy of the body of functions around in memory, it worked on some version of Linux and GCC, but only by accident. I would be much less comfortable doing this kind of circuit bending than modifying Python stack frames. If I were to achieve a similar goal in the future, I'd use TCC, generate C code and compile directly into memory.

Framehacks aren't going to do the same thing, and one should have tests for it regardless. Framehacks get you tail calls, stack scope and a bunch of other nice properties.

Happy Hacking!


The example lambda isn't "golfing". It's just plain more readable than naming the function.

Introducing side-effects in the middle of expressions with the walrus operator is weird semantics, though admittedly sometimes very convenient.


I like the walrus operator for working with the regex module.

    if match := re.search(r"Score = (\d+)", data):
        print(match[1])
    else:
        print("Not found")


I love tuple unpacking but it annoys me endlessly that the *arguments are unpacked into a list, not a tuple. Whereas if you use * in a function signature, you get a tuple, not a list. My code always ends up looking like

  first, *rest = xs
  rest = tuple(rest)
because often I want to hash "rest".

Another trick I often use:

  the_only, = xs
This extracts the single element of a one-element iterable and throws an error if xs is empty or has more than one element.


that's all cool and fine, but can you unpack named tuple or struct (data class)? In Julia one of my favorite quality of life improvement in the 1.7: https://github.com/JuliaLang/julia/pull/39285

Example:

    julia> X = (this = 3, that = 5)
    (this = 3, that = 5)

    julia> this
    ERROR: UndefVarError: this not defined

    julia> (;that, this) = X;

    julia> this
    3

    julia> that
    5
this works with structs as well (in fact anything with getproperty() defined, so like dataframe column too)


If you don't mind questionable hacks, you can implement it in Python by abusing the import system:

https://stackoverflow.com/a/69860087


I've enjoyed using types.SimpleNamespace:

    >>> from types import SimpleNamespace
    >>> d = {"one":1, "two":2}
    >>> ns = SimpleNamespace(**d)
    >>> ns.two
    2
I consider it a bonus that the dict keys are grouped as attributes on an object :)


As I teacher, types.SimpleNamespace is wonderful for helping people bridge between their understanding of dicts (a fundamental) to classes/instances (a little less fundamental).

However, I almost never use SimpleNamespace in real code. Dicts offer extra capabilities: get, pop, popitem, clear, copy, keys, values, items). Classes/instances offer extra capabilities: unique and shared keys in distinct namespaces, instances knowing their own type, and an elegant transformation of method calls ``a.m(b, c)`` --> ``type(a).m(a, b, c)``. Dataclasses and named tuples have their advantages as well. In practice, it is almost never that case that SimpleNamespace beats one of the alternatives.


namedtuple yes, it's a tuple so can be unpacked (by position) and dataclasses can not be unpacked. But with the new pattern matching, you can "unpack" dataclass objects that way!

https://peps.python.org/pep-0636/


but I'm showing unpacking by name, not by position.


I answered the question you put a question mark behind :)


JavaScript has it too; it's called "object destructuring":

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...


That’s cool that this is in Julia Base! Do you know how it compares to Unpack.jl? I love this feature because you can do partial unpacking by name

https://github.com/mauro3/Unpack.jl


Unpack.jl happened before this got added to Julia Base.

You can always partially unpack, just only put what you need on the LHS


looks can be done with some weird locals().__setitem__ hack .


I've long wanted some kind of "null-filling" destructuring operator in Python, to avoid cases like this:

    # oops, ValueError on `item = "noequals"`
    key, value = item.split("=", 1)
That alone makes destructuring somewhat limited, unless I'm either willing to validate my lengths ahead of time or do exception-driven control flow. Something like `?=` perhaps, to complement the walrus operator.


Is empty string acceptable in place of None? If so, consider partition:

  key, sep, value = item.partition("=")
If you need to know whether there was a split/match that occurred you could test sep's truthiness.


Does this work for you?

  key, *value = item.split("=", 1)
  if value:
      value = value[0]
  else:
      value = None


Yep, that's a pattern I've applied before. It's not the worst, but I don't think it has the clarity that this would have:

    key, value ?= item.split("=", 1)
Modulo bike shedding over the actual syntax, of course.

(Edit: the semantics of this could be even more generic, like Ruby's `[...] rescue value` syntax for turning any raised exception into a default value instead.)


There was a similar PEP https://peps.python.org/pep-0505/


  value = value[0] if value else None


Destructuring is one of the things i love about Clojure!

I’m happy that js and python have also adopted it.


I'm fairly sure Python had destructuring syntax before Clojure existed. I remember using it in early 2000s.


I didn't see it mentioned (forgive me if I missed it) but you can also assign to slices:

    >>> n = list(range(10))
    >>> n[::3] = 'fizz'
    >>> n
    ['f', 1, 2, 'i', 4, 5, 'z', 7, 8, 'z']
        
It's weird but occasionally useful.


While I agree with the article, readability is rarely precisely defined. What appeared to a veteran programmer be readable can be confusing to novice programmer. It then begs the question aren't veterans meant to understand a program faster. I have seen teams spend countless hours arguing what makes code readable


I share your perspective. When I saw this line below in the article, I thought it was neat and might use it when called for, but I can already see the comments on code review from some team members I've worked with who would find it less readable than the hard coded indices alternative.

    head, *middle, tail = numbers


I definitely expected "multiple assignment" to refer to this syntax (which Python also has)

  x = y = None


Yes, that would be correct.

   a = b = 10       # multiple assignment

   a, b = b, a      # simultaneous (or parallel) assignment


I suppose I’d like the smoke-filled rooms deciding readability to mandate square brackets on the left-hand side, or permit ,= without a space when a tool like black has an opinion.


Wouldn't it have been better for the language itself to have made a decision rather than another group of people who are in the business of having strong, enforced decisions about what is allowed?

The language specifies that simple statements can be separated by semicolons. Black eliminates this part of the language, essentially overriding Guido's decision.

The language specifies that four different quoting characters are available. Black eliminates this part of the language, mandating double quotes in code and single quotes in a repr output, essentially overriding Guido's decision.

The language specifies that in-line comments can start anywhere on a line. Black eliminates the writer's option to line-up their comments vertically, essentially overriding Guido's language decision.

The language allows and PEP 8 encourages using whitespace to aid readers in reading mathematical expressions just like sympy and LaTex do: ``3x*2 - 2x +5``. Black eliminates this possibility and mandates equal spacing regardless of operator precedence: ``3 * x * 2 - 2 * x + 5``. Once again, this overrides Guido's judgment.

Wouldn't be better for Guido to decide what the language allows rather than have a downstream tool narrow the range of what is possible?


Yes, and I'm looking forward to blue, and have used nero as well before I knew blue existed. Like the idea of black, and implementation, but its "taste" unfortunately leaves a bit to be desired.

Pushing extra spaces into slices is another one I don't care for.


gray seems more featureful, despite its slower-moving version odometer:

https://pypi.org/project/gray/


Hmm, looks interesting. However, I think I'd like to keep these tools separate, as long as they can stay compatible.


This is probably my most used C++17 feature too:

    for (auto& [key, value] : persons)
The nested bindings are neat. Don’t think you can do that in c++ yet.


For me it's these little things that make working with newer C++ standards so much easier. Always felt like iterating over anything was cumbersome. No longer the case.


this list is awesome. made my morning.

anyone know why this works in python

  lst=[]
  lst+=2, # note , in the end
  print(lst) #[2]


The tuple is sequence-like enough to be concatenated with a list: []+(2,) differs from []+[2] only by the type of the temporary object that contains 2.


The truly weird thing here is that

   [] + (2,)
is a

   TypeError: can only concatenate list (not "tuple") to list
but

   L = []
   L += (2,)
is totally cool.

In the end, it makes sense because I can't tell you if []+() should be a tuple or a list, and even if I did, I might have a different answer for ()+[]; whereas L+=bla looks like the type of L should not change.


They're different operations. The first is concatenation; the second is extension. Concatenation only works with other lists, but extension works with any iterable.


The comma operator makes tuples.

    lst += (2,)


Roughly, that is true. However, the actual pattern matching rules are a bit more complex:

   ()          # Makes a zero-tuple. The parens are required.

   2,          # Makes a one-tuple.
   (2,)        # Makes a one-tuple. The parens are optional.

   [2]         # Makes a list of size one.
   [(2)]       # Also makes a list of size one.
   [2,]        # Also makes a list of size one.

   f(x, y)     # Call a function with two arguments.
   f((x, y))   # Make a 2-tuple and call a function with one argument.


Yeah, you're right, of course. I should have mentioned the concept of "comma operator" is a heuristic, not something from the grammar or semantics.

BTW, kind of a tangent, I've always assumed (but not checked) that every "()" literal evaluates to one singleton zero-tuple, do you know if that's true?


It was, but that is not guaranteed and may stop being true due to work on subinterpreters, new memory managers, and heap types.

Also, it used to be true that ``frozenset() is frozenset()``, but that was lost in the war on singletons and global state.


Ah, cheers! You're a treasure!


I had no idea about the '*'!!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: