Note that there are some explanatory texts on larger screens.

plurals
  1. POPassing a collection argument without unpacking its contents
    primarykey
    data
    text
    <p><strong>Question:</strong> What are the pros and cons of writing an <code>__init__</code> that takes a collection directly as an argument, rather than unpacking its contents?</p> <p><strong>Context:</strong> I'm writing a class to process data from several fields in a database table. I iterate through some large (~100 million rows) query result, passing one row at a time to a class that performs the processing. Each row is retrieved from the database as a tuple (or optionally, as a dictionary).</p> <p><strong>Discussion:</strong> Assume I'm interested in exactly three fields, but what gets passed into my class depends on the query, and the query is written by the user. The most basic approach might be one of the following:</p> <pre><code>class Direct: def __init__(self, names): self.names = names class Simple: def __init__(self, names): self.name1 = names[0] self.name2 = names[1] self.name3 = names[2] class Unpack: def __init__(self, names): self.name1, self.name2, self.name3 = names </code></pre> <p>Here are some examples of rows that might be passed to a new instance:</p> <pre><code>good = ('Simon', 'Marie', 'Kent') # Exactly what we want bad1 = ('Simon', 'Marie', 'Kent', '10 Main St') # Extra field(s) behind bad2 = ('15', 'Simon', 'Marie', 'Kent') # Extra field(s) in front bad3 = ('Simon', 'Marie') # Forgot a field </code></pre> <p>When faced with the above, <code>Direct</code> always runs (at least to this point) but is very likely to be buggy (GIGO). It takes one argument and assigns it exactly as given, so this could be a tuple or list of any size, a Null value, a function reference, etc. This is the most quick-and-dirty way I can think of to initialize the object, but I feel like the class should complain immediately when I give it data it's clearly not designed to handle.</p> <p><code>Simple</code> handles <code>bad1</code> correctly, is buggy when given <code>bad2</code>, and throws an error when given <code>bad3</code>. It's convenient to be able to effectively truncate the inputs from <code>bad1</code> but not worth the bugs that would come from <code>bad2</code>. This one feels naive and inconsistent.</p> <p><code>Unpack</code> seems like the safest approach, because it throws an error in all three "bad" cases. The last thing we want to do is silently fill our database with bad information, right? It takes the tuple directly, but allows me to identify its contents as distinct attributes instead of forcing me to keep referring to indices, and complains if the tuple is the wrong size. </p> <p>On the other hand, why pass a collection at all? Since I know I always want three fields, I can define <code>__init__</code> to explicitly accept three arguments, and unpack the collection using the *-operator as I pass it to the new object:</p> <pre><code>class Explicit: def __init__(self, name1, name2, name3): self.name1 = name1 self.name2 = name2 self.name3 = name3 names = ('Guy', 'Rose', 'Deb') e = Explicit(*names) </code></pre> <p>The only differences I see are that the <code>__init__</code> definition is a bit more verbose and we raise <code>TypeError</code> instead of <code>ValueError</code> when the tuple is the wrong size. Philosophically, it seems to make sense that if we are taking some group of data (a row of a query) and examining its parts (three fields), we should pass a group of data (the tuple) but store its parts (the three attributes). So <code>Unpack</code> would be better.</p> <p>If I wanted to accept an indeterminate number of fields, rather than always three, I still have the choice to pass the tuple directly or use arbitrary argument lists (*args, **kwargs) and <code>*</code>-operator unpacking. So I'm left wondering, is this a completely neutral style decision?</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload