forked from bailiangrui/git-basics-tutorial
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy paththinkpython2012.html
More file actions
660 lines (625 loc) · 45.1 KB
/
thinkpython2012.html
File metadata and controls
660 lines (625 loc) · 45.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<meta name="generator" content="hevea 2.09">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap.min.css" integrity="sha384-1q8mTJOASx8j1Au+a5WDVnPi2lkFfwwEAa8hDDdjZlpLegxhjVME1fgjWPGmkzs7" crossorigin="anonymous">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/css/bootstrap-theme.min.css" integrity="sha384-fLW2N01lMqjakBkx3l/M9EahuwpSfeNvV63J5ezn3uZzapT0u7EYsXMjQV+0En5r" crossorigin="anonymous">
<link rel="stylesheet" type="text/css" href="thinkpython2.css">
<title>Dictionaries</title>
</head>
<body>
<nav class="navbar navbar-default navbar-fixed-top">
<div class="container-fluid">
<!-- Brand and toggle get grouped for better mobile display -->
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="#"><strong>Think Python</strong> - How to Think like a Computer Scientist (2e) <em>by Allen B. Downey</em></a>
</div>
<div>
<ul class="nav navbar-nav navbar-right">
<li><a href="http://greenteapress.com/thinkpython2/html/index.html"><span class="glyphicon glyphicon glyphicon-book" aria-hidden="true"></span></a></li>
<li><a href="thinkpython2011.html"><span class="glyphicon glyphicon glyphicon-menu-left" aria-hidden="true"></span></a></li>
<li><a href="index.html"><span class="glyphicon glyphicon glyphicon-home" aria-hidden="true"></span></a></li>
<li><a href="thinkpython2013.html"><span class="glyphicon glyphicon glyphicon-menu-right" aria-hidden="true"></span></a></li>
<li><a href="http://amzn.to/1VUYQUU"><span class="glyphicon glyphicon glyphicon-shopping-cart" aria-hidden="true"></span></a></li>
</ul>
<div>
</div><!-- /.container-fluid -->
</nav>
<table>
<tr>
<td valign="top" width="100" bgcolor="#b6459a" id="col-left">
</td>
<td valign="top" id="content">
<p>
<h1 class="chapter" id="sec129">Chapter 11  Dictionaries</h1>
<p>This chapter presents another built-in type called a dictionary.
Dictionaries are one of Python’s best features; they are the
building blocks of many efficient and elegant algorithms.</p>
<h2 class="section" id="sec130">11.1  A dictionary is a mapping</h2>
<p><a id="hevea_default885"></a>
<a id="hevea_default886"></a>
<a id="hevea_default887"></a>
<a id="hevea_default888"></a>
<a id="hevea_default889"></a>
<a id="hevea_default890"></a>
A <span class="c010">dictionary</span> is like a list, but more general. In a list,
the indices have to be integers; in a dictionary they can
be (almost) any type.</p><p>A dictionary contains a collection of indices, which are called <span class="c010">keys</span>, and a collection of values. Each key is associated with a
single value. The association of a key and a value is called a <span class="c010">key-value pair</span> or sometimes an <span class="c010">item</span>. <a id="hevea_default891"></a></p><p>In mathematical language, a dictionary represents a <span class="c010">mapping</span>
from keys to values, so you can also say that each key
“maps to” a value.
As an example, we’ll build a dictionary that maps from English
to Spanish words, so the keys and the values are all strings.</p><p>The function <span class="c004">dict</span> creates a new dictionary with no items.
Because <span class="c004">dict</span> is the name of a built-in function, you
should avoid using it as a variable name.
<a id="hevea_default892"></a>
<a id="hevea_default893"></a></p><pre class="verbatim">>>> eng2sp = dict()
>>> eng2sp
{}
</pre><p>The squiggly-brackets, <code>{}</code>, represent an empty dictionary.
To add items to the dictionary, you can use square brackets:
<a id="hevea_default894"></a>
<a id="hevea_default895"></a></p><pre class="verbatim">>>> eng2sp['one'] = 'uno'
</pre><p>This line creates an item that maps from the key
<code>'one'</code> to the value <code>'uno'</code>. If we print the
dictionary again, we see a key-value pair with a colon
between the key and value:</p><pre class="verbatim">>>> eng2sp
{'one': 'uno'}
</pre><p>This output format is also an input format. For example,
you can create a new dictionary with three items:</p><pre class="verbatim">>>> eng2sp = {'one': 'uno', 'two': 'dos', 'three': 'tres'}
</pre><p>But if you print <span class="c004">eng2sp</span>, you might be surprised:</p><pre class="verbatim">>>> eng2sp
{'one': 'uno', 'three': 'tres', 'two': 'dos'}
</pre><p>The order of the key-value pairs might not be the same. If
you type the same example on your computer, you might get a
different result. In general, the order of items in
a dictionary is unpredictable.</p><p>But that’s not a problem because
the elements of a dictionary are never indexed with integer indices.
Instead, you use the keys to look up the corresponding values:</p><pre class="verbatim">>>> eng2sp['two']
'dos'
</pre><p>The key <code>'two'</code> always maps to the value <code>'dos'</code> so the order
of the items doesn’t matter.</p><p>If the key isn’t in the dictionary, you get an exception:
<a id="hevea_default896"></a>
<a id="hevea_default897"></a></p><pre class="verbatim">>>> eng2sp['four']
KeyError: 'four'
</pre><p>The <span class="c004">len</span> function works on dictionaries; it returns the
number of key-value pairs:
<a id="hevea_default898"></a>
<a id="hevea_default899"></a></p><pre class="verbatim">>>> len(eng2sp)
3
</pre><p>The <span class="c004">in</span> operator works on dictionaries, too; it tells you whether
something appears as a <em>key</em> in the dictionary (appearing
as a value is not good enough).
<a id="hevea_default900"></a>
<a id="hevea_default901"></a>
<a id="hevea_default902"></a></p><pre class="verbatim">>>> 'one' in eng2sp
True
>>> 'uno' in eng2sp
False
</pre><p>To see whether something appears as a value in a dictionary, you
can use the method <span class="c004">values</span>, which returns a collection of
values, and then use the <span class="c004">in</span> operator:
<a id="hevea_default903"></a>
<a id="hevea_default904"></a></p><pre class="verbatim">>>> vals = eng2sp.values()
>>> 'uno' in vals
True
</pre><p>The <span class="c004">in</span> operator uses different algorithms for lists and
dictionaries. For lists, it searches the elements of the list in
order, as in Section <a href="thinkpython2009.html#find">8.6</a>. As the list gets longer, the search
time gets longer in direct proportion.</p><p>For dictionaries, Python uses an
algorithm called a <span class="c010">hashtable</span> that has a remarkable property: the
<span class="c004">in</span> operator takes about the same amount of time no matter how
many items are in the dictionary. I explain how that’s possible
in Section <a href="thinkpython2022.html#hashtable">B.4</a>, but the explanation might not make
sense until you’ve read a few more chapters.</p>
<h2 class="section" id="sec131">11.2  Dictionary as a collection of counters</h2>
<p>
<a id="histogram"></a>
<a id="hevea_default905"></a></p><p>Suppose you are given a string and you want to count how many
times each letter appears. There are several ways you could do it:</p><ol class="enumerate" type=1><li class="li-enumerate">You could create 26 variables, one for each letter of the
alphabet. Then you could traverse the string and, for each
character, increment the corresponding counter, probably using
a chained conditional.</li><li class="li-enumerate">You could create a list with 26 elements. Then you could
convert each character to a number (using the built-in function
<span class="c004">ord</span>), use the number as an index into the list, and increment
the appropriate counter.</li><li class="li-enumerate">You could create a dictionary with characters as keys
and counters as the corresponding values. The first time you
see a character, you would add an item to the dictionary. After
that you would increment the value of an existing item.</li></ol><p>Each of these options performs the same computation, but each
of them implements that computation in a different way.
<a id="hevea_default906"></a></p><p>An <span class="c010">implementation</span> is a way of performing a computation;
some implementations are better than others. For example,
an advantage of the dictionary implementation is that we don’t
have to know ahead of time which letters appear in the string
and we only have to make room for the letters that do appear.</p><p>Here is what the code might look like:</p><pre class="verbatim">def histogram(s):
d = dict()
for c in s:
if c not in d:
d[c] = 1
else:
d[c] += 1
return d
</pre><p>The name of the function is <span class="c004">histogram</span>, which is a statistical
term for a collection of counters (or frequencies).
<a id="hevea_default907"></a>
<a id="hevea_default908"></a>
<a id="hevea_default909"></a></p><p>The first line of the
function creates an empty dictionary. The <span class="c004">for</span> loop traverses
the string. Each time through the loop, if the character <span class="c004">c</span> is
not in the dictionary, we create a new item with key <span class="c004">c</span> and the
initial value 1 (since we have seen this letter once). If <span class="c004">c</span> is
already in the dictionary we increment <span class="c004">d[c]</span>.
<a id="hevea_default910"></a></p><p>Here’s how it works:</p><pre class="verbatim">>>> h = histogram('brontosaurus')
>>> h
{'a': 1, 'b': 1, 'o': 2, 'n': 1, 's': 2, 'r': 2, 'u': 2, 't': 1}
</pre><p>The histogram indicates that the letters <code>'a'</code> and <code>'b'</code>
appear once; <code>'o'</code> appears twice, and so on.</p><p><a id="hevea_default911"></a>
<a id="hevea_default912"></a>
Dictionaries have a method called <span class="c004">get</span> that takes a key
and a default value. If the key appears in the dictionary,
<span class="c004">get</span> returns the corresponding value; otherwise it returns
the default value. For example:</p><pre class="verbatim">>>> h = histogram('a')
>>> h
{'a': 1}
>>> h.get('a', 0)
1
>>> h.get('b', 0)
0
</pre><p>As an exercise, use <span class="c004">get</span> to write <span class="c004">histogram</span> more concisely. You
should be able to eliminate the <span class="c004">if</span> statement.</p>
<h2 class="section" id="sec132">11.3  Looping and dictionaries</h2>
<p>
<a id="hevea_default913"></a>
<a id="hevea_default914"></a>
<a id="hevea_default915"></a></p><p>If you use a dictionary in a <span class="c004">for</span> statement, it traverses
the keys of the dictionary. For example, <code>print_hist</code>
prints each key and the corresponding value:</p><pre class="verbatim">def print_hist(h):
for c in h:
print(c, h[c])
</pre><p>Here’s what the output looks like:</p><pre class="verbatim">>>> h = histogram('parrot')
>>> print_hist(h)
a 1
p 1
r 2
t 1
o 1
</pre><p>Again, the keys are in no particular order. To traverse the keys
in sorted order, you can use the built-in function <span class="c004">sorted</span>:
<a id="hevea_default916"></a>
<a id="hevea_default917"></a></p><pre class="verbatim">>>> for key in sorted(h):
... print(key, h[key])
a 1
o 1
p 1
r 2
t 1
</pre>
<h2 class="section" id="sec133">11.4  Reverse lookup</h2>
<p>
<a id="raise"></a>
<a id="hevea_default918"></a>
<a id="hevea_default919"></a>
<a id="hevea_default920"></a>
<a id="hevea_default921"></a></p><p>Given a dictionary <span class="c004">d</span> and a key <span class="c004">k</span>, it is easy to
find the corresponding value <span class="c004">v = d[k]</span>. This operation
is called a <span class="c010">lookup</span>.</p><p>But what if you have <span class="c004">v</span> and you want to find <span class="c004">k</span>?
You have two problems: first, there might be more than one
key that maps to the value <span class="c004">v</span>. Depending on the application,
you might be able to pick one, or you might have to make
a list that contains all of them. Second, there is no
simple syntax to do a <span class="c010">reverse lookup</span>; you have to search.</p><p>Here is a function that takes a value and returns the first
key that maps to that value:</p><pre class="verbatim">def reverse_lookup(d, v):
for k in d:
if d[k] == v:
return k
raise LookupError()
</pre><p>This function is yet another example of the search pattern, but it
uses a feature we haven’t seen before, <span class="c004">raise</span>. The
<span class="c010">raise statement</span> causes an exception; in this case it causes a
<span class="c004">LookupError</span>, which is a built-in exception used to indicate
that a lookup operation failed.
<a id="hevea_default922"></a>
<a id="hevea_default923"></a> <a id="hevea_default924"></a> <a id="hevea_default925"></a>
<a id="hevea_default926"></a> <a id="hevea_default927"></a></p><p>If we get to the end of the loop, that means <span class="c004">v</span>
doesn’t appear in the dictionary as a value, so we raise an
exception.</p><p>Here is an example of a successful reverse lookup:</p><pre class="verbatim">>>> h = histogram('parrot')
>>> key = reverse_lookup(h, 2)
>>> key
'r'
</pre><p>And an unsuccessful one:</p><pre class="verbatim">>>> key = reverse_lookup(h, 3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in reverse_lookup
LookupError
</pre><p>The effect when you raise an exception is the same as when
Python raises one: it prints a traceback and an error message.
<a id="hevea_default928"></a>
<a id="hevea_default929"></a>
<a id="hevea_default930"></a></p><p>The <span class="c004">raise</span> statement can take a detailed error message as an
optional argument. For example:</p><pre class="verbatim">>>> raise LookupError('value does not appear in the dictionary')
Traceback (most recent call last):
File "<stdin>", line 1, in ?
LookupError: value does not appear in the dictionary
</pre><p>A reverse lookup is much slower than a forward lookup; if you
have to do it often, or if the dictionary gets big, the performance
of your program will suffer.</p>
<h2 class="section" id="sec134">11.5  Dictionaries and lists</h2>
<p>
<a id="invert"></a></p><p>Lists can appear as values in a dictionary. For example, if you
are given a dictionary that maps from letters to frequencies, you
might want to invert it; that is, create a dictionary that maps
from frequencies to letters. Since there might be several letters
with the same frequency, each value in the inverted dictionary
should be a list of letters.
<a id="hevea_default931"></a>
<a id="hevea_default932"></a></p><p>Here is a function that inverts a dictionary:</p><pre class="verbatim">def invert_dict(d):
inverse = dict()
for key in d:
val = d[key]
if val not in inverse:
inverse[val] = [key]
else:
inverse[val].append(key)
return inverse
</pre><p>Each time through the loop, <span class="c004">key</span> gets a key from <span class="c004">d</span> and
<span class="c004">val</span> gets the corresponding value. If <span class="c004">val</span> is not in <span class="c004">inverse</span>, that means we haven’t seen it before, so we create a new
item and initialize it with a <span class="c010">singleton</span> (a list that contains a
single element). Otherwise we have seen this value before, so we
append the corresponding key to the list. <a id="hevea_default933"></a></p><p>Here is an example:</p><pre class="verbatim">>>> hist = histogram('parrot')
>>> hist
{'a': 1, 'p': 1, 'r': 2, 't': 1, 'o': 1}
>>> inverse = invert_dict(hist)
>>> inverse
{1: ['a', 'p', 't', 'o'], 2: ['r']}
</pre><blockquote class="figure"><div class="center"><hr class="c019"></div>
<div class="center"><img src="images/thinkpython2016.png"></div>
<div class="caption"><table class="c001 cellpading0"><tr><td class="c018">Figure 11.1: State diagram.</td></tr>
</table></div>
<a id="fig.dict1"></a>
<div class="center"><hr class="c019"></div></blockquote><p>Figure <a href="thinkpython2012.html#fig.dict1">11.1</a> is a state diagram showing <span class="c004">hist</span> and <span class="c004">inverse</span>.
A dictionary is represented as a box with the type <span class="c004">dict</span> above it
and the key-value pairs inside. If the values are integers, floats or
strings, I draw them inside the box, but I usually draw lists
outside the box, just to keep the diagram simple.
<a id="hevea_default934"></a>
<a id="hevea_default935"></a></p><p>Lists can be values in a dictionary, as this example shows, but they
cannot be keys. Here’s what happens if you try:
<a id="hevea_default936"></a>
<a id="hevea_default937"></a></p><pre class="verbatim">>>> t = [1, 2, 3]
>>> d = dict()
>>> d[t] = 'oops'
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: list objects are unhashable
</pre><p>I mentioned earlier that a dictionary is implemented using
a hashtable and that means that the keys have to be <span class="c010">hashable</span>.
<a id="hevea_default938"></a>
<a id="hevea_default939"></a></p><p>A <span class="c010">hash</span> is a function that takes a value (of any kind)
and returns an integer. Dictionaries use these integers,
called hash values, to store and look up key-value pairs.
<a id="hevea_default940"></a></p><p>This system works fine if the keys are immutable. But if the
keys are mutable, like lists, bad things happen. For example,
when you create a key-value pair, Python hashes the key and
stores it in the corresponding location. If you modify the
key and then hash it again, it would go to a different location.
In that case you might have two entries for the same key,
or you might not be able to find a key. Either way, the
dictionary wouldn’t work correctly.</p><p>That’s why keys have to be hashable, and why mutable types like
lists aren’t. The simplest way to get around this limitation is to
use tuples, which we will see in the next chapter.</p><p>Since dictionaries are mutable, they can’t be used as keys,
but they <em>can</em> be used as values.</p>
<h2 class="section" id="sec135">11.6  Memos</h2>
<p>
<a id="memoize"></a></p><p>If you played with the <span class="c004">fibonacci</span> function from
Section <a href="thinkpython2007.html#one.more.example">6.7</a>, you might have noticed that the bigger
the argument you provide, the longer the function takes to run.
Furthermore, the run time increases quickly.
<a id="hevea_default941"></a>
<a id="hevea_default942"></a></p><p>To understand why, consider Figure <a href="thinkpython2012.html#fig.fibonacci">11.2</a>, which shows
the <span class="c010">call graph</span> for <span class="c004">fibonacci</span> with <span class="c004">n=4</span>:</p><blockquote class="figure"><div class="center"><hr class="c019"></div>
<div class="center"><img src="images/thinkpython2017.png"></div>
<div class="caption"><table class="c001 cellpading0"><tr><td class="c018">Figure 11.2: Call graph.</td></tr>
</table></div>
<a id="fig.fibonacci"></a>
<div class="center"><hr class="c019"></div></blockquote><p>A call graph shows a set of function frames, with lines connecting each
frame to the frames of the functions it calls. At the top of the
graph, <span class="c004">fibonacci</span> with <span class="c004">n=4</span> calls <span class="c004">fibonacci</span> with <span class="c004">n=3</span> and <span class="c004">n=2</span>. In turn, <span class="c004">fibonacci</span> with <span class="c004">n=3</span> calls
<span class="c004">fibonacci</span> with <span class="c004">n=2</span> and <span class="c004">n=1</span>. And so on.
<a id="hevea_default943"></a>
<a id="hevea_default944"></a>
<a id="hevea_default945"></a></p><p>Count how many times <span class="c004">fibonacci(0)</span> and <span class="c004">fibonacci(1)</span> are
called. This is an inefficient solution to the problem, and it gets
worse as the argument gets bigger.
<a id="hevea_default946"></a></p><p>One solution is to keep track of values that have already been
computed by storing them in a dictionary. A previously computed value
that is stored for later use is called a <span class="c010">memo</span>. Here is a
“memoized” version of <span class="c004">fibonacci</span>:</p><pre class="verbatim">known = {0:0, 1:1}
def fibonacci(n):
if n in known:
return known[n]
res = fibonacci(n-1) + fibonacci(n-2)
known[n] = res
return res
</pre><p><span class="c004">known</span> is a dictionary that keeps track of the Fibonacci
numbers we already know. It starts with
two items: 0 maps to 0 and 1 maps to 1.</p><p>Whenever <span class="c004">fibonacci</span> is called, it checks <span class="c004">known</span>.
If the result is already there, it can return
immediately. Otherwise it has to
compute the new value, add it to the dictionary, and return it.</p><p>If you run this version of <span class="c004">fibonacci</span> and compare it with
the original, you will find that it is much faster.</p>
<h2 class="section" id="sec136">11.7  Global variables</h2>
<p>
<a id="hevea_default947"></a>
<a id="hevea_default948"></a></p><p>In the previous example, <span class="c004">known</span> is created outside the function,
so it belongs to the special frame called <code>__main__</code>.
Variables in <code>__main__</code> are sometimes called <span class="c010">global</span>
because they can be accessed from any function. Unlike local
variables, which disappear when their function ends, global variables
persist from one function call to the next.
<a id="hevea_default949"></a></p><p>It is common to use global variables for <span class="c010">flags</span>; that is,
boolean variables that indicate (“flag”) whether a condition
is true. For example, some programs use
a flag named <span class="c004">verbose</span> to control the level of detail in the
output:</p><pre class="verbatim">verbose = True
def example1():
if verbose:
print('Running example1')
</pre><p>If you try to reassign a global variable, you might be surprised.
The following example is supposed to keep track of whether the
function has been called:
<a id="hevea_default950"></a></p><pre class="verbatim">been_called = False
def example2():
been_called = True # WRONG
</pre><p>But if you run it you will see that the value of <code>been_called</code>
doesn’t change. The problem is that <span class="c004">example2</span> creates a new local
variable named <code>been_called</code>. The local variable goes away when
the function ends, and has no effect on the global variable.
<a id="hevea_default951"></a>
<a id="hevea_default952"></a>
<a id="hevea_default953"></a></p><p>To reassign a global variable inside a function you have to
<span class="c010">declare</span> the global variable before you use it:</p><pre class="verbatim">been_called = False
def example2():
global been_called
been_called = True
</pre><p>The <span class="c010">global statement</span> tells the interpreter
something like, “In this function, when I say <code>been_called</code>, I
mean the global variable; don’t create a local one.”
<a id="hevea_default954"></a>
<a id="hevea_default955"></a></p><p>Here’s an example that tries to update a global variable:</p><pre class="verbatim">count = 0
def example3():
count = count + 1 # WRONG
</pre><p>If you run it you get:
<a id="hevea_default956"></a>
<a id="hevea_default957"></a></p><pre class="verbatim">UnboundLocalError: local variable 'count' referenced before assignment
</pre><p>Python assumes that <span class="c004">count</span> is local, and under that assumption
you are reading it before writing it. The solution, again,
is to declare <span class="c004">count</span> global.
<a id="hevea_default958"></a></p><pre class="verbatim">def example3():
global count
count += 1
</pre><p>If a global variable refers to a mutable value, you can modify
the value without declaring the variable:
<a id="hevea_default959"></a></p><pre class="verbatim">known = {0:0, 1:1}
def example4():
known[2] = 1
</pre><p>So you can add, remove and replace elements of a global list or
dictionary, but if you want to reassign the variable, you
have to declare it:</p><pre class="verbatim">def example5():
global known
known = dict()
</pre><p>Global variables can be useful, but if you have a lot of them,
and you modify them frequently, they can make programs
hard to debug.</p>
<h2 class="section" id="sec137">11.8  Debugging</h2>
<p>
<a id="hevea_default960"></a></p><p>As you work with bigger datasets it can become unwieldy to
debug by printing and checking the output by hand. Here are some
suggestions for debugging large datasets:</p><dl class="description"><dt class="dt-description"><span class="c010">Scale down the input:</span></dt><dd class="dd-description"> If possible, reduce the size of the
dataset. For example if the program reads a text file, start with
just the first 10 lines, or with the smallest example you can find.
You can either edit the files themselves, or (better) modify the
program so it reads only the first <span class="c004">n</span> lines.<p>If there is an error, you can reduce <span class="c004">n</span> to the smallest
value that manifests the error, and then increase it gradually
as you find and correct errors.</p></dd><dt class="dt-description"><span class="c010">Check summaries and types:</span></dt><dd class="dd-description"> Instead of printing and checking the
entire dataset, consider printing summaries of the data: for example,
the number of items in a dictionary or the total of a list of numbers.<p>A common cause of runtime errors is a value that is not the right
type. For debugging this kind of error, it is often enough to print
the type of a value.</p></dd><dt class="dt-description"><span class="c010">Write self-checks:</span></dt><dd class="dd-description"> Sometimes you can write code to check
for errors automatically. For example, if you are computing the
average of a list of numbers, you could check that the result is
not greater than the largest element in the list or less than
the smallest. This is called a “sanity check” because it detects
results that are “insane”.
<a id="hevea_default961"></a>
<a id="hevea_default962"></a><p>Another kind of check compares the results of two different
computations to see if they are consistent. This is called a
“consistency check”.</p></dd><dt class="dt-description"><span class="c010">Format the output:</span></dt><dd class="dd-description"> Formatting debugging output
can make it easier to spot an error. We saw an example in
Section <a href="thinkpython2007.html#factdebug">6.9</a>. The <span class="c004">pprint</span> module provides
a <span class="c004">pprint</span> function that displays built-in types in
a more human-readable format (<span class="c004">pprint</span> stands for
“pretty print”).
<a id="hevea_default963"></a>
<a id="hevea_default964"></a>
<a id="hevea_default965"></a></dd></dl><p>Again, time you spend building scaffolding can reduce
the time you spend debugging.
<a id="hevea_default966"></a></p>
<h2 class="section" id="sec138">11.9  Glossary</h2>
<dl class="description"><dt class="dt-description"><span class="c010">mapping:</span></dt><dd class="dd-description"> A relationship in which each element of one set
corresponds to an element of another set.
<a id="hevea_default967"></a></dd><dt class="dt-description"><span class="c010">dictionary:</span></dt><dd class="dd-description"> A mapping from keys to their
corresponding values.
<a id="hevea_default968"></a></dd><dt class="dt-description"><span class="c010">key-value pair:</span></dt><dd class="dd-description"> The representation of the mapping from
a key to a value.
<a id="hevea_default969"></a></dd><dt class="dt-description"><span class="c010">item:</span></dt><dd class="dd-description"> In a dictionary, another name for a key-value
pair.
<a id="hevea_default970"></a></dd><dt class="dt-description"><span class="c010">key:</span></dt><dd class="dd-description"> An object that appears in a dictionary as the
first part of a key-value pair.
<a id="hevea_default971"></a></dd><dt class="dt-description"><span class="c010">value:</span></dt><dd class="dd-description"> An object that appears in a dictionary as the
second part of a key-value pair. This is more specific than
our previous use of the word “value”.
<a id="hevea_default972"></a></dd><dt class="dt-description"><span class="c010">implementation:</span></dt><dd class="dd-description"> A way of performing a computation.
<a id="hevea_default973"></a></dd><dt class="dt-description"><span class="c010">hashtable:</span></dt><dd class="dd-description"> The algorithm used to implement Python
dictionaries.
<a id="hevea_default974"></a></dd><dt class="dt-description"><span class="c010">hash function:</span></dt><dd class="dd-description"> A function used by a hashtable to compute the
location for a key.
<a id="hevea_default975"></a></dd><dt class="dt-description"><span class="c010">hashable:</span></dt><dd class="dd-description"> A type that has a hash function. Immutable
types like integers,
floats and strings are hashable; mutable types like lists and
dictionaries are not.
<a id="hevea_default976"></a></dd><dt class="dt-description"><span class="c010">lookup:</span></dt><dd class="dd-description"> A dictionary operation that takes a key and finds
the corresponding value.
<a id="hevea_default977"></a></dd><dt class="dt-description"><span class="c010">reverse lookup:</span></dt><dd class="dd-description"> A dictionary operation that takes a value and finds
one or more keys that map to it.
<a id="hevea_default978"></a></dd><dt class="dt-description"><span class="c010">raise statement:</span></dt><dd class="dd-description"> A statement that (deliberately) raises an exception.
<a id="hevea_default979"></a>
<a id="hevea_default980"></a></dd><dt class="dt-description"><span class="c010">singleton:</span></dt><dd class="dd-description"> A list (or other sequence) with a single element.
<a id="hevea_default981"></a></dd><dt class="dt-description"><span class="c010">call graph:</span></dt><dd class="dd-description"> A diagram that shows every frame created during
the execution of a program, with an arrow from each caller to
each callee.
<a id="hevea_default982"></a>
<a id="hevea_default983"></a></dd><dt class="dt-description"><span class="c010">memo:</span></dt><dd class="dd-description"> A computed value stored to avoid unnecessary future
computation.
<a id="hevea_default984"></a></dd><dt class="dt-description"><span class="c010">global variable:</span></dt><dd class="dd-description"> A variable defined outside a function. Global
variables can be accessed from any function.
<a id="hevea_default985"></a></dd><dt class="dt-description"><span class="c010">global statement:</span></dt><dd class="dd-description"> A statement that declares a variable name
global.
<a id="hevea_default986"></a>
<a id="hevea_default987"></a></dd><dt class="dt-description"><span class="c010">flag:</span></dt><dd class="dd-description"> A boolean variable used to indicate whether a condition
is true.
<a id="hevea_default988"></a></dd><dt class="dt-description"><span class="c010">declaration:</span></dt><dd class="dd-description"> A statement like <span class="c004">global</span> that tells the
interpreter something about a variable.
<a id="hevea_default989"></a></dd></dl>
<h2 class="section" id="sec139">11.10  Exercises</h2>
<div class="theorem"><span class="c010">Exercise 1</span>  
<a id="wordlist2"></a>
<a id="hevea_default990"></a>
<a id="hevea_default991"></a><p><em>Write a function that reads the words in <span class="c004">words.txt</span> and
stores them as keys in a dictionary. It doesn’t matter what the
values are. Then you can use the <span class="c004">in</span> operator
as a fast way to check whether a string is in
the dictionary.</em></p><p><em>If you did Exercise </em><a href="thinkpython2011.html#wordlist1"><em>10</em></a><em>, you can compare the speed
of this implementation with the list <span class="c004">in</span> operator and the
bisection search.</em></p></div><div class="theorem"><span class="c010">Exercise 2</span>  
<a id="setdefault"></a><p><em>Read the documentation of the dictionary method <span class="c004">setdefault</span>
and use it to write a more concise version of <code>invert_dict</code>.
Solution: </em><a href="http://thinkpython2.com/code/invert_dict.py"><span class="c004"><em>http://thinkpython2.com/code/invert_dict.py</em></span></a><em>.
</em><a id="hevea_default992"></a>
<a id="hevea_default993"></a></p></div><div class="theorem"><span class="c010">Exercise 3</span>  <em>
Memoize the Ackermann function from Exercise </em><a href="thinkpython2007.html#ackermann"><em>2</em></a><em> and see if
memoization makes it possible to evaluate the function with bigger
arguments. Hint: no.
Solution: </em><a href="http://thinkpython2.com/code/ackermann_memo.py"><span class="c004"><em>http://thinkpython2.com/code/ackermann_memo.py</em></span></a><em>.
</em><a id="hevea_default994"></a>
<a id="hevea_default995"></a></div><div class="theorem"><span class="c010">Exercise 4</span>  
<a id="hevea_default996"></a><p><em>If you did Exercise </em><a href="thinkpython2011.html#duplicate"><em>7</em></a><em>, you already have
a function named <code>has_duplicates</code> that takes a list
as a parameter and returns <span class="c004">True</span> if there is any object
that appears more than once in the list.</em></p><p><em>Use a dictionary to write a faster, simpler version of
<code>has_duplicates</code>.
Solution: </em><a href="http://thinkpython2.com/code/has_duplicates.py"><span class="c004"><em>http://thinkpython2.com/code/has_duplicates.py</em></span></a><em>.</em></p></div><div class="theorem"><span class="c010">Exercise 5</span>  
<a id="exrotatepairs"></a>
<a id="hevea_default997"></a>
<a id="hevea_default998"></a><p><em>Two words are “rotate pairs” if you can rotate one of them
and get the other (see <code>rotate_word</code> in Exercise </em><a href="thinkpython2009.html#exrotate"><em>5</em></a><em>).</em></p><p><em>Write a program that reads a wordlist and finds all the rotate
pairs. Solution: </em><a href="http://thinkpython2.com/code/rotate_pairs.py"><em><span class="c004">http://thinkpython2.com/code/rotate_pairs.py</span></em></a><em>.</em></p></div><div class="theorem"><span class="c010">Exercise 6</span>  
<a id="hevea_default999"></a>
<a id="hevea_default1000"></a><p><em>Here’s another Puzzler from </em>Car Talk<em>
(</em><a href="http://www.cartalk.com/content/puzzlers"><em><span class="c004">http://www.cartalk.com/content/puzzlers</span></em></a><em>):</em></p><blockquote class="quote"><em>
This was sent in by a fellow named Dan O’Leary. He came upon a common
one-syllable, five-letter word recently that has the following unique
property. When you remove the first letter, the remaining letters form
a homophone of the original word, that is a word that sounds exactly
the same. Replace the first letter, that is, put it back and remove
the second letter and the result is yet another homophone of the
original word. And the question is, what’s the word?</em><p><em>Now I’m going to give you an example that doesn’t work. Let’s look at
the five-letter word, ‘wrack.’ W-R-A-C-K, you know like to ‘wrack with
pain.’ If I remove the first letter, I am left with a four-letter
word, ’R-A-C-K.’ As in, ‘Holy cow, did you see the rack on that buck!
It must have been a nine-pointer!’ It’s a perfect homophone. If you
put the ‘w’ back, and remove the ‘r,’ instead, you’re left with the
word, ‘wack,’ which is a real word, it’s just not a homophone of the
other two words.</em></p><p><em>But there is, however, at least one word that Dan and we know of,
which will yield two homophones if you remove either of the first two
letters to make two, new four-letter words. The question is, what’s
the word?
</em></p></blockquote><p>
<a id="hevea_default1001"></a>
<a id="hevea_default1002"></a>
<a id="hevea_default1003"></a></p><p><em>You can use the dictionary from Exercise </em><a href="thinkpython2012.html#wordlist2"><em>1</em></a><em> to check
whether a string is in the word list.</em></p><p><em>To check whether two words are homophones, you can use the CMU
Pronouncing Dictionary. You can download it from
</em><a href="http://www.speech.cs.cmu.edu/cgi-bin/cmudict"><span class="c004"><em>http://www.speech.cs.cmu.edu/cgi-bin/cmudict</em></span></a><em> or from
</em><a href="http://thinkpython2.com/code/c06d"><span class="c004"><em>http://thinkpython2.com/code/c06d</em></span></a><em> and you can also download
</em><a href="http://thinkpython2.com/code/pronounce.py"><span class="c004"><em>http://thinkpython2.com/code/pronounce.py</em></span></a><em>, which provides a function
named <code>read_dictionary</code> that reads the pronouncing dictionary and
returns a Python dictionary that maps from each word to a string that
describes its primary pronunciation.</em></p><p><em>Write a program that lists all the words that solve the Puzzler.
Solution: </em><a href="http://thinkpython2.com/code/homophone.py"><em><span class="c004">http://thinkpython2.com/code/homophone.py</span></em></a><em>.</em></p></div>
<p>
</td>
<td width=130 valign="top" id="col-right">
<p>
<h4>Are you using one of our books in a class?</h4> We'd like to know
about it. Please consider filling out <a href="http://spreadsheets.google.com/viewform?formkey=dC0tNUZkMjBEdXVoRGljNm9FRmlTMHc6MA" onClick="javascript: pageTracker._trackPageview('/outbound/survey');">this short survey</a>.
<p>
<br>
<p>
<a rel="nofollow" href="http://www.amazon.com/gp/product/1491938455/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1491938455&linkCode=as2&tag=greenteapre01-20&linkId=2JJH4SWCAVVYSQHO">Think DSP</a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1491938455" width="1" height="1" border="0" alt="">
<p>
<a rel="nofollow" href="http://www.amazon.com/gp/product/1491938455/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1491938455&linkCode=as2&tag=greenteapre01-20&linkId=CTV7PDT7E5EGGJUM"><img border="0" src="http://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=1491938455&Format=_SL160_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=greenteapre01-20"></a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1491938455" width="1" height="1" border="0" alt="">
<p>
<a rel="nofollow" href="http://www.amazon.com/gp/product/1491929561/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1491929561&linkCode=as2&tag=greenteapre01-20&linkId=ZY6MAYM33ZTNSCNZ">Think Java</a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1491929561" width="1" height="1" border="0" alt="">
<p>
<a rel="nofollow" href="http://www.amazon.com/gp/product/1491929561/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1491929561&linkCode=as2&tag=greenteapre01-20&linkId=PT77ANWARUNNU3UK"><img border="0" src="http://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=1491929561&Format=_SL160_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=greenteapre01-20"></a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1491929561" width="1" height="1" border="0" alt="">
<p>
<a href="http://www.amazon.com/gp/product/1449370780/ref=as_li_qf_sp_asin_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1449370780&linkCode=as2&tag=greenteapre01-20">Think Bayes</a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1449370780" width="1" height="1" border="0" alt="">
<p>
<a href="http://www.amazon.com/gp/product/1449370780/ref=as_li_qf_sp_asin_il?ie=UTF8&camp=1789&creative=9325&creativeASIN=1449370780&linkCode=as2&tag=greenteapre01-20"><img border="0" src="http://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=1449370780&Format=_SL160_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=greenteapre01-20"></a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1449370780" width="1" height="1" border="0" alt="">
<p>
<a rel="nofollow" href="http://www.amazon.com/gp/product/1491939362/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1491939362&linkCode=as2&tag=greenteapre01-20&linkId=FJKSQ3IHEMY2F2VA">Think Python 2e</a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1491939362" width="1" height="1" border="0" alt="">
<p>
<a rel="nofollow" href="http://www.amazon.com/gp/product/1491939362/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1491939362&linkCode=as2&tag=greenteapre01-20&linkId=ZZ454DLQ3IXDHNHX"><img border="0" src="http://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=1491939362&Format=_SL160_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=greenteapre01-20"></a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1491939362" width="1" height="1" border="0" alt="">
<p>
<a href="http://www.amazon.com/gp/product/1491907339/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1491907339&linkCode=as2&tag=greenteapre01-20&linkId=O7WYM6H6YBYUFNWU">Think Stats 2e</a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1491907339" width="1" height="1" border="0" alt="">
<p>
<a href="http://www.amazon.com/gp/product/1491907339/ref=as_li_tl?ie=UTF8&camp=1789&creative=9325&creativeASIN=1491907339&linkCode=as2&tag=greenteapre01-20&linkId=JVSYKQHYSUIEYRHL"><img border="0" src="http://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=1491907339&Format=_SL160_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=greenteapre01-20"></a><img class="c003" src="http://ir-na.amazon-adsystem.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1491907339" width="1" height="1" border="0" alt="">
<p>
<a href="http://www.amazon.com/gp/product/1449314635/ref=as_li_tf_tl?ie=UTF8&tag=greenteapre01-20&linkCode=as2&camp=1789&creative=9325&creativeASIN=1449314635">Think Complexity</a><img class="c003" src="http://www.assoc-amazon.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1449314635" width="1" height="1" border="0" alt="">
<p>
<a href="http://www.amazon.com/gp/product/1449314635/ref=as_li_tf_il?ie=UTF8&camp=1789&creative=9325&creativeASIN=1449314635&linkCode=as2&tag=greenteapre01-20"><img border="0" src="http://ws-na.amazon-adsystem.com/widgets/q?_encoding=UTF8&ASIN=1449314635&Format=_SL160_&ID=AsinImage&MarketPlace=US&ServiceVersion=20070822&WS=1&tag=greenteapre01-20"></a><img class="c003" src="http://www.assoc-amazon.com/e/ir?t=greenteapre01-20&l=as2&o=1&a=1449314635" width="1" height="1" border="0" alt="">
</td>
</tr>
</table>
<nav class="navbar navbar-default navbar-fixed-top">
<div class="container-fluid">
<!-- Brand and toggle get grouped for better mobile display -->
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<a class="navbar-brand" href="#"><strong>Think Python</strong> - How to Think like a Computer Scientist (2e) <em>by Allen B. Downey</em></a>
</div>
<div>
<ul class="nav navbar-nav navbar-right">
<li><a href="http://greenteapress.com/thinkpython2/html/index.html"><span class="glyphicon glyphicon glyphicon-book" aria-hidden="true"></span></a></li>
<li><a href="thinkpython2011.html"><span class="glyphicon glyphicon glyphicon-menu-left" aria-hidden="true"></span></a></li>
<li><a href="index.html"><span class="glyphicon glyphicon glyphicon-home" aria-hidden="true"></span></a></li>
<li><a href="thinkpython2013.html"><span class="glyphicon glyphicon glyphicon-menu-right" aria-hidden="true"></span></a></li>
<li><a href="http://amzn.to/1VUYQUU"><span class="glyphicon glyphicon glyphicon-shopping-cart" aria-hidden="true"></span></a></li>
</ul>
<div>
</div><!-- /.container-fluid -->
</nav></body>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.3/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.6/js/bootstrap.min.js" integrity="sha384-0mSbJDEHialfmuBBQP6A4Qrprq5OVfW37PRR3j5ELqxss1yVqOtnepnHVP9aJ7xS" crossorigin="anonymous"></script>
</html>