-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy path05-iteration.Rmd
More file actions
executable file
·258 lines (181 loc) · 15.4 KB
/
05-iteration.Rmd
File metadata and controls
executable file
·258 lines (181 loc) · 15.4 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
# Iteration and Loops
One of the main benefits of using computers to perform tasks is that computers never get tired or bored, and so can do the same thing over and over and over and over and over and over again. This is a process known as **iteration**. Iteration represents another form of _control flow_ (similar to conditionals), and is specified in a program using a set of statements called **loops**. In this module, you will learn the basics of writing loops to perform iteration; more advanced iteration concepts will be covered in later modules.
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
**Contents**
- [Resources](#resources)
- [While Loops](#while-loops)
- [Counting and Loops](#counting-and-loops)
- [Conditionals and Sentinels](#conditionals-and-sentinels)
- [For Loops](#for-loops)
- [Difference from While Loops](#difference-from-while-loops)
- [Working with Files](#working-with-files)
- [Try/Except](#tryexcept)
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
## Resources
- [Iterations (Downey)](https://books.trinket.io/pfe/05-iterations.html)
- [Flow Control (Sweigart)](https://automatetheboringstuff.com/chapter2/) (second half)
- [Iteration (Severance)](http://openbookproject.net/thinkcs/python/english3e/iteration.html)
- [Files (Downey)](https://books.trinket.io/pfe/07-files.html)
- [Files and File Paths (Sweigart)](https://automatetheboringstuff.com/chapter8/)
## While Loops
Loops are **control structures** (similar to `if` statements) that allow you to perform iteration. These statements specify that a _block_ of code should be executed repeatedly—the block is executed statement by statement, and then the flow of control "loops" back to the top to execute the statements again. Programming languages such as Python support a number of different kinds of loops, which differ primarily in how they determine whether or not to repeat the block (though this difference is reflected in the syntax).
In programming, the most basic _control structure_ used for iteration is known as a **while loop**, which is used for "indefinite iteration" A Python while loop has the following structure:
```python
while condition:
# lines of code to run if the condition is True
```
This construction looks much like an `if` statement, and is similar in many regards. As with an `if` statement, the **condition** can be any Boolean value (or any expression that evaluates to a boolean value), and it determines whether or not the loop's _block_ will be executed.
In order to understand how a while loop influences the flow of program control, consider a more concrete example:
```python
count = 5
while count > 0:
print(count)
count = count - 1
print("Blastoff!")
```
In this example, when the interpreter reaches the `while` statement, it first checks the condition (`count > 0`, where `count` is 5). Finding that condition to be `True`, the interpreter then executes the block, printing the `count` and then _decrementing_ it. Once the block is executed, the interpreter loops back to the `while` statement and rechecks the condition. Finding that `count` (4) is still greater than 0, it executes the block again (causing `count` to decrement again). This continues until the interpreter loops to the `while` statement and finds that `count` has reached 0, and thus is no longer greater than 0. Since the condition is now `False`, the interpreter does _not_ enter the loop, and proceeds to the following statement ("Blastoff!").
**Importantly**, the condition is only checked when the block is _about_ to execute: both at the "start" of the loop, and then at the beginning of each subsequent iteration. Having the condition become `True` in the middle of the block (e.g., temporarily) will have no impact on the control flow. It is also possible for the interpreter to "never enter" the loop if the condition is not initially `True`.
### Counting and Loops
The above example also demonstrates how to use a "counter" to determine whether or not the loop has run a sufficient number of times: this is known as a **loop control variable (LCV)**. The "standard" counting loop looks like:
```python
count = 0 # 1. initialize the counter
while count < 100: # 2. check if the counter has reached its target
print(count) # 3. do some work (this may be multiple statements)
count = count + 1 # 4. update the counter
```
In order for a counted loop to work properly, you need to be careful about steps 2 and 4: the condition and the update.
First, recall that the condition is _whether to run the loop_, not _whether to stop_:
```python
count = 0
while count == 100: # bad condition!
print(count)
count = count + 1
```
In this case, the condition is not initially True, so the interpreter never enters the loop.
- When writing conditions, think _"do we keep going"_ rather than _"are we there yet?"_. In loops (as in life), the journey is more important than the destination!
Second, consider what happens if you forget to update the loop control variable:
```python
count = 0
while count < 100
print(count)
# no counter update
```
In this case, the interpreter checks that `count` (0) is less than 100, then runs the loop. Then checks that `count` (still 0) is less than 100, then runs the loop. Then checks that `count` (_still_ 0) is less than 100, then runs the loop...
This is known as an **infinite loop**: the loop will run forever, never being able to "break out" and reach the next statement.
- If you hit an infinite loop in Jupyter Notebook, use `Kernel > Interrupt` to break it and try again. If running a `.py` file on the command-line, use `Ctrl-C` to cancel the script.
There are lots of ways to accidently produce an infinite loop:
1. Having a condition that is "too exact" can cause the loop control variable to "miss" a particular breaking value:
```python
count = 0
while count != 100: # if we aren't yet at 100
print(count)
count = count + 3 # this will never equal 100
```
Thus it is always safe to use **inequalities** (e.g., `<` or `>`) when writing loop conditions.
2. Resetting the counter in the body of the loop can cause it to never reach its goal:
```python
count = 0
while count < 100:
count = 0 # this resets the count!
print(count)
count = count + 1
```
The best way to catch these errors is to "play computer": pretend that you are the compiler, and go through each statement one by one, keeping track of the loop control variable (writing its value down on a sheet of paper does wonders). This will help you be able to "trace" what your program is doing and catch any bugs there may be.
### Conditionals and Sentinels
As with `if` statements the block for a `while` loop can contain any valid Python statements, including `if` statements or even other loops (called a "nested loop"). Control statements such as `if` are intended an extra step (4 spaces or 1 tab):
```python
# flip a coin until it shows up heads
still_flipping = True
while still_flipping:
flip = random.randint(0,1)
if flip == 0:
flip = "Heads"
else:
flip = "Tails"
print(flip)
if flip == "Heads":
still_flipping = False
```
In this example, the `still_flipping` boolean variable acts as the _loop control variable_, as it determines whether or not the loop is repeated. Using a Boolean as a LCV is known as using a **sentinel variable**. A sentinel (guard) variable is used to control whether or not the program flow gets out of the loop: as long as the sentinel is `True`, the loop continues to run. Thus the loop can be "exited" by assigning the sentinel to be `False`. This is particularly useful when there may be a complex set of conditions that need to be met before the program can carry on.
- It is of course possible to design a sentinel such as `done_flipping`, and then have the `while` condition check that the sentinel is **not** `True`. This may be useful depending on how you've structured the algorithm. In either case, be sure that your sentinels are named carefully and accurately reflect the information conveyed by the variable!
## For Loops
If you look back at the basic counting loop, you'll notice that tracking the `count` loop control variable can be problematic. It is easy to forget the update statement (`count = count + 1`), and the the `count` variable itself acts as an extra "global" (or "less local") variable that the interpreter needs to keep track of.
To avoid these problems, we can instead use a different kind of loop called a **for loop**. A for loop is used for "definite iteration", when we want execute a loop a specific number of times. The basic Python for loop has the following structure:
```python
for local_variable in range(maximum):
# lines of code to run for each number
```
For example, the basic counting while loop example could be rewritten using a simpler for loop:
```python
for count in range(100):
print(count)
```
[`range()`](https://docs.python.org/3/library/stdtypes.html#ranges) is a function that returns a value representing a sequence of numbers. It is an example of a **sequence** data structure, which is a way of representing multiple data items in a single variable. Sequences will be discussed more in later modules (including the most common type of sequence, a **list**). The `range()` function can be called with different arguments depending on the range and spread of numbers you wish to use:
```python
# numbers 0 to 10 (not inclusive)
# (0 through 9)
range(10)
# numbers 1 through 11 (not inclusive)
# (1 through 10)
range(1,11)
# numbers 0 through 10 (not inclusive), skipping by 3
# (0, 3, 6, 9)
range(0, 10, 3)
```
Thus `for count in range(100)` can be read as "for _each_ number (called count) from 0 to 100".
For loops may more properly be thought of as **"for each"** loops; they are used to go through the items in a collection (e.g., each number in a `range`), executing the loop body once for each item. The **`local_variable`** (e.g., `count`) in a for loop is _implicitly_ assigned the value of the "current" item in the collection (e.g., which number in the range we're on) at each iteration.
- For ranges, this basically means that the for loops keeps track of the current iteration; but the idea of this local variable will be more important as we introduce additional collection types.
### Difference from While Loops
The main difference between while loops and for loops is:
> while loops are used for **indefinite** iteration; for loops are used for **definite** iteration.
While loops are appropriate when the interpreter doesn't know _in advance_ (before the loop starts) how many times the loop block will be executed: the loop does not have a definite number of iterations. On the other hand, a for loop is appropriate when the interpreter does know _in advance_ (before the loop starts) how many times the block will be executed—a definite number of iterations. Note that that number of iterations may be a variable so not determined until runtime; however, the value of that variable will still be known when the `for` statement is executed.
All iteration can be written with a while loop; but when performing definite iteration, it is easier, faster, and more idiomatic to use a for loop!
## Working with Files
For loops can be used to iterate through any collection (technically any "iterable" type). One of the more useful collections when working with data is external **files** (e.g., text files). Files can be treated as a collection or sequence of _lines_ (each divided by a `\n` newline character), and thus Python can "read" a text file using a for loop to iterate over the lines of text in the file.
In order to read or write text file data, you use the built-in [`open()`](https://docs.python.org/3/library/functions.html#open) function, passing it the _path_ to the file you wish to access. This function will then return an object representing that particular file (e.g., it's location on the disk), with methods that you can use to read from and write to it.
- Remember to **always** use _relative_ paths. Note that when using a Jupyter Notebook, the "current working directory" is the direction in which you ran the `jupyter notebook` command to start the server.
```python
my_file = open('myfile.txt') # open the file
for line in my_file:
print(line) # print each line in the file
```
Once you have opened a file, you can use a for loop to iterate through its line (as in the example above). You can also use a while loop, calling the [`readline()`](https://docs.python.org/3.3/tutorial/inputoutput.html#methods-of-file-objects) method _on_ the file in order to read a single line at a time.
It is also possible to write out content to a file. To do this, you need to open the file with "write" access (allowing the program to write to and modify it) by passing `w` as the second argument to the `open()` function. You can then use the `write()` method to "print" text to the file:
```python
# "open" the file with "write" access
file = open('myfile.txt', 'w')
file.write("Hello world!\n")
file.write("It's a mighty fine morning\n")
```
- Note that unlike the `print()` function, `write()` does not include a line break at the end of each method call; you need to add those yourself!
### Try/Except
File operations rely on a context that is internal to the program itself: namely, that the file you wish to open actually exists at the location you specify! But that may not be the case—particularly if which file to open is specified by the user:
```python
filename = input("File to open: ") # which file to open
file = open(filename) # "open" the file
for line in file:
print(line)
```
If the user provides a bad file name, your program will encounter an error _through no fault of your own as a programmer_:
```bash
$ python script.py
File to open: neener neener
Traceback (most recent call last):
File "script.py", line 4, in <module>
file = open(filename)
FileNotFoundError: [Errno 2] No such file or directory: 'neener neener'
```
Since it's possible for the user to make a mistake, we could like the program not to simply fail with an error, but to instead be able to "recover" and keep running (e.g., by asking the user for a different file name). We can peform this kind of [**error handling**](https://docs.python.org/3/tutorial/errors.html) by utilizing a **`try`** statement with an **`except`** clause:
```python
filename = input("File to open: ")
try: # this might break (not our fault)
file = open(filename)
for line in file:
print(line)
except: # catching FileNotFoundError
print("No such file")
```
A `try` statement acts somewhat similar to an `if` statement; however, the `try` statement checks to see if any errors occur _within its block_. If such an error occurs, rather than the program catching, the interpreter will _immediately_ jump to the `except` class and execute that block, before continuing on with the rest of the program.
- Note that this is an exception to the general rule that conditions are only checked at the start of a block; a `try` block effectively tells the computer to keep an eye out for any errors ("try this, but it might break"), with the `except` clause specifying what to do if such an error occurs.
`try` statements are used when a program may hit an error that is _not caused by programmer's code, but by an external input_ (e.g., from a user or a file). You should not use a `try` statement to fix broken program logic or invalid syntax: instead, you should fix those problems directly!