迭代

在编程中——尤其是在处理随机性时——我们经常需要重复执行某个过程多次。例如，回顾一下基于一次掷骰子的下注游戏，规则如下：

如果骰子显示 1 或 2 点，我的净收益为 -1 美元。
如果骰子显示 3 或 4 点，我的净收益为 0 美元。
如果骰子显示 5 或 6 点，我的净收益为 1 美元。

函数 bet_on_one_roll 不接受参数。每次调用时，它模拟一次公平骰子的投掷并以美元返回净收益。

[In ]:

from datascience import *
path_data = '../../../assets/data/'
import matplotlib
matplotlib.use('Agg')
%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')
import numpy as np

[In ]:

def bet_on_one_roll():
    """Returns my net gain on one bet"""
    x = np.random.choice(np.arange(1, 7))  # roll a die once and record the number of spots
    if x <= 2:
        return -1
    elif x <= 4:
        return 0
    elif x <= 6:
        return 1

玩一次这个游戏很简单：

[In ]:

bet_on_one_roll()

为了了解结果的波动程度，我们需要反复玩游戏。我们可以重复运行单元格，但这很繁琐，而且如果想做一千次或一百万次，那更不可行。

一种更自动化的解决方案是使用 for 语句循环遍历序列的内容。这称为“迭代”。for 语句以 for 开头，后面是我们想给序列中每个项的名称，然后是 in，最后是一个求值为序列的表达式。for 语句的缩进主体对序列中的“每个项”执行一次。

[In ]:

for animal in make_array('cat', 'dog', 'rabbit'):
    print(animal)

cat
dog
rabbit

编写精确复制 for 语句但不使用 for 语句的代码是有帮助的。这称为“展开”循环。

for 语句只是复制其内部的代码，但在每次迭代之前，它会从给定序列中分配一个新值给我们选择的名称。例如，这是上面循环的展开版本。

[In ]:

animal = make_array('cat', 'dog', 'rabbit').item(0)
print(animal)
animal = make_array('cat', 'dog', 'rabbit').item(1)
print(animal)
animal = make_array('cat', 'dog', 'rabbit').item(2)
print(animal)

cat
dog
rabbit

注意名称 animal 是任意的，就像我们用 = 分配的任何名称一样。

这里我们以更实际的方式使用 for 语句：我们打印之前描述的掷骰子下注五次的结果。这称为“模拟”五次下注的结果。我们使用“模拟”这个词来提醒自己，我们并不是在物理上掷骰子和交换金钱，而是使用 Python 来模拟这个过程。

要重复一个过程 n 次，通常在 for 语句中使用序列 np.arange(n)。也很常见使用非常短的名称来代表每个项。在我们的代码中，我们将使用名称 i 来提醒自己它指的是序列中的一个项。

[In ]:

for i in np.arange(5):
    print(bet_on_one_roll())

在这种情况下，我们只是多次执行完全相同的（随机）动作，因此 for 语句主体中的代码实际上并不引用 i。

扩充数组

虽然上面的 for 语句确实模拟了五次下注的结果，但结果只是被打印出来，并没有以可用于计算的形式存在。任何结果数组都会更有用。因此，for 语句的一个典型用途是通过每次扩充数组来创建结果数组。

NumPy 中的 append 方法帮助我们实现这一点。调用 np.append(array_name, value) 求值为一个新数组，该数组是 array_name 增加了 value 之后的结果。使用 append 时，请记住数组的所有条目必须具有相同的类型。

[In ]:

pets = make_array('Cat', 'Dog')
np.append(pets, 'Another Pet')

array(['Cat', 'Dog', 'Another Pet'], dtype='<U11')

这使得数组 pets 保持不变：

[In ]:

pets

array(['Cat', 'Dog'], dtype='<U3')

但在使用 for 循环时，通常在扩充数组时对数组进行变异（即更改它）会更加方便。这是通过将扩充后的数组赋值给与原始数组相同的名称来实现的。

[In ]:

pets = np.append(pets, 'Another Pet')
pets

array(['Cat', 'Dog', 'Another Pet'], dtype='<U11')

示例：五轮下注

现在我们可以模拟五次掷骰子下注，并将结果收集到一个我们称之为“收集数组”的数组中。我们将首先创建一个空数组，然后追加每次下注的结果。注意 for 循环的主体包含两条语句。每条语句对给定序列中的每个项都执行一次。

[In ]:

outcomes = make_array()

for i in np.arange(5):
    outcome_of_bet = bet_on_one_roll()
    outcomes = np.append(outcomes, outcome_of_bet)
    
outcomes

array([-1., -1.,  1.,  1., -1.])

让我们用展开的 for 语句重写单元格：

[In ]:

outcomes = make_array()

i = np.arange(5).item(0)
outcome_of_bet = bet_on_one_roll()
outcomes = np.append(outcomes, outcome_of_bet)

i = np.arange(5).item(1)
outcome_of_bet = bet_on_one_roll()
outcomes = np.append(outcomes, outcome_of_bet)

i = np.arange(5).item(2)
outcome_of_bet = bet_on_one_roll()
outcomes = np.append(outcomes, outcome_of_bet)

i = np.arange(5).item(3)
outcome_of_bet = bet_on_one_roll()
outcomes = np.append(outcomes, outcome_of_bet)

i = np.arange(5).item(4)
outcome_of_bet = bet_on_one_roll()
outcomes = np.append(outcomes, outcome_of_bet)

outcomes

array([ 1.,  0.,  0., -1.,  1.])

数组的内容可能与我们通过运行前一个单元格得到的数组不同，但这是因为掷骰子的随机性。创建数组的过程完全相同。

通过将结果捕获到数组中，我们获得了使用数组方法进行计算的能力。例如，我们可以使用 np.count_nonzero 来计算金钱易手的次数。

[In ]:

np.count_nonzero(outcomes)

示例：300 轮下注

迭代是一种强大的技术。例如，通过运行完全相同的代码处理 300 次下注而不是五次，我们可以看到 300 次下注结果的波动情况。

[In ]:

outcomes = make_array()

for i in np.arange(300):
    outcome_of_bet = bet_on_one_roll()
    outcomes = np.append(outcomes, outcome_of_bet)

数组 outcomes 包含所有 300 次下注的结果。

[In ]:

len(outcomes)

要查看三种不同可能结果出现的频率，我们可以使用数组 outcomes 和 Table 方法。

[In ]:

outcome_table = Table().with_column('Outcome', outcomes)
outcome_table.group('Outcome').barh(0)

Bar plot with 'count' on the x-axis labeled from 0 to 100 and 'Outcome' on the y-axis. The first bar is at -1.0 Outcome with length around 90. The second and third bars are at 0.0 and 1.0 Outcome with the same length a little over 100.

毫不奇怪，三种结果 -1、0 和 1 各出现大约 100 次（在 300 次中），有增有减。我们将在后续章节中更仔细地考察“增减”的量。