Pandas_Select_Data_iloc

Pandas_Select_Data_iloc

iloc[ ]是基于位置的,也可以使用布尔数组。

可以输入如下几种类型:

  • 单个标签,例如5或'a';
  • 列表或标签数组。['a', 'b', 'c']
  • 带标签的切片对象'a':'f';
  • 布尔数组
  • 函数。
import pandas as pd
import numpy as np
import seaborn as sns
​
iris = pd.read_csv('iris.csv',header=0)
iris.head()

out:
sepal_length    sepal_width petal_length    petal_width species
0   5.1 3.5 1.4 0.2 setosa
1   4.9 3.0 1.4 0.2 setosa
2   4.7 3.2 1.3 0.2 setosa
3   4.6 3.1 1.5 0.2 setosa
4   5.0 3.6 1.4 0.2 setosa

修改值

iris.sepal_length.iloc[:3] = 4
iris.head()

out:
sepal_length    sepal_width petal_length    petal_width species
0   4.0 3.5 1.4 0.2 setosa
1   4.0 3.0 1.4 0.2 setosa
2   4.0 3.2 1.3 0.2 setosa
3   4.6 3.1 1.5 0.2 setosa
4   5.0 3.6 1.4 0.2 setosa

使用切片索引

iris.iloc[2:4, 3:5]

out:
petal_width species
2   0.2 setosa
3   0.2 setosa
iris.iloc[:3]

out:
sepal_length    sepal_width petal_length    petal_width species
0   4.0 3.5 1.4 0.2 setosa
1   4.0 3.0 1.4 0.2 setosa
2   4.0 3.2 1.3 0.2 setosa
iris.iloc[:, 2:4].head()

out:
petal_length    petal_width
0   1.4 0.2
1   1.4 0.2
2   1.3 0.2
3   1.5 0.2
4   1.4 0.2
iris.iloc[10:20, :]

out:
sepal_length    sepal_width petal_length    petal_width species
10  5.4 3.7 1.5 0.2 setosa
11  4.8 3.4 1.6 0.2 setosa
12  4.8 3.0 1.4 0.1 setosa
13  4.3 3.0 1.1 0.1 setosa
14  5.8 4.0 1.2 0.2 setosa
15  5.7 4.4 1.5 0.4 setosa
16  5.4 3.9 1.3 0.4 setosa
17  5.1 3.5 1.4 0.3 setosa
18  5.7 3.8 1.7 0.3 setosa
19  5.1 3.8 1.5 0.3 setosa

获取单一值

iris.iloc[2,3]

out:
0.2

参数可以超出索引范围,且不会报错。
下面这个例子,160超过了行数,行数为150.

iris.iloc[145:160]

out:
sepal_length    sepal_width petal_length    petal_width species
145 6.7 3.0 5.2 2.3 virginica
146 6.3 2.5 5.0 1.9 virginica
147 6.5 3.0 5.2 2.0 virginica
148 6.2 3.4 5.4 2.3 virginica
149 5.9 3.0 5.1 1.8 virginica

使用可调用函数进行选择

df = pd.DataFrame(np.random.randn(6,4), index=list('abcdef'), columns=list('ABCD'))
df.iloc[lambda df: [1,3]]

out:
    A   B   C   D
b   0.801227    0.864968    1.398595    0.948333
d   0.424825    -0.411694   -0.012486   -0.442733
df.iloc[lambda df: df.index.isin(['a','c'])]

out:
    A   B   C   D
a   -1.585934   0.841256    2.189439    -0.509777
c   -1.831144   -0.946563   1.413636    -0.090958
df.iloc[:, lambda df: [2, 3]]

out:
    C   D
a   2.189439    -0.509777
b   1.398595    0.948333
c   1.413636    -0.090958
d   -0.012486   -0.442733
e   -2.336897   3.220324
f   1.023109    -1.247364
​

推荐阅读更多精彩内容