pandas is the major tool of interest to make data cleaning and analysis fast and easy
While pandas adopts many coding idioms from NumPy, the biggest difference is that pandas is designed for working with tabular or heterogeneous data. NumPy, by contrast, is best suited for working with homogeneous numerical array data.
5.1 Introduction to pandas Data Structure
5.1.1 Series:
Characteristics:
- one dimension,
- array-like object/ NumPy nd-array;
- a sequence of value
- an associated array of data labels(its index)
Creation:
1. from a list: with the self-defined index or without
2. from a dictionary. Default: the keys of the dic will be the index; you can set index
Alter Index:
Series Computation
1. NumPy array computation
2. pd.isnull(pd_series) and pd. notnull(pd_series) : the return value is Boolean
3. Important: it can automatically align by index label in arithmetic operations: +-*/...