ML实战Python知识点总结（一）

2018-01-20

一、数组和矩阵

注意，两者看似差不多，但是在对它们进行一系列操作时，效果会不太一样

数组：

>>> a = array([[1,1],[1,2],[1,3],[1,4]])
>>> a
array([[1, 1],
       [1, 2],
       [1, 3],
       [1, 4]])

矩阵：

>>> b = mat(a)
>>> b
matrix([[1, 1],
        [1, 2],
        [1, 3],
        [1, 4]])

二、numpy–函数shape用法

对数组和对矩阵的操作有一定的区别：

相同点：

>>> a
array([[1, 1],
       [1, 2],
       [1, 3],
       [1, 4]])
>>> b = mat(a)
>>> b
matrix([[1, 1],
        [1, 2],
        [1, 3],
        [1, 4]])
>>> a.shape
(4, 2)
>>> b.shape
(4, 2)

不同点：

>>> c = array([1,2,3,4])
>>> c.shape
(4,)
>>> d = mat(c)
>>> d.shape
(1, 4)

shape[0]表示数组、矩阵的行数；shape[1]表示数组、矩阵的列数；这里通过c、d的操作可以发现，shape主要是针对矩阵的，操作数组时可能会出错：

>>> a.shape[0]
4
>>> b.shape[1]
2
>>> c.shape[0]
4
>>> c.shape[1]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: tuple index out of range
>>> d.shape[0]
1
>>> d.shape[1]
4

三、numpy–函数tile用法

其主要功能是对数组进行复制，tile(a,(row,column))，表示将数组a复制成一个row行，column列的数组：

>>> a = [1,2,3]
>>> b = tile(a,2)
>>> b
array([1, 2, 3, 1, 2, 3])
>>> b = tile(a,(2,1))
>>> b
array([[1, 2, 3],
       [1, 2, 3]])
>>> b = tile(a,(1,2))
>>> b
array([[1, 2, 3, 1, 2, 3]])
>>> b = tile(a,(2,2))
>>> b
array([[1, 2, 3, 1, 2, 3],
       [1, 2, 3, 1, 2, 3]])

四、numpy–函数sum用法

注意，numpy里的sum和python自身的sum是不太一样的，这里主要介绍numpy里的sum：

axis=0表示列相加，axis=1表示行相加，没有则全部相加

使用方式1：

>>> from numpy import *
>>> sum([[0,1,2],[1,2,3]],axis=1)
array([3, 6])
>>> sum([[0,1,2],[1,2,3]],axis=0)
array([1, 3, 5])
>>> sum([[0,1,2],[1,2,3]])
9

使用方式2：

>>> b
array([[1, 2, 3],
       [1, 2, 3]])
>>> b.sum()
12
>>> b.sum(axis=0)
array([2, 4, 6])
>>> b.sum(axis=1)
array([6, 6])

五、numpy–函数argsort用法

其返回的是数组值从小到大的索引值

Examples

One dimensional array:一维数组

1
2
3

>>> x = np.array([3, 1, 2])
>>> np.argsort(x)
array([1, 2, 0])

Two-dimensional array:二维数组

>>> x = np.array([[0, 3], [2, 2]])
>>> x
array([[0, 3],
       [2, 2]])
>>> np.argsort(x, axis=0) #按列排序
array([[0, 1],
       [1, 0]])
>>> np.argsort(x, axis=1) #按行排序
array([[0, 1],
       [0, 1]])

例1：

>>> x = np.array([3, 1, 2])
>>> np.argsort(x) #按升序排列
array([1, 2, 0])
>>> np.argsort(-x) #按降序排列
array([0, 2, 1])

>>> x[np.argsort(x)] #通过索引值排序后的数组
array([1, 2, 3])
>>> x[np.argsort(-x)]
array([3, 2, 1])

另一种方式实现按降序排序：

>>> a = x[np.argsort(x)]
>>> a
array([1, 2, 3])
>>> a[::-1]
array([3, 2, 1])

六、python字典的get函数和iteritems函数

1.get()：

get()方法语法：\
dict.get(key, default=None)\
key – 字典中要查找的键。\
default – 如果指定键的值不存在时，返回该默认值。

如：

1 2	>>> dict.get('d','error') 'error'

2.iteritems()：

python字典中还存在 items() 方法。两者有些许区别。\
items方法是可以将字典中的所有项，以列表方式返回。\
iteritems方法与items方法相比作用大致相同，只是它的返回值不是列表，而是一个迭代器。

>>> d = {'1':'one', '2':'two', '3':'three'}  
>>> x = d.items()  
>>> x  
[('1', 'one'), ('3', 'three'), ('2', 'two')]  
>>> type(x)  
<type 'list'>  
>>> y = d.iteritems()  
>>> y  
<dictionary-itemiterator object at 0x025008A0>  
>>> type(y)  
<type 'dictionary-itemiterator'>

七、operator.itemgetter函数

operator模块提供的itemgetter函数用于获取对象的哪些维的数据，参数为一些序号。

要注意，operator.itemgetter函数获取的不是值，而是定义了一个函数，通过该函数作用到对象上才能获取值。

如：

>>> import operator
>>> a = [1,2,3]
>>> b = operator.itemgetter(0,1)
>>> b(a)
(1, 2)
>>> b = operator.itemgetter(2,0)
>>> b(a)
(3, 1)
>>> b = operator.itemgetter(2,0,1)
>>> b(a)
(3, 1, 2)

八、sorted函数

sorted函数用来排序，sorted(iterable[, cmp[, key[, reverse]]])

参数：
iterable可以是list或者iterator；
cmp是带两个参数的比较函数，这个在python3里已经被弃用，可以不考虑。
key 是带一个参数的函数；
reverse为False或者True；

其中key的参数为一个函数或者lambda函数。所以itemgetter可以用来当key的参数。

根据第二个域和第三个域进行排序：

1
2
3

>>> a = [('john', 'A', 15), ('jane', 'B', 12), ('dave', 'B', 10)]
>>> sorted(a, key=operator.itemgetter(1,2))
[('john', 'A', 15), ('dave', 'B', 10), ('jane', 'B', 12)]

#Python