Python多进程学习

进程的基本概念

进程是程序的一次执行,每个进程都有自己的地址空间,内存,数据栈以及其他记录其运行轨迹的辅助数据。多进程就是在一个程序中执行多个任务,可以提高脚本的并行执行能力。当然使用多进程往往是用来处理CPU密集型(科学计算)的需求。

使用fork创建进程

但是fork()调用一次,返回两次,因为操作系统自动把当前进程(称为父进程)复制了一份(称为子进程),然后,分别在父进程和子进程内返回,子进程永远返回0,而父进程返回子进程的ID

import os
# 此方法只在Unix、Linux平台上有效
print('Proccess {} is start'.format(os.getpid()))
subprocess = os.fork()
source_num = 9
if subprocess == 0:
    print('I am in child process, my pid is {0}, and my father pid is {1}'.format(os.getpid(), os.getppid()))
    source_num  = source_num * 2
    print('The source_num in ***child*** process is {}'.format(source_num))
else:
    print('I am in father proccess, my child process is {}'.format(subprocess))
    source_num = source_num ** 2
    print('The source_num in ---father--- process is {}'.format(source_num))
print('The source_num is {}'.format(source_num))
Proccess 16600 is start
I am in father proccess, my child process is 19193
The source_num in ---father--- process is 81
The source_num is 81
Proccess 16600 is start
I am in child process, my pid is 19193, and my father pid is 16600
The source_num in ***child*** process is 18
The source_num is 18

很明显,多进程之间的数据并无相互影响

multiprocessing模块

Multiprocessing是一个Python模块,使用与threading模块类似的API产生进程。它通过使用进程代替线程可以为本地和远程并发性的、有效的避开GIL。因此,该multiprocessing模块允许程序员充分利用给定机器上的多个处理器。

创建管理进程模块:

  • Process(用于创建进程):通过创建一个Process对象然后调用它的start()方法来生成进程。Process遵循threading.Thread的API。
  • Pool(用于创建进程管理池):可以创建一个进程池,该进程将执行与Pool该类一起提交给它的任务,当子进程较多需要管理时使用。
  • Queue(用于进程通信,资源共享):进程间通信,保证进程安全。
  • Value,Array(用于进程通信,资源共享):
  • Pipe(用于管道通信):管道操作。
  • Manager(用于资源共享):创建进程间共享的数据,包括在不同机器上运行的进程之间的网络共享。

同步子进程模块:

  • Condition
  • Event:用来实现进程间同步通信。
  • Lock:当多个进程需要访问共享资源的时候,Lock可以用来避免访问的冲突。
  • RLock
  • Semaphore:用来控制对共享资源的访问数量,例如池的最大连接数。

1.Process

创建进程的类:Process(group=None, target=None, name=None, args=(), kwargs={}, *, daemon=None):

group永远为0
target表示run()方法要调用的对象
name为别名
args表示调用对象的位置参数元组
kwargs表示调用对象的字典
deamon设置守护进程

方法

run():表示进程活动的方法
start():开始进程
join():表示阻塞当前进程,待调用join()的进程结束后,再开始当前方法
name:进程的名字
is_alive():返回进程是否活着(与death状态相反)
deamon:守护进程的标识
pid:返回进程ID
teminate():终结进程,强制终结

创建单个进程
import os 
from multiprocessing import Process

def hello_pro(name):
    print('I am in process {0}, It\'s PID is {1}' .format(name, os.getpid()))
    
if __name__ == '__main__':
    print('Parent Process PID is {}'.format(os.getpid()))
    p = Process(target=hello_pro, args=('test',), name='test_proc')
    # 开始进程
    p.start()
    print('Process\'s ID is {}'.format(p.pid))
    print('The Process is alive? {}'.format(p.is_alive()))
    print('Process\' name is {}'.format(p.name))
    # join方法表示阻塞当前进程,待p代表的进程执行完后,再执行当前进程
    p.join()   
    
Parent Process PID is 16600
I am in process test, It's PID is 19925
Process's ID is 19925
The Process is alive? True
Process' name is test_proc
创建多个进程
import os
 
from multiprocessing import Process, current_process
 
 
def doubler(number):
    """
    A doubling function that can be used by a process
    """
    result = number * 2
    proc_name = current_process().name
    print('{0} doubled to {1} by: {2}'.format(
        number, result, proc_name))
 
 
if __name__ == '__main__':
    numbers = [5, 10, 15, 20, 25]
    procs = []
    proc = Process(target=doubler, args=(5,))
 
    for index, number in enumerate(numbers):
        proc = Process(target=doubler, args=(number,))
        procs.append(proc)
        proc.start()
 
    proc = Process(target=doubler, name='Test', args=(2,))
    proc.start()
    procs.append(proc)
 
    for proc in procs:
        proc.join()
5 doubled to 10 by: Process-8
20 doubled to 40 by: Process-11
10 doubled to 20 by: Process-9
15 doubled to 30 by: Process-10
25 doubled to 50 by: Process-12
2 doubled to 4 by: Test
将进程创建为类
import os
import time
from multiprocessing import Process

class DoublerProcess(Process):
    def __init__(self, numbers):
        Process.__init__(self)
        self.numbers = numbers
    
    # 重写run()函数
    def run(self):
        for number in self.numbers:
            result = number * 2
            proc_name = current_process().name
            print('{0} doubled to {1} by: {2}'.format(number, result, proc_name))

            
if __name__ == '__main__':
    dp = DoublerProcess([5, 20, 10, 15, 25])
    dp.start()
    dp.join()
        
5 doubled to 10 by: DoublerProcess-16
20 doubled to 40 by: DoublerProcess-16
10 doubled to 20 by: DoublerProcess-16
15 doubled to 30 by: DoublerProcess-16
25 doubled to 50 by: DoublerProcess-16

2.Lock

代码来自Python多进程编程

import multiprocessing
import sys
 
def worker_with(lock, f):
    # lock支持上下文协议,可以使用with语句
    with lock:
        fs = open(f, 'a+')
        n = 10
        while n > 1:
            print('Lockd acquired via with')
            fs.write("Lockd acquired via with\n")
            n -= 1
        fs.close()
 
def worker_no_with(lock, f):
    # 获取lock
    lock.acquire()
    try:
        fs = open(f, 'a+')
        n = 10
        while n > 1:
            print('Lock acquired directly')
            fs.write("Lock acquired directly\n")
            n -= 1
        fs.close()
    finally:
        # 释放Lock
        lock.release()
 
if __name__ == "__main__":
    lock = multiprocessing.Lock()
    f = "file.txt"
    w = multiprocessing.Process(target = worker_with, args=(lock, f))
    nw = multiprocessing.Process(target = worker_no_with, args=(lock, f))
    w.start()
    nw.start()
    w.join()
    nw.join()
    print('END!')
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
END!

3.Pool

Pool可以提供指定数量的进程,供用户调用,当有新的请求提交到pool中时,如果池还没有满,那么就会创建一个新的进程用来执行该请求;但如果池中的进程数已经达到规定最大值,那么该请求就会等待,直到池中有进程结束,才会创建新的进程来它

import time
import os
from multiprocessing import Pool, cpu_count

def f(msg):
    print('Starting: {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
    time.sleep(3)
    print('Ending:   {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
    
if __name__ == '__main__':
    print('Starting Main Function')
    print('This Computer has {} CPU'.format(cpu_count()))
    # 创建4个进程
    p = Pool(4)
    for i in range(5):
        msg = 'Process {}'.format(i)
        # 将函数和参数传入进程
        p.apply_async(f, (msg, ))
    # 禁止增加新的进程
    p.close()
    # 阻塞当前进程
    p.join()
    print('All Done!!!')
Starting Main Function
This Computer has 4 CPU
Starting: Process 2, PID: 8332, Time: Fri Sep  1 08:53:12 2017
Starting: Process 1, PID: 8331, Time: Fri Sep  1 08:53:12 2017
Starting: Process 0, PID: 8330, Time: Fri Sep  1 08:53:12 2017
Starting: Process 3, PID: 8333, Time: Fri Sep  1 08:53:12 2017
Ending:   Process 2, PID: 8332, Time: Fri Sep  1 08:53:15 2017
Ending:   Process 3, PID: 8333, Time: Fri Sep  1 08:53:15 2017
Starting: Process 4, PID: 8332, Time: Fri Sep  1 08:53:15 2017
Ending:   Process 1, PID: 8331, Time: Fri Sep  1 08:53:15 2017
Ending:   Process 0, PID: 8330, Time: Fri Sep  1 08:53:15 2017
Ending:   Process 4, PID: 8332, Time: Fri Sep  1 08:53:18 2017
All Done!!!

本机为4个CPU,所以前0-3号进程直接同时执行,4号进程等待,带0-3号中有进程执行完毕后,4号进程开始执行。而当前进程执行完毕后,再执行当前进程,打印“All Done!!!”。方法apply_async()是非阻塞式的,而方法apply()则是阻塞式的。

将apply_async()替换为apply()方法

import time
import os
from multiprocessing import Pool, cpu_count

def f(msg):
    print('Starting: {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
    time.sleep(3)
    print('Ending:   {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
    
if __name__ == '__main__':
    print('Starting Main Function')
    print('This Computer has {} CPU'.format(cpu_count()))
    # 创建4个进程
    p = Pool(4)
    for i in range(5):
        msg = 'Process {}'.format(i)
        # 将apply_async()方法替换为apply()方法
        p.apply(f, (msg, ))
    # 禁止增加新的进程
    p.close()
    # 阻塞当前进程
    p.join()
    print('All Done!!!')
Starting Main Function
This Computer has 4 CPU
Starting: Process 0, PID: 8281, Time: Fri Sep  1 08:51:18 2017
Ending:   Process 0, PID: 8281, Time: Fri Sep  1 08:51:21 2017
Starting: Process 1, PID: 8282, Time: Fri Sep  1 08:51:21 2017
Ending:   Process 1, PID: 8282, Time: Fri Sep  1 08:51:24 2017
Starting: Process 2, PID: 8283, Time: Fri Sep  1 08:51:24 2017
Ending:   Process 2, PID: 8283, Time: Fri Sep  1 08:51:27 2017
Starting: Process 3, PID: 8284, Time: Fri Sep  1 08:51:27 2017
Ending:   Process 3, PID: 8284, Time: Fri Sep  1 08:51:30 2017
Starting: Process 4, PID: 8281, Time: Fri Sep  1 08:51:30 2017
Ending:   Process 4, PID: 8281, Time: Fri Sep  1 08:51:33 2017
All Done!!!

可以看到阻塞式的在一个接一个执行,待上一个执行完毕后才执行下一个。

使用get方法获取结果
import time
import os
from multiprocessing import Pool, cpu_count

def f(msg):
    print('Starting: {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
    time.sleep(3)
    print('Ending:   {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
    return 'Done {}'.format(msg)
    
if __name__ == '__main__':
    print('Starting Main Function')
    print('This Computer has {} CPU'.format(cpu_count()))
    # 创建4个进程
    p = Pool(4)
    results = []
    for i in range(5):
        msg = 'Process {}'.format(i)
        results.append(p.apply_async(f, (msg, )))
    # 禁止增加新的进程
    p.close()
    # 阻塞当前进程
    p.join()
    for result in results:
        print(result.get())
    print('All Done!!!')
Starting Main Function
This Computer has 4 CPU
Starting: Process 0, PID: 8526, Time: Fri Sep  1 09:00:04 2017
Starting: Process 1, PID: 8527, Time: Fri Sep  1 09:00:04 2017
Starting: Process 2, PID: 8528, Time: Fri Sep  1 09:00:04 2017
Starting: Process 3, PID: 8529, Time: Fri Sep  1 09:00:04 2017
Ending:   Process 1, PID: 8527, Time: Fri Sep  1 09:00:07 2017
Starting: Process 4, PID: 8527, Time: Fri Sep  1 09:00:07 2017
Ending:   Process 3, PID: 8529, Time: Fri Sep  1 09:00:07 2017
Ending:   Process 0, PID: 8526, Time: Fri Sep  1 09:00:07 2017
Ending:   Process 2, PID: 8528, Time: Fri Sep  1 09:00:07 2017
Ending:   Process 4, PID: 8527, Time: Fri Sep  1 09:00:10 2017
Done Process 0
Done Process 1
Done Process 2
Done Process 3
Done Process 4
All Done!!!

4.Queue

Queue是多进程安全的队列,可以使用Queue实现多进程之间的数据传递。

put方法用以插入数据到队列中,put方法还有两个可选参数:blocked和timeout。如果blocked为True(默认值),并且timeout为正值,该方法会阻塞timeout指定的时间,直到该队列有剩余的空间。如果超时,会抛出Queue.Full异常。如果blocked为False,但该Queue已满,会立即抛出Queue.Full异常。

get方法可以从队列读取并且删除一个元素。同样,get方法有两个可选参数:blocked和timeout。如果blocked为True(默认值),并且timeout为正值,那么在等待时间内没有取到任何元素,会抛出Queue.Empty异常。如果blocked为False,有两种情况存在,如果Queue有一个值可用,则立即返回该值,否则,如果队列为空,则立即抛出Queue.Empty异常

import os
import time
from multiprocessing import Queue, Process

def write_queue(q):
    for i in ['first', 'two', 'three', 'four', 'five']:
        print('Write "{}" to Queue'.format(i))
        q.put(i)
        time.sleep(3)
    print('Write Done!')
def read_queue(q):
    print('Start to read!')
    while True:
        data = q.get()
        print('Read "{}" from Queue!'.format(data))
if __name__ == '__main__':
    q = Queue()
    wq = Process(target=write_queue, args=(q,))
    rq = Process(target=read_queue, args=(q,))
    wq.start()
    rq.start()
    # #这个表示是否阻塞方式启动进程,如果要立即读取的话,两个进程的启动就应该是非阻塞式的, 
    # 所以wq在start后不能立即使用wq.join(), 要等rq.start后方可
    wq.join()
    # 服务进程,强制停止,因为read_queue进程李是死循环
    rq.terminate()

Write "first" to Queue
Start to read!
Read "first" from Queue!
Write "two" to Queue
Read "two" from Queue!
Write "three" to Queue
Read "three" from Queue!
Write "four" to Queue
Read "four" from Queue!
Write "five" to Queue
Read "five" from Queue!
Write Done!

5.Pipe

Pipe方法返回(conn1, conn2)代表一个管道的两个端。

Pipe方法有duplex参数,如果duplex参数为True(默认值),那么这个管道是全双工模式,也就是说conn1和conn2均可收发。duplex为False,conn1只负责接受消息,conn2只负责发送消息。

send和recv方法分别是发送和接受消息的方法。例如,在全双工模式下,可以调用conn1.send发送消息,conn1.recv接收消息。如果没有消息可接收,recv方法会一直阻塞。如果管道已经被关闭,那么recv方法会抛出EOFError。

可参考使用pipe管道使python fork多进程之间通信

import os, time, sys
from multiprocessing import Pipe, Process

def send_pipe(p):
    for i in ['first', 'two', 'three', 'four', 'five']:
        print('Send "{}" to Pipe'.format(i))
        p.send(i)
        time.sleep(3)
    print('Send Done!')
def receive_pipe(p):
    print('Start to receive!')
    while True:
        data = p.recv()
        print('Read "{}" from Pipe!'.format(data))
if __name__ == '__main__':
    sp_pipe, rp_pipe = Pipe()
    sp = Process(target=send_pipe, args=(sp_pipe,))
    rp = Process(target=receive_pipe, args=(rp_pipe,))
    sp.start()
    rp.start()
    wq.join()
    rq.terminate()
Start to receive!
Send "first" to Pipe
Read "first" from Pipe!
Send "two" to Pipe
Read "two" from Pipe!
Send "three" to Pipe
Read "three" from Pipe!
Send "four" to Pipe
Read "four" from Pipe!
Send "five" to Pipe
Read "five" from Pipe!
Send Done!

6.Semaphore

Semaphore用来控制对共享资源的访问数量,例如池的最大连接数

import multiprocessing
import time
 
def worker(s, i):
    s.acquire()
    print(multiprocessing.current_process().name + "acquire");
    time.sleep(i)
    print(multiprocessing.current_process().name + "release\n");
    s.release()
 
if __name__ == "__main__":
    s = multiprocessing.Semaphore(3)
    for i in range(5):
        p = multiprocessing.Process(target = worker, args=(s, i*2))
        p.start()
Process-170acquire
Process-168acquire
Process-168release
Process-169acquire

Process-171acquire
Process-169release

Process-172acquire
Process-170release

Process-171release

Process-172release

阅读

  1. Python 201:多进程教程
  2. Python多进程编程
  3. Python mutilprocessing Processing 父子进程共享文件对象?
  4. Python 多进程实践
  5. 廖雪峰Python教程-多进程
  6. python多进程的理解 multiprocessing Process join run
  7. python文档
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 157,298评论 4 360
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 66,701评论 1 290
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 107,078评论 0 237
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 43,687评论 0 202
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 52,018评论 3 286
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 40,410评论 1 211
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 31,729评论 2 310
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 30,412评论 0 194
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 34,124评论 1 239
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 30,379评论 2 242
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 31,903评论 1 257
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 28,268评论 2 251
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 32,894评论 3 233
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 26,014评论 0 8
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 26,770评论 0 192
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 35,435评论 2 269
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 35,312评论 2 260

推荐阅读更多精彩内容