urllib.parse简介

urllib.parse主要用来把URL字符串拆分成URL组件,或者把URL组件拼装成URL字符串

  • 拆分

    urllib.parse.urlparse(urlstring, scheme='', allow_fragments=True)
    

    实例

    from urllib.parse import urlparse
    
    result = urlparse("http://www.baidu.com/index.html;user?id=5#comment")
    print(result)
    

    或者指定协议

    from urllib.parse import urlparse
    
    result = urlparse("www.baidu.com/index.html;user?id=5#comment",scheme="https")
    print(result)
    

    如果URL字符串中已经包含了协议,scheme指定无效

  • 拼接

    urllib.parse.urlunparse(data)
    

    实例

    from urllib.parse import urlunparse
    
    data = ['http','www.baidu.com','index.html','user','a=123','commit']
    print(urlunparse(data))
    
  • 连接

    urllib.parse.urljoin(str1,str2)
    

    实例

    print(urljoin('http://www.baidu.com', 'FAQ.html'))
    print(urljoin('http://www.baidu.com', 'https://pythonsite.com/FAQ.html'))
    print(urljoin('http://www.baidu.com/about.html', 'https://pythonsite.com/FAQ.html'))
    print(urljoin('http://www.baidu.com/about.html', 'https://pythonsite.com/FAQ.html?question=2'))
    print(urljoin('http://www.baidu.com?wd=abc', 'https://pythonsite.com/index.php'))
    print(urljoin('http://www.baidu.com', '?category=2#comment'))
    print(urljoin('www.baidu.com', '?category=2#comment'))
    print(urljoin('www.baidu.com#comment', '?category=2'))
    

    拼接的时候后面的优先级高于前面的URL。

  • 字典转换URL字符串

    urllib.parse.urlencode(dict)
    

    实例

    from urllib.parse import urlencode
    
    params = {
        "name":"zhaofan",
        "age":23,
    }
    base_url = "http://www.baidu.com?"
    
    url = base_url+urlencode(params)
    print(url)
    
  • URL字符串转换字典

    urllib.parse.unquote(urlstr)
    

推荐阅读更多精彩内容