国产欧美日韩在线播放,日韩精品在线观看一区,www久久精品

什么是pyQuery：

強(qiáng)大又靈活的網(wǎng)頁解析庫。如果你覺得正則寫起來太麻煩（我不會寫正則），如果你覺得 BeautifulSoup的語法太難記，如果你熟悉JQuery的語法，那么PyQuery就是你最佳的選擇。

pyQuery的安裝pip3 install pyquery即可安裝啦。

pyQuery的基本用法：

初始化：

字符串初始化：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story


                <p class="title" name="dromouse">
                  <b>
                    The Dormouse's story
                  </b>
                </p>
                <p class="story">
                  Once upon a time there were three little sisters;and thier names were

                  <a  class="sister" id="link1">
                    <!-- Elsie -->                  </a>
                  <a  class="sister" id="link2">
                    Lacie
                  </a>
                   and

                  <a  class="sister" id="link3">
                    Title
                  </a>
                  ; and they lived at the boottom of a well.
                </p>
                <p class="story">
                  ...
                </p>
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)

            
            
              print
            
            (doc(
            
              '
            
            
              a
            
            
              '
            
            ))

運(yùn)行結(jié)果：

URL初始化：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               URL初始化
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            = pq(
            
              '
            
            
              http://www.baidu.com
            
            
              '
            
            
              )

            
            
              print
            
            (doc(
            
              '
            
            
              input
            
            
              '
            
            ))

運(yùn)行結(jié)果：

文件初始化：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               文件初始化
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            = pq(filename=
            
              '
            
            
              baidu.html
            
            
              '
            
            
              )

            
            
              print
            
            (doc(
            
              '
            
            
              title
            
            
              '
            
            ))

運(yùn)行結(jié)果：

?選擇方式和jquery一致，id、name、class都是如此，還有很多都和jquery一致。

基本CSS選擇器：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               Css選擇器
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story


                <p class="title" name="dromouse">
                  <b>
                    The Dormouse's story
                  </b>
                </p>
                <p class="story">
                  Once upon a time there were three little sisters;and thier names were

                  <a  class="sister" id="link1">
                    <!-- Elsie -->                  </a>
                  <a  class="sister" id="link2">
                    Lacie
                  </a>
                   and

                  <a  class="title" id="link3">
                    Title
                  </a>
                  ; and they lived at the boottom of a well.
                </p>
                <p class="story">
                  ...
                </p>
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)

            
            
              print
            
            (doc(
            
              '
            
            
              .title
            
            
              '
            
            ))

運(yùn)行結(jié)果：

查找元素：

子元素：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               子元素
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story


                <p class="title" name="dromouse">
                  <b>
                    The Dormouse's story
                  </b>
                </p>
                <p class="story">
                  Once upon a time there were three little sisters;and thier names were

                  <a  class="sister" id="link1">
                    <!-- Elsie -->                  </a>
                  <a  class="sister" id="link2">
                    Lacie
                  </a>
                   and

                  <a  class="title" id="link3">
                    Title
                  </a>
                  ; and they lived at the boottom of a well.
                </p>
                <p class="story">
                  ...
                </p>
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              .title
            
            
              '
            
            
              )

            
            
              print
            
            
              (type(items))

            
            
              print
            
            
              (items)
p 
            
            = items.find(
            
              '
            
            
              b
            
            
              '
            
            
              )

            
            
              print
            
            
              (type(p))

            
            
              print
            
            (p)

該代碼為查找id為title的標(biāo)簽，我們可以看到id為title的標(biāo)簽有兩個一個是p標(biāo)簽，一個是a標(biāo)簽，然后我們再使用find方法，查找出我們需要的p標(biāo)簽，運(yùn)行結(jié)果：

這里需要注意的是，我們所使用的find是查找每一個元素內(nèi)部的標(biāo)簽.

children：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               子元素
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story


                <p class="title" name="dromouse">
                  <b>
                    The Dormouse's story
                  </b>
                </p>
                <p class="story">
                  Once upon a time there were three little sisters;and thier names were

                  <a  class="sister" id="link1">
                    <!-- Elsie -->                  </a>
                  <a  class="sister" id="link2">
                    Lacie
                  </a>
                   and

                  <a  class="title" id="link3">
                    Title
                  </a>
                  ; and they lived at the boottom of a well.
                </p>
                <p class="story">
                  ...
                </p>
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              .title
            
            
              '
            
            
              )

            
            
              print
            
            (items.children())

運(yùn)行結(jié)果：

也可以在children()內(nèi)添加選擇器條件：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               子元素
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story


                <p class="title" name="dromouse">
                  <b>
                    The Dormouse's story
                  </b>
                </p>
                <p class="story">
                  Once upon a time there were three little sisters;and thier names were

                  <a  class="sister" id="link1">
                    <!-- Elsie -->                  </a>
                  <a  class="sister" id="link2">
                    Lacie
                  </a>
                   and

                  <a  class="title" id="link3">
                    Title
                  </a>
                  ; and they lived at the boottom of a well.
                </p>
                <p class="story">
                  ...
                </p>
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              .title
            
            
              '
            
            
              )

            
            
              print
            
            (items.children(
            
              '
            
            
              b
            
            
              '
            
            ))

輸出結(jié)果和上面的一致。

?父元素：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               子元素
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story


                <p class="title" name="dromouse">
                  <b>
                    The Dormouse's story
                  </b>
                </p>
                <p class="story">
                  Once upon a time there were three little sisters;and thier names were

                  <a  class="sister" id="link1">
                    <!-- Elsie -->                  </a>
                  <a  class="sister" id="link2">
                    Lacie
                  </a>
                   and

                  <a  class="title" id="link3">
                    Title
                  </a>
                  ; and they lived at the boottom of a well.
                </p>
                <p class="story">
                  ...
                </p>
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              #link1
            
            
              '
            
            
              )

            
            
              print
            
            
              (items)

            
            
              print
            
            (items.parent())

運(yùn)行結(jié)果：

這里只輸出一個父元素。這里我們用parents方法會給予我們返回所有父元素，祖先元素

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               祖先元素
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story
              
              
                Once upo a time were three little sister;and theru name were
            
                
                  
                    Elsie
                  
                
                
                  Lacie
                
                
            and 
            
                
                  Title
                
                
                  Title
                
              
              
                ...
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              #link1
            
            
              '
            
            
              )

            
            
              print
            
            
              (items)

            
            
              print
            
            (items.parents(
            
              '
            
            
              body
            
            
              '
            
            ))

運(yùn)行結(jié)果：

兄弟元素：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               兄弟元素
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story
              
              
                Once upo a time were three little sister;and theru name were
            
                
                  
                    Elsie
                  
                
                
                  Lacie
                
                
            and 
            
                
                  Title
                
                
                  Title
                
              
              
                ...
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              #link1
            
            
              '
            
            
              )

            
            
              print
            
            
              (items)

            
            
              print
            
            (items.siblings(
            
              '
            
            
              #link2
            
            
              '
            
            ))

運(yùn)行結(jié)果：

上面就把查找元素的方法都說了，下面我來看一下如何遍歷元素。

遍歷

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               兄弟元素
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story
              
              
                Once upo a time were three little sister;and theru name were
            
                
                  
                    Elsie
                  
                
                
                  Lacie
                
                
            and 
            
                
                  Title
                
                
                  Title
                
              
              
                ...
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              a
            
            
              '
            
            
              )

            
            
              for
            
             k,v 
            
              in
            
            
               enumerate(items.items()):
    
            
            
              print
            
            (k,v)

運(yùn)行結(jié)果：

?獲取信息：

　　獲取屬性：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               獲取屬性
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story
              
              
                Once upo a time were three little sister;and theru name were
            
                
                  
                    Elsie
                  
                
                
                  Lacie
                
                
            and 
            
                
                  Title
                
                
                  Title
                
              
              
                ...
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              a
            
            
              '
            
            
              )

            
            
              print
            
            
              (items)

            
            
              print
            
            (items.attr(
            
              '
            
            
              href
            
            
              '
            
            
              ))

            
            
              print
            
            (items.attr.href)

運(yùn)行結(jié)果：

獲得文本：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               獲取屬性
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story
              
              
                Once upo a time were three little sister;and theru name were
            
                
                  
                    Elsie
                  
                
                
                  Lacie
                
                
            and 
            
                
                  Title
                
                
                  Title
                
              
              
                ...
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              a
            
            
              '
            
            
              )

            
            
              print
            
            
              (items)

            
            
              print
            
            
              (items.text())

            
            
              print
            
            (type(items.text()))

運(yùn)行結(jié)果：

　獲得HTML：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               獲取屬性
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story
              
              
                Once upo a time were three little sister;and theru name were
            
                
                  
                    Elsie
                  
                
                
                  Lacie
                
                
            and 
            
                
                  Title
                
                
                  Title
                
              
              
                ...
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              a
            
            
              '
            
            
              )

            
            
              print
            
            (items.html())

運(yùn)行結(jié)果：

DOM操作：

addClass、removeClass

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               DOM操作，addClass、removeClass
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story
              
              
                Once upo a time were three little sister;and theru name were
            
                
                  
                    Elsie
                  
                
                
                  Lacie
                
                
            and 
            
                
                  Title
                
                
                  Title
                
              
              
                ...
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              #link2
            
            
              '
            
            
              )

            
            
              print
            
            
              (items)
items.addClass(
            
            
              '
            
            
              addStyle
            
            
              '
            
            ) 
            
              #
            
            
               add_class
            
            
              print
            
            
              (items)
items.remove_class(
            
            
              '
            
            
              sister
            
            
              '
            
            ) 
            
              #
            
            
               removeClass 
            
            
              print
            
            (items)

運(yùn)行結(jié)果：

attr、css：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               DOM操作，attr,css
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story
              
              
                Once upo a time were three little sister;and theru name were
            
                
                  
                    Elsie
                  
                
                
                  Lacie
                
                
            and 
            
                
                  Title
                
                
                  Title
                
              
              
                ...
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
items 
            
            = doc(
            
              '
            
            
              #link2
            
            
              '
            
            
              )
items.attr(
            
            
              '
            
            
              name
            
            
              '
            
            ,
            
              '
            
            
              addname
            
            
              '
            
            
              )

            
            
              print
            
            
              (items)
items.css(
            
            
              '
            
            
              width
            
            
              '
            
            ,
            
              '
            
            
              100px
            
            
              '
            
            
              )

            
            
              print
            
            (items)

可以給予新的屬性，如果原來有該屬性，會覆蓋掉原有的屬性

運(yùn)行結(jié)果：

remove：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               DOM操作，remove
            
            
              
html 
            
            = 
            
              """
            
            
              
                
    Hello World
    
                
                  This is a paragraph.
                
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)
wrap 
            
            = doc(
            
              '
            
            
              .wrap
            
            
              '
            
            
              )

            
            
              print
            
            
              (wrap.text())
wrap.find(
            
            
              '
            
            
              p
            
            
              '
            
            
              ).remove()

            
            
              print
            
            (
            
              "
            
            
              remove以后的數(shù)據(jù)
            
            
              "
            
            
              )

            
            
              print
            
            (wrap)

運(yùn)行結(jié)果：

還有很多其他的DOM方法，想了解更多的小伙伴可以閱讀其官方文檔，地址：https://pyquery.readthedocs.io/en/latest/api.html

偽類選擇器：

            
              #
            
            
              !/usr/bin/env python
            
            
              
#
            
            
               -*- coding: utf-8 -*-
            
            
              
#
            
            
               DOM操作，偽類選擇器
            
            
              
html 
            
            = 
            
              """
            
            
              
                The Dormouse's story
              
              
                Once upo a time were three little sister;and theru name were
            
                
                  
                    Elsie
                  
                
                
                  Lacie
                
                
            and 
            
                
                  Title
                
                
                  Title
                
              
              
                ...
              
            
            
              """
            
            
              from
            
             pyquery 
            
              import
            
            
               PyQuery as pq
doc 
            
            =
            
               pq(html)

            
            
              #
            
            
               print(doc)
            
            
wrap = doc(
            
              '
            
            
              a:first-child
            
            
              '
            
            ) 
            
              #
            
            
               第一個標(biāo)簽
            
            
              print
            
            
              (wrap)
wrap 
            
            = doc(
            
              '
            
            
              a:last-child
            
            
              '
            
            )  
            
              #
            
            
               最后一個標(biāo)簽
            
            
              print
            
            
              (wrap)
wrap 
            
            = doc(
            
              '
            
            
              a:nth-child(2)
            
            
              '
            
            ) 
            
              #
            
            
               第二個標(biāo)簽
            
            
              print
            
            
              (wrap)
wrap 
            
            = doc(
            
              '
            
            
              a:gt(2)
            
            
              '
            
            ) 
            
              #
            
            
               比2大的索引 標(biāo)簽  即為  0 1 2 3 4 從0開始的  不是1
            
            
              print
            
            
              (wrap)
wrap 
            
            = doc(
            
              '
            
            
              a:nth-child(2n)
            
            
              '
            
            ) 
            
              #
            
            
               第 2的整數(shù)倍 個標(biāo)簽
            
            
              print
            
            
              (wrap)
wrap 
            
            = doc(
            
              '
            
            
              a:contains(Lacie)
            
            
              '
            
            ) 
            
              #
            
            
               包含Lacie文本的標(biāo)簽
            
            
              print
            
            (wrap)

這里不在詳細(xì)的一一列舉了，了解更多CSS選擇器可以查看官方文檔，由W3C提供地址：http://www.w3school.com.cn/css/index.asp

到這里我們就把pyQuery的使用方法大致的說完了，想了解更多，更詳細(xì)的可以閱讀官方文檔，地址：https://pyquery.readthedocs.io/en/latest/

上述代碼地址：https://gitee.com/dwyui/pyQuery.git

感謝大家的閱讀，不正確的地方，還希望大家來斧正，鞠躬，謝謝。

更多文章、技術(shù)交流、商務(wù)合作、聯(lián)系博主

微信掃碼或搜索：z360901061

微信掃一掃加我為好友

QQ號聯(lián)系： 360901061

您的支持是博主寫作最大的動力，如果您喜歡我的文章，感覺我的文章對您有幫助，請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧，狠狠點(diǎn)擊下面給點(diǎn)支持吧，站長非常感激您！手機(jī)微信長按不能支付解決辦法：請將微信支付二維碼保存到相冊，切換到微信，然后點(diǎn)擊微信右上角掃一掃功能，選擇支付二維碼完成支付。

【本文對您有幫助就好】元

2元

5元

10元

20元

自定義