午夜电影网,色停停,交视频在线观看国产

總體上來說，從Response對象開始，我們就分成了兩條路徑，一條路徑是數據放在HTML里，所以我們用BeautifulSoup庫去解析數據和提取數據；另一條，數據作為Json存儲起來，所以我們用response.json()方法去解析，然后提取、存儲數據。
爬取知乎大v張佳瑋的文章“標題”、“摘要”、“鏈接”，并存儲到本地文件。
張佳瑋的知乎文章URL在這里：https://www.zhihu.com/people/zhang-jia-wei/posts?page=1
用requests.get()獲取數據，然后檢查請求是否成功。

            
              import requests
#引入requests
headers={'user-agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}
#封裝headers
url='https://www.zhihu.com/api/v4/members/zhang-jia-wei/articles?'
#寫入網址
params={
    'include':'data[*].comment_count,suggest_edit,is_normal,thumbnail_extra_info,thumbnail,can_comment,comment_permission,admin_closed_comment,content,voteup_count,created,updated,upvoted_followees,voting,review_info,is_labeled,label_info;data[*].author.badge[?(type=best_answerer)].topics',
    'offset':'10',
    'limit':'20',
    'sort_by':'voteups',
    }
#封裝參數
res=requests.get(url,headers=headers,params=params)
#發送請求，并把響應內容賦值到變量res里面
print(res.status_code)
#確認請求成功

顯示200說明請求成功。
們來看看第一頁和最后一頁請求的參數區別：
對比一下，你會發第一頁的is_end是顯示false，最后一頁的is_end是顯示true，這個元素可以幫我們結束循環。
至于那個totals: 919元素，我算了一下頁碼和每頁的文章數，判斷這是文章的總數，也同樣可以作為結束循環的條件。兩個元素都可以用，我用的是totals，結合每頁offset的值的變化。

            
              import requests
import csv
import openpyxl

n=0
csv_file=open('zhihu.csv','w',newline='',encoding='gbk')
writer=csv.writer(csv_file)
writer.writerow(['編號','標題','摘要','鏈接'])

wb=openpyxl.Workbook()
sheet=wb.active
sheet['A1']='編號'
sheet['B1']='標題'
sheet['C1']='摘要'
sheet['D1']='鏈接'

headers={'user-agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36'}
url='https://www.zhihu.com/api/v4/members/zhang-jia-wei/articles?'
params={
'include':'data[*].comment_count,suggest_edit,is_normal,thumbnail_extra_info,thumbnail,can_comment,comment_permission,admin_closed_comment,content,voteup_count,created,updated,upvoted_followees,voting,review_info,is_labeled,label_info;data[*].author.badge[?(type=best_answerer)].topics',
'offset':'10',
'limit':'20',
'sort_by':'voteups'
}
res=requests.get(url,headers=headers,params=params)
html=res.json()
totals=html['paging']['totals']
for offset in range(0,totals,20):
        res=requests.get('https://www.zhihu.com/api/v4/members/zhang-jia-wei/articles?include=data%5B*%5D.comment_count%2Csuggest_edit%2Cis_normal%2Cthumbnail_extra_info%2Cthumbnail%2Ccan_comment%2Ccomment_permission%2Cadmin_closed_comment%2Ccontent%2Cvoteup_count%2Ccreated%2Cupdated%2Cupvoted_followees%2Cvoting%2Creview_info%2Cis_labeled%2Clabel_info%3Bdata%5B*%5D.author.badge%5B%3F(type%3Dbest_answerer)%5D.topics&\
                offset={}&limit=10&sort_by=voteups'.format(offset),headers=headers)
        html=res.json()
        items=html['data']
        for item in items:
                n+=1
                num=n
                title=item['title']
                abstract=item['excerpt']
                url=item['url']
                writer.writerow([num,title,abstract,url])
                sheet.append([num,title,abstract,url])
csv_file.close()
wb.save('zhihu.xlsx')

更多文章、技術交流、商務合作、聯系博主

微信掃碼或搜索：z360901061

微信掃一掃加我為好友

QQ號聯系： 360901061

您的支持是博主寫作最大的動力，如果您喜歡我的文章，感覺我的文章對您有幫助，請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧，狠狠點擊下面給點支持吧，站長非常感激您！手機微信長按不能支付解決辦法：請將微信支付二維碼保存到相冊，切換到微信，然后點擊微信右上角掃一掃功能，選擇支付二維碼完成支付。

【本文對您有幫助就好】元

2元

5元

10元

20元

自定義