使用Python抓取模板之家的CSS模板

系統 2019-09-27 17:51:30 1941 0

Python版本是2.7.9，在win8上測試成功，就是抓取有點慢，本來想用多線程的，有事就罷了。模板之家的網站上的url參數與頁數不匹配，懶得去做分析了，就自己改代碼中的url吧。大神勿噴！

復制代碼代碼如下:

          
           #!/usr/bin/env python 
  
           # -*- coding: utf-8 -*- 
  
           # by ustcwq 
  
           # 2015-03-15 
  
           ? 
  
           import urllib,urllib2,os,time 
  
           from bs4 import BeautifulSoup 
  
           ? 
  
           start = time.clock() 
  
           path = os.getcwd()+u'/模板之家抓取的模板/' 
  
           if not os.path.isdir(path): 
  
           ??? os.mkdir(path) 
  
           ? 
  
           url = "http://www.cssmoban.com/cssthemes/index_80.shtml"??? # 源網站中的index后面數字怎么編排的？ 
  
           theme_url ='http://www.cssmoban.com/cssthemes/' 
  
           response = urllib2.urlopen(url) 
  
           soup = BeautifulSoup(response) 
  
           result = soup.select('p[class="title"] a') 
  
           print result 
  
           ? 
  
           for item in result: 
  
           ??? link = item['href'] 
  
           ??? # down_name = item.text?? # 文件名稱 
  
           ??? new_url = theme_url+link.split('/')[-1] 
  
           ??? response = urllib2.urlopen(new_url) 
  
           ??? soup = BeautifulSoup(response) 
  
           ??? result = soup.select('.btn a') 
  
           ??? down_url = result[1]['href']??? # 文件鏈接 
  
           ? 
  
           ??? local = path+time.strftime('%Y%m%d%H%M%S',time.localtime(time.time()))+'.zip' 
  
           ??? urllib.urlretrieve(down_url, local) # 遠程保存函數 
  
           ? 
  
           end = time.clock() 
  
           print u'模板抓取完成！' 
  
           print u'一共用時：',end-start,u'秒'

以上所述就是本文的全部內容了，希望大家能夠喜歡。

更多文章、技術交流、商務合作、聯系博主

微信掃碼或搜索：z360901061

微信掃一掃加我為好友

QQ號聯系： 360901061

您的支持是博主寫作最大的動力，如果您喜歡我的文章，感覺我的文章對您有幫助，請用微信掃描下面二維碼支持博主2元、5元、10元、20元等您想捐的金額吧，狠狠點擊下面給點支持吧，站長非常感激您！手機微信長按不能支付解決辦法：請將微信支付二維碼保存到相冊，切換到微信，然后點擊微信右上角掃一掃功能，選擇支付二維碼完成支付。

【本文對您有幫助就好】元

2元

5元

10元

20元

自定義

發表我的評論

最新評論總共0條評論