Discuss / Python / 抓取Python官网的会议内容,上一节的url

抓取Python官网的会议内容,上一节的url

Topic source

''' 1.这篇教材不能说是教材只能算个简介,新手只看这个不可能看得懂的,需要大量找其他资料学习。 2.新浪微博模拟登陆的data参数获取,网上找不到实例教程,有的也非常复杂,尝试抓取数据也未成功。 希望廖老师有时间可以告知这些具体的参数是如何获取的?如有同学搞明白了请不吝赐教。 3.联系给的urf明显错的,应该是过期了,我使用了上一节的url,通过urllib的request方法获取网页内容, 使用bs4进行解析。其实解析上一节就做过了,所以这节很简单。答案是否正确也请指教,运行没问题。 下面给出代码和结果。 '''

~~~~~~~~

#!usr/bin/env python3

-- coding: utf-8 --

from urllib import request from bs4 import BeautifulSoup

with request.urlopen('https://www.python.org/events/python-events/') as f: soup = BeautifulSoup(f, 'lxml', from_encoding='utf-8') fp = soup.section.text print((fp.split('More')[1]).split('You just missed...')[0])

#输出结果:

PyOhio 2017

29 July – 31 July 2017 Columbus, Ohio, USA

PyCon AU 2017

03 Aug. – 09 Aug. 2017 Melbourne Convention and Exhibition Centre, 1 Convention Centre Pl, South Wharf VIC 3006, Australia

PyCon KR 2017

12 Aug. – 16 Aug. 2017 COEX 513, Yeongdong-daero, Gangnam-gu Seoul 06164, Republic of Korea

PyCon Amazônia 2017

12 Aug. – 14 Aug. 2017 Manaus, Amazonas, Brazil

DjangoCon US 2017

13 Aug. – 19 Aug. 2017 Spokane, WA, USA

PyCon PL 2017

17 Aug. – 21 Aug. 2017 Hotel Ossa Congress & SPA, Ossa, Poland

import requests from bs4 import BeautifulSoup f = requests.get('https://www.python.org/events/python-events/') soup = BeautifulSoup(f.text, 'lxml') fp = soup.section.text print((fp.split('More')[1]).split('You just missed...')[0])

井底蛙27

#4 Created at ... [Delete] [Delete and Lock User]

自学Python,能加个好友吗?

作为完全没基础的建筑学生,这节让我看得怀疑人生怀疑智商了,看了你的留言,我才觉得好像没那么糟糕了。可以加好友交流一下你的心得吗

抹烦

#6 Created at ... [Delete] [Delete and Lock User]

为什么我安装不了BeautifulSoup这个库吗?我python版本是3.5.2,只能安bs4,和beautifulsoup4但是很不好用哎


  • 1

Reply