参考了下面的大佬,这应该确实是最简洁的了
Topic sourcefrom urllib import requestimport reName = r'<a href=".*">(.*)</a><br/>'Location = r'<a href=".*" title=".*">(.*)</a></span>'Time = r'<time datetime="(.*)T.*">.*</time>'def main(): URL = 'https://www.python.org/jobs/' data = delInf(URL)def delInf(URL): # datalist = [] strInf = request.urlopen(URL).read().decode('utf-8') name = re.findall(Name, strInf) location = re.findall(Location, strInf) time = re.findall(Time, strInf) for index in range(0, len(name)): print('会议名称:'+ name[index]) print('会议地点:'+ location[index]) print('会议时间:'+ time[index] + '\n')if __name__ == '__main__': main()
到底怎么选才不乱码呀QAQ
from urllib import request
import re
Name = r'<a href=".*">(.*)</a><br/>'
Location = r'<a href=".*" title=".*">(.*)</a></span>'
Time = r'<time datetime="(.*)T.*">.*</time>'
def main():
URL = 'https://www.python.org/jobs/'
data = delInf(URL)
def delInf(URL):
# datalist = []
strInf = request.urlopen(URL).read().decode('utf-8')
name = re.findall(Name, strInf)
location = re.findall(Location, strInf)
time = re.findall(Time, strInf)
for index in range(0, len(name)):
print('会议名称:'+ name[index])
print('会议地点:'+ location[index])
print('会议时间:'+ time[index] + '\n')
if __name__ == '__main__':
main()
我去别的地方测试了一下,希望不要再乱了,再乱的话我就不发了QAQ
- 1
晴天也想拥有一只猫咪
用的不是老师的网址,那个datalsit是大佬里本来就有的,但是没啥作用就标记成备注了。