BeautifulSoup, zum der Straßenadresse abzukratzen
Ich benutze den Code ganz unten, um zu bekommenWeblink, und dasMasjid Name. allerdings möchte ich auch bekommenKonfession undAdresse. Bitte helfen Sie mir stecken.
Derzeit erhalte ich die folgenden
Weblink:
<div class="subtitleLink"><a href="http://www.salatomatic.com/d/Tempe+5313+Masjid-Al-Hijrah">
undMasjid Name
<b>Masjid Al-Hijrah</b>
Möchte aber folgendes bekommen;
Konfession
<b>Denomination:</b> Sunni (Traditional)
undAdresse
<br>45 Station Street (Sydney)
Der folgende Code kratzt das Folgende
<td width=25><a href="http://www.salatomatic.com/d/Tempe+5313+Masjid-Al-Hijrah"><img src='http://www.halalfire.com/images/en/photo_small.jpg' alt='Masjid Al-Hijrah' title='Masjid Al-Hijrah' border=0 width=48 height=36></a></a></td><td width=10><img src="http://www.salatomatic.com/images/spacer.gif" width=10 border=0></td><td nowrap><div class="subtitleLink"><a href="http://www.salatomatic.com/d/Tempe+5313+Masjid-Al-Hijrah"><b>Masjid Al-Hijrah</b></a> </div><div class="tinyLink"><b>Denomination:</b> Sunni (Traditional)<br>45 Station Street (Sydney) </div></td><td align=right valign=center><div class="tinyLink"></div></td>
CODE:
from bs4 import BeautifulSoup
import urllib2
url1 = "http://www.salatomatic.com/c/Sydney+168"
content1 = urllib2.urlopen(url1).read()
soup = BeautifulSoup(content1)
results = soup.findAll("div", {"class" : "subtitleLink"})
for result in results :
br = result.find('b')
a = result.find('a')
currenturl = a.get('href')
if not currenturl.startswith("http"):
currenturl = "http://www.salatomatic.com" + currenturl
print currenturl
elif currenturl.startswith("http"):
print a.get('href')
pos = br.get_text()
print pos