python beautifulsoup extraindo o texto

Question

Jun 24, 2013, 06:14 PM

python beautifulsoup extraindo o texto

Eu gostaria de extrair onegrito texto, que está indicando o mais recente clima psi deste sitehttp://app2.nea.gov.sg/anti-pollution-radiation-protection/air-pollution/psi/psi-readings-over-the-last-24-hours. Alguém sabe como extrair usando este código abaixo?

Também precisei extrair dois valores que estão na frente do psi climático atual para calcular. Total de três valores (últimos e dois valores anteriores)

Exemplo: valor atual (negrito) é 5h: 51, preciso também das 3h e 4h da manhã. Alguém sabe e pode me ajudar com isso? Desde já, obrigado !

    from pprint import pprint
    import urllib2
    from bs4 import BeautifulSoup as soup


    url = "http://app2.nea.gov.sg/anti-pollution-radiation-protection/air-pollution/psi/psi-readings-over-the-last-24-hours"
    web_soup = soup(urllib2.urlopen(url))

    table = web_soup.find(name="div", attrs={'class': 'c1'}).find_all(name="div")[2].find_all('table')[0]

    table_rows = []
    for row in table.find_all('tr'):
        table_rows.append([td.text.strip() for td in row.find_all('td')])

    data = {}
    for tr_index, tr in enumerate(table_rows):
        if tr_index % 2 == 0:
            for td_index, td in enumerate(tr):
                data[td] = table_rows[tr_index + 1][td_index]

    pprint(data)

impressões:

    {'10AM': '49',
     '10PM': '-',
     '11AM': '52',
     '11PM': '-',
     '12AM': '76',
     '12PM': '54',
     '1AM': '70',
     '1PM': '59',
     '2AM': '64',
     '2PM': '65',
     '3AM': '59',
     '3PM': '72',
     '4AM': '54',
     '4PM': '79',
     '5AM': '51',
     '5PM': '82',
     '6AM': '48',
     '6PM': '79',
     '7AM': '47',
     '7PM': '-',
     '8AM': '47',
     '8PM': '-',
     '9AM': '47',
     '9PM': '-',
     'Time': '3-hr PSI'}