Sitios web dinámicos de Scrapy de Python

Question

Jan 19, 2014, 08:33 PM

Sitios web dinámicos de Scrapy de Python

Estoy intentando raspar una página web muy simple con la ayuda de Scrapy y sus selectores xpath, pero por alguna razón los selectores que tengo no funcionan en Scrapy pero sí funcionan en otras utilidades de xpath

Estoy tratando de analizar este fragmento de código HTML:

<select id="chapterMenu" name="chapterMenu">

<option value="/111-3640-1/20th-century-boys/chapter-1.html" selected="selected">Chapter 1: Friend</option>

<option value="/111-3641-1/20th-century-boys/chapter-2.html">Chapter 2: Karaoke</option>

<option value="/111-3642-1/20th-century-boys/chapter-3.html">Chapter 3: The Boy Who Bought a Guitar</option>

<option value="/111-3643-1/20th-century-boys/chapter-4.html">Chapter 4: Snot Towel</option>

<option value="/111-3644-1/20th-century-boys/chapter-5.html">Chapter 5: Night of the Science Room</option>

</select>

Scrapy código parse_item:

def parse_item(self, response):
    itemLoader = XPathItemLoader(item=MangaItem(), response=response)
    itemLoader.add_xpath('chapter', '//select[@id="chapterMenu"]/option[@selected="selected"]/text()')
    return itemLoader.load_item()

Scrapy no extrae ningún texto de esto, pero si obtengo el mismo fragmento de ruta y html y lo ejecutoaquí funciona bien

si utilizo este xpath: