Źródło strony pobierania za pomocą HtmlUnit: URL został zablokowany

Próbuję uzyskać źródło strony następującego adresu URL za pomocą metody pobierania HTML-Unit.

http://denydesigns.com/collections/barbara-sherman-fleece-throw-blanket/products/barbara-sherman-antique-fleece-throw-blanket

Gdzieś utknie. Próbuję znaleźć przyczynę, ale tego nie rozumiem. Próbowałem też sprawdzić, czy wątek utworzony przez HtmlUnit jest ZABLOKOWANY ar WAITING, ale tak też nie jest.

Poniżej znajduje się mój dziennik wygenerowany przez jednostkę HTML.
18 Jan 2013 04:14:47,832 -  main - ERROR - com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter.runtimeError(StrictErrorReporter.java:79) - runtimeError: message=[The data necessary to complete this operation is not yet available.] sourceName=[http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js] line=[16] lineSource=[null] lineOffset=[0]
18 Jan 2013 04:14:47,924 -  main -  WARN - com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument.jsxFunction_getElementById(HTMLDocument.java:1049) - getElementById(script1358500487923) did a getElementByName for Internet Explorer
18 Jan 2013 04:14:49,498 -  main - ERROR - com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter.runtimeError(StrictErrorReporter.java:79) - runtimeError: message=[The data necessary to complete this operation is not yet available.] sourceName=[http://code.jquery.com/jquery-latest.js] line=[911] lineSource=[null] lineOffset=[0]
18 Jan 2013 04:14:49,565 -  main -  WARN - com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument.jsxFunction_getElementById(HTMLDocument.java:1049) - getElementById(sizzle-1358500489525) did a getElementByName for Internet Explorer
18 Jan 2013 04:14:53,047 -  main -  WARN - com.gargoylesoftware.htmlunit.javascript.host.ActiveXObject.jsConstructor(ActiveXObject.java:128) - Automation server can't create object for 'ShockwaveFlash.ShockwaveFlash.7'.
18 Jan 2013 04:14:53,048 -  main - ERROR - com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter.runtimeError(StrictErrorReporter.java:79) - runtimeError: message=[Automation server can't create object for 'ShockwaveFlash.ShockwaveFlash.7'.] sourceName=[http://www.google-analytics.com/ga.js] line=[18] lineSource=[null] lineOffset=[0]
18 Jan 2013 04:14:53,060 -  main -  WARN - com.gargoylesoftware.htmlunit.javascript.host.ActiveXObject.jsConstructor(ActiveXObject.java:128) - Automation server can't create object for 'ShockwaveFlash.ShockwaveFlash.6'.
18 Jan 2013 04:14:53,061 -  main - ERROR - com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter.runtimeError(StrictErrorReporter.java:79) - runtimeError: message=[Automation server can't create object for 'ShockwaveFlash.ShockwaveFlash.6'.] sourceName=[http://www.google-analytics.com/ga.js] line=[18] lineSource=[null] lineOffset=[0]
18 Jan 2013 04:14:53,061 -  main -  WARN - com.gargoylesoftware.htmlunit.javascript.host.ActiveXObject.jsConstructor(ActiveXObject.java:128) - Automation server can't create object for 'ShockwaveFlash.ShockwaveFlash'.
18 Jan 2013 04:14:53,062 -  main - ERROR - com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter.runtimeError(StrictErrorReporter.java:79) - runtimeError: message=[Automation server can't create object for 'ShockwaveFlash.ShockwaveFlash'.] sourceName=[http://www.google-analytics.com/ga.js] line=[18] lineSource=[null] lineOffset=[0]
18 Jan 2013 04:14:53,829 -  main - ERROR - com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter.runtimeError(StrictErrorReporter.java:79) - runtimeError: message=[The data necessary to complete this operation is not yet available.] sourceName=[http://chat.livechatinc.net/licence/1051689/script.cgi?lang=en&groups=0] line=[60] lineSource=[null] lineOffset=[0]
18 Jan 2013 04:14:54,878 -  main - ERROR - com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter.runtimeError(StrictErrorReporter.java:79) - runtimeError: message=[The data necessary to complete this operation is not yet available.] sourceName=[http://platform.twitter.com/widgets.js] line=[5] lineSource=[null] lineOffset=[0]
18 Jan 2013 04:14:56,215 -  main -  WARN - com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument.jsxFunction_getElementById(HTMLDocument.java:1049) - getElementById(sizzle-1358500496196) did a getElementByName for Internet Explorer
18 Jan 2013 04:14:56,458 -  main -  WARN - com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument.jsxFunction_execCommand(HTMLDocument.java:1590) - Nothing done for execCommand(BackgroundImageCache, ...) (feature not implemented)
18 Jan 2013 04:14:58,086 -  main -  WARN - com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument.jsxFunction_getElementById(HTMLDocument.java:1049) - getElementById(sizzle-1358500489525) did a getElementByName for Internet Explorer
A oto mój zrzut wątku dla utworzonego procesu (za pomocą jstack)
2013-01-18 04:17:46
Full thread dump Java HotSpot(TM) 64-Bit Server VM (22.1-b02 mixed mode):

"Attach Listener" daemon prio=10 tid=0x0000000002955000 nid=0x16dd waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Service Thread" daemon prio=10 tid=0x00007feca00cc800 nid=0x154f runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x00007feca00ca000 nid=0x154e waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x00007feca00c7000 nid=0x154d waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x00007feca00c5000 nid=0x154c runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x00007feca007c800 nid=0x154b in Object.wait() [0x00007fec9fffe000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000c2369e20> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x00000000c2369e20> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x00007feca007a000 nid=0x154a in Object.wait() [0x00007feca4157000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000000c23699e0> (a java.lang.ref.Reference$Lock)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
        - locked <0x00000000c23699e0> (a java.lang.ref.Reference$Lock)

"main" prio=10 tid=0x00000000025d9000 nid=0x1546 runnable [0x00007fecaa8b6000]
   java.lang.Thread.State: RUNNABLE
        at net.sourceforge.htmlunit.corejs.javascript.ScriptableObject.getTopLevelScope(ScriptableObject.java:2007)
        at com.gargoylesoftware.htmlunit.javascript.SimpleScriptable.getWindow(SimpleScriptable.java:303)
        at com.gargoylesoftware.htmlunit.javascript.SimpleScriptable.getWindow(SimpleScriptable.java:293)
        at com.gargoylesoftware.htmlunit.javascript.SimpleScriptable.getPrototype(SimpleScriptable.java:251)
        at com.gargoylesoftware.htmlunit.javascript.host.html.HTMLCollection.<init>(HTMLCollection.java:99)
        at com.gargoylesoftware.htmlunit.javascript.host.html.HTMLCollection.<init>(HTMLCollection.java:110)
        at com.gargoylesoftware.htmlunit.javascript.host.HTMLCollectionFrames.<init>(Window.java:1751)
        at com.gargoylesoftware.htmlunit.javascript.host.Window.getFrames(Window.java:759)
        at com.gargoylesoftware.htmlunit.javascript.host.Window.jsxGet_length(Window.java:749)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at net.sourceforge.htmlunit.corejs.javascript.MemberBox.invoke(MemberBox.java:172)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptableObject$GetterSlot.getValue(ScriptableObject.java:342)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptableObject.getImpl(ScriptableObject.java:2523)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptableObject.get(ScriptableObject.java:438)
        at com.gargoylesoftware.htmlunit.javascript.SimpleScriptable.get(SimpleScriptable.java:75)
        at com.gargoylesoftware.htmlunit.javascript.host.Window.get(Window.java:1226)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptableObject.getProperty(ScriptableObject.java:2088)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getObjectProp(ScriptRuntime.java:1527)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.getObjectProp(ScriptRuntime.java:1513)
        at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpretLoop(Interpreter.java:1398)
        at net.sourceforge.htmlunit.corejs.javascript.Interpreter.interpret(Interpreter.java:854)
        at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:164)
        at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.doTopCall(ContextFactory.java:429)
        at com.gargoylesoftware.htmlunit.javascript.HtmlUnitContextFactory.doTopCall(HtmlUnitContextFactory.java:267)
        at net.sourceforge.htmlunit.corejs.javascript.ScriptRuntime.doTopCall(ScriptRuntime.java:3183)
        at net.sourceforge.htmlunit.corejs.javascript.InterpretedFunction.call(InterpretedFunction.java:162)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$4.doRun(JavaScriptEngine.java:538)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine$HtmlUnitContextAction.run(JavaScriptEngine.java:589)
        - locked <0x00000000c274d308> (a com.gargoylesoftware.htmlunit.html.HtmlPage)
        at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:537)
        at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:538)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:545)
        at com.gargoylesoftware.htmlunit.javascript.JavaScriptEngine.callFunction(JavaScriptEngine.java:520)
        at com.gargoylesoftware.htmlunit.html.HtmlPage.executeJavaScriptFunctionIfPossible(HtmlPage.java:896)
        at com.gargoylesoftware.htmlunit.javascript.host.EventListenersContainer.executeEventListeners(EventListenersContainer.java:162)
        at com.gargoylesoftware.htmlunit.javascript.host.EventListenersContainer.executeBubblingListeners(EventListenersContainer.java:221)
        at com.gargoylesoftware.htmlunit.javascript.host.Node.fireEvent(Node.java:735)
        at com.gargoylesoftware.htmlunit.html.HtmlElement$2.run(HtmlElement.java:866)
        at net.sourceforge.htmlunit.corejs.javascript.Context.call(Context.java:537)
        at net.sourceforge.htmlunit.corejs.javascript.ContextFactory.call(ContextFactory.java:538)
        at com.gargoylesoftware.htmlunit.html.HtmlElement.fireEvent(HtmlElement.java:871)
        at com.gargoylesoftware.htmlunit.html.HtmlPage.executeEventHandlersIfNeeded(HtmlPage.java:1162)
        at com.gargoylesoftware.htmlunit.html.HtmlPage.initialize(HtmlPage.java:202)
        at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseInto(WebClient.java:440)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:311)
        at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:389)

"VM Thread" prio=10 tid=0x00007feca0072800 nid=0x1549 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x00000000025e4000 nid=0x1547 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x00000000025e5800 nid=0x1548 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007feca00d7800 nid=0x1550 waiting on condition

JNI global references: 317

Nie jestem pewien, dlaczego URL jest zablokowany. Nie wychodzi z metody. Czy może zajrzeć do tego dowolne ciało?

AKTUALIZACJA com.gargoylesoftware.htmlunit.html.HTMLParser.HtmlUnitDOMBuilder.parse (XMLInputSource) @Nadpisanie

    public void parse(final XMLInputSource inputSource) throws XNIException, IOException {
        final HtmlUnitDOMBuilder oldBuilder = page_.getBuilder();
        page_.setBuilder(this);
        try {
            super.parse(inputSource);
        }
        finally {
            page_.setBuilder(oldBuilder);
        }
    }

Dołączyłem kod źródłowy HtmlUnit z HtmlUnit i Debugged. Powyższa metoda nie jest wykonywana całkowicie.

Ponadto ustawiłem limit czasu w następujący sposób:

webClient.setTimeout(120000);

Dlaczego więc nie wychodzi z niego po 2 minutach i mówi SomeThingTimeOutException?

questionAnswers(1)

yourAnswerToTheQuestion