java - Can not connect to website using Jsoup behind company proxy -
i want scrape webpage using javas jsoup library behind corporate proxy prevents me connecting webpage. researched problem , know have address proxy identify myself proxy. still not able connect webpage. trying test connection retrieving title www.google.com using following code:
import java.io.ioexception; import org.jsoup.jsoup; import org.jsoup.nodes.document; public class test { public static void main(string[] args) { system.out.println("1"); try{ system.setproperty("http.proxyhost", "myproxy"); system.setproperty("http.proxyport", "myport"); system.setproperty("http.proxyuser", "myuser"); system.setproperty("http.proxypassword", "mypassword"); document doc = jsoup.connect("http://google.com").get(); string title = doc.title(); system.out.println(title); }catch(ioexception e){ system.out.println(e); } } } the above code returns following error:
org.jsoup.unsupportedmimetypeexception: unhandled content type. must text/*, application/xml, or application/xhtml+xml. mimetype=application/x-ns-proxy-autoconfig, url=http://google.com
this tells me soemthing retrieved in content type can not processed, adjusted "test" ignore content type, in order see retrieved using following code:
import java.io.ioexception; import org.jsoup.jsoup; import org.jsoup.nodes.document; public class demoii { public static void main(string[] args) { system.out.println("1"); try{ system.setproperty("http.proxyhost", "myproxy"); system.setproperty("http.proxyport", "myport"); system.setproperty("http.proxyuser", "myuser"); system.setproperty("http.proxypassword", "mypassword"); string script = jsoup.connect("http://google.com").ignorecontenttype(true).execute().body(); system.out.println(script); }catch(ioexception e){ system.out.println(e); } } } it turns out "script" string retrieves source code proxy server. making connection proxy request www.google.com not going through. ideas doing wrong?
op finds solution:
@mcl hey thanks, had no idea file did , after told me does, had inside , there proxy name differs 1 used before, , works - – user3182273
Comments
Post a Comment