编程爱好者联盟 2016-12-20
1、问题:用Jsoup在获取一些网站的数据时,起初获取很顺利,但是在访问某浪的数据是Jsoup报错,应该是请求头里面的请求类型(ContextType)不符合要求。
错误信息:
Exception in thread "main" org.jsoup.UnsupportedMimeTypeException: Unhandled content type. Must be text/*, application/xml, or application/xhtml+xml. Mimetype=application/json; charset=utf-8, URL=... at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:547) at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:493) at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:205) at org.jsoup.helper.HttpConnection.get(HttpConnection.java:194) at com.Interface.test.JsoupUtil.httpGet(JsoupUtil.java:30) at com.Interface.test.test.main(test.java:23)
请求方法:
public static String httpGet(String url,String cookie) throws IOException{ //获取请求连接 Connection con = Jsoup.connect(url); //请求头设置,特别是cookie设置 con.header("Accept", "text/html, application/xhtml+xml, */*"); con.header("Content-Type", "application/x-www-form-urlencoded"); con.header("User-Agent", "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0))"); con.header("Cookie", cookie); //解析请求结果 Document doc=con.get(); //获取标题 System.out.println(doc.title()); return doc.toString(); }
2、解决:只需要在 Connection con = Jsoup.connect(url);中添加ignoreContentType(true)即可,这里的ignoreContentType(true)意思就是忽略ContextType的检查。
添加后
//获取请求连接 Connection con = Jsoup.connect(url).ignoreContentType(true);