最近在爬虫,发现一个很特殊的请求头:
:authority, :method, :path, :scheme
。几番操作,要不无法解析请求头,要不就是得不到正确的结果。历时至少一天了,整个搜索逻辑我也贴一下,蓝灯翻不了,用百度要死,每一个正确,搜索的内容答非所问,发现bing搜索可以代替一下谷歌。python requests http2伪请求 python requests http2 python requests :authority ValueError: Invalid header name b':authority' hyper authority, :method, :path, :scheme 在此圈粉bing搜索,还分国内和国际版: http://cn.bing.com
一般情况的请求头为http1的请求头,如下:
def getHeaders(): headers = { "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36", "X-Requested-With": "XMLHttpRequest" return headers r=requests.post(url_search,data=playload,headers=getHeaders())
最近做爬虫,发现一个请求头如下:
ValueError: Invalid header name b':authority'
经过一番的搜索(不要用百度,百度弱鸡,不能用谷歌,就用bing吧),在下面几处找到答案:
HTTP headers - Requests - Python发现是源码无法解析的问题:
那你要重写源码?放心这个问题肯定有对应的库解决,果然需要使用:hyper来处理
http2
这类问题,结合requests
的写法如下:def getHeaders(): headers = { ":authority": "xxx.com", ":method": "POST", ":path": "/search/keyWords/more", ":scheme": "https", "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36", "X-Requested-With": "XMLHttpRequest" return headers from hyper.contrib import HTTP20Adapter sessions=requests.session() sessions.mount('https://xxxx.com', HTTP20Adapter()) r=sessions.post(url_search,data=playload,headers=getHeaders())
参考资料: