登录成功后，后续该怎么爬取？

我使用小米商城登录成功后，想爬取个人中心页面的用户信息，直接爬取网页，发现页面中用户信息是ajax加载，js模板渲染的

```python
        url = 'https://account.xiaomi.com/pass/serviceLoginAuth2?_dc=' + str(int(time.time() * 1000))
        res = self.sess.post(url, data=data)
        data = json.loads(res.content.decode('utf-8').replace('&&&START&&&', ''))
        if data['desc'] == '成功':
            pprint(data)
            res2 = self.sess.get('https://www.mi.com/user/portal')
            res2.encoding = 'utf-8'
            print(res2.text)
            soup = BeautifulSoup(res2.text, 'lxml')
            img_tag = soup.select_one('div.portal-content-box div.user-card>img')  # 无法获取，此处是js渲染的
            print(img_tag['alt'], img_tag['src'])
```
![image](https://user-images.githubusercontent.com/34160512/113375039-241f9780-93a1-11eb-8060-9ea577f92d7f.png)

然后改成调用接口，响应状态码500，内容为请求来源不合法

```python
        url = 'https://account.xiaomi.com/pass/serviceLoginAuth2?_dc=' + str(int(time.time() * 1000))
        res = self.sess.post(url, data=data)
        data = json.loads(res.content.decode('utf-8').replace('&&&START&&&', ''))
        if data['desc'] == '成功':
            pprint(data)
            self.sess.get('https://www.mi.com/user/portal')
            self.sess.headers[quote(':authority')] = 'api2.service.order.mi.com'
            self.sess.headers[quote(':method')] = 'GET'
            self.sess.headers[quote(':path')] = '/user/userinfo'
            self.sess.headers[quote(':scheme')] = 'https'
            res2 = self.sess.get('https://api2.service.order.mi.com/user/userinfo')  # 直接调用接口，状态码500
            print(res2.text)
            res2.encoding = 'utf-8'
            data2 = json.loads(res2.text)
            pprint(data2)
```
![image](https://user-images.githubusercontent.com/34160512/113375827-e02d9200-93a2-11eb-9a55-dc402c29e71e.png)

请问以上问题怎么解决？
难道只能通过selenium+webdriver控制浏览器的方式爬取网页信息，那么做js加密登录的意义是啥？
希望可以相互交流下...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

登录成功后，后续该怎么爬取？ #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

登录成功后，后续该怎么爬取？ #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions