Loading Ajax with Python Requests

Question

For a personal project, I'm trying to get a full friends list of a user (myself for now) from Facebook using Requests and BeautifulSoup. The main friends page however displays only 20, and the rest are loaded with Ajax when you scroll down.

The request url looks something like this (method is GET):

https://www.facebook.com/ajax/pagelet/generic.php/AllFriendsAppCollectionPagelet?dpr=1&data={"collection_token":"1244314824:2256358349:2","cursor":"MDpub3Rfc3RydWN0dXJlZDoxMzU2MDIxMTkw","tab_key":"friends","profile_id":1244214828,"overview":false,"ftid":null,"order":null,"sk":"friends","importer_state":null}&__user=1364274824&__a=1&__dyn=aihaFayfyGmagngDxfIJ3G85oWq2WiWF298yeqrWo8popyUW3F6wAxu13y78awHx24UJi28cWGzEgDKuEjKeCxicxabwTz9UcTCxaFEW58nVV8-cxnxm1typ9Voybx24oqyUf9UgC_UrQ4bBv-2jAxEhw&__af=o&__req=5&__be=-1&__pc=EXP1:DEFAULT&__rev=2677430&__srp_t=1474288976

My question is, is it possible to recreate the dynamically generated tokens such as the __dyn, cursor, collection_token etc. to send manually in my request? Is there some way to figure out how they are generated or is it a lost cause?

I know that the current Facebook API does not support viewing a full friends list. I also know that I can do this with Selenium, or some other browser simulator, but that feels way too slow, ideally I want to scrape thousands of friends lists (of users whose friends lists are public) in a reasonable time.

My current code is this:

import requests
from bs4 import BeautifulSoup
with requests.Session() as S:
    requests.utils.add_dict_to_cookiejar(S.cookies, {'locale': 'en_US'})
    form = {}
    form['email'] = 'myusername'
    form['pass'] = 'mypassword'
    response = S.post('https://www.facebook.com/login.php?login_attempt=1&lwv=110', data=form)
    # Im logged in
    page = S.get('https://www.facebook.com/yoshidakai/friends?source_ref=pb_friends_tl')

Any help will be appreciated, including other methods to achieve this :)


Show source
| facebook   | python   | python-requests   | web-scraping   2016-10-12 18:10 0 Answers

Answers ( 0 )

◀ Go back